Creascience :: Consulting in Statistics Statistics Training 
   Top » EDA1 Log In   |   My Account   |   Training Selection   |   Register    
  
 Exploratory Data Analysis – Understanding Your Data through the Use of Graphics

Exploratory Data Analysis – Understanding Your Data through the Use of Graphics

EDA1
 
Duration :Duration : 1.0 day(s)
 
 

:: Course Summary

We are awash in an ocean of information! Displaying and graphing are becoming a fundamental part of data analysis. Visualization is used in initial data inspection and exploration, model building and validation, as well as communicating results. Graphical techniques are thus central to the process of abstracting knowledge from information.
This one-day course will provide the practitioner/data analyst with the graphics and plotting procedures required to effectively explore data sets and to help reveal their underlying structure.

:: Learning Objectives

  • Understand the fundamentals of static graphing tools for both univariate and multivariate data
  • Use graphical techniques to explore the nature of data distributions
  • Understand scaling/aspect ratio
  • Appreciate the problems of graphing large data sets
  • Recognize the usefulness of interactive graphics and how to incorporate it into your data analysis
  • Understand linked brushing and how to use this technique
  • Use the tour technique to look for high-dimensional structure
  • Take a dataset and use course procedures to search for interesting features
  • Acquire the practical skills needed to apply course techniques to your data
  • Learn to judge the quality of a graph

:: Target Audience

This course will help practitioners in applied data analysis in business and government; it will also be valuable to students and researchers in physical and life sciences, engineering, computer sciences and statistics.

:: Prerequisite

People who attend are assumed to have a basic knowledge of statistics; nevertheless, some needed basics are summarized throughout the course as required.
  

:: Topics Covered

    Static graphics
    • Histograms – their use and misuse
    • Quantile plotting – exploring the nature of the data distribution
    • Box plots – looking at the data for many variables
    • Jittering - when you have many data points of the same value
    • Hexagon binning – a tool to use when your data set is large
    • SPLOM – looking at more than one scatterplot
    • Parallel coordinate plots – plotting many variables in one dimension
    • Mosaic plots – what to do with multivariate categorical data
    Dynamic/interactive graphics
    • Conditioning your plots – revealing structure in subsets of your data
    • Touring - taking a spin to visualize the nature of high dimensional data
    • Selection & identification – presenting information about a point or clusters of points
    • Linking & brushing – multiple views of the same data shown simultaneously
    • Transformations and scaling your plots – should the axis be changed?
    • Missing values - what is there and not there?
    Some of the plotting software demonstrated in the course (all open source):
    • R-lattice
    • Ggobi
    • Mondrian
    • ggplot
    • iplots

:: Course Content

The course is divided into two sections: the first (morning) session explores the use of static graphics, including the traditional histogram, cumulative distribution plots, Q-normal plots. Jittering, box plots, hexagon binning, and the use of mosaic plots for displaying categorical data will also be discussed. The graphing of both univariate and multivariate data sets is included during this session. Discussion will focus on how to apply these tools and their ‘proper’ treatment within a problem-solving environment. This session will also review the potential problems of graphing large data sets.

The second (afternoon) session is dedicated to dynamic and interactive graphics. The session introduces interactive and dynamic graphics, and includes topics such as sub-setting and conditioning, linked windows and graphical tours using these tools within the context of the familiar scatterplot, bar chart and parallel coordinate plots. Direct manipulation of the plots includes scaling, moving points, linked brushing and identification using categorical variables. A special section of the session is devoted to the important problem of dealing with missing data. This session introduces GGobi, an open source, dynamic-interactive graphics program.

Two class demonstrations will review the techniques discussed and show the procedures for generating the different types of graphical applications. These demonstrations are complemented with two tutorial sessions in which the participants work along with the instructor in order to analyze a data set using several of the graphing/plotting procedures presented. Participants are encouraged to use their own data.

 

Upcoming Public Sessions

 Location   Date   Language   Seats Left   Price    
 Montreal, Canada   October 7, 2008   English   7   CA$650.00 CA$552.50  
 Register 
 San Francisco, USA   Oct. 21, 2008   English   7   CA$750.00 CA$637.50  
 Register 
 Displaying 1 to 2 (of 2 sessions)   Result Pages:  1  
 

Offered Discounts

  • Register more than 6 weeks before a session date and get a 15% discount (Displayed above if available).
  • Register 2 persons or more and get a 10% discount (Applied at checkout).
  • Register for 2 sessions or more and get a 10% discount (Applied at checkout).
 
General Information
Our PhilosophyIn-House TrainingCoachingInstructorsLatest NewsContact UsNewsletter
Tell A Friend
 
Tell someone you know about this training.
Languages
French English
Currencies
Testimonials
Fundamental Tools in Statistics
Fundamental Tools in Statistics
Notifications
NotificationsNotify me of updates to Exploratory Data Analysis – Understanding Your Data through the Use of Graphics
Browsing this Site
Privacy NoticeConditions of Use