:: Topics Covered
- Histograms their
use and misuse
- Quantile plotting
exploring the nature of the data distribution
- Box plots looking
at the data for many variables
- Jittering - when
you have many data points of the same value
- Hexagon binning
a tool to use when your data set is large
- SPLOM looking
at more than one scatterplot
- Parallel coordinate
plots plotting many variables in one dimension
- Mosaic plots
what to do with multivariate categorical data
Dynamic/interactive
graphics
- Conditioning your
plots revealing structure in subsets of your data
- Touring - taking
a spin to visualize the nature of high dimensional data
- Selection &
identification presenting information about a point or clusters
of points
- Linking & brushing
multiple views of the same data shown simultaneously
- Transformations
and scaling your plots should the axis be changed?
- Missing values -
what is there and not there?
Some of the plotting
software demonstrated in the course (all open source):
- R-lattice
- Ggobi
- Mondrian
- ggplot
- iplots
:: Course ContentThe course is divided
into two sections: the first (morning) session explores the use of static
graphics, including the traditional histogram, cumulative distribution
plots, Q-normal plots. Jittering, box plots, hexagon binning, and the
use of mosaic plots for displaying categorical data will also be discussed.
The graphing of both univariate and multivariate data sets is included
during this session. Discussion will focus on how to apply these tools
and their proper treatment within a problem-solving environment.
This session will also review the potential problems of graphing large
data sets.
The second (afternoon)
session is dedicated to dynamic and interactive graphics. The session
introduces interactive and dynamic graphics, and includes topics such
as sub-setting and conditioning, linked windows and graphical tours
using these tools within the context of the familiar scatterplot, bar
chart and parallel coordinate plots. Direct manipulation of the plots
includes scaling, moving points, linked brushing and identification
using categorical variables. A special section of the session is devoted
to the important problem of dealing with missing data. This session
introduces GGobi, an open source, dynamic-interactive
graphics program.
Two class demonstrations
will review the techniques discussed and show the procedures for generating
the different types of graphical applications. These demonstrations
are complemented with two tutorial sessions in which the participants
work along with the instructor in order to analyze a data set using
several of the graphing/plotting procedures presented. Participants
are encouraged to use their own data.
|