A chart is a diagram that makes information easier to understand by showing how two or more sets of data are related.
There are two common types of chart, a pie chart and a bar chart.
They are also offen called "graphs".
|
Bar Graphs
Histogram
Frequency Polygon
Pie chart
Pictograph
Map chart
Line charts
Dot graphs
Box plot
Radar chart
Bubble chart
Bar Graphs
The bar-chart (or column chart) is the simplest and most versatile of statistical diagrams. It is used for comparing the frequency, count, total or average of data in different categories.
A bar graph displays discrete data in separate columns.
A double bar graph
can be used to compare two data sets. Categories are considered unordered
and can be rearranged alphabetically, by size, etc.
A bar chart such as this, is used to show how different sets of information
compare.
Source: City
Population http://www.citypopulation.de/cities.html
In a vertical bar chart (or Column chart in Excel) the height
of a bar is used to represent the frequency. All bars must be the same
width and evenly spaced along the x (category) axis. Hence bars should
never overlap. The gaps between the bars are used to emphasise
the distinction between the categories or discrete values. As a general
rule, the gaps should be about 30-70% of the width of the bar. If the
gaps are too wide the chart will become unnecessarily large. Sometimes
it may be appropriate to leave either no gap or a very small gap.
A horizontal bar chart: If there are a large number of bars, or the category labels are long,
it is sensible to rotate the bar chart so that the bars are horizontal.
Otherwise the category labels would have to be printed vertically,
making them difficult to read.
Note
that the category axis is now vertical and the frequency
axis is horizontal.
In Excel a horizontal bar chart is called simply a Bar chart.
A component (or stacked) bar chart is used to show the
breakdown of a total measurement into several categories. In the
component chart it is only possible to judge the size of the bottom
category precisely. However, the total is clearly visible and it is
easier to see the proportion of the total in each component.
|
Advantages
- Visually strong
- Can easily compare two or three data sets
|
Disadvantages
- Graph categories can be reordered to emphasize certain effects
- Use only with discrete data
|
В начало страницы
Histogram
A histogram displays continuous data in ordered columns. Categories are
of continuous measure such as time, inches, temperature, etc.
|
Advantages
- Visually strong
- Can compare to normal curve
- Usually vertical axis is a frequency count of items falling into
each category
|
Disadvantages
- Cannot read exact values because data is grouped into categories
- More difficult to compare two data sets
- Use only with continuous data
|
В начало страницы
Frequency Polygon
A frequency polygon (1) can be made from a line graph by shading in the area
beneath the graph. It can be made from a histogram by joining midpoints
of each column.
Advocates of the frequency polygon argue that
the purpose of a histogram is to show the shape of the data
distribution and removing the bars makes the shape clearer and smoother.
Critics argue that the class boundaries are then more difficult to see.
Also the y-axis is more difficult to interpret once the bars are
removed.
The main problem with frequency polygons is deciding what to do with
the endpoints. If one imagines an additional empty bar at each end of
the histogram, then the polygon would look like the chart in (1). However, this gives the impression that a few of the marks were
negative and some over 100! On the other hand, the chart (2) makes the areas at either end too small and is clearly wrong.
|
|
Disadvantages
- Anchors at both ends may imply zero as data points
- Use only with continuous data
|
В начало страницы
Pie
chart
A pie chart displays data as a percentage of the whole. Each pie section
should have a label and percentage. A total data number should be included.
|
Advantages
- Visually appealing
- Shows percent of total for each category
|
Disadvantages
- No exact numerical data
- Hard to compare 2 data sets
- "Other" category can be a problem
- Total unknown unless specified
- Best for 3 to 7 categories
- Use only with discrete data
|
Pictograph
A pictograph uses an icon to represent a quantity of data values in order
to decrease the size of the graph. A key must be used to explain the icon.
There are three ways in which the pictures
can be used to represent the data:
- varying both the length and width of the picture - and
hence its area;
In this type of pictogram each data value (frequency, percentage,
etc.) is represented by the area
of a picture which relates to the
data.
A square root rule must be applied to ensure the pictures
are in the correct proportions, as the following example.
- keeping the width of the picture constant and stretching the length;
- keeping the width and length constant but stacking multiple copies of the picture.
|
Advantages
- Easy to read
- Visually appealing
- Handles large data sets easily using keyed icons
|
Disadvantages
- Hard to quantify partial icons
- Icons must be of consistent size
- Best for only 2-6 categories
- Very simplistic
|
В начало страницы
Map chart
A map chart displays data by shading sections of a map, and must include
a key. A total data number should be included.
|
Advantages
- Good visual appeal
- Overall trends show well
|
Disadvantages
- Needs limited categories
- No exact numerical values
- Color key can skew visual interpretation
|
Line charts
Line plot
A line plot is a graph using marks (e.g., X, ·) above a number on a number line to show the frequency of data.
It can be used as an initial record of discrete data values.
|
Advantages
- Quick analysis of data
- Shows range, minimum & maximum, gaps & clusters, and outliers
easily
- Exact values retained
|
Disadvantages
- Not as visually appealing
- Best for under 50 data values
- Needs small range of data
|
Line graph
A line graph plots continuous data as points and then joins them with
a line. Multiple data sets can be graphed together, but a key must be
used.
|
Advantages
- Can compare multiple continuous data sets easily
- Interim data can be inferred from graph line
|
Disadvantages
- Use only with continuous data
|
В начало страницы
Dot graphs
Scatterplot
An XY chart, or scatterplot, is used to display bivariate
(i.e. two variable) numerical
data, either discrete or continuous. This is where pairs of data have been recorded
on the same objects.
Example 1
Each pair of measurements e.g. (height, weight)
is represented by a single point on the plot. The independent x-variable
is plotted on the horizontal axis, while the dependent y-variable is
plotted on the vertical axis.
Points are not joined by lines on an XY chart.
Example 2
Unlike other charts (e.g. bar charts) it is not necessary to include
the zero mark on either axis. The purpose of the graph is not to show
the absolute value of the measurements but rather to reveal any
relationship that may exist between them. Including the zero marks may
cause the data to be squashed into a corner of the plot area (see chart
to left). Instead the scales on the axes should be chosen to allow the
data to fill the plot area.
Example 3
A scientist or researcher may add a trendline (or regression line) to
a scatterplot to show the underlying relationship.
The equation (or formula) used to generate the plotting points for
the trendline may be shown.
|
Advantages
- Shows a trend in the data relationship
- Retains exact data values and sample size
- Shows minimum/maximum and outliers
|
Disadvantages
- Hard to visualize results in large data sets
- Flat trend line gives inconclusive results
- Data on both axes should be continuous
|
Dotplot
A dotplot is an informal alternative to a histogram for displaying
continuous data. In a dotplot each data value is plotted as a dot
on a horizontal axis. Where two values are separated by less than
a certain increment the dots are stacked in a column. If the
increment is made too small then it is impossible to see the shape of
the distribution. However, if the increment is made too large then
a single column of dots is obtained.
|
Advantages
|
Disadvantages
|
Stem and Leaf Plot
Stem and leaf plots record data values in rows, and can easily be made
into a histogram. Large data sets can be accomodated by splitting stems.
A stem-and-leaf plot is essentially a dotplot in which the plotting
symbol is replaced by the data value itself. It provides an
informal alternative to a histogram when carrying out exploratory data
analysis. In a traditional stem-and-leaf plot a data value (e.g.
27) is split into two components - the stem
(i.e. 2, representing 20) and the leaf
(i.e. 7). The stems are then written down once, while the leaves
are stacked up alongside the stem to which they are
attached.
|
Advantages
- Concise representation of data
- Shows range, minimum & maximum, gaps & clusters, and outliers
easily
- Can handle extremely large data sets
|
Disadvantages
- Not visually appealing
- Does not easily indicate measures of centrality for large data sets
|
В начало страницы
Box plot
A boxplot (or box-and-whisker plot) is a concise graph showing the five point summary. The five figures in question
are: the minimum, lower quartile, median, upper quartile and maximum values. These five
vital statistics are plotted on an axis (vertical or horizontal). Multiple
boxplots can be drawn side by side to compare more than one data set.
|
Advantages
- Shows 5-point summary and outliers
- Easily compares two or more data sets
- Handles extremely large data sets easily
|
Disadvantages
- Not as visually appealing as other graphs
- Exact values not retained
|
В начало страницы
Radar chart
A radar chart is the clockface form of a line chart. The category (x) variable is
plotted at equally spaced points around the clock. The y-variable is plotted as a radius,
so each category has its own y-axis radiating from the centre. Lines are used to connect
points belonging to the same series. It is an ideal chart to use when the categories have
a natural cyclical order, e.g. seasons of the year. However, it relies heavily on the use
of colour to distinguish between series and is therefore of little value in monochrome
media.
|
Advantages
|
Disadvantages
|
Bubble chart
A bubble chart is an extension of the XY-chart, where each data marker is drawn as a
circular bubble and the area of the bubble is used to show the value of a third variable.
Varying the colours of the bubbles can also be used to show a fourth categorical variable.
Data labels can be used to identify particular points of interest. Unfortunately people
find it very difficult to compare the areas of different sized circles.
|
Advantages
|
Disadvantages
|
В начало страницы
Использованные ресурсы:
http://www.leeds.ac.uk/languages/resource/english/graphs/tren1.htm
http://home.ched.coventry.ac.uk/Volume/vol0/chart.htm
http://mainland.cctt.org/mathsummer/JosephBond/StemAndPlots/stem-and-leaf_std.htm
http://math.youngzones.org/stat_graph.html
http://www.stats.govt.nz/about-us/policies-and-guidelines/data-use/graphic-guidelines-when+to+use.htm
|