English for Masters
Perm State University


Курс английского языка для студентов магистратуры физического факультета
домашняя страница информация о курсе учебные материалы групповая зачётка


Описание графиков и диаграмм , раздел 6



A chart is a diagram that makes information easier to understand by showing how two or more sets of data are related.

There are two common types of chart, a pie chart and a bar chart.
They are also offen called "graphs".

Bar Graphs
Histogram
Frequency Polygon
Pie chart
Pictograph
Map chart
Line charts
Dot graphs
Box plot
Radar chart
Bubble chart

 

Bar Graphs

The bar-chart (or column chart) is the simplest and most versatile of statistical diagrams. It is used for comparing the frequency, count, total or average of data in different categories.

A bar graph displays discrete data in separate columns.
A double bar graph can be used to compare two data sets. Categories are considered unordered and can be rearranged alphabetically, by size, etc.

A bar chart such as this, is used to show how different sets of information compare.
Source: City Population http://www.citypopulation.de/cities.html

In a vertical bar chart (or Column chart in Excel) the height of a bar is used to represent the frequency. All bars must be the same width and evenly spaced along the x (category) axis. Hence bars should never overlap. The gaps between the bars are used to emphasise the distinction between the categories or discrete values. As a general rule, the gaps should be about 30-70% of the width of the bar. If the gaps are too wide the chart will become unnecessarily large. Sometimes it may be appropriate to leave either no gap or a very small gap.

A horizontal bar chart: If there are a large number of bars, or the category labels are long, it is sensible to rotate the bar chart so that the bars are horizontal. Otherwise the category labels would have to be printed vertically, making them difficult to read.
Note that the category axis is now vertical and the frequency axis is horizontal.

In Excel a horizontal bar chart is called simply a Bar chart.

A component (or stacked) bar chart is used to show the breakdown of a total measurement into several categories. In the component chart it is only possible to judge the size of the bottom category precisely. However, the total is clearly visible and it is easier to see the proportion of the total in each component.

Advantages
  • Visually strong
  • Can easily compare two or three data sets
Disadvantages
  • Graph categories can be reordered to emphasize certain effects
  • Use only with discrete data

В начало страницы

 

Histogram

A histogram displays continuous data in ordered columns. Categories are of continuous measure such as time, inches, temperature, etc.

Advantages
  • Visually strong
  • Can compare to normal curve
  • Usually vertical axis is a frequency count of items falling into each category
Disadvantages
  • Cannot read exact values because data is grouped into categories
  • More difficult to compare two data sets
  • Use only with continuous data

В начало страницы

 

Frequency Polygon

A frequency polygon (1) can be made from a line graph by shading in the area beneath the graph. It can be made from a histogram by joining midpoints of each column.
Advocates of the frequency polygon argue that the purpose of a histogram is to show the shape of the data distribution and removing the bars makes the shape clearer and smoother. Critics argue that the class boundaries are then more difficult to see.  Also the y-axis is more difficult to interpret once the bars are removed.

The main problem with frequency polygons is deciding what to do with the endpoints. If one imagines an additional empty bar at each end of the histogram, then the polygon would look like the chart in (1). However, this gives the impression that a few of the marks were negative and some over 100! On the other hand, the chart (2) makes the areas at either end too small and is clearly wrong.

Advantages
  • Visually appealing
Disadvantages
  • Anchors at both ends may imply zero as data points
  • Use only with continuous data

В начало страницы

 

Pie chart

A pie chart displays data as a percentage of the whole.
Each pie section should have a label and percentage.
A total data number should be included.
Advantages
  • Visually appealing
  • Shows percent of total for each category
Disadvantages
  • No exact numerical data
  • Hard to compare 2 data sets
  • "Other" category can be a problem
  • Total unknown unless specified
  • Best for 3 to 7 categories
  • Use only with discrete data

 

Pictograph

A pictograph uses an icon to represent a quantity of data values in order to decrease the size of the graph. A key must be used to explain the icon.

There are three ways in which the pictures can be used to represent the data:

  1. varying both the length and width of the picture - and hence its area;

    In this type of pictogram each data value (frequency, percentage, etc.) is represented by the area of a picture which relates to the data.
    A square root rule must be applied to ensure the pictures are in the correct proportions, as the following example.
  2. keeping the width of the picture constant and stretching the length;

  3. keeping the width and length constant but stacking multiple copies of the picture.

Advantages
  • Easy to read
  • Visually appealing
  • Handles large data sets easily using keyed icons
Disadvantages
  • Hard to quantify partial icons
  • Icons must be of consistent size
  • Best for only 2-6 categories
  • Very simplistic

В начало страницы

 

Map chart

A map chart displays data by shading sections of a map, and must include a key. A total data number should be included.
Advantages
  • Good visual appeal
  • Overall trends show well
Disadvantages
  • Needs limited categories
  • No exact numerical values
  • Color key can skew visual interpretation

 

Line charts

Line plot

A line plot is a graph using marks (e.g., X, ·) above a number on a number line to show the frequency of data.

It can be used as an initial record of discrete data values.

Advantages
  • Quick analysis of data
  • Shows range, minimum & maximum, gaps & clusters, and outliers easily
  • Exact values retained
Disadvantages
  • Not as visually appealing
  • Best for under 50 data values
  • Needs small range of data

Line graph

A line graph plots continuous data as points and then joins them with a line. Multiple data sets can be graphed together, but a key must be used.
Advantages
  • Can compare multiple continuous data sets easily
  • Interim data can be inferred from graph line
Disadvantages
  • Use only with continuous data

В начало страницы

 

Dot graphs

Scatterplot

An XY chart, or scatterplot, is used to display bivariate (i.e. two variable) numerical data, either discrete or continuous.  This is where pairs of data have been recorded on the same objects.

Example 1
Each pair of measurements e.g. (height, weight) is represented by a single point on the plot. The independent x-variable is plotted on the horizontal axis, while the dependent y-variable is plotted on the vertical axis.
Points are not joined by lines on an XY chart.

Example 2
Unlike other charts (e.g. bar charts) it is not necessary to include the zero mark on either axis. The purpose of the graph is not to show the absolute value of the measurements but rather to reveal any relationship that may exist between them. Including the zero marks may cause the data to be squashed into a corner of the plot area (see chart to left). Instead the scales on the axes should be chosen to allow the data to fill the plot area.

Example 3
A scientist or researcher may add a trendline (or regression line) to a scatterplot to show the underlying relationship.
The equation (or formula) used to generate the plotting points for the trendline may be shown.

 

Advantages
  • Shows a trend in the data relationship
  • Retains exact data values and sample size
  • Shows minimum/maximum and outliers
Disadvantages
  • Hard to visualize results in large data sets
  • Flat trend line gives inconclusive results
  • Data on both axes should be continuous

Dotplot

A dotplot is an informal alternative to a histogram for displaying continuous data.  In a dotplot each data value is plotted as a dot on a horizontal axis.  Where two values are separated by less than a certain increment the dots are stacked in a column.  If the increment is made too small then it is impossible to see the shape of the distribution.  However, if the increment is made too large then a single column of dots is obtained.

 

Advantages
Disadvantages

Stem and Leaf Plot

Stem and leaf plots record data values in rows, and can easily be made into a histogram. Large data sets can be accomodated by splitting stems.

A stem-and-leaf plot is essentially a dotplot in which the plotting symbol is replaced by the data value itself.  It provides an informal alternative to a histogram when carrying out exploratory data analysis.  In a traditional stem-and-leaf plot a data value (e.g. 27) is split into two components - the stem (i.e. 2, representing 20) and the leaf (i.e. 7).  The stems are then written down once, while the leaves are stacked up alongside the stem to which they are attached.

 

Advantages
  • Concise representation of data
  • Shows range, minimum & maximum, gaps & clusters, and outliers easily
  • Can handle extremely large data sets
Disadvantages
  • Not visually appealing
  • Does not easily indicate measures of centrality for large data sets

В начало страницы

 

Box plot

A boxplot (or box-and-whisker plot) is a concise graph showing the five point summary. The five figures in question are: the minimum, lower quartile, median, upper quartile and maximum values. These five vital statistics are plotted on an axis (vertical or horizontal). Multiple boxplots can be drawn side by side to compare more than one data set.
Advantages
  • Shows 5-point summary and outliers
  • Easily compares two or more data sets
  • Handles extremely large data sets easily
Disadvantages
  • Not as visually appealing as other graphs
  • Exact values not retained

В начало страницы

 

Radar chart

A radar chart is the clockface form of a line chart. The category (x) variable is plotted at equally spaced points around the clock. The y-variable is plotted as a radius, so each category has its own y-axis radiating from the centre. Lines are used to connect points belonging to the same series. It is an ideal chart to use when the categories have a natural cyclical order, e.g. seasons of the year. However, it relies heavily on the use of colour to distinguish between series and is therefore of little value in monochrome media.
Advantages
Disadvantages

 

Bubble chart

A bubble chart is an extension of the XY-chart, where each data marker is drawn as a circular bubble and the area of the bubble is used to show the value of a third variable. Varying the colours of the bubbles can also be used to show a fourth categorical variable. Data labels can be used to identify particular points of interest. Unfortunately people find it very difficult to compare the areas of different sized circles.
Advantages
Disadvantages

В начало страницы

Использованные ресурсы:
http://www.leeds.ac.uk/languages/resource/english/graphs/tren1.htm
http://home.ched.coventry.ac.uk/Volume/vol0/chart.htm
http://mainland.cctt.org/mathsummer/JosephBond/StemAndPlots/stem-and-leaf_std.htm
http://math.youngzones.org/stat_graph.html
http://www.stats.govt.nz/about-us/policies-and-guidelines/data-use/graphic-guidelines-when+to+use.htm


далее: Практическое задание 1

 

Hosted by uCoz