Reading And Writing About Statistics
Why do we need statistics? Statistics provides a standardized procedure for drawing conclusions. It is involved with the collection, analysis, interpretation, and presentation of data.
Distribution: If the individual subjects in a sample or population have a characteristic that is not the same among members of the sample, the description of that characteristic is the distribution.
Kinds of Data and Scales of Measurement
| Data Type | Explanation and Examples |
| Nominal Scale | Grouping is arbitrary. Observations can be separated according to category: Dodge/Chevy/Ford, Male/Female. Classifications are equally weighted and sorted of data is only into named groups. All nominal scales are made from discrete data. |
| Ordinal Scale | Items can be ranked from smallest to largest. You can guess the relative weights of animals and rank them from smallest to largest: Shrew-Cat-Dog-Horse-Elephant All ordinal scales are composed of discrete data |
| Interval Scale | Based on continuous data. The numerical value has meaning. Example: A temperature of 100° C is exactly half that of 200° C. In the ordinal scale, the difference in weight among each of our animals is unknown. Interval scales my be either interval (discrete) or continuous (real). |
Descriptive Statistics
Arithmetic Mean (
):
The average of the observations.
Median: The median is a statistic of location equal to that variable in a ranked array that has an equal number of items above and below it. The median divides the distribution in half. For the ranked distribution 14, 15, 16, 19, 23 M=16. When the sample has an even number of items, the median is the midpoint between the two sample variates (n/2)th and [(n/2)+1]th such as 14, 15, 16, 19 M=15.5. Use the median if the data are not normally distributed. An example might be the salaries in a company where a few high-paid executives will skew the mean salary.
Statistics of Dispersion

Figure 1
Radically different distributions can have the same mean (Figure 1). Therefore, measures of the dispersion is needed.
Range: The difference between the highest and lowest values. For the following distribution... 14.9, 10.8, 12.3, 23.3, the range is 23.8-10.8=12.5

Figure 2
Standard Deviation (s, σ, SD): The standard deviation is the square root of the variance. Figure 1 depicts two distributions with equal means but low and high standard deviations. More on the standard deviation can be found here. Specifically, 68% of the observations lie within one standard deviation of the mean and 95% of the observations lie within two standard deviations of the mean (Figure 2).
Standard Error (SE, SEM): SD/sqrt(N).
95% Confidence Limits: If an experiment were run 100 times and you calculate the mean you would expect it to fall within the 95% confidence limits 95% of the time. For samples over 15, the 95% confidence limits are approximately twice the SEM
| Sample A | Sample B | |
| 7.2 | 3.6 | |
| 7.0 | 12.5 | |
| 6.8 | 3.3 | |
| 7.0 | 8.6 | |
| N | 4 | 4 |
|
|
7.0 | 7.0 |
| Median | 7.0 | 7.0 |
| s2 | 0.27 | 19.53 |
| s | 0.16 | 4.40 |

Figure 3
EXAMPLE DATA
Download Web Area data in Stata data format.
Download Excel data.
Note: If Excel’s Data Analysis tools are not available activate them using this procedure: Choose Tools, Add-Ins and click on the box next to Analysis ToolPak