Chi-Square Analysis- Introduction to Statistics
A simple Chi-square calculator for two samples is here (Tutorial)
A Chi-square calculator for 2-6 samples is here (Tutorial)
Some processes in biology are clear-cut and work out nicely. If you're sick with the flu, then a blood test will show that you have the flu virus. Other things are not so clear-cut and your understanding of the data requires a statistical analysis to determine how probable your observations are. As an example, assume that you flip a coin 5 times and get the following results:
![]() Flip 1 |
![]() Flip 2 |
![]() Flip 3 |
![]() Flip 4 |
![]() Flip 5 |
Hmmmmm... four heads and one tail. The question you might ask is "Is this a fair coin or is it a trick coin that is weighted to come up mostly heads?". You might decide to cut the coin open to see if there's a weight in the coin that causes it to come up mostly heads or you might want to run a statistical test first to determine if four out of five heads is really that improbable. A simple test you could run is known as a Chi-square test. Before we get to that though, you need a little background in statistics.
Hypothesis: A hypothesis is a statistical claim. In it's simplest form any statistical test will have two hypotheses; a null hypothesis (H0) and an alternative hypothesis (H1). The null hypothesis is the hypothesis of no difference, in this case that a 4:1 heads:tails ratio is no different than a 50:50 chance (the coin is fair) while the alternative hypotheses states that the observed four:one ration is different enough that it can't be due to chance alone (the coin is not fair). In every statistical test you are always testing the null hypothesis, even if you don't believe it to be true. As an example you might test if the height of adult males and females in your class is equal (the null hypothesis of no difference) versus the obviously true alternative hypothesis that there is a difference in the height of males and females. To show there is a difference in height you must statistically disprove or reject the null hypothesis to accept the alternative hypothesis. A simple statistical test you could use is a Chi-square analysis.
The formula for a Chi-square test is:

This isn't nearly as bad as it looks. The biggest problem is figuring out our expected values. Since H0 expects that there is no difference in the number of heads and tails, the expected value for each is 2.5 (half of five flips). Obviously we can't have 2 1/2 heads or tails in the real world, but statisticians and mathematicians have no problems with their reality. So, plugging the numbers into the formula we get:

The X 2 value of 1.8 is compared to a table to determine if we should accept or reject H0. Here's the table....
| df | P = 0.05 |
P = 0.01 |
P = 0.001 |
1 |
3.84 | 6.64 | 10.83 |
2 |
5.99 | 9.21 | 13.82 |
3 |
7.82 | 11.35 | 16.27 |
4 |
9.49 | 13.28 | 18.47 |
5 |
11.07 | 15.09 | 20.52 |
To
enter the table, you need to know what row to use under "df" (degrees of
freedom). The number of independent pieces of information that are used to
estimate a parameter is the degrees of freedom. For most situations it is simply
N-1 where N is the number of independent scores (or attributes).
In this example, the number of attributes is 2 (heads or tails), so the degrees
of freedom is 1 (df=N-1). So, you enter the table at df=1 and read across the
row. To reject the null hypothesis, your Chi-square value must be more than
3.84, 6.64, or 10.83, with each increasing value representing successively
higher significance levels (yes, I know the P values actually get smaller). The
significance levels indicate the probability of rejecting a null hypothesis if
it is true. Thus, for P=0.05, there is a 5% chance that you'll accept the
alternative hypothesis (that there is a difference in your groups) over the true
null hypothesis. For this example, our Chi-square of 1.8 is smaller than all the
values for one degree of freedom, so we accept the null hypothesis that the 4:1
head:tail result is not inconsistent with a 1:1 ratio.
What happens if we flip the coin more times and get the following results?
![]() Flip 1 |
![]() Flip 2 |
![]() Flip 3 |
![]() Flip 4 |
![]() Flip 5 |
![]() Flip 6 |
![]() Flip 7 |
![]() Flip 8 |
![]() Flip 9 |
![]() Flip 10 |
![]() Flip 11 |
![]() Flip 12 |
![]() Flip 13 |
![]() Flip 14 |
![]() Flip 15 |
![]() Flip 16 |
![]() Flip 17 |
![]() Flip 18 |
![]() Flip 19 |
![]() Flip 20 |
Sixteen heads to four tails. That still works out to a four:one ratio, but what happens if we run the Chi square again? In this case, we would expect 10 heads and 10 tails. Run the Chi-square again, this time with the new values:

A quick check with our table shows that the results of the analysis are now significant at P<0.01 and we should reject the null hypothesis and accept the alternative hypothesis (e.g. the coin is NOT fair and the ratio of heads to tails is not 1:1). The reason why this changed is because we increased our sample size (the number of separate runs or experiments. Generally, the larger the sample size, the better your confidence in the analysis.
Let's apply this to the results of a Punnett Square. We'll make this a dihybrid cross between two individuals, both of which are heterozygous for the brown eye color and vestigial wing and have the genotype BrbrVgvg. The brown eye color (br) is recessive to the normal wild-type eye which is reddish in color (wild-type=Br). The small vestigial wing allele (vg) is recessive to the normal wing allele (Vg). After mating your flies (BrbrVgvg X BrbrVgvg) you return after a few weeks and count the offspring. Here's what you got:

Checking your Punnett Square you discover that you should have the familiar 9:3:3:1 ratio (9 wild: 3 Wild eye-vestigial wing: 3 brown eyed, normal winged: 1 brown-eyed, vestigial wing fly). The Punett Square is shown below.

With a total of 155 flies you would therefore expect 87.19 wild-eyed, normal-winged flies [(9/16)*155], 29.06 wild-eyed, vestigial wing flies [(3/16)*155], 29.06 brown-eyed, normal wing flies [(3/16)*155], and 9.69 double mutant flies [(1/16)*155]. Your null hypothesis is that your counted flies fit a 9:3:3:1 ratio. The Chi-square calculations follow:

With 3 degrees of freedom (four fly types and df=N-1), we consult our table and find that 21.99 exceeds all Chi-square table values for 3 df and that P<.001, so we reject our null hypothesis. A quick eye-balling of the data suggests that we have too many of the brown-eyed, vestigial wing flies and too few of the double mutants. Perhaps this should be looked at in more detail to determine why we got our skewed ratios.
Bush supporters vs. Kerry: Who's more
confused and less informed? 
| Question | Bush | Kerry | exp | Chi2 | p value |
| Iraq had WMDs | 72 | 26 | 49 | 21.59 | <0.001 |
| Duelfer Report concluded Iraq had WMD | 56 | 18 | 37 | 19.51 | <0.001 |
| Evidence of 9/11 support found | 63 | 30 | 46.5 | 11.71 | <0.001 |
| Most experts believe al Queda connection | 60 | 30 | 45 | 10.00 | <0.01 |
| Foreign countries support the war | 66 | 25 | 45.5 | 18.47 | <0.001 |
| Foreign countries want Bush re-elected | 57 | 33 | 45 | 6.40 | <0.05 |
| Admin claimed Iraq had WMD | 82 | 84 | 83 | 0.02 | ns |
|
Admin claimed Iraq had ties with al Qaeda |
75 | 74 | 74.5 | 0.01 | ns |
| Admin supports... | |||||
| Comprehensive Test Ban Treaty | 69 | ||||
| Land mine treaty | 72 | ||||
| Kyoto Protocol | 51 | ||||
| International Criminal Court | 66 | ||||