HANDOUT: GENETIC DRIFT
Genetic drift is a change in gene frequency due to the indeterminacy of sampling in small populations. Although it is not possible to predict the magnitude or direction of change in gene frequency due to genetic drift with certainty in any particular generation, it is possible to place reasonable bounds on the magnitude of change one is likely to see. Those bounds are the subject of this handout.
Suppose one has an ensemble of urns with a large number of white and black balls in them (these urns are analagous to populations and the balls are analagous to genes). We will assume that the total number of balls is the same in each urn and that in each urn half the balls are black and half are white (e.g. pb = 0.5). Suppose now that we reach into each urn and draw 10 balls randomly from each bag. Although the expected number of white balls drawn from each bag is 5 (p x 10 = 0.5 x 10 = 5), most samples will deviate by chance from this expectation. Most deviations will be small, but there is a finite probability that even all 10 balls will be either black or white. Small deviations will be common, large ones rare.
If we were to make a frequency histogram of the proportion of samples with a given proportion of white balls (analagous to the gene frequency in the next generations), we would obtain something like the following:
Most samples would contain between 40% and 60% wihte balls; a significant, though smaller, proportion would contain between 30% and 40% or between 60% and 70%; and an even smaller faction would contain more extreme proportions.
From this histogram, one can see that approximately 90-95% of the samples lie between 0.3 and 0.7. This is analagous to saying that in a population reduced to five individuals each generation (i.e. to 10 copies of the gene each generation--each individual has two copies), the new gene frequency would lie between 0.3 and 0.7 with a probability of 90-95%. Another way of putting this is that the change in gene frequency is between 0.2 and 0.2 with a probability of 90-95%.
If one were able to make these kinds of statements for any population size and for any initial gene frequency (initial proportion of white balls in the urn), then we would have some handle on predicting the magnitude of change in gene frequency due to genetic drift. In particular, if we know that 95% of the time the change due to drift will be between .15 and .15, and we actually observed a change over one generation of 0.3, then we would be forced to conclude one of two things: either (1) we happened to witness a rare event, which is unlikely, or (2) some process other than, or in addition to, genetic drift is caused the change in gene frequency, which is more likely. In other words, by comparing actual change in gene frequency with the largest reasonably expected under genetic drift, one can determine whether genetic drift can account by itself for the observed change.
The quantity that permits one to make such a comparison is a statistical property of the histogram in the Figure. That property is known as the variance of the distribution; its square root is the standard deviation of the distribution. Statisticians long ago worked out several properties of the variance and standard deviation of distributions of the type in the Figure:
1. the variance = s2 = p (1-p)/2n , where p is the gene frequency and N is the number of breeding adults.
2. the standard deviation = s = [ p (1p) / 2n ]1/2
3. 67% of the time, the gene frequency in the next generation will
lie within the interval (p-s,
p+s).
This is equivalent to saying that 67% of the time |Dp|
< s
, where Dp
is the change in gene frequency.
4. 95% of the time, the gene frequency in the next generation will
lie within the interval (p-2s,
p+2s).
This is equivalent to saying that 95% of the time |Dp|
< 2s.
Thus, if the
observed Dp
is greater than 2s,
this event would be rare, i.e. it would be expected to occur only 5% of
the time if genetic drift is the only process operating to change gene
frequencies. hence, if we saw a change in gene frequencies that was
greater than 2s,
we would be inclined to believe that the change was at least in part due
to other processes (e.g. natural selection).
EXAMPLE
Suppose a population with pA = 0.3 at time t is reduced to 100 breeding individuals at time t+1. Suppose further that the gene frequency at time t+1 is p = 0.39. Can genetic drift explain this change?
First calculate : s = [ pA (1-pA) / 2N]1/2 = [ (0.3) (0.7) / 200]1/2 = 0.032 .
Then 2s = 2 x 0.032 = 0.064 .
Next calculate Dp: Dp = 0.39 - 0.3 = 0.09.
Since Dp
> 2s
(0.9 > 0.064), the observed change is larger than would reasonably be expected
by genetic drift acting alone. One would therefore suspect that natural
selection is acting in addition to change gene frequencies in this case.a
[ To Lecture 5 | Bio 120
Handouts | Bio 120 Home Page | Department
of Biology | Duke University ]