Genetic Drift

April 30, 2007

If you’ve managed to follow all the stuff so far about probability matrices, and random numbers you’ll want to know why we do simulations this way.

The answer is to represent genetic drift.

Genetic Drift is an evolutionary mechanism that doesn’t get as much press coverage as selection (survival of the fittest).

The reason Genetic Drift doesn’t get as much attention is that it’s a weak force.  It’s also random. However it’s still important.

To explain Genetic Drift I’m going to start with a ‘thought experiment’. So you’ll have to use your imagination.

Imagine you have a large bowl.  Into that bowl you tip 500 red marbles and 500 blue marbles.

Now put on a blindfold and mix up the marbles in the bowl with your hands, being careful not to spill any. Next, keeping the blindfold on, randomly take 100 marbles out of the bowl.  Don’t drop any, and don’t loose count.

Before you take the blindfold off, guess how many red marbles you took out of the bowl.  If you answered “50 red marbles” you might not be correct.

50 red marbles is the predicted number of red marbles you would expect;  half the marbles in the bowl (the total marble population) were red, so you’d expect a random selection of marbles (the sample) to also be half made up of red marbles.

But it’s not that simple.  Because you couldn’t see the marbles you were choosing, the marbles were chosen randomly.  Because the marbles were chosen randomly there’s a chance that the selected marbles (the sample) was not 50/50.  You might have actually selected 51 red marbles and 49 blue, or 43 red marbles and 57 blue.  There’s even a very tiny chance that you could have selected 100 blue marbles.

Now take the 100 marbles you chose and multiply each one by ten, so you have 1000 marbles again.  This is your next generation.  Repeat the blindfold/bowl/random choice bit again.  Now how many red marbles did you select? It’s still probably not 50.  It might be even further away from 50 than last time.  Keep doing this a few hundred times and you could find that the number of red marbles in the bowl changes over time.  It drifts in numbers in a random fashion.

That’s what genetic drift is in an evolving population.

Because there’s a random element in which individuals in a population get to mate together, there’s a random element in which genes get passed down to the next generation, which means there a random element to the genetic makeup of the population.  And as you should know if you read my first post, changes to the genetic makeup of a population is called evolution.

(EDIT:  I’ve just Googled and found other people are using marbles to explain genetic drift.  Dammit!  I thought I was being original!)

So, Genetic Drift (now I’ve defined what it is I’d better start capitalising the words) is evolution, but it is not Darwinian evolution.

What’s that? Not Darwinian evolution?  Nope.  Darwin didn’t know about genetics, so he could never have thought of Genetic Drift.  This is why current evolutionary science is described as Neo-Darwinian. So when idiots start spouting how Darwinian theory has holes in it, you could answer straight back with “that’s why we study neo-Darwinian theory!”; but that will probably make you sound like an idiot too, so I generally just keep quiet and ignore the lunatic fringe.

So now we have an idea of what Genetic Drift is, lets look at an example.  I finished my previous post with a graph.  I’ll put another copy here:

Chart of randomly changin population genetics.

First thing to remember:  the Y-axis represents the average trait value. Possible trait values range from 0 to 20.  The X-axis represents time and goes for about 200 generations.

If you look at the very first (left-most) point on the graph, you’ll see that it’s close to a value of 10.  This is roughly what we would expect the average value to be when the population is randomly generated, although due to the randomness of the initialisation you can never be certain.

In another previous post, I showed a chart of the genetic distribution within a randomly generated population. Here it is again:

Distribution of trait within a population.

This is a normal distribution, so keep in mind that the top chart, showing average trait is just the tip of the iceberg; each point is just the top of a normal distribution lurking 95% under the surface. When you look at the trait over time graph, think of it as being like this:

OK, so now how do we interpret the chart?  Well, it’s drifting around.  There’s no real direction to the changes except at the end where it seems to stop at a value of 8.


It’s due to the phenomenon of ‘fixation’.  Because the alleles in the individuals don’t ‘jump around’ in the gene (our array of Locus instances in the code) it’s possible to reach a state where every individual in the population has the same allele at the same location on the gene.  When this happens, there can be no more change at that gene. My guess is that when the population reaches an unchanging value of 8, that 4 loci in the gene are fixed for the ‘AA’ allele, while the rest are fixed for the ‘aa’ allele.

Let’s get more data.  I’ll run another simulation, but this time I’ll also record the standard deviation as well as the mean for the trait…

Chart of Average Trait Value over time

As we can see from this chart, the standard variation decreases slowly until there isn’t any left when the average trait value stops changing at a value of 6.

(Nice chart, huh?  Just got Office 2007 installed 🙂 )

Fixation can be an important mechanism in smaller populations, but when population size increases, it becomes less likely.  The last chart is from a population of 1000, this next one is from a population of 10000:

Drift in a large population.

As you can see, in this larger population (actually running the simulation for twice as long, too) there is little or no reduction in standard deviation, although you can still see it drifting around.

Remember when I said drift was a weak force?  That doesn’t mean it can’t have a large effect.  The following chart came about by chance as I was preparing to generate the last chart:

This was in a large population (10000) and run over 400 generations.  As you can see, the trait value drops to almost zero.  This means that the ‘A’ allele has almost vanished from the population. Once an allele is removed from the population, that allele is extinct.  This can happen purely through genetic drift.


Genetic Drift is an effect of the randomness of mating.  Although it is a weak force compared to selective pressure, it can have a large effect on the evolution of a population.  Drift is a stronger force in smaller populations.


If you’re interested in writing software, check out my other blog: Coding at The Coal Face