OK, so we’ve got code for genes and code for individuals.  If we want to simulate evolution, we’re going to need a population.

Static Population Size

The first point to make is that we’re going to use a static population size. In reality populations are not necessarily static; populations grow and shrink.  However, in general populations follow a sigmoidal growth curve similar to this (idealised) one:

Population Growth Curve

Now, if we look at the right-hand end of the graph the population we can see that the population is a constant size. So in our model we’re assuming that our population if mature and constant. This model doesn’t fit all population types – some just don’t follow this model – but the majority of animal populations do follow this model.

Plus it’s much, much easier to write.

Generational Overlap

Populations in simulations can be split into two distict categories:  non-overlapping generations and overlapping generations.

Non-overlapping generations are the simplest model, so we’ll tack those first.  In a non-overlapping generation model, each generation dies at the end of the ‘year’ to be replaced by a completely new population. In programming terms, we create a second list of individuals of the same length as the original and let individuals from the ‘parent’ population reproduce, placing their offspring into the ‘offspring’ population until the offspring population is full.  We then delete the parent population and replace it with the new one.

Overlapping populations are a little different; individuals in the population can live for more than one year. When individuals do die, they leave ‘gaps’ in the population that can be filled by offspring. In programming terms we perform a mortality operation that kills off some of the individuals (usually in a way related to a measure of their evolutionary fitness), and then fill the empty spaces in the population with offspring.  This is more complex than a non-overlapping model as we need to keep track of the ‘gaps’ that mortality opens up in the population as well as perform fitness-dependent mortality.

Writing Generic Population Code

I’ve decided to write population code for both overlapping and non-overlapping populations using generics in .NET 2.  We can then create a basic ‘individual’ class that we can modify for our own purposes depending on what we want to do with out simulation, while re-using the same population code over and over again.

Firstly we have to deal with the ‘individual’ class that we’ve already created.

Improving the Individual : Interactions

So far our individuals are isolated from one another. In a true population they interact to some degree; at the very least they mate. We have to make a design decision here:  the basic population code will remain the same for almost all simulations, but the code for the individuals in the population will change a lot.  What I’m going to do is use an interface for the individuals, and use generics for the population.

We only currently need two thing from an individual;  a value for the genetic trait that the individual carries and the ability to cross with another individual. We’ll add more to this as we build up the simulation later.

public interface IIndividual

{
    int TraitValue { get;}
    IIndividual Cross(IIndividual mate);
}

In our code we just need to adapt our Individual class.

Firstly, deriving from the IIndividual interface:

public class Individual : IIndividual

Then, getting the trait value (int this case, just passing back the ‘height’ property:

 ///<summary>
/// Gets the trait value.
///</summary>
///<value>The trait value.</value>
public int TraitValue
{
    get { return this.Height; }
}

Finally, we need to add code to allow us to cross two individuals:

///<summary>
/// Crosses with the specified mate.
///</summary>
///<param name="mate">The mate.</param>
///<returns>A new <see cref="IIndividual"/> as offspring.</returns>
public IIndividual Cross(IIndividual mate)
{
    // Cast to specific type, error if not possible.
    Individual other = mate as Individual;
    if (other == null)
    {
        throw new System.ArgumentException("mate is not of same type.", "mate");
    }

    // Cross each locus of the gene
    Locus[] offspringGene = new Locus[arraySize];

    for (int i = 0; i < arraySize; i++)
    {
        offspringGene[i] = Locus.Cross(this.heightLoci[i], other.heightLoci[i]);
    }

    // Create new offspring with new genotype.
    return new Individual(offspringGene);
}

First things first.  We cast the IIndividual interface passed into our specific Individual type.  If this fails, we throw an exception. The types must be the same.

Next we create a new array of Locus objects.  These will be given to the offpspring.

We loop through all the parental Loci, crossing each one and putting the result into the offpsring Locus array.

We create a new Individual passing it the new Locus array.  We return the new Individual as the result (it will get cast back to an IIndividual automatically).

Look at how the Locus array gets crossed:  the first locus in parent 1 gets crossed with the first locus in parent 2 and the result becomes the first locus in the offspring.  This ordering is important.  Loci do not ‘jump about’ in the gene array, they stay at their original index.  We shall later add code to shuffle things around (called ‘cross-over’ or more correctly ‘conjugation’).

Notice also that there is no randomisation on crossover.  Only our probability rules are used to create locus types, they are at no point randomly assigned.  We shall later add some randomness to create ‘point mutations’ within the population.

A Non-overlapping Population

So now that we’ve got the individual interface sorted, lets look at using it in a population.

First, the class declaration:

///<summary>
/// A Simple, non-overlapping population.
///</summary>
///<typeparam name="T">The <see cref="IIndividual"/> based class.</typeparam>
class SimplePopulation<T> where T : class, IIndividual, new()

The class SimplePopulation uses generics. This will allow us to use the same popultion class with different versions of the individual class without having to change the population code.

The SimplePopulation class has some constraints on the generic type.  The type ‘T’ must be a class (not a value type or struct), it must implement the IIndividual interface and it must have a default constructor (with no parameters).

The SimplePopulation has only a few internal data members:

// Private holder for the population size.
private int mPopulationSize = 100;
// Maximum possible value for trait (required for census histogram)
private int mMaxTraitValue = 20;

// The actual population.
private T[] population;

and the constructor  simply initialises them:

///<summary>
/// Initializes a new instance of the <see cref="T:SimplePopulation&lt;T&gt;"/> class.
///</summary>
///<param name="populationSize">Size of the population.</param>
public SimplePopulation(int populationSize, int MaxTraitValue)
{
    mPopulationSize = populationSize;
    mMaxTraitValue = MaxTraitValue;

    population = new T[mPopulationSize];

    for (int i = 0; i < mPopulationSize; i++)
    {
        population[i] = new T();
    }
}

Notice the line that uses new T() — this is allowed due to the new()constraint in the SimplePopulation generics constraint mentioned above.

The next feature of a population is that it must be able to allow breeding.  We will do this through a single season with a method:

///<summary>
/// Does one season of reproduction.
///</summary>
///<remarks>This updates the Population with one season of reproduction.</remarks>
public void DoOneSeasonOfReproduction()
{
    T[] newGeneration = new T[mPopulationSize];

    System.Random rand = new Random();

    for (int i = 0; i < mPopulationSize; i++)
    {
        int idxParentOne = rand.Next(0, mPopulationSize - 1);
        int idxParentTwo = rand.Next(0, mPopulationSize - 1);

        newGeneration[i] = population[idxParentOne].Cross(population[idxParentTwo]) as T;

    }

    population = newGeneration;
}

Here we simply create a new array of individuals and fill it by randomly selecting parents from the original population and mating them to create a single ofspring.  We’re just doing random mating for now.  We’ll add trait-based selection later.

Once the new population is full, it replaces the old one.

Simple really.

I have also added a method to take a census of the populatin as a snapshot.  I won’t go over the code here (it’s in the download) as it’s just housekeeping.

So how to use our simpl population?

static void Main(string[] args)
{
    SimplePopulation<Individual> SimplePop = new SimplePopulation<Individual>(1000, 20);

    System.IO.StreamWriter sw = System.IO.File.CreateText("Log.txt");

    for (int i = 0; i < 999; i++)
    {
        DescriptiveStatistics.Histogram histo = SimplePop.TakeCensus();

        if ((i % 10) == 0)
        {
            Console.WriteLine("{0}   {1}", i, histo.Statistics.Average);
        }

        sw.WriteLine(histo.Statistics.Average);

        SimplePop.DoOneSeasonOfReproduction();
    }

    sw.Close();
    sw.Dispose();

}

We create our population using our Individual class as the template type, passing the population size (1000) and the maximum trait value (20).  We get the maximum trait value as twice the number of trait alleles in the Individual class; there are 10 alleles, therefore the maximum trait is 20.

Then we simply loop through the population once for each generation (I chose 999 generations pretty arbitrarily). We start each generation with a census which I write to the console and to a log file.  Then we perform one season of reproduction.

Not very complicated, I’m sure you’ll agree.

Running the simulation does not look very exciting.  Numbers appear on the screen and get saved to a text file. Woopee, I hear you cry sarcastically.

I’ve taken the output of running the program and plotted it as a chart to get this:

So what are you looking at?  This is an evolving population.  It’s not Darwinian Selection, it’s  Genetic Drift. I’ll discuss Genetic Drift in the next post and we’ll explore it a little with the code we’ve written so far.

Summary

So what have we got? We have a population of individuals that reproduce randomly.  Genes are inherited my offspring using probability-based Mendellian genetics. When run over a period of time, we get genetic drift.

The code for this entry can be downloaded from Channel9’s sandbox.

 

If you’re interested in writing software, check out my other blog: Coding at The Coal Face

Advertisements

I’ve got a new PC, so there will be a short period of adjustment while I set everything up.