Evolutionary Modeling

October 26, 2006

I thought I’d write a bit about how evolution is modeled, or at least how I model it.

Most models will start out with a verbal version; a plain English explanation of the mechanism.

Unfortunately, I found that many of the more mathematical models missed this bit out.  It seems that mathematical types understand a lot just by looking at equations.  I’m not a mathematical type (despite being able to write simulations). I need the verbal model.

So I’ll first make two definitions:

  1. A Model is a mathematical function.  Given the same parameters it will always give the same answers.
  2. A Simulation is an actual reproduction of a simplified system. Simulations normally use some random numbers to determine probabilities of what will happen next.  Given the same parameters a simulation may give wildly different answers each time.

So what’s the point of a simulation if it always gives different answers? What you do with a simulation is to run it a number of times. Each run is called a replicate. When you have enough replicates you will analyse the results as a group and look for trends and patterns.

Simulations are often called Monte-Carlo models. This refers to the random number element — it’s supposed to be like gambling at a Monte-Carlo casino.  I’ve never been to Monte-Carlo, or gambled at a casino. This is a stupid name.  Before computers were available some simulations were done by throwing dice for randomness.  They should have called it D&D modeling.

Models are often build on the principles of the Normal Distribution and will deal with changes in mean and standard deviation over time, or in regard to some other variable.

I don’t do models.  Actually, I can’t do models.  Like I said, I’m not very mathematical (I’ve tried to learn calculus three times, and I still find it tough going).

What I understand and am going to write about are simulations.

So what’s the basis for the simulations I write?

Firstly there’s the probability stuff.  You decide on the probability of a particular event happening.  Then you generate a random number between 0 and 1 and if that number of less that your probability, then the event happens.

Pretty simple really.

To simulate a coin tossing, you would write:

if(randomNumber < 0.5)
Print Heads
else
Print Tails

You would then do this a few hundred (or even thousand) times and see that the coin comes down Heads half the time.

But we’re not simulating coin tossing, we’re simulating evolution.

Evolution is defined as ‘ any net change in the genetic makeup of a population’. So were going to need a population and some genetics.

In the last post I showed the three types of heterozygous gene possible: aa, aA and AA.

So that’s a gene.  Put some of these into a structure and we have an individual .  Make an array of these and we have a population with some genetics.  All we need to evolve the population is some selection pressure and a big, repeating loop.

The flow of a simulation program is often like this:

  1. Initialise population with random genes.
  2. Kill off some of the individuals based on their genetics and a probability-based function.
  3. Allow some of the individuals to breed — the probability of breeding will also be a probability-based function.
  4. Record the genetics of the population
  5. Go to 2

This is a simulation with overlapping generations.  Some simpler simulations will simply allow the old population to reproduce into a new, empty population of the same size and then destroy the parent population.  It depends on the type of animal your are basing your simulations on.

The probability-based functions that decide who dies and who breeds are the selection pressure in the model.  Selection pressure makes populations evolve as they try to maxmise their fitness.

What’s fitness?  I’ll tell you next time, it’s getting late.

 

If you’re interested in writing software, check out my other blog: Coding at The Coal Face

A bit of Genetics

October 15, 2006

I’m going to take a short break from the normal distribution and talk about the basic mechanisms of genetic inheritance.

If you took biology in high-school, this will all sound familiar.

Genes and Alleles.

The genetics I’m going to talk about are for heterozygous species (such as ourselves). This basically means that an individual has two copies of each gene. Some species are homozygous — but they’re mainly single-celled organisms and we all know how trashy they can get.

When a gene has alternative versions, they are called ‘alleles’. We are only going to look at the case where a gene has two alleles. I’m keeping it simple, stupid.

One of these alleles is dominant. That means if an individual has even one copy of that allele, then they will show the physical attributes coded by that allele.

The other gene is recessive. In order for an individual to display the characteristics of the recessive gene, both of the copies held by that individual must be the recessive allele.

The classic example of dominant and recessive alleles is eye colour. The allele for blue eyes is recessive. The allele for brown eyes is dominant. To have blue eyes, you need two copies of the blue eye allele. If you have one blue eye allele and one brown eye allele, then you’ll have brown eyes.

We will call the dominant allele ‘A’ and the recessive allele ‘a’. See? The lower case is recessive while the upper case is dominant.

OK, so there are three possible combinations of allele:

  • aa
  • aA
  • AA

Notice that aA and Aa are the same; the sequence does not matter.

Breeding.

Yeah. You’ve been waiting for this bit. I know.

So, we’ve got two potential parents. They breed. Give them some privacy, please. We’ll wait while they finish.

OK, they’re done. That was a bit quick, wasn’t it?

So what genes will the child carry? Basically the child will get one randomly selected gene from each parent.

The following table shows the possible offspring types for each parent combination:

Parent 1

aa aA AA
aa aa aa, aA aa, aA
Parent 2 aA aa, aA aa, aA, AA aA, AA
AA aA aA, AA AA

For example, if both parents are ‘aa’ then all offspring will be ‘aa’ because they are the only alleles available.

If one parent is ‘aa’ and the other is ‘aA’, then offspring will either be ‘aa’ or ‘aA’; ‘AA’ offspring are not possible because only one parent has the dominant ‘A’ gene to pass down.

By going through the possibilities, we can come to the probabilities.

If both parents are ‘aa’ then the probability of offspring being ‘aa’ if 100%.

If both parents are ‘aA’ then the offspring probabilities are:

  • aa — 25%
  • aA — 50%
  • AA — 25%

We can now build a probability matrix to determine the probabilities of offspring genotypes from any parental genotypes.

I’ll do that in the next installment and show how this probability matrix can be used in a computer simulation of genetics.

 

If you’re interested in writing software, check out my other blog: Coding at The Coal Face

Written while listening to:

King Of The Mountain by Kate Bush from the album Aerial

She Sells Sanctuary (Long Version) by The Cult from the album Best of Rare Cult

That Lady/Part 1 & 2 by The Isley Brothers; O’Kelly Isley; Ronald Isley; Rudolph Isley; Marvin Isley; Ernie Isley; Chris Jasper from the album Summer Breeze Greatest Hits

Take Your Mama by Scissor Sisters from the album Scissor Sisters [UK Bonus Tracks]

Normal Distribution Part II

October 15, 2006

Ok, so in the last post I described what a Normal Distribution was:

It’s a graph of the frequency of occurrence of values of a specific trait.

So here’s some more nuggets of information.

A Normal Distribution can be described using two values; the mean and the ‘standard deviation’ are all that’s needed to describe a normal distribution.

Here a picture showing what the mean and the standard deviation describe:

A normal distribution with mean and standard deviation marked.

So the mean described the position of the normal distribution on the X-axis, while the standard deviation describes the width of the normal distribution.

Mathematicians and statisticians love the normal distribution because you can describe the whole thing with just two numbers.

So here are two normal distributions with the same mean, but different standard deviations:

Two normal distributions with same mean but different standard deviations.

.. and here are to normal distributions with the same standard deviation but different means:

Two normal distributions with the same standard deviation, but with different means.

It is important to understand that evolution can affect both of these values either independently, or at the same time.

The normal distribution is a probability function.

There is a set of statistical tests that are known as the Parametric tests. These are all based on the assumption that the data being tested come from a normally distributed set. This basically allows the tests to make mathematical shortcuts by using the mean and standard deviation to produce probability curves.

There is a probability formula for the normal distribution, but I’m not going to show it, because we don’t need to memorise it to understand statistics.

 

If you’re interested in writing software, check out my other blog: Coding at The Coal Face

If You Find Yourself Caught in Love by Belle & Sebastian from the album Dear Catastrophe Waitress

Plug In Baby by Muse from the album Origin Of Symmetry