6/26/2006

How much is too much

(Population Genetics)

Apropos of my earlier post about the problems that the basic neutral model hits wrt mutation, I rigged up a script[1] to figure out: exactly how much mutation is required to mess things about?

Here is the output data taken over 100000 generations[2] with a population size of 100. Anything longer or larger would have been computationally problematic.
Mutation rate | Fixations
0.0001 | 12
0.0002 | 23
0.0005 | 51
0.001 | 68
0.002 | 111
0.005 | 108
0.01 | 23
0.02 | 2
0.05 | 1
0.1 | 1
0.2 | 1
0.5 | 1


I have of course graphed this out[3]:


And a close-up of the left hand side:


Now, from this graph it looks very much like the rate of fixation increases linearly then tails away. We've already explained the second phenomenon - it's because the high rate of mutation breaks the otherwise reasonable assumption that the system will tend to homogeneity. The first phenomenon, however, is very interesting. Let's act like proper scientists and see what's going on.

First we'll try to figure out what the actual rate of fixation is in these cases (note: I've trimmed the boring tail off):
Mutation rate | Fixation rate
0.0001 | 0.00012
0.0002 | 0.00023
0.0005 | 0.00051
0.001 | 0.00068
0.002 | 0.00111
0.005 | 0.00108
0.01 | 0.00023
0.02 | 0.00002
0.05 | 0.00001


Now, what's odd about those numbers? Oh, right, for the first few the mutation rate is almost identical to the fixation rate. Cool!

Warning: Empiricism at work

This equality would appear to hold very well for this case. But there's always the possibility that it's a trick of the light - there could be more factors that just happen to equal 1 in this case. So we're going to have to try more experiments.

In fact, due to time constraints, I'm only going to try three more. I'm going to change each of the two remaining variables (population size, number of generations - I've already covered mutation rate) in turn.

Population size

I'll raise the size by a power of 10 (setting mutation at 0.0002 and generations at 100000). If our hypothesis that the mutation rate = the fixation rate is true, I should get around 20 fixations.

The result: 11. Rate of fixation is therefore 0.00011
Close, but no cigar.

On reflection, though, it occurs to me that this could be a feature of the fact that, with a bigger population, the whack-a-mole factor of new mutations appearing is going to be more of a problem. As such, I repeated the experiment with a population size of just 10.

Result: 25. Rate of fixation is therefore 0.00025, which is acceptably close to the mutation rate of 0.0002. At some point I'll have to do a more detailed examination to determine whether all this is valid (at the moment I'm effectively reasoning from anecdotal evidence) but for the moment I'll say that population size apparently does not affect fixation rate and I'll move on.

Number of generations

I'll return the size to its original value of 100, and raise the number of generations to 1000000, leaving mutation rate at 0.0002

Result (after much waiting): 177. Should be 200. What the hell, close enough. (Like I say, when I have lots of computer time to play with I'll redo a lot of this)

Conclusion and explanation

OK, so the results, whilst decidedly flakey, appear to broadly support the conclusion that the rate of mutation equals the rate of fixation (for low rate of mutation). But why should this be so?

Turns out the reasoning is fairly obvious. Recall that, if N is the population size, the probability of any given allele in a generation eventually fixing is 1/N. Now note that, if the rate of mutation is u, there will be u.N mutated alleles per generation. If we assume that the average time taken for an allele to fix is not dependent on the generation it appears in, it follows directly that the rate of fixation will be u.N.1/N = u. QED.

This post has raised three further questions which I'll have to explore at some point:

1) How does the fixation curve vary with rate of mutation? In particular, at what point does the negative whack-a-mole effect start to overwhelm the positive effect described last paragraph? (this will probably be a mostly theoretical discussion)

2) Apropos of the last paragraph, under what conditions does the assumption of constant average fixation time break down? Can we destroy it by, for example, messing around with the population size?

3) How exactly are we assessing whether an experimental result is "close enough" to the theoretical result? Here I'm going to need to discuss some undergrad-level statistics.

[1] http://coalescent.freewebpage.org/popgen/gendrift4.py
[2] This sort of thing is why population geneticists don't do much fieldwork...
[3] After two years of the scariest computer projects on God's green earth, it is actually psychologically impossible for me to see data like this without trying to graph it out. I hold out hope that some day the scars will fade[4]
[4] If the computer course supervisor happens to be reading this, please note that the above was strictly humorous - the projects were great fun[5]
[5] Apart from the extremely dodgy function libraries you gave us, which in two cases made experienced programmers burst into laughter. But let's not go there
[6] This assumption may break down if, for example, the size of the population is increasing. I'll have to do more experiments at some point.

0 Comments:

Post a Comment

Links to this post:

Create a Link

<< Home