The merits of models

(Population Genetics)

I've already written a couple of posts as to the relative merits of various models. For example, I talked about the assumptions of the Hardy-Weinberg model and what happens if they're broken. I also talked briefly about how Dawg models genetic mutation more realistically than most other programs.

The specific model that I've been looking at most lately is the neutral model of genetic drift. As such, it seems useful to have a look at some of the major problems with it.


Firstly, the obvious: no selection. To introduce the concept of selection, we're going to need to:
a) simulate the generation of phenotypes from genotypes (in particular, we'll need to consider dominance issues relating to competing alleles)
b) assign fitness values to those phenotypes

There are some difficulties with both of these. The issue of dominance can be and has been effectively solved by sufficient real-world research, so I won't discuss that. The assignment of fitness, however, has major conceptual problems. In the real world, organisms don't wander around with a big neon sign over their heads proclaiming "this organism has fitness value 98" or whatever. Fitness is usually an emergent property of the organism's interactions with its environment - if you drop a pale-coloured moth into the middle of an industrial revolution, it instantly becomes less fit.

I'll leave off discussion of this for the moment, because I intend to go on about it at great length when discussing genetic algorithms. Just be aware that assigning fitness in any meaningful sense is a major pain in the unmentionables.


There's a second flaw with many of the conclusions of the neutral model, and it lies in an assumption I briefly mentioned earlier - that systems tend to fixation, with one allele type ruling the roost. This assumption works fine if you have a system that proceeds from the first generation by reproduction alone. However, it breaks horribly if random mutation is thrown into the mix.

Just think about how the poor allele type must feel. It's finally approaching fixation when boom! a mutated variant of itself appears out of nowhere. Then, just when it's got that mutation beaten, another appears. And another. And another. It must be like playing whack-a-mole!

OK, so anthropomorphising allele types is silly. But the point remains. This assumption about fixation - which is so central to our conclusions - breaks down horribly when the mutation rate gets too high[1].

A new model

But that can't be right surely? After all, large alleles go to fixation all the time in the human population. Don't they?

No they don't. Well, a few do, but they're usually the ones that, when tinkered with, cause people to do an award-winning impression of Senator Kelly from the first XMen film. Selection is responsible there. However, what can and does happen under genetic drift is for specific DNA bases to go to fixation. Alleles mutate too fast to fix, but the DNA itself mutates fairly slowly. So a better model is actually to say that each allele contains a large number of bases, with a slow rate of mutation per base.

Now, this new model actually has very interesting consequences. See, before this, we had no way of determining an allele's history. Say we had a system with just two allele types, A and B, and one of them mutated into a C - we'd have no way of knowing which one it was. But that's just changed - since new alleles will be generally very similar to the allele from which they mutated (usually only differing by a single base), we can make inferences about the historical relationships of the various extant alleles - their family tree.

This family tree is called the Coalescent, and really deserves a post of its own. See you next time for more :)

[1] At some point I'll write a script to determine: how high is too high?


Post a Comment

Links to this post:

Create a Link

<< Home