Random Genetic Drift

IF THERE IS one single concept that is crucial to our understanding of the situation in which the world of purebred dog breeders finds itself as the millennium turns over, I would think that concept must be random genetic drift. One hears the concept mentioned occasionally, yet I suspect that very few breeders really understand it thoroughly and correctly. Let us then try to explore this notion of random genetic drift. It is one of the fundamental theories of population genetics. In its full expansion it becomes rather complex mathematically, yet when reduced to its most basic form it is extremely simple and straightforward.

Population Size

TO BEGIN WITH, keep in mind that random genetic drift has its most powerful effect in small populations. What do we mean by "small" in this context? We mean those populations that are the furthest removed from infinite number. Consider bacteria: microscopic in size, present everywhere that the correct conditions of temperature, humidity, and pH exist, their populations probably number untold trillions or quadrillions -- pretty close to infinite. On the other side of things, consider whooping cranes, condors and similar rare and endangered species of wild birds: the entire known populations of such species may number something like ten to fifty individuals. Purebred dog breeds, obviously, lie somewhere in between; but many of them are statistically much closer to the rare birds than they are to the bacteria. A glance at The Canadian Kennel Club's registration figures for a recently reported six-month period shows that, of the 160 breeds reported, 54 of those breeds registered 5 or fewer litters over the six-month reporting period. Even the most numerous breed, the Labrador Retriever, registered only 825 litters and 4,552 individual dogs over the same period. Obviously purebred dog populations are a far cry from infinite numbers. Genetic drift, then, will have its greatest influence on exactly such populations as those of our dogs.

The population size factor is more influential yet than it first appears, due to the way in which purebred dog populations are bred. In many breeds the effective breeding population is a great deal smaller than the actual population figures would suggest. Unless the number of males and females in the breeding population remains equal, the discrepancy between the number of males and females that actually contribute to the production of each generation will limit the effective breeding population. This limitation can be quite dramatic in breeds, such as the German Shepherd Dog, in which a limited number of very popular stud dogs account for a large proportion of the litters born to a large population of brood bitches. As a rule of thumb, the effective breeding population cannot exceed four times the number of sires in use. In the case of the GSD, a breed which surely must number in the hundreds of thousands of individual animals worldwide, the effective breeding population has been calculated to be something like six hundred, due to the persist overuse of popular stud dogs generation after generation.

The Nature of Mammalian Reproduction

OUR UNDERSTANDING of random genetic drift must begin with the nature of mammalian reproduction. We all know that it takes two mammals to make more mammals: specifically, a male (whom dog breeders call the sire) and a female (whom they call the dam). The sire contributes one sperm cell per puppy; the dam contributes one egg cell or ovum per puppy. On the level of individual genes, or equally on that of the chromosomes upon which genes are found, the basics remain the same: at each gene location or locus, an individual mammal has two copies of the gene belonging to that locus, one copy on each of two paired chromosomes -- one chromosome (and one allele) from the sire and one from the dam. There may be more possible alleles (alternative versions of a single gene) than just two at any given locus, but no normal individual has more than two, one from each of the two paired chromosomes that carry that locus in the hereditary material (the DNA) of every cell in the animal's body. The two alleles may be identical (making the animal homozygous for that trait) or non-identical (making the animal heterozygous for that trait). Of the two gene copies that the sire possesses, he can contribute only one copy to each of his progeny; the same is true of the dam. Each parent gets one chance only to influence each genetic trait in each individual progeny, by contributing one gene at each locus, and thus every progeny has half its genetic heritage coming from its sire and half from its dam. Perhaps all this sounds as though I am elucidating what is already abundantly obvious to all of us, but we need to remind ourselves of these very basic facts, because what we have just described is a process of binomial sampling. Since the chromosomes and genes in the sperm and ova are created afresh from cells already present in the parent body, in such a way that the originals are not lost in the process and the total stock of gene "possibilities" is not diminished, it is actually a process of binomial sampling with replacement. This means that it's not like dealing out a deck of cards, in which everyone knows that once three aces have been dealt there can be only one ace remaining in the deck, so that the probabilities of a player receiving an ace change even as the cards are dealt out. By contrast, in the reproductive process the stock of genes is virtually infinite; the probabilities are not altered by the genes contained in ova or sperm already produced.

This means that gametic sampling, which is the correct term for the distribution to the progeny of the genes held by male and female, is a process closely analogous to that of flipping a coin to see if it lands "tails" or "heads." That, too, is a process of binomial sampling with replacement, since there are only two possibilities and the probabilities at each toss remain unaltered by previous tosses. Let's emphasise that, because people sometimes assume intuitively that if you toss "heads" nine times running, you must have a 90 percent or better chance of tossing "tails" on the next throw. Wrong! No matter how many times you toss "heads" (providing the coin is a "fair coin" with no weighting to make it tend to fall one way more often than another), the probabilities on the next toss always remain fifty/fifty.

If you toss a fair coin one thousand times, you have every reason to expect that the final tally will be rather close to 500 heads and 500 tails. All right, you wouldn't be too surprised to see 507 heads and 493 tails, and the actual result would still be a good fit to the theoretical expectation of 500/500. Instead of 50%/50%, the actual results would then be 50.7%/49.3% -- close enough to satisfy most of us.

But if you only tossed the coin ten times, the actual result could turn out to be 7 heads and 3 tails, which would not be a good fit at all to our theoretical expectation of 5/5, because percentage-wise it would be not 50%/50% but 70%/30%! The error involved in this example is called sampling error. It is a basic principle of random statistical sampling processes of this kind that the smaller the sample involved, the greater is the potential for deviation from the theoretically expected probabilities.

Up to this point everything I've explained has probably been quite obvious and by now you may well think I'm insulting your intelligence, but at this point when we stop talking about coin tossing and go back to sampling genes, important things start to happen which, although equally logical, are by no means quite so obvious to everyone.

Allele Frequency and How It Changes

POPULATION GENETICISTS speak of the allele frequency of a particular version of a gene at a given locus within a finite population. Let's create an imaginary example in which we have a population of ten dogs, five males and five females. We'll call the population number "N" (N = 10), and consider the gene locus "B" which we'll say influences the pigment colouring the noses of the dogs in that population and has two competing versions, upper-case "B" representing dominant black pigment and lower-case "b" representing recessive liver pigment. Since every individual has two copies of the B-locus gene, there will be a total of twenty copies of the nose-color gene (2N) in the total population. If there are 14 copies of version "B" and 6 copies of version "b" then geneticists say that the allele frequency (denoted by the letter F) of B = 0.7 and the allele frequency of b = 0.3. (The total must always add up to unity --1.0.) Notice that this measurement of allele frequency says nothing about how many dogs actually have black noses and how many have liver noses. It's possible that all ten dogs might have black noses; it's also possible that as many as three of them could have liver noses. That would depend upon the distribution of the two competing versions among the individual animals. In either case the allele frequency remains the same, since we are only talking about the total number of copies of each competing version within the entire population, not about how they are distributed among the individual dogs.

If there are just so many copies of each allele within the population, 14 copies of B and 6 of b, then it's reasonable to ask whether these proportions (0.7/0.3, or 70%/30%) are likely to change, and if so, what might cause such a change. In order to answer that, we must look briefly at a fundamental principle of population genetics, called the Hardy-Weinberg Principle. Among other things, this principle states that allele frequencies will remain stable within a population over successive generations provided that certain conditions exist. Among these preconditions for stable allele frequencies are: non-overlapping generations, random mating, very large population size, no "migration" (movement of individuals into or out of the population), and no selection pressure. None of those preconditions are met in purebred dog populations! So we have at least five factors present that are potentially capable of changing allele frequencies in our imaginary example, as well as in our purebred dog breed populations in the real world.

Of those factors for change, the most potent are small population size, non-random mating, and extreme artificial selection pressure. Small population size is the most basic, because it readily generates the kind of sampling error that we saw in the ten-time coin-tossing exercise. We saw that it was easy to wind up with a 7/3 result even though the probabilities were 50/50. In our example, however, the probabilities at the outset are 70/30 due to the greater abundance of copies of "B". It is possible that the relative abundance of the two versions might remain the same from one generation to the next, but the same kind of random factors that influenced our small-sample coin toss could easily operate to make the relative allele frequencies of "B" and "b" change from 0.7/0.3 in the parental generation to 0.8/0.2 in the first filial generation, or they might operate in the other direction to change the frequencies to 0.6/0.4, too. Neither result would be difficult to imagine.

Then, when we bred the second filial generation, we would start from a new distribution of allele frequencies. The results, of course, would not be influenced by the previous draw in the genetic lottery any more than one toss of a coin is influenced by the previous one. One possibility is that the relative frequencies might move back to the original distribution. Another is that they would remain unchanged; or again, they might become yet more extreme, say 0.9/0.1, or 0.5/0.5. Already we can see two alternative scenarios emerging from this random binomial sampling process. In the first scenario, an allele that was only relatively uncommon gradually becomes really rare. In the second, the same relatively uncommon allele becomes more widely distributed. Let us suppose that in the third filial generation, the relative allele frequencies change yet again. If they were 0.9/0.1 in the second generation, then it would be quite possible that in the third generation they might become 1.0/0.0, in which case the minority allele would have disappeared completely from the population! If they were 0.5/0.5, they might easily become 0.4/0.6, in which case the allele which began as a minority presence of 30% would have progressed to a majority of 60%.

In the real world, of course, it is not likely that such extreme changes of allele frequency would occur in just three generations, but in such a small population it is definitely quite possible for the minority "b" allele with its initial frequency of 0.3 to become either lost (frequency 0.0) or fixed (frequency 1.0) within four to eight generations, solely through the operation of sampling error. Let's notice two things about this possibility. The first is that the breeding history of most purebred dog populations in a closed stud book already involves twenty to thirty or more canine generations. Many breeds have existed within the closed registries of The Canadian Kennel Club and The American Kennel Club for sixty to one hundred years or more; average generation times vary from two to six years. So there has already been plenty of time for alleles present in the original founder population to become either fixed (allele frequency = 1.0) or lost (allele frequency = 0.0) solely through the operation of sampling error.

The other thing to notice is that sampling error is not the only factor that can influence allele frequency. The way purebred dogs are bred almost invariably involves non-random mating and artificial selection. Random mating would imply that every male and female that survive to adulthood have more or less an equal chance of mating and making their contribution to the next generation In the purebred dog world this is far from being the case. Breeders consciously choose the sire and dam to produce each new litter. When a litter is born, the individuals from that litter that will ultimately produce the next generation are always carefully selected, and in their turn mates for them will be carefully chosen by the breeder. Thus each generation involves a fairly sharp cutback of the total available population to determine the actual breeding population.

Moreover, dog breeders vociferously insist that they select consistently for particular traits, generation after generation. In the case of show dog breeders, traits that are made the object of such selective breeding often turn out to be recessive traits genetically. In order for a fully recessive trait to be expressed in the phenotype, the actual physical expression of genes in an individual animal, that trait must be homozygous, that is to say, the individual must have two identical copies of the recessive gene. In the case of the nose-color gene we used for our example, only the dogs whose genotype is "bb" will have liver noses. The "Bb" heterozygotes will be black-nosed, as will the homozygous "BB" individuals. By this simple fact hangs a long and pregnant tale of dog-breeding!

To continue reading "Random Drift -- The Breeder's Hidden Enemy" click here.

Random Genetic Drift -- The Breeder's Hidden Enemy Copyright ©1999 J. Jeffrey Bragg

Population Size

The Nature of Mammalian Reproduction

Allele Frequency and How It Changes

Random Genetic Drift -- The Breeder's Hidden Enemy

Copyright ©1999 J. Jeffrey Bragg