Evolutionary Analysis

Chapter 5: Activity

Case Studies: The Evolution of Human Resistance to Malaria

The Evolution of Human Resistance to Malaria
by Juliette Winterer, Franklin and Marshall College

In this case study, you will evaluate data that reveal the relationship between frequencies of a rare hemoglobin allele in human populations, and resistance to malaria, a disease that today is the third most deadly infectious disease in the world. Several mechanisms of evolution come into play in this story: mutation, migration, and natural selection. All of these mechanisms shed light on the intricate interrelationship between humans and this disease. You will be given several datasets and several analytical techniques, and provided with the opportunity to evaluate the nature of the interaction between humans and malaria. By answering a series of questions, you will discover whether there is evidence for ongoing evolution in human populations.

Human variation in hemoglobin1
Hemoglobin, the molecule that carries oxygen in the red blood cells of human children and adults, is a protein made of two alpha subunits and two beta subunits. The genes that encode the alpha and beta subunits are both known to have several alleles. Many of these alleles derive from point-mutations in the DNA sequence that lead to single amino-acid substitutions in the protein. Most of the mutant alleles are very rare, because they reduce the efficiency of oxygen transport by red blood cells.

In sub-Saharan Africa, there is a surprisingly high frequency of an allele that produces an abnormal form of the ß-subunit of the hemoglobin molecule.2 The allele is the result of a single point mutation in the coding region of the ß-subunit gene. This mutation results in a substitution of valine for glutamic acid in position six of the ß-subunit. This substitution alters the shape of the hemoglobin molecule. Hemoglobin molecules containing altered ß-subunits will crystallize under conditions of low oxygen tension. When this happens, the red blood cells which carry the hemoglobin molecules change shape drastically. Instead of their normal platelet shape, the red blood cells become sickle-shaped (see figure 4.1 on page 105 in your textbook).

We will use the symbol A to represent the allele for the normal form of the ß-subunit, and the symbol S to represent the mutant form responsible for cell sickling. The three possible genotypes in a population containing both alleles are AA,AS, and SS. People with the genotype AA make normal hemoglobin and have a normal phenotype. People with genotype SS produce hemoglobin that is highly prone to sickling, and as a result suffer from anemia caused by poor oxygen delivery by the sickle-shaped cells in the blood. Furthermore, periodic and painful crises can lead to serious organ damage. This condition, known as sickle-cell anemia, is often fatal. Many SS homozygotes die before the age of reproduction.3 People with the genotype AS produce a mixture of normal and abnormal hemoglobin. For the most part, the production of normal hemoglobin compensates for the production of mutant hemoglobin, and the heterozygotes do not suffer from sickle cell anemia (although they are, of course, carriers of the sickle-cell trait). In other words, allele S is recessive to A.

Question: What is the expected frequency of alleles A and S in a population containing both?

We said above that in sub-Saharan Africa the frequency of allele S, which is greater than 0.12 in some areas, is surprisingly high. The basis for this claim is a calculation of what the frequencies of S and A ought to be, given the information we have just reported about the phenotypes associated with each of the three possible genotypes. We can perform this calculation using tools developed in section 5.3 of the textbook.

We will use the variable p to represent the frequency of allele A, and q to represent the frequency of S. Box 5.3 in your textbook (page 131) derives the following expression for delta p, the change from one generation to the next in the frequency of allele A:

where wAA is the fitness of AA individuals, wAS is the fitness of AS individuals, and is the average fitness for the whole population, equal to p2wAA + 2pqwAS+ q2wSS. The fitness of individuals with a particular genotype is their average lifetime reproductive success, which is a function of the probability that they will survive to adulthood and the number of offspring they will have if they do survive.

Based on the information provided above, we can assign fitness to the three hemoglobin genotypes as follows:

Individuals with genotypes AA and AS have high and essentially equal fitness, whereas SS individuals have fitness that is lower by some increment s, mostly due the to their reduced probability of surviving to adulthood.

Given these fitnesses, we are interested in whether the frequency of allele A will rise or fall from one generation to the next. Notice that the value of , as calculated by equation 1, will be positive whenever

is greater than zero. Substituting the full expression for and the values for wAA', wAS', and wSS' from the table above gives:

This last quantity, q2s, will be greater than zero anytime q is greater than zero (and s is greater than zero). Our calculation shows that whenever allele S is present in the population, the frequency of allele A should rise--until the frequency of allele S is zero, and the frequency of allele A is one. Another way to reach this conclusion is to set the expression for equal to 0, and solve for p. This gives the equilibrium value of p, also known as . Note that , as calculated in equation 1, is equal to zero whenever p is equal to zero or is equal to zero. As an exercise, readers may want to solve the expression

for p, to show that = 1.

Thus, we expect the frequency of the A allele to be one, and the frequency of the S allele to be zero, regardless of s, the strength of selection against SS homozygotes.

Because the frequency of the S allele in sub-Saharan Africa is much greater than zero, there must be some process competing with natural selection against the S allele that maintains the S allele in the population.

Question: Is that process mutation?

The rate of mutations converting allele A to allele S would have to be 10-4or higher to suggest that a balance between selection and mutation is maintaining the S allele at a frequency of just 0.03 in a population (see textbook pages 145-147, and Hartl and Clark 19974). This mutation rate is extraordinarily high, and the frequency of the S allele is considerably higher than 3% in many parts of Africa.

Question: Why is S in high frequency in African populations when there is clear natural selection against SS individuals?

The biology of malaria5
Malaria, a disease which can cause debilitating anemia and potentially fatal brain blood clots in people, is caused by four species of protozoa in the genus Plasmodium. By far the most pernicious of the four is Plasmodium falciparum which is common in sub-Saharan Africa. The plasmodium is transmitted from one person to another by way of a vector, the mosquito. Several tropical mosquito species are capable of transmitting the plasmodium, and as a result, the disease is endemic to many tropical areas. In Africa, where 90% of the world's malaria cases exist, the mosquito Anopheles gambiae is the main vector for the plasmodium. Mosquitoes transmit the disease by drawing blood from an infected person. With the blood, they also draw up plasmodia. These plasmodia live in the gut of the mosquito, where they reproduce sexually, and move to the mosquito's salivary gland. When the mosquito then draws blood from an uninfected person, the plasmodium is transferred to that person in the saliva of the mosquito. The plasmodia invade the human blood stream and quickly move to the liver where they multiply by asexual reproduction. The offspring of these invaders move back into the blood and inhabit red blood cells. Further asexual reproduction causes the red blood cells to burst, which causes the severe symptoms of the disease.

The Relationship between S and malaria
In 1949, J. B. S. Haldane6 suggested that the reason that the deleterious S allele occurs in high frequency in some human populations is because individuals who are heterozygous for the allele (genotype AS) do not suffer from the severe anemia due to cell sickling, but also enjoy some resistance to malaria. He based this proposition solely on the observation of higher than expected frequencies of allele S in regions where malaria is endemic.

We saw above that some evolutionary force must counteract natural selection against the deleterious recessive S allele. We also concluded that that force was not likely to be mutation balancing against selection. Haldane suggested that that force was natural selection for resistance to malaria. If individuals with genotype AA are less resistant to malaria, their fitness will be lowered by some fraction, t, relative to individuals with the genotype AS. Of course, individuals with genotype SS still suffer from sickle-cell anemia, so their fitness is still lowered by some fraction, s. Thus we can represent the fitnesses of the three genotypes in a population as follows:

With these fitness values, is there now an equilibrium value for p (and for q) somewhere between zero and one? Looking back at equation 1, we know that = 0 when


Substituting the full expression for and the new fitnesses for wAA', wAS', and wSS' gives

Simplifying this equation reduces it to

Rearranging terms and substituting (1-p) for q gives

Factoring (pt) out of the left hand side gives


This equation can be solved for p, which gives

Finally, we can substitute (1 - q) for p to get

In other words, the equilibrium frequency of S in the population is a function of the competing strengths of the two opposing forces of natural selection: selection against SS individuals because of sickle-cell anemia, and selection against AA individuals because of susceptibility to malaria. Through this process, wherein the heterozygote has a fitness advantage over either homozygote, the frequency of S may remain at levels seen in sub-Saharan Africa, even if the fitness of SS homozygotes is extremely low.

Question: What does the selective advantage of having genotype AS have to be in order to see the observed frequency of S (as high as 0.15) in sub-Saharan African populations? (How high does t have to be if s = 1, and the frequency of S is 0.15 in a population?)

Assume that s = 1 (that is, that all SS individuals die before reproducing). Use equation 2 to calculate the predicted value of t, the strength of selection for resistance to malaria, in a population in which the frequency of S is 0.15.

Question: Do the data support Haldane's prediction that individuals with the S allele have greater resistance to malaria than AA individuals?

It wasn't until the 1960's that data that could address this prediction became available. Below you will find the original data7 that were used to test Haldane's hypothesis. They are a compilation of 10 independent studies on the incidence of severe plasmodium infection among children with or without the sickle-cell allele.

Examine the data, and show whether the evidence supports or rejects the hypothesis that the S allele confers resistance to infection by the plasmodium that causes malaria. (Most of the individuals in the category 'with S allele' were heterozygotes, but a few were SS homozygotes.)

Method for Analysis:
For each of the ten populations, calculate the frequency of infection among people with the S allele, and among people without. If Haldane's hypothesis is correct, then there should be higher rates of infection among AA individuals than among AS or SS individuals. Does this appear to be true?

Could we have gotten this result by chance, even if the S allele has no effect on resistance to malaria? Answering this question requires a statistical test. Following the strategy outlined in Box 8.1 (page 260-261 of your textbook), we can conduct a test as follows:

1) Specify a null hypothesis. In this case, our null hypothesis is that there is no association between genotype and resistance to malaria.

2) Calculate a test statistic. We will take as our test statistic the number of populations in our set of ten for which the rate of infection is higher among AA individuals than among AS or SS individuals.

3) Determine the probability that chance alone could have made the test statistic as high as it is. Under the null hypothesis that genotype does not affect resistance to malaria, determining whether the frequency of infection is higher in individuals with or without the S allele is like tossing a coin. The true frequency of infection should be the same in both groups, with the group that appears to have the higher frequency in any given study determined by chance. If we toss a coin ten times and count the number of heads, we could get a number anywhere from 0 to 10. The graph below shows the probability of each possible result. The table shows the total probability of getting 6 or more heads, 7 or more heads, and so on.

4) Decide whether the outcome is statistically significant. If the value of the test statistic is 9 or 10, then we can we can consider our result to be significant. In other words, we can conclude that the data demonstrate that having at least one copy of the S allele makes a person more resistant to malaria.

Question: What kind of data would show evidence that tolerance to malaria is the mechanism by which natural selection maintains high frequency of the S allele in human populations?

Now that we have compelling evidence that the sickle cell allele protects individuals from infection by the plasmodium that causes malaria, we can test the hypothesis that it is the process of natural selection that explains high frequencies of this allele in human populations from regions where malaria is endemic.

Approaches and Data:
Approach 1: Examine historical data showing an increase in the frequency of the allele with the appearance of endemic malaria.

We have no historical record of human hemoglobin allele frequencies prior to the emergence of malaria as an endemic tropical disease. In fact, data suggest that the plasmodium that causes malaria may have been around as long as the genus Homo (Cavalli-Sforza et al 1994). However, current research is in progress that may approach this question. In our DNA are the footprints of natural selection. When balancing selection occurs, it leaves characteristic traces in the variability of the DNA sequence surrounding the locus under selection (See textbook chapter 18, pages 612-617; and chapter 13, pages 446-450.) Michael Nachman at the University of Arizona and his colleagues are now analyzing the sequence variation surrounding hemoglobin genes to look for evidence of a history of balancing selection on the S allele.

Approach 2: Examine correlative data showing an association of high frequencies of the allele in regions of endemic malaria.

We have maps of the distribution of the S allele in Africa, as well as maps of the distribution of malaria prior to the 1950's, when global intervention against the disease began. These maps8 show that there is some correspondence between high frequency of S and endemic malaria. The maps, however, do not correspond perfectly, and we can only conclude that these patterns are not inconsistent with the hypothesis that malaria-resistance and the sickle cell trait are causally related.

Approach 3: Evaluate experimental data9 showing that when malaria is removed, the frequency of the S allele declines.

Several hundred years ago, African populations were forcibly removed from Africa and brought to the United States and sold into slavery. Some of these people moved to the Caribbean, and their descendants have lived there ever since. Some Caribbean regions have endemic malaria and some do not. Thus, the setting of a "natural experiment" exists. We can evaluate the frequency of S in populations that originated in Africa where malaria was endemic and ask whether the frequency of S remains high when these people were removed from Africa but relocated to malarial regions. We can also ask whether the frequency of S has declined among these Caribbean peoples living in regions without malaria.

Analyze these data and draw your own conclusions. One method for analyzing the data is as follows:

First, calculate the frequency of the S allele in each of the 11 populations. For this calculation, assume that SS individuals are rare, and that to a first approximation all individuals in the category "SS or AS" are heterozygotes. Does it appear to be the case that the frequency of the S allele has fallen where malaria is absent?

Now perform a statistical test as follows (this test is Wilcoxon's rank sum test, described in many basic statistics books):

1) Take as the null hypothesis the claim that the frequency of the S allele is the same in regions where malaria is present as in regions were malaria is absent.

2) Calculate a test statistic as follows. Arrange all 11 populations in a list, in order of their frequency of the S allele. The population with the lowest frequency should be first; the population with the highest frequency should be last. The position of each population in the list is its rank. Take the sum of the ranks of the four populations from areas without malaria. For example (not the actual result), if Curaçao is first, St. Vincent seventh, Dominique fourth, and Barbados ninth, then the sum of their ranks is 21.

3) Under the null hypothesis, our procedure for calculating the test statistic is just like picking at random four numbers between 1 and 11 (such that no number is picked twice), then adding them up. The answer could be anything from 1 + 2 + 3 + 4 = 10 to 8 + 9 + 10 + 11 = 38. The graph below shows the probability of each possible result. The table shows the total probability of getting a sum of 16 or smaller, 15 or smaller, and so on.

4) If the value of the rank sum we calculate from the actual data is 14 or smaller, we can consider our result to be statistically significant. In other words, we can conclude that the frequency of the S allele is lower in African-Caribbean populations living in areas without malaria than it is in similar populations living in areas with malaria.


1. Hemoglobin references:

Cavalli-Sforza, L. L. and W. F. Bodmer. 1971. The Genetics of Human Populations. W. H. Freeman and Company, San Francisco, CA.

Collins, F. H. and S. M. Paskewitz. 1995. Malaria: Current and future prospects for control. Annu. Rev. Entomol. 40: 195-219.

DNA Learning Center, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, New York 11724. http://www.dnalc.org/

2. See figure 2.14.1.D from: Cavalli-Sforza, L. L., P. Menozzi, and A. Piazza. 1994. The History and Geography of Human Genes. Princeton University Press, Princeton, NJ. for a map of the distribution of the allele from sub-saharan Africa.

3. Medical researchers recently discovered that treatment with hydroxyurea dramatically decreases the frequency of crises. See Charache, S., et al. 1995. Effect of hydroxyurea on the frequency of painful crises in sickle cell anemia. New England Journal of Medicine 332: 1317-1322.

4. Hartl, D. L., and A. G. Clark 1997. Principles of Population Genetics. 3rd edition. Sinauer Associates, Inc. Sunderland, MA.

5. Malaria references:

Collins, F. H. and S. M. Paskewitz. 1995. Malaria: Current and future prospects for control. Annu. Rev. Entomol. 40: 195-219.

World Health Organization, Division of Control of Tropical Diseases. http://www.who.ch/programmmes/ctd/diseases/mala/malamain.htm

6. Haldane, J. B. S. 1949. Disease and evolution. Ricerca Sci. 19:(Suppl. 1):3-10.

7. Data from: Allison, A. C. 1965. Polymorphism and natural selection in human populations. Cold Spring Harbor Symposium in Quantitative Biology. 29:137-149.

8. See figures 4.8 and 4.9 on p. 149 from: Cavalli-Sforza, L. L. and W. F. Bodmer. 1971. The Genetics of Human Populations. W. H. Freeman and Company, San Francisco, CA. for maps of the distribution of Hbs in Africa, and of the distribution of malaria prior to 1950.

9. Data from: Allison, A. C. 1965. Polymorphism and natural selection in human populations. Cold Spring Harbor Symposium in Quantitative Biology. 29:137-149 (table 4) .

Copyright © 2003 by Prentice Hall, Inc. A Pearson Company Legal Notice