25.7 Proteins

Proteins are macromolecular substances present in all living cells. About 50 percent of your body's dry weight is protein. Proteins serve as the major structural components in animal tissues; they are a key part of skin, nails, cartilage, and muscles. Other proteins catalyze reactions, transport oxygen, serve as hormones to regulate specific body processes, and perform other tasks. Whatever their function, all proteins are chemically similar, being composed of the same basic building blocks, called amino acids.

Amino Acids

The building blocks of all proteins are -amino acids, which are substances in which the amino group is located on the carbon atom immediately adjacent to the carboxylic acid group. The general formula for an -amino acid is represented in the following ways:


The doubly ionized form usually predominates at near neutral values of pH. This form results from the transfer of a proton from the carboxylic acid group to the amine group.

Amino acids differ from one another in the nature of their R groups. Figure 25.16 shows the structural formulas of several of the 20 amino acids found in most proteins. Our bodies can synthesize ten of these 20 in sufficient amounts for our needs. The other ten must be ingested and are called essential amino acids because they are necessary components of our diet.

FIGURE 25.16 Condensed structural formulas for several amino acids, with the three-letter abbreviation for each acid.

You can see from the structural formulas of amino acids that the -carbon atom, which bears both the amine and carboxylic acid groups, has four different groups attached to it. (The sole exception is glycine, for which R = H. For this amino acid, there are two H atoms on the -carbon atom.) Any molecule containing a carbon with four different attached groups is chiral. A chiral molecule is capable of existing as isomers that are nonsuperimposable mirror images of each other. (For more information, see Section 24.4)

The mirror-image isomers of a chiral substance are called enantiomers. The two enantiomers of a mirror-image pair are sometimes distinguished by using the labels D- (from the Latin dexter, "right") and L- (from the Latin laevus, "left"). The enantiomers of a chiral substance possess the same physical properties, such as solubility, melting point, and so forth. Their chemical behavior toward ordinary chemical reagents is also the same. However, they differ in chemical reactivity toward other chiral molecules. It is a striking fact that all the amino acids normally found in proteins are of the L-configuration at the carbon center (except glycine, which is not chiral). Only amino acids with this specific configuration at the chiral carbon center are biologically effective in forming proteins in most organisms. Figure 25.17 shows the two enantiomers of the amino acid alanine.

FIGURE 25.17 Alanine is chiral and therefore has two enantiomers, which are nonsuperimposable mirror images of each other.

Polypeptides and Proteins

In proteins, amino acids are linked together by amide groups, one of the functional groups introduced in Section 25.5:

Each of these amide groups is called a peptide bond when it is formed by amino acids. A peptide bond is formed by a condensation reaction between the carboxyl group of one amino acid and the amino group of another amino acid. For example, alanine and glycine can react to form the dipeptide glycylalanine:


Notice that the acid that furnishes the carboxyl group for peptide-bond formation is named first, with a -yl ending; then the amino acid furnishing the amino group is named. Based on the three-letter codes for the amino acids from Figure 25.16, glycylalanine can be abbreviated gly-ala. In this notation it is understood that the unreacted amino group is on the left and the unreacted carboxyl group on the right. The artificial sweetener aspartame (Figure 25.18) is the methyl ester of the dipeptide of aspartic acid and phenylalanine:

Sample Exercise 25.7

Draw the structural formula for alanylglycylserine.

SOLUTION The name of this substance suggests that three amino acids--alanine, glycine, and serine--have been linked together, forming a tripeptide. Note that the ending -yl has been added to each amino acid except for the last one, serine. We can view this tripeptide as three "building blocks" connected by peptide bonds:

By convention, the first-named amino acid (alanine in this case) has a free amino group, and the last-named one (serine) has a free carboxyl group. We can abbreviate this tripeptide as ala-gly-ser.

Practice Exercise

Name the dipeptide that has this structure and give its abbreviation:

Answer: serylaspartic acid; ser-asp

Polypeptides are formed when a large number of amino acids are linked together by peptide bonds. Proteins are polypeptide molecules with molecular weights ranging from about 6000 to over 50 million amu. Because 20 different amino acids are linked together in proteins and because proteins consist of hundreds of amino acids, the number of possible arrangements of amino acids within proteins is virtually limitless.

Protein Structure

The arrangement, or sequence, of amino acids along a protein chain is called its primary structure. The primary structure gives the protein its unique identity. A change in even one amino acid can alter the biochemical characteristics of the protein. For example, sickle-cell anemia is a genetic disorder resulting from a single misplacement in a protein chain in hemoglobin. The chain that is affected contains 146 amino acids. The substitution of a single amino acid with a hydrocarbon side chain for one that has an acidic functional group in the side chain alters the solubility properties of the hemoglobin, and normal blood flow is impeded.

Proteins in living organisms are not simply long, flexible chains with random shapes. Rather, the chains coil or stretch in particular ways. The secondary structure of a protein refers to how segments of the protein chain are oriented in a regular pattern.

One of the most important and common secondary structure arrangements is the -helix, first proposed by Linus Pauling and R. B. Corey. The helix arrangement is shown in schematic form in Figure 25.19. Imagine winding a long protein chain in a helical fashion around a long cylinder. The helix is held in position by hydrogen-bond interactions between NH bonds and the oxygens of nearby carbonyl groups. The pitch of the helix and the diameter of the cylinder must be such that (1) no bond angles are strained and (2) the NH and CO functional groups on adjacent turns are in proper position for hydrogen bonding. An arrangement of this kind is possible for some amino acids along the chain, but not for others. Large protein molecules may contain segments of the chain that have the -helical arrangement interspersed with sections in which the chain is in a random coil.

FIGURE 25.19 -helix structure for a protein. The symbol R represents any one of the several side chains shown in Figure 25.16.

The overall shape of a protein, determined by all the bends, kinks, and sections of rodlike -helical structure, is called the tertiary structure. Figure 24.8 shows the tertiary structure of myoglobin, a protein with a molecular weight of about 18,000 amu and containing one heme group. (For more information, see Section 24.2) Notice the helical sections of the protein.

Myoglobin is an example of a globular protein, one that folds into a compact, roughly spherical shape. Globular proteins are generally soluble in water and are mobile within cells. They have nonstructural functions, such as combating the invasion of foreign objects, transporting and storing oxygen, and acting as catalysts. In fibrous proteins the long coils align themselves in a more or less parallel fashion to form long, water-insoluble fibers. Fibrous proteins provide structural integrity and strength to many kinds of tissue and are the main components of muscle, tendons, and hair.

The tertiary structure of a protein is maintained by many different interactions. Certain foldings of the protein chain lead to lower-energy (more stable) arrangements than do other folding patterns. For example, a globular protein dissolved in aqueous solution folds in such a way that the nonpolar hydrocarbon portions are tucked within the molecule, away from the polar water molecules. The more polar acidic and basic side chains, however, project into the solution where they can interact with water molecules through ion-dipole, dipole-dipole, or hydrogen-bonding interactions.

One of the most important classes of proteins are enzymes, large protein molecules that serve as catalysts. (For more information, see Section 14.6) Enzymes are usually very specific with respect to the reactions they catalyze. Their tertiary structure generally dictates that only a certain substrate molecule can interact with the active site of the enzyme (Figure 25.20).