Underrated Origins of the Protein Folding Problem
Students of biology take much more granted. Seemingly simple ideas — like how DNA is the genetic material, or how a protein folds according to the order of its amino acids — are taken as gospel or undeniable “truth,” even though such ideas once bordered on fringe conspiracies. I’m intrigued by the stories of how such ideas went mainstream, so to speak; doubly so if the discoveries were made by people who I’ve never heard of (not the Darwins or Mendels or Cricks).
Christian Anfinsen is one such person. Until last week, I had never heard of him. He shared the 1972 Nobel Prize in Chemistry for his studies of a particular enzyme, called ribonuclease. Specifically, he did a clever experiment in which he denatured this enzyme (meaning he destroyed its 3D shape) and then showed the enzyme could refold, and gain its activity, autonomously. This experiment was the first to suggest a protein’s form is encoded by its amino acid sequence, and it heralded the mad rush, by computational biologists, to solve the “protein folding problem.” (Anfinsen is also interesting, on a more personal level, because he apparently did his entire PhD at Harvard in just two years.)
Before I describe the experiment, it’s important to put everything (briefly) in historical context. The experiment itself was published in 1961, but Anfinsen was working on these ideas since at least 1958. (I say this because he published another paper, also involving ribonuclease and its 3D form, in 1959; and the experiments, getting a paper publishing, and so on all take time.) Even if one uses 1961, the year Anfinsen published his experiment demonstrating that a protein’s structure is encoded by its amino acids, as reference, just consider all the things that were not yet understood:
The structure of DNA had only been solved eight years before; the first protein structure (myoglobin, cracked by John Kendrew) and Crick’s initial thoughts on the Central Dogma and information flow were only three years old; the first codon (UUU, corresponding to phenylalanine) was not yet solved. Said more forcefully, Anfinsen connected protein sequences to structures before the genetic code was mapped. And he figured out that the linear sequence of amino acids contained the instructions for a 3D structure, at a time when nobody really knew how DNA molecules even encoded amino acids, or how to sequence them. It’s really quite revolutionary.
Anfinsen’s experiment was quite simple, but provided a large amount of information. (In this sense, I’d absolutely classify it as a “beautiful experiment” which, as defined by Nobel Prize laureate Frank Wilczek, is an experiment wherein you get out more than you put in.”) He was working with a small protein, called ribonuclease, which cuts up RNA molecules. Ribonuclease has four strong disulfide bridges which hold its 3D form. These bridges connect cysteines together, and always in the same pairs. The protein has eight cysteine amino acids, and so these disulfide bridges could theoretically take 105 different combinations; but Anfinsen found that the bridges always form between the same pairs. (There is a bridge connecting the cysteine at position 26 with a cysteine at position 84, for example.)
To begin, Anfinsen purified ribonuclease enzymes from cow pancreases and tested their activities and forms. He shined polarized light through his protein solution, for example, to measure how they rotate the polarity. (More “ordered” proteins twist light more severely.) He also tested whether the ribonucleases could cut up RNA molecules and, indeed, they did.
Next, Anfinsen dropped these enzymes in a 8M urea solution with mercaptoethanol. The mercaptoethanol destroys disulfide bridges, while urea destroys everything else, and especially the hydrogen bonds. The end result is a bunch of uncoiled, loose, inactive ribonucleases (as measured, again, according to their optical rotation and enzymatic activity.)
But then, in the final part of his experiment, Anfinsen filtered out the urea and mercaptoethanol, placed the enzymes in a “clean” liquid with pH 8.2, oxygenated them (so the disulfide bridges could reform) and waited 24 hours. The next day, enzymes which had been completely obliterated, in terms of 3D structure, suddenly had the same optical rotations and activities as when Anfinsen had first purified them from the cow pancreas. These “recovered” enzymes regained nearly 100 percent activity, and the final concentration was about 95 percent of the original batch.
In order for these enzymes to recover their activity, the four disulfide bridges must “re-find” their partners out of the 105 possible combinations I mentioned earlier. Anfinsen’s experiment showed that this pairing is not random; the same couplets find each other every time. More broadly, this experiment suggested that protein refolding itself is not random, but rather thermodynamically-driven. The native structure of a protein is stable, in other words, and proteins will autonomously “search” until they find a stable form.
Anfinsen called this the “thermodynamic hypothesis of protein folding,” and concluded that in order to determine a protein’s structure, one could presumably calculate the sum of all interactions between its atoms, in all possible configurations, and then find the solution with the lowest internal energy.
In 1968, a full seven years after Anfinsen’s work was published, a protein biochemist named Cyrus Levinthal published a paper titled, “Are there pathways for protein folding?” He begins the paper with an overt nod toward Anfinsen’s experiment, writing: “Denatured proteins, which have had essentially all of their native three-dimensional structure disrupted, can refold from their random disordered state into a well-defined unique structure, in which the biological activity is virtually completely restored.” Levinthal’s paper goes on to present a paradox: Namely, consider a simple scenario in which each amino acid in a protein can adopt three configurations. Assuming this protein has 100 amino acids, then it could theoretically adopt 3^100 forms. Even if this protein was able to sample configurations 10^13 times per second, it would take 10^27 years to try all configurations. And yet, somehow, experimental evidence from the time demonstrated that proteins can fold quite quickly; often in a few seconds or minutes. This discrepancy became known as Levinthal’s paradox. Even today, it seems that more protein biochemists are familiar with Levinthal than Anfinsen, because the former’s paradox motivated the “protein-folding problem” and suggested there must be some viable way to predict a protein’s structure in a computationally-efficient way, since biology had evolved a mechanism to do precisely that.
And yet, the data Anfinsen collected in 1961 already showed that there is a lag phase as ribonuclease enzymes regain their activity. This lag phase suggests that these enzymes “search through space” back to their 3D form and, most importantly, that this happens quite quickly. Anfinsen also showed that wrong intermediates (the disulfide shuffling) can correct themselves, presumably due to thermodynamics. Or, said in simpler terms, Anfinsen’s experiment demonstrated a glimpse of the solutions to a paradox that was only formulated eight years later.