Tag: basic-science

  • One Equation for Faster-Growing Cells

    Biologists are obsessed with records.

    We like to learn about the smallest and biggest cells, the animals that live longest, and the birds which migrate furthest. Perhaps this is an intrinsic part of Human Nature; but a part of me — deep down — wants to resist it. I’ll not be a stamp collector, I think, or mere record keeper! No; I shall study the mundane and the average, such that I can understand life as it really is, or at least usually is, on this beautiful Earth.

    And yet, what’s the fun in averages? I think there is something about “records,” and our hunt for them, that serves a valuable purpose. Indeed, records are often a starting point for a deeper curiosity.

    When we learn of an organism that lives for hundreds of years, or first hear that elephants do not get cancer despite the abundance of cells in their bodies, it is only natural to think, “Wait, then why do humans get cancer? We have way fewer cells than elephants!” In this way, records become a starting point toward rich questions.

    But the record I think about most is cell division; specifically, why an obscure microbe — called Vibrio natriegens — is able to divide every 9.8 minutes and not a moment sooner.

    Dividing V. natriegens cells. Credit: Max-Planck-Institut for Terrestrial Microbiology

    V. natriegens was first isolated by William Payne, a professor at the University of Georgia, from a glob of mud on Sapelo Island in 1958. Four years later, a man named R.G. Eagon incubated these cells at 37°C, shaking them vigorously in a liquid broth containing blended bits of brains and hearts. It was Eagon who found, in this experiment, that the cells divided every 9.8 minutes. This must have been a startling discovery, because the average microbe divides every three hours or so. Some microbes, living deep in the Earth’s crust, divide once every few years.

    It has been more than 60 years since Eagon made his discovery, and yet nobody has found a microbe which grows faster than V. natriegens. Is 9.8 minutes some kind of magical threshold; a speed limit to life’s replication? I don’t think so. And the reason I say so is because of a single equation, the parameters of which may actually reveal how to make cells grow faster.

    False Assumption

    My first assumption was that a cell’s division time is limited by DNA replication. For one cell to become two, the cell must copy its genome and pass one copy to each offspring. The bigger the genome, the longer it takes to make a copy, and the slower a cell divides. Right?

    Not quite. The enzyme responsible for copying the genome, called DNA polymerase, moves at roughly 1,000 bases per second. V. natriegens has about 5.17 million bases in its genome, split across two chromosomes. The first chromosome has 3.25 million bases, and the second has 1.93 million bases. At normal speed, one polymerase would need 54 minutes to copy the first chromosome and 32 minutes to copy the second.

    Doubling time for 214 microbes, organized by their optimal growth temperature.

    For years, many researchers thought that splitting the genome across two chromosomes was what enables V. natriegens to grow fast. With two chromosomes (so their thinking went), two polymerases can copy the genome in parallel, thus cutting division times in half! But then a 2024 paper came out, explaining how researchers had fused both chromosomes into a single genome, and the cells still divided every nine minutes. So clearly that’s not the bottleneck here.

    The truth is that cells don’t use a single DNA polymerase to copy their genomes. Instead, two polymerases copy the genome at the same time, albeit in opposite directions. This bidirectional copying also happens many times simultaneously. As soon as one set of DNA polymerases begin copying the chromosome, another set latches on and starts copying it, too. Multiple copies of the genome are thus in the act of being made at any given moment. When one cell becomes two, each “daughter” not only inherits a genome, but also inherits the copies of that genome that are in the act of being made.

    DNA replication is not the bottleneck to cell division. In theory, a cell could initiate dozens of rounds of DNA replication all at once, provided it has enough energy and nucleotides to do so.

    The true bottleneck, it turns out, are actually the ribosomes, or big “machines” (a tired metaphor, I know) that build proteins. Before a cell can split in two, it must double its pool of ribosomes so that each daughter cell has enough to survive. And as we’ll see, this is really slow.

    Many students are taught to think of ribosomes as “proteins that build other proteins.” But two-thirds of a ribosome’s mass is RNA; not amino acids. Each ribosome is also built from two pieces, called the large and small subunits. These two pieces glom onto a strand of messenger RNA and “read” its code to build proteins. After a ribosome has finished making a protein, it falls off the messenger RNA, searches for a new strand, and begins building the next one.

    E. coli and V. natriegens have nearly identical ribosomes. In both, the small subunit contains a long strand of RNA, called ribosomal RNA, packed inside of 21 proteins. The large subunit has two strands of RNA (one short and another long) stuffed inside of 33 proteins. Of all the RNA molecules floating around a cell, about 80 percent are ribosomal. (Messenger RNAs account for only a tiny fraction.) In total, each ribosome contains 4,566 nucleotides of RNA and 54 separate proteins, totaling 7,500 amino acids. This is enormous; an average protein has about 300 amino acids. Once built, each ribosome can “stitch together” about 16 amino acids per second.

    Now, I know there are a lot of numbers here. But recall that V. natriegens divides every 9.8 minutes, and consider what happens when we crunch the numbers on how long it takes a ribosome to build a copy of itself:

    There are 7,500 amino acids in a ribosome, and each ribosome stitches 16 amino acids together each second. Therefore, it takes one ribosome about 7 minutes and 50 seconds to build another ribosome; and V. natriegens divides every 9.8 minutes! That gap of about two minutes is all the time the cell has to make everything else: copying DNA, growing its lipid membrane, and building all the other proteins it needs to survive. Ribosome biosynthesis is the true bottleneck on cell division. No organism can divide faster than the time it takes to make its own ribosomes.

    If this explanation strikes you as too tidy, though, you are certainly not alone. I had the same reaction at first. And one question I began thinking about is this: Sure, it takes one ribosome about eight minutes to make one ribosome. But each cell has tens of thousands of ribosomes. Those ribosomes all work together, in parallel, to make more ribosomes. Because the 7,500 amino acids required to build each ribosome are split across 54 different proteins, 54 ribosomes could (in theory) work together to build each new ribosome.

    But this is only true at the level of one ribosome. If we zoom out to the whole cell, the math doesn’t work out quite this cleanly. For a cell to go from R ribosomes to 2R, it must build R ribosomes, and it only has R ribosomes to do this. Each ribosome, on average, must make one other ribosome; and that takes about eight minutes.

    (Parallelization works when you can add more “machines” independent of output but, in this case, the “machines” are also the output.)

    CategoryMetric Value
    DNA PolymeraseSpeed~1,000 bases/second
    V. natriegens GenomeTotal size5.17 million bases
    Number of chromosomes2
    Chromosome 13.25 million bases
    Chromosome 21.93 million bases
    Replication Time (single polymerase)Chromosome 154 minutes
    Chromosome 232 minutes
    Cell DivisionV. natriegens division time9.8 minutes
    Ribosome CompositionTotal ribosomal RNA4,566 nucleotides
    Total proteins54
    Total amino acids7,500
    Ribosome SpeedTranslation rate~16 amino acids/second
    Time for one ribosome to build one ribosome7min 50sec

    This raises other questions, too; like rather than fully double its ribosome pool, why doesn’t a dividing cell give fewer ribosomes to each daughter?

    A cell could do this. But doing so would mean each daughter cell then needs to “catch up” and make more ribosomes so it can grow at its maximum capacity again. Cells must devote about half **their ribosomes toward making the various proteins needed to sustain life (not ribosomes). If a cell devotes too many ribosomes toward making other ribosomes, it will not be able to sustain its metabolism, or make energy, or copy its genome, or all that other stuff. Short-shifting daughter cells, then, is just passing a problem down to future generations.

    So the ribosome bottleneck holds, no matter how we come at it. But this makes V. natriegens’ growth rate even more impressive. This microbe, pulled from a glob of mud in Georgia, has evolved a way to divide quite close to its theoretical, biophysical limit; mostly by optimizing for ribosome biosynthesis.

    First, V. natriegens has at least a dozen ribosomal RNA operons, or gene clusters encoding ribosomal RNA molecules, in its genome. E. coli, for comparison, has seven. And second, these ribosome genes are located next to “strong” promoters, or genetic sequences that recruit RNA polymerase enzymes. In other words, Vibrio devotes more of its genome to ribosomal genes, and has also evolved a stronger “start” signal for those genes, meaning the cell makes ribosomal RNA much more frequently, and in higher numbers, than other microbes.

    Scientists don’t fully understand why V. natriegens evolved to grow quickly, though. But remember that these cells were first discovered in nutrient-rich mud, on an obscure island off the coast of Georgia, where lots of organic matter washes up with the tide. As this tide flushes out, nutrients go with it. In their natural environment, then, these cells are exposed to ebbs and flows of nutrient-rich soup; cells that divide faster are able to “scoop up” more nutrients before it disappears. The end result, over millions of years, is that cells evolve to grow and consume as quickly as possible.

    I can’t help but wonder why evolution “stopped” at 9.8 minutes, though, rather than the eight minutes it takes to theoretically double the ribosome pool. Those extra two minutes, it turns out, come from the fact that a dividing cell must make not only ribosomes, but also many other proteins, before it divides. A cell needs to make all the enzymes required for DNA replication, proteins to “pull apart” the chromosomes for each daughter cell, lipid molecules to grow the cell membrane, and so on. All of these things require proteins, which are made by ribosomes. And that’s why ribosomes can’t spend all their time making other ribosomes! (Even at maximum growth rates, most microbes only devote about one-third of their ribosomes toward making more ribosomes. The rest are used to build other things.)

    Still, I wonder if cells could grow even faster.

    Math “Knobs”

    The interesting thing about essays is that they describe phenomena in the English language, and thus are imprecise by their nature. I can work really hard to edit my sentences and make my words as clear as possible, but there will always be a chance that you, my reader, will be confused. Or, I could just simplify everything by giving a single equation which captures and explains the whole phenomenon. It turns out that this works remarkably well for cell division.

    A few years ago, researchers at Caltech published a paper, titled “Fundamental limits on the rate of bacterial growth and their influence on proteomic composition.” In it, they write down two simple, mathematical relationships. First, they note that the fraction of a cell’s mass devoted to ribosomes depends on how many ribosomes it has (of course) and how big those ribosomes are, relative to all the proteins in the cell. And second, for a cell to double in size, it must synthesize a cell’s worth of new protein, and the rate at which ribosomes do this determines how fast the cell grows.

    By smashing these two relationships together, they arrived at a single equation — with just four parameters1 — that describes how quickly a cell will divide:

    λ=rtfaΦRLR\lambda = \frac{r_t \cdot f_a \cdot \Phi_R}{L_R}

    The left side, λ, is the cell’s growth rate, or number of times it divides per hour. On the right, there are four terms. rt is the translation elongation rate, or the speed at which a ribosome puts amino acids together; in most microbes, this is 15-30 amino acids per second. fa is the fraction of ribosomes actively making proteins at any given moment. In a normal cell, at a narrow slice of time, about 15 percent of all ribosomes are idle. ΦR is the ribosomal mass fraction, or percentage of all proteins in the cell that are ribosomes. And LR, on the bottom, is the total number of amino acids in each ribosome.

    The beauty of this equation — the reason it nearly brings a tear to my eye — is because it immediately explains both the biophysical limits of cell division and the knobs, or “dials,” by which we can change it. We can intuit, for example, that f_a must always be less than 1.0, because some ribosomes will always be between jobs, searching for their next strand of messenger RNA. And ΦR must be less than 1.0, too, because a cell made entirely of ribosomes is a cell without a metabolism, membrane, and so on. Both of these parameters have hard ceilings.

    To get a feel for what’s biologically plausible, let’s plug in some back-of-the-envelope numbers for V. natriegens:

    rt = 20 amino acids per second
    fa = 0.85 active ribosomes
    ΦR = 0.50 of protein mass is ribosomes
    LR = 7,500 amino acids per ribosome

    Crunching these numbers, we get λ = 4.08 h⁻¹, or a doubling time of 10.2 minutes; remarkably close to what Eagon measured in 1962!2

    The nice thing about mathematical equations, like this, is that they not only point at biophysical limits, but also reveal which parameters can be tweaked to change the results. Now that we know the four parameters which set growth rate, in other words, we can begin to dream up clever ways to tune each “knob” to make cells grow faster or slower.

    One option is to engineer ribosomes such that they literally build proteins faster. If we could raise the rt parameter to 30 or more (as some other microbes have), then division time goes down. Or, alternatively, we could try and make ribosomes smaller. Researchers have already explored this for E. coli. In 2002, researchers studied which proteins — of the 54 found in the E. coli ribosome — were “conserved” across other bacteria, archaea, and eukaryotes. In other words, they wanted to figure out which proteins show up again and again across species, and which proteins were only found in a few species (and, thus, might be disposable.) 

    They found that about 21 of E. coli‘s ribosomal proteins show up in bacteria, but not archaea or eukaryotes, and some could plausibly be trimmed. I’m not aware of anyone who has actually tried this, but I wouldn’t be surprised if we could cut out, say, 20 percent of the ribosome without impacting its function too much, and thus shave a couple minutes from the theoretical cell division time. Somebody should try this!

    Another option is to raise fa by boosting the fraction of “active” ribosomes within the cell. Protein synthesis is the most energetically expensive thing a cell does, so many organisms have evolved mechanisms to shut ribosomes down when they are not needed, thus conserving energy. E. coli, for example, carry “hibernation factors,” proteins that grab onto ribosomes and push them into an inactive form when they are not needed. It’s not known if V. natriegens encode the same proteins, but we could search through their genome and delete similar genes to test this theory.

    Or, perhaps, we could take a more agnostic approach and just let evolution take its course, albeit in an accelerated way. If Vibrio evolved with slow ocean tides, maybe we could make them evolve even faster in the laboratory. Perhaps we could run a Richard Lenski-esque experiment, in which V. natriegens’ cells are grown in a robotic bioreactor and flooded with glucose every few hours, followed by stretches of nutrient starvation. If we repeat this lots of times, some microbes may evolve to grow even faster during those periods of high glucose. Or maybe not; V. natriegens may already be quite close to the theoretical cell division time limit.

    These experiments haven’t been done yet. But that, in a way, is the whole point. 

    I never planned to write this essay, which emerged entirely by accident, with one question leading to another, until I found myself deep in the weeds of ribosomes and biophysics and growth rates. What surprised me most, in the end, was that the clearest answer to my question was found not in words, but rather in a single equation with just four parameters. Biology, at its limits, can often be described best with mathematics.

    This equation only exists because generations of biophysicists heard about a record set by a microbe pulled from Georgia mud in 1958 and couldn’t let it go. They spent decades modeling ribosome fractions and translation rates; not because anyone asked them to, but because the record raised questions which bothered them and they wanted desperately to answer. Eventually, they wrote down an equation that not only explains why V. natriegens divides as fast as it does, but points toward how we might push it further still.

    Records, it turns out, are not merely trivia, but rather a map toward the loose threads that, when pulled, unravel something remarkable about the world. In glorifying the exceptional, we can find answers to the mundane.

    1. A typical E. coli has 75,000 ribosomes and V. natriegens has 115,000 ribosomes. But why? The equation also helps explain why this is. The gist is that cells can’t just crank up ribosome speed indefinitely, because there is a maximum rate of protein biosynthesis. The only way to grow faster, then, is to make more ribosomes. But the downside of this is that, by making too many ribosomes (especially when nutrients are scarce), the cell’s amino acids will get depleted and the cell will slow down. Therefore, cells must carefully balance their ribosome numbers to match their available nutrients. This also explains, somewhat, why larger cells — even of the same species — divide more quickly; they are using the extra space to house more ribosomes. ↩︎
    2. Cells grow exponentially; each division yields two cells, each of which divides again. Therefore, the actual doubling time is not 60/λ, or roughly 15 minutes, but rather ln(2)/λ, or about 0.693/λ. Hence the 10.2 minute figure. ↩︎
  • Estimating the Size of a Single Molecule

    Many decades before the discovery of x-rays and the invention of powerful microscopes, Lord Rayleigh calculated the size of a single molecule. And he did it, remarkably, using little more than oil, water, and a pen. His inspiration was none other than Benjamin Franklin.

    Sometime around 1770, while visiting London, Franklin became intrigued by a phenomenon he had observed during his transatlantic voyage. Specifically, he noticed that when ships discarded greasy slops into the ocean, the surrounding waves would calm. This ancient practice of oiling the seas to pacify turbulent waters was known to the Babylonians and Romans, but Franklin decided to investigate further.

    On a windy day in London, he walked to a pond on Clapham Common. Carrying a small quantity of oil — “not more than a Tea Spoonful,” according to his diary — Franklin poured it onto the agitated water. The oil spread rapidly across the surface, covering “perhaps half an Acre” of the pond and rendering its waters “as smooth as a Looking Glass.” Franklin documented his observations in detail; they can be read today on the Clapham Society’s website.

    Franklin’s oil drop experiment, of course, was just one in a long line of his “amateur” science experiments. He was also the first to demonstrate that lightning is electrical in nature (via his famous kite experiments), and he charted the Gulf Stream’s course across the Atlantic ocean, noting that ships traveling from America to England sailed quicker than those going the opposite direction. His experiments at Clapham Common are not nearly as well-known.

    But Franklin was a careful experimenter, repeating his oil drop multiple times and taking notes each time. In his journal, he opined on how much oil might be needed to calm various areas of ocean (he was thinking specifically about applications for the Royal Navy) but never grasped the molecular implications of his experiments. It wasn’t until more than a century later that Lord Rayleigh, whose real name was John William Strutt, revisited Franklin’s experiment with a brilliant new perspective.

    An academic at the University of Cambridge and a baron by title, Rayleigh was renowned for his work in physics. The Rayleigh number, a common parameter used to describe the flow of water, is named for him; as is Rayleigh scattering, which explains how photons diffuse through the atmosphere and color the sky blue. Rayleigh also discovered the noble gas, Argon, earning a Nobel Prize for it in 1904.

    But a little experiment that Rayleigh performed in 1890, inspired directly by Franklin’s observations, is not nearly as well-known.

    Rayleigh carefully measured a tiny volume of olive oil — 0.81 milligrams, to be exact — and placed it onto a known area of water. The oil quickly spread out and covered an area, which Rayleigh precisely measured. And then he did something that Franklin never thought of: Rayleigh divided the volume of the oil by the area it covered, thus estimating the thickness of the oil film. Assuming that the oil formed a single layer of molecules — a monolayer — then the thickness of the oil film is the same thing as the length of one oil molecule.

    This is how Lord Rayleigh became the first person to figure out a single molecule’s dimensions, many years before anyone could see such molecules.

    Rayleigh’s final result was 1.63 nanometers. Olive oil is mainly composed of fat molecules called triacylglycerols, and modern measurements show that they measure about 1.67 nanometers in length, thus implying that Rayleigh’s “primitive” estimates were off by just 2 percent. His original paper detailing the experiment can be found here.

    I love this story because it shows, at least anecdotally, how deep scientific insights can emerge from the simplest of experiments. It’s a testament to the idea that you don’t always need sophisticated equipment to unlock the secrets of nature — sometimes, all it takes is a drop of oil and a bit of ingenuity.

    For those interested in delving deeper into the history of these oil drop experiments, Charles Tanford’s book, Ben Franklin Stilled the Waves, offers a much deeper exploration.

  • The Most Abundant Protein

    One reason David Goodsell’s paintings attract biologists, I think, is because they are unapologetically realistic. His paintings depict seas of macromolecules splayed out in pastel shades. A Goodsell painting looks nothing like the spacious diagrams one finds in high school biology textbooks, and that’s exactly why they linger in the mind: they show, visually, how crowded cells really are.

    But crowded with what, exactly?

    Well, an E. coli cell has an internal volume of just one femtoliter (or one cubic micron) and a total mass of 1 picogram. These are handy numbers to remember. About 70 percent of that mass is water, and the other 30 percent is mostly proteins, RNA, DNA, lipids, and smaller molecules like metabolites. Proteins alone make up 55 percent of the cell’s dry mass, which made me wonder: Which protein is the most abundant?

    If I sat down at my computer without looking up the answer, I’d guess it has something to do with translation. After all, proteins account for most of the cell’s dry mass, and other proteins are needed to build all those proteins! So maybe the most abundant protein is one of the ribosomal subunits, or something involved in transcribing DNA to RNA. Another possibility is that the most abundant protein is involved in energy production or some other critical process.

    But then I started digging. And here’s what I found.

    In 1978, researchers believed that elongation factor EF-Tu was the most abundant protein, with around 60,000 copies per cell. EF-Tu helps the ribosome grab the correct amino acid during translation. Around that same time, scientists also identified acyl carrier protein and RpiL (involved in fatty acid biosynthesis and protein translation, respectively) as top contenders. They estimated that each E. coli cell has something like 60,000 to 110,000 copies of these proteins.

    Then, in 1979, a paper in Cell argued that those weren’t actually the most abundant proteins. Instead, the authors claimed that E. coli contains a protein with an order-of-magnitude more copies than either EF-Tu or RpiL or anything else. They reported more than 700,000 copies of this protein inside each cell, an astounding figure given that E. coli typically holds only 3–4 million total proteins.

    That protein is called Lpp, and it basically maintains the structural integrity of the cell envelope by anchoring the outer membrane to the peptidoglycan layer. Lpp exists in two forms: one-third of the molecules are covalently bound to the peptidoglycan, and the remaining two-thirds float freely in the membrane. Together, these molecules create a network that stabilizes the cell envelope. They’re what keep cells “roughly” spherical and prevent them from collapsing. Without Lpp, the outer membrane would detach from the peptidoglycan layer and cells would get wrecked by various environmental stressors.

    Decades of experimental evidence now support the high copy number of Lpp. Way back in 1969, a duo named Braun and Rehn treated E. coli cell walls with trypsin (an enzyme that cleaves proteins) and observed a rapid decrease in light absorbance. This suggested that about 40 percent of the rigid cell wall is protein. Subsequent experiments identified Lpp as that protein.

    follow-up study in 1972 used lysozyme and SDS-PAGE to separate the bound and unbound forms of Lpp. By tracking radiolabeled arginine incorporation, the researchers discovered that free Lpp is synthesized first and then converted to the bound form. Combining those findings with earlier data, they estimated that each cell contains around 300,000 total Lpp molecules. Later studies, including the 1979 Cell paper, refined this estimate to 720,000 copies (I don’t entirely understand how; the authors cite those earlier experiments).

    [Despite some of this shaky evidence, I do believe that Lpp is the most abundant protein by far. A 2023 paper in Science Advances visualized this protein in individual cells using atomic force microscopy, and again concluded that each E. coli cell contains hundreds of thousands to about one million copies.]

    Despite the evidence for Lpp’s abundance, some discrepancies remain. For example, the PaxDB database, which compiles protein abundance data from various studies, lists UspA (a stress response protein) as the most abundant E. coli protein. That is almost certainly not correct; many studies isolate and measure cytoplasmic proteins but lose the cell membrane in the process, which can bias results. Protein abundance also depends heavily on the E. coli strain and its growth conditions. Rapidly dividing cells ramp up Lpp to expand their membranes—but they also churn out more ribosomes to handle higher translation demands. Conversely, cells in nutrient-limited conditions might boost stress response proteins like UspA.

    So what are the lessons in all this? A few things:

    1. Few questions in biology have simple answers, and my initial guesses are often wrong.
    2. Don’t just trust a database. Instead, figure out how its data were actually collected before drawing conclusions.
    3. Cells change a lot from one moment to the next, and also between strains. Answers depend on these variables!

    1 Here’s how it works, briefly: Researchers feed cells a radioactive form of the amino acid, arginine. When the cells make proteins, they incorporate this tagged arginine and scientists can measure it using radioactivity devices. For this particular Lpp experiment, researchers grew E. coli in the presence of radiolabeled arginine and then separated the different forms of Lpp (bound vs. free) on SDS-PAGE gels. By looking at how much radioactivity appeared in each band over time, they could see whenand where new Lpp was being produced.

  • Underrated Origins of the Protein Folding Problem

    Students of biology take much more granted. Seemingly simple ideas — like how DNA is the genetic material, or how a protein folds according to the order of its amino acids — are taken as gospel or undeniable “truth,” even though such ideas once bordered on fringe conspiracies. I’m intrigued by the stories of how such ideas went mainstream, so to speak; doubly so if the discoveries were made by people who I’ve never heard of (not the Darwins or Mendels or Cricks).

    Christian Anfinsen is one such person. Until last week, I had never heard of him. He shared the 1972 Nobel Prize in Chemistry for his studies of a particular enzyme, called ribonuclease. Specifically, he did a clever experiment in which he denatured this enzyme (meaning he destroyed its 3D shape) and then showed the enzyme could refold, and gain its activity, autonomously. This experiment was the first to suggest a protein’s form is encoded by its amino acid sequence, and it heralded the mad rush, by computational biologists, to solve the “protein folding problem.” (Anfinsen is also interesting, on a more personal level, because he apparently did his entire PhD at Harvard in just two years.)

    Before I describe the experiment, it’s important to put everything (briefly) in historical context. The experiment itself was published in 1961, but Anfinsen was working on these ideas since at least 1958. (I say this because he published another paper, also involving ribonuclease and its 3D form, in 1959; and the experiments, getting a paper publishing, and so on all take time.) Even if one uses 1961, the year Anfinsen published his experiment demonstrating that a protein’s structure is encoded by its amino acids, as reference, just consider all the things that were not yet understood:

    The structure of DNA had only been solved eight years before; the first protein structure (myoglobin, cracked by John Kendrew) and Crick’s initial thoughts on the Central Dogma and information flow were only three years old; the first codon (UUU, corresponding to phenylalanine) was not yet solved. Said more forcefully, Anfinsen connected protein sequences to structures before the genetic code was mapped. And he figured out that the linear sequence of amino acids contained the instructions for a 3D structure, at a time when nobody really knew how DNA molecules even encoded amino acids, or how to sequence them. It’s really quite revolutionary.

    Anfinsen’s experiment was quite simple, but provided a large amount of information. (In this sense, I’d absolutely classify it as a “beautiful experiment” which, as defined by Nobel Prize laureate Frank Wilczek, is an experiment wherein you get out more than you put in.”) He was working with a small protein, called ribonuclease, which cuts up RNA molecules. Ribonuclease has four strong disulfide bridges which hold its 3D form. These bridges connect cysteines together, and always in the same pairs. The protein has eight cysteine amino acids, and so these disulfide bridges could theoretically take 105 different combinations; but Anfinsen found that the bridges always form between the same pairs. (There is a bridge connecting the cysteine at position 26 with a cysteine at position 84, for example.)

    To begin, Anfinsen purified ribonuclease enzymes from cow pancreases and tested their activities and forms. He shined polarized light through his protein solution, for example, to measure how they rotate the polarity. (More “ordered” proteins twist light more severely.) He also tested whether the ribonucleases could cut up RNA molecules and, indeed, they did.

    Next, Anfinsen dropped these enzymes in a 8M urea solution with mercaptoethanol. The mercaptoethanol destroys disulfide bridges, while urea destroys everything else, and especially the hydrogen bonds. The end result is a bunch of uncoiled, loose, inactive ribonucleases (as measured, again, according to their optical rotation and enzymatic activity.)

    But then, in the final part of his experiment, Anfinsen filtered out the urea and mercaptoethanol, placed the enzymes in a “clean” liquid with pH 8.2, oxygenated them (so the disulfide bridges could reform) and waited 24 hours. The next day, enzymes which had been completely obliterated, in terms of 3D structure, suddenly had the same optical rotations and activities as when Anfinsen had first purified them from the cow pancreas. These “recovered” enzymes regained nearly 100 percent activity, and the final concentration was about 95 percent of the original batch.

    In order for these enzymes to recover their activity, the four disulfide bridges must “re-find” their partners out of the 105 possible combinations I mentioned earlier. Anfinsen’s experiment showed that this pairing is not random; the same couplets find each other every time. More broadly, this experiment suggested that protein refolding itself is not random, but rather thermodynamically-driven. The native structure of a protein is stable, in other words, and proteins will autonomously “search” until they find a stable form.

    Anfinsen called this the “thermodynamic hypothesis of protein folding,” and concluded that in order to determine a protein’s structure, one could presumably calculate the sum of all interactions between its atoms, in all possible configurations, and then find the solution with the lowest internal energy.

    In 1968, a full seven years after Anfinsen’s work was published, a protein biochemist named Cyrus Levinthal published a paper titled, “Are there pathways for protein folding?” He begins the paper with an overt nod toward Anfinsen’s experiment, writing: “Denatured proteins, which have had essentially all of their native three-dimensional structure disrupted, can refold from their random disordered state into a well-defined unique structure, in which the biological activity is virtually completely restored.” Levinthal’s paper goes on to present a paradox: Namely, consider a simple scenario in which each amino acid in a protein can adopt three configurations. Assuming this protein has 100 amino acids, then it could theoretically adopt 3^100 forms. Even if this protein was able to sample configurations 10^13 times per second, it would take 10^27 years to try all configurations. And yet, somehow, experimental evidence from the time demonstrated that proteins can fold quite quickly; often in a few seconds or minutes. This discrepancy became known as Levinthal’s paradox. Even today, it seems that more protein biochemists are familiar with Levinthal than Anfinsen, because the former’s paradox motivated the “protein-folding problem” and suggested there must be some viable way to predict a protein’s structure in a computationally-efficient way, since biology had evolved a mechanism to do precisely that.

    And yet, the data Anfinsen collected in 1961 already showed that there is a lag phase as ribonuclease enzymes regain their activity. This lag phase suggests that these enzymes “search through space” back to their 3D form and, most importantly, that this happens quite quickly. Anfinsen also showed that wrong intermediates (the disulfide shuffling) can correct themselves, presumably due to thermodynamics. Or, said in simpler terms, Anfinsen’s experiment demonstrated a glimpse of the solutions to a paradox that was only formulated eight years later.