Tag: biology

  • 30 Great Essays About Biology

    The world needs more essays about biology. So last month, I tweeted a link to one of my favorite essays (#1 below) and promised that I would continue to share an additional essay every day for the next 29 days. I titled the series, “30 Essays to Make You Love Biology.”

    I’ve now assembled all 30 essays in this article. I hope you’ll read them and emerge with a deeper appreciation for the cell, atoms and their confluence with physics and math.

    I scoured the internet for non-paywalled versions of each article, so all links go to open-source versions. This effort was inspired by the website “Read Something Wonderful.” Enjoy!

    1. “I should have loved biology” by James Somers. An easy-to-read essay about how biology is poorly taught in schools, and how this poor teaching masks its most intriguing bits. Students are typically told to read textbooks and memorize facts about the cell (Mitochondria are the powerhouse of the cell!) without ever appreciating its miraculous complexity. Tests are often given as multiple choice, with little to no problem-solving involved. As Somers writes: “It was only in college, when I read Douglas Hofstadter’s Gödel, Escher, Bach, that I came to understand cells as recursively self-modifying programs.” Link
    2. “Cells are very fast and crowded places” by Ken Shirriff. A short essay about some awe-inspiring numbers in cell biology. My two favorite lines are: “A small molecule such as glucose is cruising around a cell at about 250 miles per hour” and “a typical enzyme can collide with something to react with 500,000 times every second.” Link
    3. “Seven Wonders,” by Lewis Thomas. When Thomas was asked by a magazine editor “to join six other people at dinner to make a list of the Seven Wonders of the Modern World,” he declined and instead drafted this article about the seven wonders of biology. Number 2 on the list: Bacteria that survive in 250°C waters. Link
    4. “Life at low Reynolds number,” by E.M. Purcell. An all-time classic. One of the best biology lectures of all time. This essay opened my eyes to the weirdness of life at the microscale, where “inertia plays no role whatsoever.” Or, as Purcell says, “We know that F = ma, but [microbes] could scarcely care less.” Link
    5. “The Baffling Intelligence of a Single Cell,” by James Somers & Edwin Morris. This interactive article, about chemotaxis and flagella, gives “an intuition for how a bag of unthinking chemicals could possibly give rise to a being.” It’s stunning and slightly emblematic of the great Bartosz Ciechanowski’s blog. Link
    6. “Thoughts About Biology,” by James Bonner. A little-read essay, I think, that deserves more attention. Published in 1960, Bonner argues that biology is ever-changing and progress, often, comes from those outside the field. Part of biology’s beauty is that you can push it forward regardless of background. Link
    7. “Biology is more theoretical than physics,” by Jeremy Gunawardena. It is often said “that biology is not theoretical,” writes Gunawardena, but that’s not true. This essay gives examples where theory preceded and informed major discoveries in biology. It’s a must-read, especially for those who want to work on biology but don’t feel compelled to work at the bench with a pipette in hand. Link
    8. “Can a biologist fix a radio?” by Yuri Lazebnik. One of my favorites. Biologists tend to catalog things by breaking them apart. But without quantitative insights, it is difficult to piece them back together into a holistic understanding. Even if you think a line of inquiry in biology has been exhausted, there is always room to go deeper. Link
    9. “Schrodinger’s What Is Life? at 75” by Rob Phillips. In 1944, physicist Erwin Schrödinger wrote a book, called “What is Life?” that pondered a single question: “How can the events in space and time which take place within the spatial boundary of a living organism be accounted for by physics and chemistry?” This essay is an ode, synopsis, and expansion of that classic book. “Names such as physics and biology are a strictly human conceit,” writes Phillips, “and the understanding of the phenomenon of life might require us to blur the boundaries between these fields.” Link
    10. “Molecular ‘Vitalism’” by Marc Kirschner, John Gerhart & Tim Mitchison. Students are often taught that genes are the bedrock, or blueprint, for biology. But this picture is quickly changing, unraveling, fading. “Although…proteins, cells, and embryos are…the products of genes, the mechanisms that promote their function are often far removed from sequence information.” Link
    11. “Escherichia coli,” by David Goodsell. Goodsell is a computational biologist who also makes brilliant watercolor paintings of living cells. His paintings are based on atomic truth—that is, the ribosomes, mRNAs, and DNA molecules are all painted to scale. This short essay explains how he does it. Link
    12. “How Life Really Works,” by Philip Ball. This essay challenges much that students are taught about how cells actually work. DNA is not some all-powerful blueprint of the cell, as textbooks often suggest. To truly understand life, argues Ball, one must first realize that cells are far more complex than that. They are, in fact, intelligent agents that change their surroundings to their own benefit. Link
    13. “A Long Line of Cells,” by Lewis Thomas. Another masterful essay that traces one man’s life, and mankind’s progress, through the lens of evolutionary biology. It helped me appreciate how my own life is deeply intertwined with the lives of organisms all around me. Link
    14. “AlphaFold2 @ CASP14,” by Mohammed AlQuraishi. Biological progress is swift, and that is one reason it is so exciting. In this first-person essay, a computational biologist marvels at a scientific breakthrough in predicting protein structures from their amino acid sequences. Link
    15. “Theory in Biology: Figure 1 or Figure 7?,” by Rob Phillips. Another great essay about theory—and not just wet-lab experiments—as a key driver of scientific progress. “Most of the time, if cell biologists use theory at all, it appears at the end of their paper, a parting shot from figure 7. A model is proposed after the experiments are done, and victory is declared if the model ‘fits’ the data.” But such an approach is misguided, writes Phillips. As Henri Poincaré once said: “A science is built up of facts as a house is built up of bricks. But a mere accumulation of facts is no more a science than a pile of bricks is a house.” Link
    16. On Being the Right Size,” by J.B.S. Haldane. Published in 1926, this essay made me appreciate the myriad forms and functions of lifeforms all around me. I learned why an insect is not afraid of gravity; why a flea as large as a human couldn’t jump as high as that human; why a tree spreads its branches, and much more. Simple, beautiful. Link
    17. “I Have Landed,” by Stephen Jay Gould. The final essay in a 300-essay series, Gould  writes about how he often lies awake at night, pondering his purpose in the Universe and his fear of death. And how, upon deep reflection, he is most stunned by the fact that life—after more than 3.5 billion years of evolution—continues to exist at all “without a single microsecond of disruption.” Link
    18. “A Life of Its Own,” by Michael Specter. Published in The New Yorker in 2009, this piece explores the then-nascent field of synthetic biology. It opens by telling the story of Jay Keasling, a professor at UC Berkeley, who engineered yeast to make an antimalarial drug called artemisinin, which has been used to save at least 7.6 million lives. Artemisinin was historically extracted from the sweet wormwood plant in a painstaking and low-efficiency process. Link
    19. “Slaying the Speckled Monster,” by Jason Crawford. Smallpox killed an estimated 300 million people in the 20th century alone. This essay explains how a long line of brilliant scientists—from John Fewster and Edward Jenner to D.A. Henderson—invented the first vaccines against the disease and then, in the 1960s, launched campaigns to eradicate smallpox entirely. An inspiring story about how biological discoveries can save lives. I also learned this: “The origin story [about smallpox vaccines] that is usually told, where Jenner learns of cowpox’s protective properties from local dairy worker lore or his own observations of the beauty of the milkmaids, turns out to be false—a fabrication by Jenner’s first biographer, possibly an attempt to bolster his reputation by erasing any prior art.” Link
    20. “Why we didn’t get a malaria vaccine sooner,” by Saloni Dattani, Rachel Glennerster & Siddhartha Haria. Malaria has killed billions of humans in the last few centuries and continues to kill 600,000+ each year. This is, simply put, the best essay ever written on the history of malaria and the invention of vaccines to prevent it. We are living through a revolutionary time, considering these vaccines were only approved for the first time in 2021. Link
    21. “Biology is a Burrito” and “Fast Biology,” by Niko McCarty. Cells are often envisioned as wide-open spaces, where molecules diffuse freely. But this isn’t true. In reality, cells are so crowded, it’s a wonder they work at all. Every protein in the cell collides with about 10 billion water molecules per second. Protein ‘motors’ make energy-storing molecules by spinning around thousands of times a minute. Sugar molecules fly by at 250 miles per hour, nearly double the speed of a Cessna 172 airplane at cruising speed. When I first heard these numbers, I thought they were made up. After all, how is it even possible to measure such things? The world’s most powerful microscope cannot necessarily “see” a protein motor spinning, or watch a sugar molecule move through a cell. As a PhD student, I jumped head-first into the world of biological speed. My goal was to collect some “remarkable” numbers in biology and understand the experiments that brought them to light. My search made me appreciate how remarkable it is that life functions at all, considering the chaotic conditions in which cells exist. It also gave me a new appreciation for biology, and the incredible exactitude that one must have to engineer it — let alone engineer it successfully. LinkLink
    22. “Jonas Salk, the People’s Scientist,” by Algis Valiunas. Salk made one of the first successful polio vaccines. A double-blind clinical trial, launched in 1954, showed that patients who received his vaccine “developed paralytic polio at about one-third the rate of the control groups. On average across the different types…the vaccine was eighty to ninety percent effective.” Shortly after the trial’s results were made public, journalist Edward R. Murrow interviewed Salk. When Murrow asked Salk who held the patent on the vaccine, Salk replied: “Well, the people, I would say. There is no patent. Could you patent the sun?” Reading this essay helped me to appreciate the struggle and strife of biological research, the fickleness of fame, and the positive impact that a small group of scientists can have on the world. Link
    23. “On Protein Synthesis,” by Francis Crick. Arguably the most important essay in biology’s history, this was adapted from a lecture that Crick gave in 1957 during which the famed geneticist made several accurate predictions about how cells work well before experimental evidence existed to support them. “I shall…argue that the main function of the genetic material is to control (not necessarily directly) the synthesis of proteins,” wrote Crick. “There is a little direct evidence to support this, but to my mind the psychological drive behind this hypothesis is at the moment independent of such evidence.” At the time, scientists weren’t sure DNA had anything to do with proteins. In this essay, Crick also predicted the existence of a small ‘adaptor’ molecule that brings amino acids to the ribosome for protein synthesis (now known as tRNAs) and that future scientists would chart evolutionary lineages by comparing DNA sequences between organisms. Crick was years ahead of his time. This essay is a masterclass in scientific thinking. Link
    24. “The People Who Saw Evolution,” by Joel Achenbach. My favorite article on this list. Every year, for 40 years, Peter and Rosemary Grant traveled to Daphne Major, a volcanic island in the Galápagos, to study Charles Darwin’s finches. During that time, they watched “evolution happen right before their eyes.” In 1977, for example, just 24 millimeters of rain fell on Daphne Major, causing major food sources—including small, soft seeds—to become scarce. When the Grants returned to the island in 1978, they found that smaller finch species had died off, whereas “finches with larger beaks were able to eat the seeds and reproduce. The population in the years following the drought in 1977 had ‘measurably larger’ beaks than had the previous birds.” I also strongly recommend the book, “40 Years of Evolution,” from Princeton University Press.  Link
    25. “Is the cell really a machine?” by Daniel J. Nicholson. Living cells are far more complex—and beautiful—than any machines made by human hands. In this essay, a philosopher points to four areas of current research where the metaphor of “cells as machines” breaks down. For example: Even though proteins are depicted as static or unmoving molecules, they actually “behave more like liquids than like solids.” Link
    26. “Biological Technology in 2050” by Rob Carlson. “In fifty years,” writes Carlson, “you may be reading The Economist on a leaf. The page will not look like a leaf, but it will be grown like a leaf. It will be designed for its function, and it will be alive. The leaf will be the product of intentional biological design and manufacturing.” This is a futuristic essay about the potential of manipulating atoms via living cells. Link
    27. “Research Papers Used to Have Style. What Happened?” by Roger’s Bacon. This is an ode to beautiful scientific writing. The essay draws from classic biology research papers to make its case. Link
    28. “Night Science,” by Itai Yanai & Martin Lercher. A personal essay about scientific discoveries that do not emerge from the scientific method as it’s taught in school, as told by two biologists. Perhaps it will inspire you to take up night science experiments of your own. Link
    29. “Atoms Are Local,” by Elliot Hershberg. Biology is the ultimate distributed manufacturing platform. Cells harvest atoms from their environments—air and soil—and rearrange them to build materials, medicines, and everything we need to live. Link
    30. “The Mechanistic Conception of Life,” by Jacques Loeb. This is the article that got me hooked on biology a decade ago. Written by one of history’s greatest biologists, it poses a number of questions that I suspect will keep scientists busy for many decades to come. “We must either succeed in producing living matter artificially,” writes Loeb, “or we must find the reasons why this is impossible.” Link

    What essays did I miss? Let me know in the comments and I’ll expand the list 🙂

  • A Christmas Story

    I.

    For centuries, physicians have noticed an unsettling pattern: a string of young boys who seem doomed to bleed. Every scrape or cut on their bodies oozed blood long after other boys had scabbed and healed. Doctors didn’t know the cause; some speculated that the bloody kids merely had “fragile blood vessels.” Others suspected that platelets — the small, disc-like cells that help form clots — were defective. A slight bump against a doorframe might cause a bruise that blackens and spreads beneath the skin. Even a bending of the knees could cause joints to fill with blood! Worse still, internal organs would rupture and hemorrhage, causing the lungs or brain to fill with blood.

    Whatever the cause, families watched their children die before reaching adulthood. In the 1960s, prospects for people with this disease were grim. A 1967 study noted that of 113 patients who went untreated, most died in childhood or early adulthood, often from minor injuries that triggered uncontrolled bleeding. Only eight of these patients survived beyond 40 years.

    It wasn’t until the mid-20th century that researchers began to unravel, slowly, a mechanism for the disease. In 1952, researchers at Oxford University figured out that hemophilia — as the disease came to be called — was not one condition, but at least two. They reached their conclusion while studying a young boy named Stephen Christmas, and even published their findings in the Christmas issue of the British Medical Journal.

    II.

    Scientists have known about hemophilia since ancient times. The Babylonian Talmud forbade the circumcision of a male child if two of his brothers had already died from bleeding from the same procedure. In the 12th century, an Arab physician named Albucasis described a family in which multiple male relatives bled to death after minor injuries, according to an academic review titled The History of Hemophilia. These early authors had no way of understanding genetics, but they did suspect some kind of inherited pattern — a familial “curse,” so to speak.

    The first clinical documentation of hemophilia in the “modern literature” appeared in 1803. John Conrad Otto, a physician in New York, noted the disorder among his patients; he painstakingly traced pedigrees, mapping who bled and who carried the condition. This analysis laid the foundation for understanding hemophilia as an inherited disease on the X chromosome, although the exact genetic mechanism was not yet known. Otto published his findings in an article entitled, “An Account of a Hemorrhagic Disposition Existing in Certain Families.”

    Two decades later, Friedrich Hopff at the University of Zurich dug more deeply into the disease by studying families with recurring bleeding disorders and tracking which males were affected. Hopff wrote detailed case histories of men who bled spontaneously or who bled for days after a minor trauma. It was Hopff who first coined the term “hemophilia” by combining the Greek “Hemo-” (blood) and “-philia” (love or affinity).

    In the 19th century, hemophilia gained widespread attention when doctors realized that Queen Victoria of England — who reigned for 63 years, from 1837 until 1901 — carried the disease. Victoria passed it to her youngest son, Leopold, who died of a brain hemorrhage at 31 in Cannes. (At the end of his life, Leopold retreated to the south of France in search of refuge from the harsh British winters.)

    Two of Queen Victoria’s daughters, Alice and Beatrice, were also carriers. After marrying into other royal houses, they spread hemophilia into Spain, Germany, and Russia. Tsar Nicholas II’s son, Alexei, also inherited the gene through his mother, Alexandra (a granddaughter of Queen Victoria). It was Alexei’s frequent bleeding episodes that first drew Grigori Rasputin, a peasant faith healer, into the Romanov court. When Alexei died in 1918, so too did the last Russian tsesarevich, or heir apparent.

    And still, nobody knew what actually caused hemophilia. But slowly, over time, researchers discovered much more. Like how a defective segment on the X chromosome prevents the body from producing a functional clotting factor. Or that the process of clot formation in healthy people involves a series of proteins called “factors,” each activating the next in a cascading sequence.

    Factor IX, for example, helps convert prothrombin into thrombin, which in turn converts fibrinogen into fibrin to form a stable clot. When mutations disable factor IX, the clotting cascade stalls. Patients then bleed more, sometimes dramatically. This defect became known as hemophilia B. If the mutation affects another factor instead, called VIII, then patients are said to have hemophilia A. (There is also a third form of hemophilia, type C, that affects factor XI.)

    In the early 20th century, physicians did not know there were different subtypes. They had assumed that all these bleeding problems stemmed from the same root cause: “weak blood vessels.” This assumption persisted into the 1930s.

    But then, in 1936, two Harvard doctors isolated a substance from plasma that could fix the clotting defect in some people with hemophilia. They named this substance “antihemophilic globulin,” but did not know why the substance helped blood clot in some cases, yet not in others.

    An answer to their question would not appear until 1947, when an Argentinian physician named Alfredo Pavlosky mixed blood from two separate hemophilia patients and found, oddly, that the blood clotted quickly. One of those patients had hemophilia A, and the other had hemophilia B. Neither realized it at the time, but this observation showed that each patient lacked a distinct factor. One patient’s factor VIII could complement the other patient’s factor IX deficiency, and vice versa. Researchers slowly began to recognize that they were dealing with separate disorders.

    The defining moment in hemophilia B’s story, though, came in 1952. That’s the year Oxford scientists first described a five-year-old boy, named Stephen Christmas, who had frequent, uncontrollable bleeding since he was 20 months old. When the doctors mixed Christmas’ blood with blood from hemophilia A patients, they noted normal clotting; much like Pavlosky had five years earlier. The scientists therefore concluded that Stephen Christmas did not lack “antihemophilic globulin” (now called factor VIII).

    Unlike Pavlosky, though, the Oxford team took their experiments much further and showed, for the first time, that Christmas was missing a different protein, which they dubbed the Christmas Factor (factor IX). Their findings were published in the British Medical Journal’s Christmas issue and the disease was named, fittingly, “Christmas Disease.”

    The Oxford team used clever blood-mixing experiments to make their discovery. They took small samples of blood plasma from different patients and observed what happened when they were combined. If two samples improved clotting times when mixed, it suggested that each plasma had at least some clotting element the other lacked. For instance, mixing patient #2’s blood with patient #4’s blood yielded faster clotting times, indicating that each person was deficient in a different factor. In contrast, mixing certain pairs that both lacked the same factor did not produce any improvement in clotting. This approach ruled out the possibility that Christmas Disease was simply hemophilia A by another name. Over time, the disease was renamed to hemophilia B.

    III.

    Efficacious treatments for hemophilia did not appear for another decade after the Christmas paper. In 1964, a Stanford scientist named Judith Graham Pool discovered that the slushy precipitate left after partially thawing plasma (called the cryoprecipitate) contained a high concentration of factor VIII. This discovery meant that blood banks could collect and store large amounts of clotting factors in relatively small volumes.

    Patients with hemophilia A — a factor VIII deficiency — could now receive fewer, more potent infusions to control or even preempt bleeding episodes. This was great news, of course, but it did not help hemophilia B directly because factor IX was still missing from the cryoprecipitates. Still, hemophilia A affects about six-times more people than hemophilia B (1:5,000 births, compared to about 1:30,000 births) and the idea of separating and concentrating specific clotting factors set the stage for future treatments.

    The next leap came in the 1970s, when researchers developed freeze-dried concentrates containing both factor VIII and factor IX. These concentrates could be stored easily and administered at home, which allowed patients to treat themselves as soon as bleeding began. Orthopedic surgeries also became much safer, giving patients a chance to correct damage that had already accumulated.

    In Sweden, doctors like Inge Marie Nilsson and Ake Ahlberg went even further: they pioneered prophylactic treatment, giving factor VIII to hemophiliacs on a regular schedule rather than waiting for bleeds. The same principle applied to factor IX for hemophilia B patients. This approach transformed hemophilia from a life-threatening disorder into a manageable, yet chronic, condition.

    There is a tragic sidenote in this tale, though. Before 1985, many plasma-derived concentrates were unknowingly contaminated with human immunodeficiency virus (HIV) and hepatitis viruses. A devastating number of hemophilia patients contracted these conditions. It is estimated that [4,000 of the 10,000 hemophiliacs then thought to be living in the U.S. died from AIDS.

    Today, hemophilia has morphed from a chronic condition into a curable one. Lasting genetic fixes are now available. Rather than requiring frequent or even weekly infusions of factor IX, patients can get a one-time dose of a gene therapy — such as Hemgenix, a gene therapy approved by the FDA in 2022 — that prompts their own cells to make factor IX.

    Hemgenix is a one-time infusion for hemophilia B. It works like this: First, a healthy gene encoding the factor IX gene is inserted into an adeno-associated virus, or AAV. This virus is then infused into the bloodstream (this takes an hour or two), where it travels to the liver, gloms onto cells, and delivers its genetic payload. The AAVs deliver the healthy gene into liver cells. The gene integrates into the cells’ DNA, instructing them to make functional copies of factor IX. After getting Hemgenix, 96% of participants stop using their normal medication.

    The Hemgenix clinical trials measured the annualized bleeding rate before and after gene therapy. During the lead-in period, patients had about 4.1 bleeds per year. In months 7 to 18 after treatment, that average dropped to 1.9 bleeds per year. In other words, patients receiving Hemgenix bled less than half as often after receiving the gene therapy compared to before. The researchers also measured how much functional factor IX the patients’ blood contained over 24 months. Their factor IX levels hovered around 36–41% of normal. That range is typically enough to cause blood clotting, making severe bleed outs much less likely.

    In the United Kingdom, the National Health Service will pay about 2.6 million poundsper patient for Hemgenix. This price may seem high, but it’s likely far lower than the costs required to give those patients factor replacement medicines over several decades of their lives.

    It’s incredible to me that only one hundred years ago, families watched helplessly as children with “weak blood vessels” bled and died from small bumps. And that now, we have a gene therapy that corrects the disorder and makes hemophilia liveable for the first time in human history.

    So this Christmas, I’m grateful for biotechnology. Although often tied to scary things like “bioweapons” — especially by those outside of biology — my experience is that biotechnology is far more often used as a force for good. Christmas disease is just one example of that. In 2025, I’m hopeful that we’ll see much more progress on AAV engineering (using AI and other tools!) to make gene therapies safer and more precise, and less likely to cause severe immune reactions. If we figure this out, gene therapies could be used to cure many diseases that were once considered little more than death sentences.

  • How to Calculate BioNumbers

    Arithmetic is a superpower. Or, as Dynomight has written, a “world-modeling technology.” It is one of the first things we learn in school, and yet few seem to use it in everyday life to make predictions about the world.

    Physicists use back-of-the-envelope arithmetic all the time, though. Enrico Fermi famously used it to estimate the energy released during the Trinity atomic bomb test. Standing ten miles away, he wrote that:

    About forty seconds after the explosion the air blast reached me, I tried to estimate its strength by dropping from about six feet small pieces of paper before, during and after the passage of the blast wave. Since, at the time, there was no wind, I could observe very distinctly and actually measure the displacement of the pieces of paper that were in the process of falling while the blast was passing. The shift was about 2.5 metres, which, at the time, I estimated to correspond to a blast that would be produced by ten thousand tons of TNT.

    I don’t often meet biologists who use similar estimates to test their assumptions, even though they stand to benefit just as much as physicists. In Fast Biology, I gave an anecdote about some Caltech researchers who were trying to figure out the rate-limiting factor for bacterial growth — specifically, the “thing” that limits a cell’s division rate. They found the answer (ribosome biosynthesis) using simple arithmetic, scribbled on a sheet of paper. No complicated experiments were required.

    I’d like more biologists to use simple arithmetic to check their ideas prior to running experiments. Similarly, I hope more people outside biology will enter the field and contribute. To encourage this, I’m launching a new blog series called Order-of-Magnitude Thinking. Every few weeks, I’ll pose a question and walk through the steps I take to arrive at an answer using arithmetic. I hope you’ll follow along and try these calculations yourself. Over time, I think you’ll become adept at developing biological intuitions, doing sanity checks on experiments, and so on.

    Let’s start with a basic question: How long does it take E. coli to turn one average-sized gene into one protein?

    Before answering, let’s review some molecular biology. When I say “turn,” I really mean transcribe DNA into messenger RNA (mRNA), and then translate that mRNA into protein. We can think of a gene, in this case, as a stretch of DNA that contains all the instructions needed to build that protein. Three mRNA letters are called a “codon” and encode one amino acid — the building blocks of proteins. Thus, a gene’s length in nucleotides is at least three times longer than the protein it encodes.

    Now we’re ready to move forward. The first step is to break down the question and collect our variables. We’ll need to know the size of an average gene in E. coli, the size of a protein encoded by that gene, the transcription rate (the number of DNA “letters” converted to mRNA per second) and the translation rate (how many amino acids are added to a protein per second).

    If this question was about mammalian cells, we’d also need to account for the time it takes mRNA to move from the nucleus to the cytoplasm. But E. coli cells lack a nucleus, so we can ignore this step; their genome is mixed in with everything else, meaning that ribosomes can kick off translation as soon as a mRNA appears.

    I use the BioNumbers database to look up variables. Searching “average gene length E. coli” yields an answer of about 330 amino acids. Recall that each amino acid is encoded by three nucleotides, so let’s assume that an average E. coli gene has about 1,000 nucleotides.

    What about transcription and translation rates? At 37°C (a standard temperature for E. coli growth), the transcription rate is about 40 nucleotides per second. A typical translation rate is 8 amino acids per second.

    Great. Now that we’ve got our numbers, we can carry on with the calculation.

    First, we calculate the transcription time — the number of seconds it takes to convert our average-sized gene into mRNA. This is 1,000 nucleotides divided by 40 nucleotides per second, or 25 seconds.

    Next, we calculate the translation time — the time required for ribosomes to convert the mRNA into a protein. This is 330 amino acids divided by 8 amino acids per second, or about 41 seconds.

    At first glance, we might assume that the total time to make a protein is the sum of these two values: 25 seconds + 41 seconds = 66 seconds, or 1 minute and 6 seconds. But because E. coli lacks a nucleus, transcription and translation happen at the same time. Translation kicks off as soon as the mRNA starts forming. In other words, the creation of proteins in E. coli is bottlenecked by the speed of translation.Therefore, I’d estimate that it only takes about 40 seconds to make one protein from a gene.

    Keep in mind that this estimate involves several assumptions! For instance, we’re assuming that proteins fold immediately, even though some take several minutes to adopt their final structure. We’re also assuming that transcription begins immediately, even though the cell may have to wait several seconds for the correct enzyme to latch onto the correct gene. Many biological processes are limited by diffusion — the time it takes for molecules to encounter each other — and this is an issue I’ll return to in future estimates.

    In any case, the goal of order-of-magnitude estimates is to get within a factor of ten of the underlying reality. It’s okay to round numbers up or down, or to factor in some of your assumptions, to make your final estimate. You’ll develop an intuition for how to do this effectively over time. But for this question, I think it’s safe to say that it takes “a minute or so” to make a protein from a gene in E. coli.

  • We Need Biotech Data

    In 2011, while working in Brazil, Max Roser began formulating the idea for Our World in Data. He initially planned to publish “data and research on global change,” possibly as a book. Before long, that modest blueprint morphed into something far more ambitious.

    Our World in Data went live in May 2014 and, according to Roser, attracted an average of 20,000 visitors per month in its first six months. Today, the website has a worldwide audience. It’s difficult to get exact metrics, but they have more than 300,000 followers on Twitter alone. I’d argue that their true value, though, is not in their “audience reach,” but rather in their global impact.

    By publishing numbers and charts about global change on the internet, Our World in Data plays a key role in finding aspects of global development — like malaria cases over time, for example — that are particularly stubborn and, therefore, ripe for philanthropic or government interventions. In essence, they have shown how numbers, displayed in accessible forms, can illuminate which issues deserve urgent attention and where efforts can accelerate progress.

    We should build a similar initiative for biotechnology. The Schmidt Foundation has forecasted that the bioeconomy (encompassing everything from medicines to microbe-made materials) “could be a $30 trillion global industry.” If we intend to realize that potential, we first need to benchmark where biotechnology has been, assess where it stands now, and identify the most pressing challenges ahead.

    “If I’m just watching the news, I’m going to find it very difficult to get an all-things-considered sense of how humanity is doing,” researcher Fin Moorhouse has written. “I’d love to be able to visit a single site which shows me — in as close as possible to a single glance — some key overall indicators of how the world’s holding up.” Biotechnology deserves precisely this kind of concentrated, data-driven resource.

    More specifically, I’m imagining a website that aggregates information on everything from the computational costs of protein design to the efficiency of gene-editing tools across cell lines. Such a resource would help researchers, investors, and policymakers figure out which areas demand attention and which breakthroughs are worth scaling, all while helping prevent misuse.

    Pieces of this puzzle already exist, but it seems only in scattered or ad-hoc formats. Rob Carlson, managing director of Planetary Technologies, has famously publisheddata on DNA sequencing and synthesis costs. His charts became so popular that people eventually dubbed them “Carlson Curves.” Meanwhile, Epoch AI, a research institute that monitors the computational demands and scaling of AI models, is building the benchmarks and datasets needed to track the AI field’s progress. They could serve as a model for this biotechnology effort.

    A dedicated nonprofit research institute for “Biotech Data” could systematically track metrics such as:

    • Cloning times over the last several decades. How long does it take to synthesize DNA, stitch it together, and make sure everything works as intended? Bottlenecks in cloning slow scientific progress as a whole; the speed of experiments is a key driver of scientific speed overall.
    • CRISPR off-target scores over time. How frequently do gene-editing tools make unintended cuts in the genome, and how can we standardize measurements across studies? We’ll need to make some benchmarks.
    • Resolution and speed of cryo-EM. How rapidly have improvements in cryo-electron microscopy accelerated, both in terms of resolution and throughput?
    • Antibody manufacturing titers over time. Using a single antibody as reference, what titers are companies achieving in CHO (in g/L) or other cell types over time?
    • Bioscience PhDs awarded per year. How many new doctorates emerge from academia, and where do they end up across industry, startups, and research labs?

    Note that these datasets span both technical and societal issues. This is deliberate; to scale biotechnology, we have to understand both scientific breakthroughs and the workforce dynamics behind them. Tools are useless without a workforce to wield them. Many of these numbers already exist on the internet, but are buried in unwieldy government PDFs or tucked away in a patchwork of scientific articles. Others may require painstaking curation by combing through decades of research articles.

    Starting this nonprofit wouldn’t be too difficult. You could begin by collecting one dataset, transforming it into a chart, and posting it online. People on Twitter and LinkedIn seem to really love data visualizations, so you could probably grow an audience quickly. Over time, you might build automated scraping tools for government websites, create reusable templates to make charts quickly, and even publish short blog posts about various charts (like why, exactly, cryo-EM resolution got so good; what were the key innovations?)

    If this vision appeals to you, send me an email (niko@asimov.com), and I’ll help you get started. We briefly considered launching this venture at Asimov Press, but we only have two full-time employees and so don’t have the bandwidth. We might be keen to fund this project.

  • Biotech Needs a Hydrogen Atom

    The hydrogen atom revolutionized physics.

    Throughout the 20th century, physicists used this atom to develop a quantum theory of matter. By using the same atom from one experiment to the next, physicists were able to compare results and reconcile their findings. Hydrogen is the foundation on which physics built its cathedral.

    Biotechnology needs its own hydrogen atom.

    A zoologist and protein engineer both call themselves biologists, but otherwise share little in common. Biology is broad and multi-faceted. Even in narrowly-focused fields—such as for Alzheimer’s or cell death—disagreements abound. Every scientist pursues their own ideas using slightly different methods and cell strains. Papers are promptly locked behind paywalls and negative findings are rarely published at all.

    This is not a good way to build scientific cathedrals. Biotechnology promises to do so much for our world, and yet I fear I’ll never see many of its goods in my lifetime, simply because of the scattershot way in which we work. Biotechnology can learn from physics and build its own cathedral.

    Imagine a biological singularity, of sorts, in which one could design any molecule, or any cell, for any purpose. If biotechnology transcended from an era of trial-and-errorand billion-dollar development timelines, and instead could be used to design safe solutions to problems at will, most diseases would have a cure. Materials would be grown from layers of engineered cells and plants would fix their own nitrogen. Abundance.

    If this sounds like overzealous optimism, well, that’s because it is. But these achievements are not impossible. Cells are made from molecules, which are made from atoms, which can be understood. Nothing in this quest flies against the laws of physics. This century should be devoted to the mapping, quantification, and deep understanding of how life works, such that we can begin to reliably design living organisms to do more good in the world. We’re already seeing this with protein design; in the future, we may see it with cell design.

    But first, biotechnology will need to find its own hydrogen atom, a foundation on which to build tools and knowledge that can later be applied more broadly. I’d like to propose Mycoplasma genitalium, an organism with perhaps the smallest genome of any free-living thing. We’ve already made great progress in understanding this “simple” cell, but there is more to be done.

    In 2006, the J. Craig Venter Institute reported that only 382 genes in M. genitaliumare essential. A whole-cell model of this organism’s life cycle followed in 2012. But even now, dozens of genes in M. genitalium have unknown functions. We don’t fully understand how its molecules interact to carry out behaviors, and most of its proteins have unknown structures. There are also mysteries in the ways that these cells communicate and draw resources from their environment.

    We should build an institute that is wholly devoted to understanding a single type of cell, be it M. genitalium or another, at a depth that is complete enough such that its entire life and all its functions can be simulated on a computer. Achieving that simulation would require first that we build technologies to study life at high spatial and temporal resolutions, for one cell or populations of interacting cells, and then feed the collected data into predictive models that can later be applied more broadly. This institute would ideally operate as a non-profit and make all of these tools and models open-source.

    In this way, a single cell could provide a foundation for biotechnology’s future.

  • C57Bl6/J

    The first mouse emulation appeared in 2032; a rodent’s entire anatomy, and all of its cells—including the brain—perfectly recapitulated using computer hardware. In those early days, only a few organizations had sufficient computing power (and the necessary data files) to run the emulations, which depended on custom-designed NVIDIA chips. The military had enough compute to run sevenemulations and the National Institutes of Health, or NIH, enough to run five. A thirteenth emulator was thought to exist, but no one knew for sure.

    The military’s emulators were commandeered by high-ranking officers at the Pentagon, Office for Naval Research, and CIA. The Pentagon sent a handful of chips to leading materials science laboratories, who worked tirelessly to dissect their atomic properties. The remaining emulators were mainly used to screen drugs that could make mice do various things of military interest—stay awake longer, move faster, grow larger muscles, etc. In 2035, a ProPublica investigation revealed that many of the in silico results had secretly been tested on prisoners in Guantanamo. Once the military felt that it had exhausted the emulators’ potential, they stored the chips and files somewhere in Fort Detrick. Not even the President knew exactly where.

    The NIH gave grant-making committees authority to dole out access to the mouse emulators. The committees announced a series of grants, but ultimately awarded them to close friends at various academic institutions. One emulator went to a consortium at Harvard, a second to MIT, and the others to academics at Stanford, Johns Hopkins, and the University of Utah. In exchange, the academics agreed to list all the NIH committee members as authors on all future papers in perpetuity. As h-indexes swelled to the hundreds, then thousands, they soon ceased to be relevant at all.

    These academic emulators were used to churn out biomedical research papers; about 50 per day. Every experiment that could possibly be run on mice—every possible gene deletion, or even combinations of deletions, and every battery of physiological tests—were modeled and executed in silico. Soon, every problem under the sun had been solved in mice; aging, eyesight, diabetes, cancer, you name it. The researchers spun up companies and chaired important committees. They sat on the boards of pharmaceutical companies and began to apply their findings to people. The F.D.A. agreed to remove some pre-clinical testing requirements, such that 11 academics were soon involved in 7,100 clinical trials that had collectively enrolled 2.3 million people.

    Rumors of the thirteenth emulator percolated around the Internet, but nobody knew for sure whether it was real. People in the r/biotech subreddit speculated that a disgruntled NVIDIA employee had quietly slipped away with a few chips and the data files, and was planning to sell them to a wealthy individual—perhaps Musk or Altman. So everyone was surprised when, in late 2032, a Reddit user by the name of Hitchhiker42 (later revealed to be a student living in Berkeley, California) uploaded all of the files and chip designs, for free, onto a public server. Hitchhiker42’s post began: “I think I found a bug in this emulator…”

  • Central Dogma in 7 Experiments

    Introduction

    In the days before DNA sequencing, high-powered microscopes, and molecular biology textbooks, decoding the finer workings of a living cell often required arduous experiments and clever speculation.

    ‍The history of molecular biology is rife with eccentric scientists who drummed up creative experiments to study unseen molecules, and then used deductive reasoning to piece a larger puzzle together. Mapping the Central Dogma is their crowning achievement.

    The Central Dogma was first described by Francis Crick, the Cambridge scientist who solved DNA’s structure with James Watson, based on x-ray images obtained by Rosalind Franklin. In 1958, Crick wrote that once genetic information has passed into protein, “it cannot get out again.”

    ‍Although students typically learn the Central Dogma as something like DNA → RNA → protein, or “DNA is transcribed to RNA which is translated to protein,” this is not what Crick originally said. There are also exceptions to the oft-mentioned DNA→RNA→protein depiction; RNA is reverse transcribed into DNA, for example, and prions are protein aggregates that replicate themselves. Crick regretted naming his idea the ‘Central Dogma,’ he wrote in Nature, because the idea itself was speculative. Crick had misunderstood the definition of the word dogma.

    ‍Still, the way that cells read instructions encoded in DNA to create all the proteins necessary for life is the cornerstone of modern molecular biology. The scientists who cracked this code were often brilliant thinkers, and their experiments ought to be an inspiration for future genetic designers hoping to make discoveries in areas where we are currently most blind.

    ‍In this essay, we highlight 7 experiments that elucidated the Central Dogma and information processing in cells. These experiments include those that first isolated the intermediate molecule between DNA and proteins, called messenger RNA, cracked the genetic code, and solved the basic mechanism for DNA replication in living cells.

    ‍Experiments described in this essay are important, but not exhaustive. Biological knowledge is built up, slowly, by the collective efforts of hundreds of scientists. Only a book like The Eighth Day of Creation, by Horace Judson, could even begin to do justice to the rich and beautiful history of molecular biology. This essay focuses on a few important years, and is inspired by The Generalist’s article on the history of AI.

    THE STRUCTURE OF DNA (1953) #

    Friedrich Miescher, a Swiss chemist, was the first person to isolate DNA. In 1869, he collected pus-covered bandages from patients at a university hospital and extracted a sticky substance from them. Miescher called this substance nuclein.

    For decades after, most biologists believed that Miescher’s discovery was little more than a quaint curiosity. Early molecular biologists (the coin was termed in 1938) thought that proteins, rather than DNA, was the genetic material of living cells. Proteins are built from many different amino acids, and appear in all kinds of different shapes and sizes. This made them seem like the likelier option for genetic material.

    ‍By 1944, though, this view began to crumble when three scientists at the Rockefeller Institute in New York City, named Oswald Avery, Colin MacLeod, and Maclyn McCarty did an experiment to identify the molecule responsible for carrying genetic information. Their results pointed to DNA.

    ‍The trio isolated protein and DNA from different strains of Streptococcus pneumoniae, a virulent bacterium, and then used enzymes to break down the molecules. The digested proteins and DNA were inserted into a harmless strain of bacteria, and the scientists waited to see if either molecule would turn the harmless cells virulent.

    ‍When digested DNA was added to the cells, the harmless bacteria adopted the traits of the virulent strain, but this did not happen with digested proteins. These results suggested that DNA, and not proteins, was a carrier of hereditary information.

    ‍A few miles north, at Columbia University, a biochemist named Erwin Chargraff read the Avery-MacLeod-McCarty paper and was “deeply moved by the sudden appearance of a giant bridge between chemistry and genetics,” as he later wrote. Chargaff had an academic background in molecular chemistry. He realized that, if DNA was indeed the genetic material, then perhaps a chemist could dissect how it differs across organisms and thus explain the rich diversity of the natural world.

    ‍Chargaff’s team spent several years chewing up DNA sequences, separating out the individual nucleotides on pieces of paper, and exposing the nucleotides to a UV spectrophotometer. They repeated this for DNA molecules harvested from yeast, bacteria, beef spleens, and calf thymus. By 1949, Chargaff had cracked a basic principle of the DNA code:

    “The desoxypentose nucleic acids from animal and microbial cells contain varying proportions of the same four nitrogenous constituents, namely adenine, guanine, cytosine, thymine…Their composition appears to be characteristic of the species, but not of the tissue, from which they are derived.”

    ‍In other words, Chargaff correctly determined that every organism on Earth uses DNA molecules that are made from the same four letters. Genetic material only differs, from one species to the next, by the order in which the four nucleotides appear. Chargaff also noted that “the molar ratios of total purines to total pyrimidines, and also of adenine to thymine and of guanine to cytosine, were not far from 1.” Said another way, the amount of ‘A’ in DNA is always equal to the total amount of ‘T’. Ditto for ‘G’ and ‘C’ nucleotides.

    ‍Chargaff shared his results in a lecture at Cambridge University in 1952. Watson and Crick were in attendance. The following year, using x-ray diffraction images first obtained by Rosalind Franklin at King’s College London, and perhaps also Chargaff’s observations, Watson and Crick assembled a biophysically accurate model of DNA. Their model was made from crude, metal sheets, but clearly depicted a right-handed double-helix in which ‘A’ connects to ‘T’, and ‘G’ connects to ‘C’. The model was published in Nature on 25 April 1953.

    Recent revelations have revised Rosalind Franklin’s role in solving DNA’s structure. In the classic telling of this tale, Franklin is “portrayed as a brilliant scientist, but one who was ultimately unable to decipher what her own data were telling her about DNA,” according to an article by Matthew Cobb & Nathaniel Comfort in Nature. “She supposedly sat on the image for months without realizing its significance, only for Watson to understand it at a glance.”

    ‍But this tale is not accurate. Newly unearthed documents, including a shelved article that Franklin wrote with Crick and Watson for Time magazine in 1953, now suggest that “Franklin did not fail to grasp the structure of DNA. She was an equal contributor to solving it.”

    DNA REPLICATION (1958) #

    Watson and Crick’s 1953 Nature paper concludes with one of the most famous passages in biology’s history:

    “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”

    ‍The Cambridge duo’s model correctly depicted DNA as a molecule composed from two interlocking strands, wherein ‘A’ always connects to ‘T’ and ‘G’ always connects to ‘C’. If the two strands were to unwind and detach from each other, Watson and Crick noted, it should be possible to recreate the original strand merely by pairing up each base in the separated strand with its appropriate nucleotide. This idea was called the semi-conservative model of replication.

    ‍Other eminent scientists attacked this idea. Max Delbrück was a renowned physicist at the California Institute of Technology who, together with Salvador Luria, had discovered that bacteria resist phage attacks via random mutations. He penned an article arguing that the semi-conservative model could not be correct because too much energy would be required to unwind the two DNA strands.

    ‍Delbrück favored a different model, called dispersive replication, in which small chunks of a DNA molecule are broken up, and then matching DNA sequences are synthesized directly in the broken regions to create an intact, double-stranded helix. A third group of scientists favored a conservative replication model, which theorized that the entire DNA molecule is somehow copied without unwinding whatsoever.

    ‍Thanks to a particularly innovative experiment devised by two young scientists at Caltech, named Matthew Meselson and Franklin Stahl, Watson and Crick’s semi-conservative model was ultimately vindicated.

    ‍It would be relatively simple to figure out how DNA replicates if one could directly observe these molecules. But that was not possible in 1958. Instead, Meselson and Stahl devised a clever experiment, based on spinning molecules quickly in a centrifuge, to test the three models.

    ‍Meselson and Stahl’s key insight was to tag DNA strands undergoing replication with heavy atoms, such as nitrogen (N15) that carries an extra neutron. The scientists grew bacterial cells in a growth medium containing this heavy nitrogen, waited for the N15 to incorporate into all of the cells’ molecules, and then quickly transferred the ‘heavy’ microbes into growth media with normal nitrogen.

    ‍As the DNA molecules replicated, Meselson and Stahl killed the cells and used a centrifuge to spin down the molecules. As the tubes spin, heavier DNA moves toward the bottom and lighter DNA stays closer to the top. Before the cells replicated their DNA, all of the DNA molecules contained heavy nitrogen. After one round of DNA replication, the DNA strands contained half-heavy and half-light nitrogen atoms (Meselson and Stahl saw two ‘bands’ begin to appear in their centrifuged tubes.) And after two rounds of DNA replication, only one-in-four DNA molecules contained heavy nitrogen, suggesting that the semi-conservative model was correct.

    ‍This experiment is renowned for its simplicity and clever approach – it is now called “the most beautiful experiment.” Delbrück was wrong; DNA replication occurs when the two interlocking strands unwind, and each strand is then used as a ‘template’ to remake a double helix.

    THE CENTRAL DOGMA (1958) #

    After publishing his 1953 Nature paper about DNA’s structure, Francis Crick toured the world to lecture on an idea that “permanently altered the logic of biology,” according to Horace Judson, author of The Eighth Day of Creation.

    ‍During his lectures, Crick would often draw a diagram on the auditorium’s blackboard. His diagram depicted how information flows through living cells; DNA is somehow converted into an intermediate molecule, which Crick called ‘template RNA’, that somehow encoded the amino acids in a protein molecule. Crick correctly predicted the basic details of protein synthesis years before direct experimental evidence had confirmed the existence of mRNA or tRNA.

    ‍In 1958, Crick adapted his lecture into a published article, called On Protein Synthesis. His target audience was “a general reader rather than the specialist.” The article gave two hypotheses to explain the relationship between DNA and proteins, called the Sequence Hypothesis and the Central Dogma.

    ‍“The direct evidence for both of them is negligible,” Crick wrote, “but I have found them to be of great help in getting to grips with these very complex problems.”

    The sequence hypothesis, in its simplest form, “assumes that the specificity of a piece of nucleic acid is expressed solely by the sequence of its bases, and that this sequence is a (simple) code for the amino acid sequence of a particular protein.” In other words, the bases in a strand of DNA or RNA corresponds to the amino acids in a protein.

    ‍The Central Dogma, Crick wrote, “states that once ‘information’ has passed into protein it cannot get out again.” Stated another way, “the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is im­possible.”

    ‍This passage marked the first time that the Central Dogma, the defining idea of molecular biology, had been published. But this is not why Crick’s article was so prescient.

    ‍In the article, Crick used scattered experimental evidence and anecdotal observations, including the fact that “spermatozoa contain no RNA,” to correctly predict that there must be a messenger RNA molecule in the cytoplasm that is produced by “the DNA of the nucleus.”

    ‍Crick’s astounding ability to theorize was most prominently displayed, though, when he correctly inferred the existence of tRNAs, predicted what they were made of, and explained how they likely became ‘charged’ with amino acids for protein synthesis.

    ‍Molecular biologists knew that proteins were made from 20 amino acids, but most other details of protein synthesis were a mystery. Today, we know that tRNA molecules get ‘loaded’ with the correct amino acid via the action of specific enzymes, and that this is how a message encoded in a strand of RNA is used by the ribosome to build a protein. But Crick had little evidence for any of this. And yet, in his 1958 paper, he wrote:

    “Granted that…[mRNA]…is the template, how does it direct the amino acids into the correct order? One’s first naive idea is that the RNA will take up a configuration capable of forming twenty different ‘cavities’, one for the side-chain of each of the twenty amino acids. If this were so, one might expect to be able to play the problem backwards – that is, to find the configuration of RNA by trying to form such cavities. All attempts to do this have failed, and on physical­ chemical grounds the idea does not seem in the least plausible…

    Apart from the phosphate-sugar backbone, which we have assumed to be regular and perhaps linked to the structural protein of the particles, RNA presents mainly a sequence of sites where hydrogen bonding could occur. One would expect, therefore, that whatever went on to the tem­plate in a specific way did so by forming hydrogen bonds. It is therefore a natural hypothesis that the amino acid is carried to the template by an ‘adaptor’ molecule, and that the adaptor is the part which actually fits on to the RNA. In its simplest form one would require twenty adaptors, one for each amino acid.

    What sort of molecules such adaptors might be is anybody’s guess. They might, for example, be proteins…though personally I think that proteins, being rather large molecules, would take up too much space. They might be quite unsuspected molecules, such as amino sugars. But there is one possibility which seems inherently more likely than any other-that they contain nucleotides. This would enable them to join on to the RNA template by the same ‘pairing’ of bases as is found in DNA, or in polynucleotides.

    If the adaptors were small molecules one would imagine that a separate enzyme would be required to join each adaptor to its own amino acid and that the specificity required to distinguish between, say, leucine, iso­leucine and valine would be provided by these enzyme molecules instead of by cavities in the RNA. Enzymes, being made of protein, can probably make such distinctions more easily than can nucleic acid.”

    ‍This paper is a tour-de-force of logical reasoning. It became the focal point, a rallying cry, for molecular biologists seeking to crack the genetic code and resolve the cell’s mysteries. Crick, fortunately, would not have to wait long for his ideas to be vindicated. A ‘template RNA,’ or messenger RNA as it’s now called, was discovered just three years later.

    ISOLATION OF MESSENGER RNA (1961) #

    Messenger RNA was first isolated by two separate research groups in 1961. Their results appeared back-to-back in the 13 May issue of Nature.

    ‍At the Institut Pasteur in Paris, the French scientists François Jacob and Jacques Monod had discovered that the enzymes required to break down a sugar in bacterial cells were only made after cells were exposed to that sugar. In other words, cells somehow “process” an external cue and make proteins in response. This marked the discovery of genetic regulation, but also raised a slew of questions.

    ‍Among them: How does a cell know which genes to turn on at any given time? Why doesn’t the whole genome “turn on” at the same time? Several answers were proposed. Maybe there is a custom ribosome corresponding to each gene, some said. Or maybe, as Crick had proposed in 1958, there is an intermediate molecule – a “template RNA” – that transmits messages between DNA and proteins.

    ‍In 1960, two groups set out to isolate this mystery molecule. The first group rallied around Matthew Meselson’s laboratory at the California Institute of Technology, and included Sydney Brenner and François Jacob. A second group rallied around Wally Gilbert’s group at Harvard, and included James Watson and François Gros, a French biologist who had worked with Jacob.

    ‍Both groups turned to a compelling, experimental model: bacteriophages. When E. colibacteria are infected with a phage, many scientists had noted that the cells stop making their own proteins, and quickly switch over to making the phage proteins. “This system thus provides an ideal model for observing the synthesis of new proteins following the introduction of specific DNA,” Gilbert’s team noted in their 1961 paper.

    ‍To isolate messenger RNA, the Caltech scientists grew bacteria in a growth medium with heavy isotopes, much like Meselson had done with Stahl several years earlier to validate the semi-conservative model of replication. These ‘heavy’ bacteria were then infected with phage and immediately transferred into a growth medium with light isotopes. Infected cells were finally lysed open at regular time points and spun down in Meselson’s ultracentrifuges.

    ‍The bands that emerged from Meselson’s centrifuges confirmed a few things. First, bacterial cells did not make new ribosomes after they were infected. This observation was evidence against the fact there is a unique ribosome for each gene. Second, the results confirmed that a new type of RNA molecule was swiftly made after phage infection, and that this new RNA quickly attached to existing ribosomes in the cell. This suggested that the DNA in phages was being quickly transcribed into messenger RNA. And third, the bacterial cells began to make phage proteins using their existing ribosomes.

    ‍They had discovered messenger RNA. There is an excellent, and much richer, account of this history by the scientific historian, Matthew Cobb.

    MAPPING A CODON (1961) #

    Crick’s 1958 paper made a series of predictions about messenger RNA, transfer RNAs, and how a code embedded in a DNA molecule could possibly encode a protein. But one longstanding question in molecular biology had to do with the nature of the genetic code itself. Namely, how do the nucleotides in a strand of RNA encode the amino acids in a protein? What does UAG mean, or GAA, or UUU, or any other codon, for that matter?

     *Nirenberg and Matthaei in the laboratory. Credit: NIH/Marshall W. Nirenberg.*

    ‍The first triplet codon to be mapped to an amino acid was ‘UUU’ to phenylalanine. This connection was made by two young researchers at the National Institutes of Health (NIH) in Bethesda, Maryland.

    ‍Heinrich Matthaei was a post-doctoral fellow working in the laboratory of Marshall Nirenberg, a new researcher at the Institutes. The two scientists were interested in the Central Dogma – they had read Crick’s paper – and aimed to understand the connection between RNA and proteins, often by running experiments on cell-free extracts, a liquid made by grinding up living cells in a mortar and pestle. This enabled the two scientists to study cell biochemistry without having to deal with living organisms.

    ‍At 3 o’clock in the morning of May 27th, the two scientists took some of these ‘cell guts’ and added a few drops of synthetic RNA with the sequence:

    UUUUUUUUUUUUUUUUU

    ‍Their concoction was next added to 20 different tubes, each of which held a different amino acid; valine, alanine, glutamine, and so on. One of the tubes contained phenylalanine amino acids that had been labeled with a radioactive isotope.

    ‍“The results were spectacular and simple at the same time,” according to a brief history from the NIH. “After an hour, the control tubes showed a background level of 70 counts, whereas the hot tube” – with the radioactive phenylalanine – “showed 38,000 counts per milligram of protein.”

    ‍In other words, when the synthetic RNA molecule was added to a tube of phenylalanine amino acids, the cell-free extract began to churn out radioactive peptides. This singular experiment suggested that the nucleotides UUU somehow encode phenylalanine during protein synthesis.

    ‍Over the next several years, Nirenberg and other researchers would go on to map all 64 codons, including the codon that signals the start of translation, AUG. Nirenberg shared the 1968 Nobel Prize in Physiology or Medicine.

    CRACKING THE GENETIC CODE (1961) #

    The year 1961 was molecular biology’s annus mirabilis. Messenger RNA was isolated for the first time and Nirenberg and Matthaei decoded the ‘meaning’ of the first codon – UUU. Even after those papers were published, though, mysteries remained. Among them: Is the genetic code overlapping or non-overlapping? And is it actually made from doublet, triplet, or quadruplet codons?

    ‍A messenger RNA sequence that reads ‘AUGACC’ could be read by the ribosome as ‘AUG’ and then ‘ACC,’ or it could be read by the ribosome as ‘AUG’, ‘UGA,’ ‘GAC’, ‘ACC’. The former is a non-overlapping code, and the latter is an overlapping code. Similarly, the code could be read as ‘AU’ and then ‘GA’ and then ‘CC’ if codons were doublets, or ‘AUGA’ and then ‘UGAC’ if they were quadruplets, and so on. Nirenberg and Matthaei’s experiment did not help to answer either of these questions, because their synthetic RNA had a repetitive sequence: UUUUUUUU.

    ‍In the waning weeks of 1961, Sydney Brenner, Lesie Barnett, Francis Crick, and R.J. Watts-Tobin used fragmentary experimental evidence and thought experiments to conclude that each amino acid in a protein is encoded by a triplet code, and that the letters in this code do not overlap. Their ideas were published in a paper entitled, “General nature of the genetic code for proteins.”

    ‍Their experiments hinged on two things: A bacteriophage, called T4, that infects bacteria, and a particular type of dye, an acridine called proflavin, that precisely mutates DNA by adding or removing a single nucleotide.

    ‍Crick, the ever-careful thinker, had a beautiful idea. He decided to take some T4 bacteriophage and then expose it to proflavin, such that the phage lost its ability to make a particular protein. If Crick added one base and then removed one base, using the acridine, he noted that the phage were able to make the protein. But if he used acridine to add two bases, the phage did not make the protein. When three bases were added, the phage made the protein again. From these observations, the scientists argued that the genetic code must use triplets to encode each amino acid. Their takeaway was based on this fragmented, experimental evidence.

    ‍Even though the “combination of mutations strongly suggested that the code was based on units of three bases, the experiments could not prove that to be the case – a code using groups of six bases was consistent with the results,” wrote Matthew Cobb in a 2021 history of this paper.

    ‍Today, we know that there are 64 codons in total, and that codons appear as ‘triplets’ to encode amino acids in a final protein chain. Codons made of six bases “would raise all sorts of problems,” as Cobb notes, “by massively increasing the number of either meaningless or degenerate sequences (there would be 4096 possible combinations of bases, rather than a mere 64).”

    ‍As Crick later said: This was “hardly likely to be taken seriously.”

    TRANSLATION VIA A SINGLE RIBOSOME (2008) #

    By 1961, the basic contours of the Central Dogma had been resolved. But that doesn’t mean all work has since abated, nor that the years from 1953 to 1961 are all-encompassing. Linus Pauling at Caltech predicted the main structural motifs of proteins as early as 1951. A ‘stop’ codon that halts protein synthesis was identified in 1965. The ribosome’s structure was solved in 2000, after decades of work, and culminated in the 2009 Nobel Prize in Chemistry.

    ‍Today, synthetic biologists continue to expand the Central Dogma using technologies that Francis Crick, in 1958, could only have dreamed of. And yet, the molecular choreography that underlies the Central Dogma continues to surprise. There are far more enzymes and components involved than early molecular biologists ever could have realized. Transfer RNAs carry amino acids to the ribosome, proteins interact with the ribosome to push it off the RNA strand, and dozens of proteins are involved in transcription initiation, elongation, and termination in human cells.

    ‍Molecular biologists continue to resolve this complexity today. In a 2008 study, called “Following translation by single ribosomes one codon at a time,” chemists at the University of California, Berkeley studied individual ribosomes as they moved along a single messenger RNA molecule. Their experiment revealed the stochastic starts and stops of a ribosome during translation.

    ‍For this experiment, each end of an mRNA molecule was attached to a polystyrene bead. One of the beads was then placed in a laser trap, holding it in place. The middle of the mRNA molecule contained a long loop, which slowly unwound as the ribosome traversed along its length. As the mRNA molecule stretched out, this elongation could be directly measured by measuring the distance between the two beads.

    ‍The chemists repeated this experiment several times, and measured the rate at which the mRNA molecule stretched out each time. Their key result was this: Ribosomes do not glide along the mRNA at a steady pace (which would stretch out the molecule in a linear fashion), but rather jump from one codon to the next in time steps of around 0.1 seconds. The ribosome occasionally pauses between jumps. Each ribosome, then, translates a strand of mRNA in a slightly different amount of time.

    ‍This experiment is one of thousands that have been applied to study the Central Dogma in the last two decades. Crick’s 1958 article continues to inspire generations of molecular biologists, who have found his ideas to be rich fodder for a lifetime of scientific work. We now know a shocking amount about transcription, translation, and the genetic code; bacteria add about eight amino acids to a protein each second, human cells add about five amino acids in the same length of time, and DNA is transcribed to RNA at a rate of about 40 nucleotides per second.

    CONCLUSION #

    The ways in which cells process information is a biophysical marvel that has slowly unraveled over the last 70 years. The Central Dogma, and the 7 seminal experiments described in this essay, are the basis for most everything we do in genetic engineering. But there are still many instances in which we, as genetic designers, place a gene into a cell expecting one thing to happen, but observe something entirely unexpected instead. In other words, biology does not always behave as we expect.‍

    Though useful, the Central Dogma is an incomplete way to think of a living cell. DNA is not always transcribed to RNA, and RNA is not always translated into protein. Sometimes RNA goes back to DNA. The only rule in biology is that there are exceptions to every rule. In future posts, we’ll continue our exploration of the Central Dogma and explain many of these exceptions.

    ***

    Contributors: Ben Gordon and Alec Nielsen. Words by Niko McCarty.

  • Think of the Eggs

    When people think of “biotech” — myself included — they tend to picture GLP-1s and gene therapies. But biotech is much broader than just medicine; it’s also pushing forward a renaissance in the egg industry.

    Eggs aren’t usually top of mind for me. I toss a carton in my grocery cart now and then, but rarely think about how those eggs landed on the shelf in the first place. Perhaps I should. Every year, the global egg industry kills around six billion male chicks shortly after they hatch. Why? Because male birds, bred from “layer” lines, don’t make eggs and don’t pack on enough meat to be profitable. Hence, they’re thrown into a blender.

    Fortunately, scientists have figured out how to determine a chicken’s sex before it hatches. These technologies are called in ovo sexing. Using hyperspectral cameras or PCR, they can be used to figure out which eggs will hatch male vs. female. With widespread adoption, in ovo sexing could spare billions of chicks from the blender. Alas, these technologies weren’t available at all in the U.S. … until last month. Hardly anyone in the mainstream biotech community seems to know about what’s going on in this sector but, in my view, it’s among the most underrated and important stories of today.

    In ovo sexing has been available in Europe for years. Germany banned chick culling in 2022. In response, hatcheries were initially forced to keep male chicks alive and raise them for meat — “a practice that was costly and unsustainable,” according to Innovate Animal Ag. (Again, so-called “layer” chickens just don’t produce much meat. Broilerchickens, on the other hand, are specially bred to grow quickly; they “can grow to be over four times the weight of a natural chicken in only 6-7 weeks,” according to an article in Asimov Press.)

    Sensing an opportunity, companies launched in ovo sexing technologies in Europe so hatcheries could screen out male eggs before they hatched. If eggs are destroyed by day 12 of development, the embryo feels no pain. Thanks to this shift, about 78.4 million of Europe’s 389 million hens — or about 20 percent — came from in ovo sexed eggs last year, according to data from Innovate Animal Ag.

    But only two in ovo sexing methods have reached commercial scale so far. As Robert Yaman, CEO of Innovate Animal Ag, previously wrote for Asimov Press:

    The first of these approaches utilizes imaging technologies like MRI or hyperspectral imaging to look “through” the shell of the egg to determine the sex of the embryo inside. The second approach involves taking a small fluid sample from inside the egg, and then running PCR to identify the sex chromosomes, or using mass spectrometry to locate a sex-specific hormone…

    …Other approaches are in development and have not yet been commercially deployed. Some technologies can “smell” a chick’s sex by analyzing volatile compounds excreted through the eggshell. Another approach uses gene editing so that male eggs have a genetic marker that allows their development to be halted by a simple trigger, such as a blue light. Unlike humans, the sex of a chicken is determined by the chromosomal contribution of its mother. By only modifying the sex chromosome of the female parent line that yields male chicks, the female chicks end up without the gene edit. This means that the eggs they lay do not need to be labeled as “gene-edited” for consumers.

    As Europe rolls out these technologies, most American consumers still have no idea that chick culling is even a thing. In one poll, only 11 percent of Americans knew about chick culling; once informed, a majority opposed it. Fortunately, in ovo sexing technologies have finally arrived in the U.S.

    Three U.S. egg companies — Egg Innovations, Kipster, and NestFresh — have announced plans to adopt in ovo sexing technology. In late 2024, Agri-Advanced Technologies also rolled out a machine called “Cheggy” to hatcheries in Iowa and Texas. Cheggy can scan 25,000 eggs per hour and figure out the sex of embryos inside using hyperspectral imaging. The machine is able to “see” the color of down feathers forming beneath the shell. (Brown-egg chicken breeds typically have differently colored feathers for males and females, but this doesn’t work on white eggs.) Hyperspectral imaging is great because it’s non-invasive; the eggs don’t need to be cracked or poked at all. If the machine detects a female embryo, it sends it back to the incubator. Male eggs are destroyed and turned into protein for pet food.

    Also, in December, Respeggt announced that by February 2025, it will roll out its own in ovo sexing tech at a massive Nebraska hatchery, with a capacity to serve 10 percent of the entire U.S. layer market. Respeggt’s technology relies on PCR, so it works for both white and brown eggs.

     *Respeggt’s technology uses a laser to puncture eggs and retrieve a small amount of liquid to run PCR.*

    In Europe, in-ovo-sexed eggs cost only about one to three euro cents more each. That’s a tiny bump, and I’d gladly pay extra just for the mental solitude of knowing that farmers didn’t have to kill any male chicks to produce them. But I am not most consumers; eggs are one of the most price-sensitive grocery items. When people talk about inflation, they usually talk about the price of bread, milk, and eggs!

    Fortunately, a Nielsen survey found that 71 percent of American egg buyers say they’d pay more for in-ovo-sexed eggs. We’ll see what happens, though, as these eggs get rolled out to grocery stores (likely by mid-2025). Consumer reactions will be super important here because the U.S. government doesn’t mandate whether or not hatcheries kill baby chicks. The survival of these technologies will literally be determined by whether or not people buy the eggs.

    Finally, I just want to say that few (if any) people have been pushing for this harder than Innovate Animal Ag. They didn’t pay me to say that, either; they don’t even know I’m writing this article! But they’re the ones dropping all these reports and data about chick culling, commissioning surveys to figure out price points, and pushing for new certifications to coax consumer buy-in.

    So yeah, we often celebrate biotech’s potential — gene editing, advanced vaccines, cultivated meat — but in ovo sexing is already improving the egg industry at scale. It flies under the radar, but at least now you know the story.

  • Estimating the Size of a Single Molecule

    Many decades before the discovery of x-rays and the invention of powerful microscopes, Lord Rayleigh calculated the size of a single molecule. And he did it, remarkably, using little more than oil, water, and a pen. His inspiration was none other than Benjamin Franklin.

    Sometime around 1770, while visiting London, Franklin became intrigued by a phenomenon he had observed during his transatlantic voyage. Specifically, he noticed that when ships discarded greasy slops into the ocean, the surrounding waves would calm. This ancient practice of oiling the seas to pacify turbulent waters was known to the Babylonians and Romans, but Franklin decided to investigate further.

    On a windy day in London, he walked to a pond on Clapham Common. Carrying a small quantity of oil — “not more than a Tea Spoonful,” according to his diary — Franklin poured it onto the agitated water. The oil spread rapidly across the surface, covering “perhaps half an Acre” of the pond and rendering its waters “as smooth as a Looking Glass.” Franklin documented his observations in detail; they can be read today on the Clapham Society’s website.

    Franklin’s oil drop experiment, of course, was just one in a long line of his “amateur” science experiments. He was also the first to demonstrate that lightning is electrical in nature (via his famous kite experiments), and he charted the Gulf Stream’s course across the Atlantic ocean, noting that ships traveling from America to England sailed quicker than those going the opposite direction. His experiments at Clapham Common are not nearly as well-known.

    But Franklin was a careful experimenter, repeating his oil drop multiple times and taking notes each time. In his journal, he opined on how much oil might be needed to calm various areas of ocean (he was thinking specifically about applications for the Royal Navy) but never grasped the molecular implications of his experiments. It wasn’t until more than a century later that Lord Rayleigh, whose real name was John William Strutt, revisited Franklin’s experiment with a brilliant new perspective.

    An academic at the University of Cambridge and a baron by title, Rayleigh was renowned for his work in physics. The Rayleigh number, a common parameter used to describe the flow of water, is named for him; as is Rayleigh scattering, which explains how photons diffuse through the atmosphere and color the sky blue. Rayleigh also discovered the noble gas, Argon, earning a Nobel Prize for it in 1904.

    But a little experiment that Rayleigh performed in 1890, inspired directly by Franklin’s observations, is not nearly as well-known.

    Rayleigh carefully measured a tiny volume of olive oil — 0.81 milligrams, to be exact — and placed it onto a known area of water. The oil quickly spread out and covered an area, which Rayleigh precisely measured. And then he did something that Franklin never thought of: Rayleigh divided the volume of the oil by the area it covered, thus estimating the thickness of the oil film. Assuming that the oil formed a single layer of molecules — a monolayer — then the thickness of the oil film is the same thing as the length of one oil molecule.

    This is how Lord Rayleigh became the first person to figure out a single molecule’s dimensions, many years before anyone could see such molecules.

    Rayleigh’s final result was 1.63 nanometers. Olive oil is mainly composed of fat molecules called triacylglycerols, and modern measurements show that they measure about 1.67 nanometers in length, thus implying that Rayleigh’s “primitive” estimates were off by just 2 percent. His original paper detailing the experiment can be found here.

    I love this story because it shows, at least anecdotally, how deep scientific insights can emerge from the simplest of experiments. It’s a testament to the idea that you don’t always need sophisticated equipment to unlock the secrets of nature — sometimes, all it takes is a drop of oil and a bit of ingenuity.

    For those interested in delving deeper into the history of these oil drop experiments, Charles Tanford’s book, Ben Franklin Stilled the Waves, offers a much deeper exploration.

  • Microbial Lenses

    There’s a new paper out in PNAS that hints at some intriguing synthetic biology applications. Researchers at the University of Rochester introduced a sea sponge gene into Escherichia coli, giving the bacteria a translucent, silica-based coating. This biosilica shell transforms the cells into tiny microlenses that focus beams of light.

    Here’s an excerpt from the paper (paywalled):

    Remarkably, the polysilicate-encapsulated bacteria focus light into intense nanojets that shine nearly an order of magnitude brighter than unmodified bacteria. Polysilicate-encapsulated bacteria remain metabolically active for up to four months, potentially enabling them to sense and respond to stimuli over time. Our data show that synthetic biology can produce inexpensive and durable photonic components with unique optical properties.

    Typically, microlenses are just tiny spheres, a few micrometers across, fabricated in cleanrooms with harsh chemicals. They appear in photodetectors and camera sensor arrays. Engineered microbes can’t match the precision of these fabricated microlenses, but they offer a major advantage: you can make them at room temperature and neutral pH in a flask of liquid. (And the cells reproduce themselves “for free”!)

    Notably, lifeforms evolved primitive microlenses long before this paper. Cyanobacteria focus incoming light on their cell membranes to locate the sun’s position; they’re probably the world’s smallest and oldest camera eyes. Other cells, like yeast and red blood cells, also naturally behave as microlenses.

    What’s new about this paper is that the silica coating majorly improves the cells’ ability to focus light. More importantly, the work shows that we can tune a living organism’s optical properties through genetic engineering.

    The researchers took silicatein, an enzyme from sea sponges, and fused it to OmpA, an outer-membrane protein that allows molecules to flow in and out of the cell. Silicatein grabs silicon-containing molecules from the environment and stitches them into silica polymers; sea sponges use it to build “bioglass” structures. When fused, OmpA embeds into the cell membrane and holds silicatein outward, like a fishing hook.

    By flooding the engineered cells with orthosilicate (a silicon-containing molecule), the silicatein “hooks” grab it and stitch together a silica shell around the entire cell. The researchers confirmed this with confocal imaging and a dye that binds specifically to silica. The engineered cells ended up surrounded by dye, while normal cells remained unstained.

     *Rho123, a dye, stains silica. Cells were engineered to express silicatein enzyme from two different microbes (hence column A and B), and were compared to wildtype. From Sidor et al.*

    This silica shell significantly changes the cells’ optical properties. To visualize this, the researchers built a custom microscope that can shine light on cells from any imaginable angle relative to the vertical axis. Uncoated cells scattered some light but didn’t create a distinct focal spot beyond their surface. In contrast, silica-encapsulated microbes produced light beams that stretched for several microns, with peak intensities nearly an order of magnitude higher than wildtype cells.

    I would have guessed this treatment might kill the cells — either because the silica shell blocks nutrients or because photons would roast them — but it doesn’t. Engineered cells continued scattering and focusing light even months after switching on the fusion protein. The only downside is that the cells grow more slowly, if at all.

    What could we do with these living lenses?

    My first step would be to engineer cells of different shapes and dimensions. A typical E. coli measures about two microns long and one micron wide. What if we engineered more spherical cells? Or longer cells? We could create a series of living microlenses, each with unique optical properties, by tuning the silicatein protein and adjusting the cells’ physical dimensions.

    (In the video below, researchers are blasting a stationary cell with light at angles ranging from -90° to 90°. There are some orientations where a nanojet appears, but it happens quickly.)

    From there, the applications depend on our imaginations. We might wire living bacteria into optical devices that don’t need batteries and last for months without a power supply. Or we could build medical devices. Instead of swallowing a pill camera powered by toxic batteries, perhaps we could engineer E. coli into a camera. I’m not sure. At this stage, it’s speculation.

    Practical limitations exist with current microlenses. As pixel sizes in camera sensor arrays shrink below two micrometers, placing microlenses becomes difficult. However, cells can “swim” to a specific destination and arrange themselves autonomously. In other words, arrays of bacteria could line up over a sensor — maybe using microfluidic channels — to focus and direct light into tiny pixels.

    Will any of these ideas actually happen? Probably not soon. Still, when a paper broadens our “design space” in biological engineering, it’s worth paying attention. One of my first questions, upon reading something like this, is usually: “Where else could this be applied, especially in unexpected ways?”

    Consider optogenetics: Ed Boyden and Karl Deisseroth discovered channelrhodopsins—light-responsive proteins—and imagined splicing them into neurons to control action potentials. That mental leap doesn’t seem so large in hindsight.

    Engineered gas vesicles, similarly, are being used to improve ultrasound resolution within the body, enabling scientists to image individual cells moving through the bloodstream. I’ve written about these structures before for Asimov Press. Mikhail Shapiro got the idea for engineering gas vesicles after reading “two short paragraphs” about photosynthetic algae!

    In other words, pay attention when a paper like this appears. It might plant the seeds for something exciting, even if we don’t recognize it immediately.

  • How to Minimize Cell Burden

    I. Molecular Burden

    Biochemistry textbooks often depict cells as spacious places, where molecules float in secluded harmony. But cells are dense and crowded; a bit like molecular burritos, according to Michael Elowitz, a biologist at Caltech.

    Roughly three to four million proteins jostle around inside a single E. coli bacterium, which has an internal volume 50 billion times smaller than a drop of water. A typical enzyme within this crowded cell collides with its substrate 500,000 times each second. When bioengineers manipulate life, they must also consider how their modifications will impact everything else within the cell, too—for everything in the cell is connected to everything else.

    In 2000, Elowitz published one of the first synthetic gene circuits—called the “repressilator”—with his mentor, Stanislas Leibler. A gene circuit is made from RNA or proteins that interact with one another, enabling cells to perform logical functions. The repressilator was crafted from just three genes, each of which encoded a protein that repressed another protein to form an inhibitory loop. One of these proteins was fused to a green fluorescent protein so that, as the protein levels rose and fell, the cells flashed green—on and off—in 150 minute intervals.

    As synthetic biology advanced, and its tools grew sharper, synthetic gene circuits swelled in size. In 2016, a paper in Science reported an engineered circuit made from 55 different sequences assembled into 11 genes; among the largest gene circuits yet assembled in a single cell. Building significantly larger synthetic gene circuits will require careful consideration of the finite resources available to cells.

    After all, cells are not empty vessels that have evolved to do our bidding. When we engineer an organism, coaxing it to make new proteins or molecules, we are imposing a molecular burden upon it. Typically, burdensome genes are defined as those that “impose a high enough energetic burden to be opposed by selection if they do not confer sufficient added benefits.” Any genes added to a cell must compete for cellular resources—energy, ribosomes, and RNA polymerases—that may diminish the cell’s ability to carry out other functions; to grow, metabolize, and divide.

    recent study in Nature Communications measured the molecular burden imposed by 301 different plasmids; fewer than 20 percent of them caused E. coli cells to grow more slowly. But surprisingly, some of the most burdensome plasmids were also the simplest—a plasmid encoding red fluorescent protein—and nothing more—caused a 44% reduction in growth rate.

    The study is intriguing, in part, because its dataset could provide insights into whysome genes, once expressed, cause cells to grow more slowly. More importantly, though, this study reveals that there is still so much we don’t understand about biology, or toxicity, or how to ease molecular loads as we strive to engineer life in increasingly sophisticated ways.

    II. Competition

    Cells have finite resources. Insert a synthetic gene into a cell, and several things quickly happen.

    First, the gene is transcribed into RNA by an enzyme calledRNA polymerase. Then, the RNA molecules are translated into protein via ribosomes, large protein-RNA complexes made from dozens of interlocking components. A typical E. coli cell contains about 3,000 RNA polymerase molecules and 30,000 ribosomes. Exogenous genes pull some of these enzymes away from other parts of the cell. And, for reasons that are not fully understood, cells burdened with recombinant DNA do not upregulate their production of RNA polymerase or ribosomes to compensate for the increased load, according to a 2020 study.

    Although the term “burden” typically refers to resource limitations—be they metabolic, transcriptional, or translational—it is often experimentally difficult to untangle from toxicity. A thorough investigation is often needed to tell whether a cell is growing slowly due to burden or toxicity, because the outcome—slow growth—is the same.

    Some proteins that are normally non-toxic also become toxic when expressed above a certain threshold. For a 2018 study, researchers expressed 29 different enzymes in yeast. All of the enzymes have well-known mechanisms and are non-toxic at normal levels. Some of the enzymes became toxic in the yeast, however, because they “aggregated together, they overloaded a transport system that [took] them to a specific cell compartment, or [they] produced too much catalytic activity.”

    A cell faced with excess burden or toxicity really only has one way out: To mutate and break the troublesome genes. A single milliliter of liquid culture holds as many as one billion E. coli cells. If just one of those cells mutates the burdensome genes and breaks its function, then that cell will grow more quickly than its neighbors. The mutated cell’s progeny will eventually take over the entire population. The more burdensome a genetic sequence, the more likely a mutant will appear and take over.

    Remember that Nature Communications study that I mentioned earlier? Well, the authors built a simple mathematical model to predict the correlation between different levels of burden and “population takeovers” when cells are grown in different sized containers. A plasmid causing more than a 30% reduction in growth rate, for example, is likely to result in a “mutant takeover” when the cells are grown in even a small container, such as a flask.

    Collecting the data to build this model was straightforward. The authors placed each of the 301 different plasmids into E. coli cells, and then measured how much each plasmid slowed down their growth rates. A plate reader machine measured the cloudiness of each population over time; a proxy for cell growth. The authors also measured growth rates for E. coli carrying one of five different plasmids that imposed known levels of burden. These controls were used to normalize growth rates between experiments.

    Of the 301 plasmids tested, just six caused cells to grow more than 30% slower than unaltered cells. A further 19 plasmids caused cells to grow more than 20% slower. In total, the authors found 59 plasmids that caused measurable changes to bacterial growth rates.

    Genes expressed from constitutive promoters (meaning they are always “on”) were 2.9 times more likely to be in the burdensome set of 59 plasmids. And plasmids containing a strong ribosome binding site (the part of an mRNA strand where ribosomes bind and kickstart translation) were 2.1 times as likely to slow E. coli growth, compared to plasmids that include weaker RBS variants.

    III. Build Bigger

    If this study’s results were distilled in a single sentence, I think it would be this:

    Genetic sequences inserted into a cell do not usually cause excess burden; but when they do, it is often for reasons we don’t fully understand.

    Why, for example, is a plasmid encoding red fluorescent protein so burdensome? Plasmids encoding YFP and GFP also caused 29.5% and 27.1% reductions in growth rate, respectively. A plasmid encoding a chloramphenicol antibiotic resistance gene—and nothing more—caused cells to grow 33.4% slower. Molecular mechanisms explaining these growth defects are often unclear, or completely absent.

    At Asimov, one of our primary applications involves engineering Chinese Hamster Ovary (CHO) cells to make therapeutic proteins, like monoclonal antibodies. This particular type of cell, originally derived from animals smuggled out of China in 1948, are used to make nearly 90% of all therapeutic proteins.

    In our hands, most therapeutic antibodies can be expressed well by optimizing the genetic design or bioreactor process. In many cases, we’ve engineered CHO cells to make more than 10 grams per liter of antibodies without causing any noticeable growth defects on the cells. But other times—and for reasons we don’t fully understand—engineering CHO cells to make certain therapeutic antibodies imposes huge burdens or toxicity. Debugging these cases is an interesting exercise on its own. The root cause is often mysterious, but in other cases we can detect hallmarks of endoplasmic reticulum (ER) stress, which suggests protein misfolding or aggregation in the cell.

    Fortunately, there are steps we can take to reduce molecular burden or toxicity.

    Codon optimization is one option. This is when scientists convert the DNA sequence from one organism into codons “preferred” by another organism, without altering the order of amino acids in the final protein. In the lab, we have tested various codon configurations to find those that slow down the ribosome’s movement, thus giving proteins more time to fold and reducing toxicity.

    Another way we solve this problem is by balancing the expression of genes. Antibodies are made from proteins—called heavy chain and light chain—that come together to make a Y-shaped molecule. If one of these chains is expressed at a much lower level than the other, it can become rate-limiting in the formation of the antibodies. At the same time, if the “excess” chain is the antibody heavy chain, it can float around the cell and cause toxicity. Another way to reduce burden is to integrate genes directly into the host genome, rather than using multi-copy plasmids, such that only one copy of the genes exist and they don’t consume too many cellular resources.

    A more complicated approach is to engineer cells with incoherent feedforward loops, or IFFLs, to mitigate burden caused by gene expression. Such gene circuits are designed to dampen mRNA levels when a gene’s expression diminishes the cell’s ability to carry out other functions.

    A balance must be struck, however. It is good to reduce burden, but not at the cost of antibody production. Molecular burden, toxicity, and economics are all valid things to consider.

    Most of these strategies are also akin to using Tylenol to treat a cold—we may get the outcome we’re after (less burden), but only because we don’t understand how to solve the problem at its core. It is only by peering deeper into living cells, and untangling their intricate complexities, that we begin to understand what goes wrong when we manipulate them.

    In this case, as in many others, greater basic science research will enable more sophisticated engineering.


    By Niko McCarty

    Thanks to Rachel Kelemen, Alec Nielsen, Ben Gordon, Kate Dray, Kevin Smith, Chris Voigt, and Arturo Casini for help with this essay.

  • Tardigrades Can Live for 30 Years

    A few months ago, I saw some claims online that tardigrades—also called water bears—can survive for up to 30 years without food or water.

    Naturally, I was curious about the veracity of this statement, so I did a few Google searches and followed breadcrumbs back to the original reports. A Quora poster had previously written that they first heard this claim from a National Geographic article. But that Nat Geo article had a paywall, and when I finally got around it, all it said was:

    Tardigrades belong to an elite category of animals known as extremophiles, or critters that can survive environments that most others can’t. For instance, tardigrades can go up to 30 years without food or water. They can also live at temperatures as cold as absolute zero or above boiling, at pressures six times that of the ocean’s deepest trenches, and in the vacuum of space.

    The article didn’t include a hyperlink or any other citation for the “30 years” claim. (Also, they didn’t mention the coolest tardigrade feat: that they can survive after we shoot them out of a high-speed gun, at speeds around 3,000 feet per second, and survive the impact!)

     *A light gas gun was used to launch tardigrades at speeds of 900 meters per second. The animals survived the impact. Credit: NASA*

    After a bit more digging, I found the original source: a 2016 research article that researchers published in an obscure journal called Cryobiology. In this study, Japanese scientists found tardigrades clinging to frozen moss, without food or water, that researchers had placed in a freezer 30 years prior.

    Back in 1983, a scientist at the National Institute of Polar Research in Tokyo, named Hiroshi Kanda, traveled to the Yukidori Valley in eastern Antarctica and collected moss samples there. Kanda wrapped the moss in paper, sealed them in plastic bags, and then chucked them into a -20°C freezer. And there they sat, waiting, for the next 30 years.

    In 2014, Kanda’s successors in Tokyo found these moss samples and removed them from the freezer. After thawing the moss for 24 hours, the scientists added some water to the moss and picked the samples apart with tweezers to search for living organisms. They found a tardigrade egg, in addition to two living tardigrades clinging to the moss, which they named Sleeping Beauty 1 and 2. There were also dead tardigrades, but the researchers did not report their numbers in this study. It’s therefore difficult to know what fraction of these animals actually survive in the freezer for 30 years. (Is it a rare or common occurrence?)

    During the first few days, the living tardigrades moved around very slowly, if at all. But after a few days, both of the animals started moving and feeding on algae that the researchers fed to them. Sleeping Beauty 1 later laid 19 eggs, 14 of which hatched. The egg clinging to the moss also hatched, and the tardigrade that emerged later had babies of its own.

    In other words, frozen tardigrades can actually survive for at least 30 years without eating or drinking anything — but only if they’re frozen first! This is one instance, it seems, where ridiculous-sounding claims on the Internet ended up being true. Most comments about this that I found on the internet, however, failed to mention the “frozen” part. It’s likely that (unfrozen) tardigrades can only survive a few weeks without food.

    Tardigrades are not the only organisms that can survive for decades — or even thousands of years — in a frozen state. In 2021, scientists drilling in a remote Arctic location collected some permafrost and thawed it. A living rotifer emerged; it had been encased in the ice for at least 24,000 years, according to radiocarbon dating experiments. Other scientists in Siberia and Antarctica have also thawed out 400-year-old moss and a 32,000-year-old seed, both of which were viable. The seed regenerated and grew into a plant.

     *This is the plant that scientists regenerated from a 32,000-year-old seed. It’s cute!*

    Tardigrades are able to survive for decades in ice because they enter a state called cryobiosis. When the animals sense that they’re surrounded by frozen water, they begin to shut down their metabolism and make cryoprotectants that change the freezing point of their internal tissue. Nobody knows exactly how, but tardigrades seem to be able to control ice formation so that their cells don’t get destroyed by crystals during freezing. Oddly enough, though, the tardigrades make all these changes without significant changes in their gene expression, suggesting that their “freeze-tolerance genes” are always switched on. It’s weird.

    German naturalist Johann August Ephraim Goeze was the first to see tardigrades — or “little water bears,” as he called them — crawling upon a bit of moss, in 1773. A few years later, Lazzaro Spallanzani named them “tardigrada,” which means “slow steppers” in Italian. Scientists have been studying these little animals for several centuries, then, and it’s clear there’s so much we don’t understand about them, or how they survive and adapt to extreme conditions.

    It also proves, at least anecdotally, that scientific discoveries can be made by innocuous actions, like thawing out some samples stuffed in the back of a freezer.