• We Need Better “Cloning Design” Tools

    Why do cloning tools still suck? This problem seems like a low-hanging fruit for AI to solve.

    Today, if a scientist wants to make a new plasmid or DNA sequence, they often go into their freezer, figure out which DNA sequences they have, upload those DNA sequences to Benchling (or another platform), and then must figure out how to “convert” those sequences into what they want. Should I do Golden Gate or Gibson Assembly? What annealing temperature should I use for my primers? And so on. 

    There are already tools that help with each of these steps, but has anybody “automated” this decision-making? If so, I’m not familiar with them. (A tool called J5 is probably the closest thing, but it won’t recommend the optimal method given a scientist’s existing sequences and primers.) And if the scientist makes even one error in this multi-step design process — like forgetting about an internal restriction site in a gene — they basically waste an entire week of work.

    (You might object to this and say, “But DNA synthesis solves this problem; just synthesize the full plasmid directly!” But people have been saying that for decades at this point, and DNA synthesis costs have not fallen in several years. Cloning DNA remains essential.)

    What we need is a fully automated, end-to-end cloning design tool that selects the best method based on a library of existing sequences and primers; a tool that recommends the optimal approach based on cost, speed, and so on. “Design tools” for cloning may not seem like a sexy thing to work on, but whoever solves it will marginally improve the lives of many scientists.

    With this in mind, I’ve given $1,500 in microgrants, courtesy of Astera Institute, to two people — Jai Padmakumar and Xavier Bower — who have been thinking about this problem. Bower has already built an open-source prototype, called IceCreamClone.

    Screenshot of the IceCreamClone platform, featuring options for various DNA cloning methods: Simple Single Insert, 3-Fragment Assembly, Modular 4-Part Assembly, and Point Mutation, with descriptions of each method and a button to build a custom construct.

    Here’s how a tool like this should work:

    First, you specify the plasmid you want to build. Then, you upload your current plasmid library, a collection of DNA sequences already in your inventory, and existing primers. The tool takes these data and outputs multiple cloning protocols based on different metrics, such as lowest cost, fastest speed, or the protocol most likely to be successful. The tool also runs a series of checks on all the sequences to make sure they don’t have internal restriction sites, for example, or weird secondary structures.

    It would be particularly cool if scientists using this tool could opt-in to sharing their data. The tool could then prompt them afterwards: How did the cloning go? Can you upload the results? Over time, this feedback data could be used to train predictive models that make cloning far more likely to be successful.

    Of course, there are issues with this idea. For one, it requires that people upload their entire catalog of existing sequences + primers, which is quite tedious for some laboratories; especially those with decades of cloning experience. Ideally, these tools would directly integrate with Benchling and Addgene.

    Anyway, I continue to think this is a “low-hanging” problem worth working on. Whoever makes an easy-to-use, end-to-end cloning design tool with really good predictive accuracy could presumably make a small business out of it. And, in doing so, you’d make many people happy!

  • A $1,000 Microgrant for “Bubble Algae”

    Valonia ventricosa, or “bubble algae,” is the largest single-celled organism on Earth. It’s three-to-four orders-of-magnitude larger than most “normal” microbes. (Yes, the green ball in the image is a single cell.)

    A person holding a large, smooth, green stone in their hand.

    More than 95% of the cell’s volume is taken up by a vacuole. This vacuole is not like cytoplasm; it’s acidic and packed with ions. It’s also filled with big sugar chains, which the cell uses to repair damage on its cell extremities. Each cell has dozens or hundreds of nuclei, pressed tightly against the walls. 

    One cool thing about sea grapes is that (because they are so huge, and visible to the naked eye) you can directly inject them with chemicals. I gave a $1,000 microgrant (courtesy of Astera Institute) to a team of six working on scaling up Valonia cells as “living bioreactors.” They are based out of Splat Space, a community lab in Durham, North Carolina.

    Today, when scientists want to “design” a new metabolic pathway, they often use E. coli. The microbes are engineered to express each gene in the pathway, and then these genes are swapped and replaced until the metabolic network starts working. This takes a lot of work and usually requires months of genetic engineering. But with Valonia, what if we could instead test out metabolic pathways by injecting molecules straight into their vacuole? Perhaps sea grapes could become a sort of self-replicating, cell-free system for prototyping ideas.

    That’s the SPLAT team’s idea. I think it’s really cool, even though I’m skeptical that Valonia will become a model organism anytime soon. Nobody has ever engineered these cells, for a few reasons:

    1. They have hundreds of nuclei, each sharing a common cytoplasm pool extending as a thin layer around the whole cell. (Perhaps an extrachromosal plasmid would work?)
    2. Their cell walls are thick, so electroporation is unlikely to work.
    3. Nobody knows their genome sequence!
    4. Even if you get DNA inside, you need to make sure that DNA matches the cells’ abnormal genetic code. After millions of years of evolution, Valonia has reassigned its TAG and TAA stop codons to glutamine.

    There are also issues with injecting chemicals into the vacuole. The vacuole is not like cytoplasm; it is acidic and packed with molecules that would likely interfere with proteins. So you’d need to find some way to engineer the vacuole environment, and maintain that environment, before you turn Valonia into a prototyping platform. Not an easy problem to solve!

    If this idea did pan out, though, the Splat team told me that they could transform sea grapes in bulk (”potentially hundreds of thousands at once”) and then grow them to the size of cherries. It’d be a totally new way to prototype metabolic pathways and manufacture chemicals.

    And despite the barriers, I’m surprised more people are not studying these organisms. At the very least, we should have better microscopy images and a complete genome sequence. So best of luck to the SPLAT team. I’m glad they are working on hard problems!

    A hand holding a cluster of green sea grapes, with a blurred background of a water tank and boats.
  • Failed Attempts to Make V. Natriegens Grow Faster

    In a recent blog post, I claimed that nobody has yet tried to “evolve” Vibrio natriegens in the laboratory to make them grow even faster. But I was totally wrong, and it warrants a correction.

    For context, V. natriegens is a microbe that doubles every 9.7 minutes in highly enriched growth media, or every ~30 minutes in more “minimal” media, with just some sugar and salt. In my recent blog, I explained that these cells divide faster than any other known organism because they make a large number of ribosomes quickly. (It has nothing to do with the time required to copy a genome):

    V. natriegens has at least a dozen ribosomal RNA operons, or gene clusters encoding ribosomal RNA molecules, in its genome… [Also,] these ribosome genes are located next to strong promoters, or genetic sequences that recruit RNA polymerase enzymes. In other words, Vibrio devotes more of its genome to ribosomal genes, and has also evolved a stronger start signal for those genes, meaning the cell makes ribosomal RNA much more frequently, and in higher numbers, than other microbes.”

    At the end of the blog post, I proposed some experiments to make Vibrio natriegens grow even faster. Perhaps we could make its ribosomes smaller, such that each one takes less time to create. Or, alternatively:

    “…we could take a more agnostic approach and just let evolution take its course, albeit in an accelerated way…Perhaps we could run a Richard Lenski-esque experiment, in which V. natriegens’ cells are grown in a robotic bioreactor and flooded with glucose every few hours…If we repeat this lots of times, some microbes may evolve to grow even faster… Or maybe not; V. natriegens may already be quite close to the theoretical cell division time limit. These experiments haven’t been done yet.

    This last claim — that nobody has yet done these experiments — was totally wrong, as Adam Feist, bioengineer at UC San Diego, explained to me by email. Feist is a co-author on two studies that have done basically this exact experiment!

    In one study, they used an automated laboratory evolution (ALE) robot to evolve E. coli cells to grow faster. Their goal was to see whether E. coli could grow as fast as V. natriegens. (Answer: No.) In a second study, they did much the same with V. natriegens to see if they could “break” its speed limit. (Answer: No.)

    For the first study, the researchers took six different E. coli strains, including BL21, K-12 MG1655, Crooks, and others. They grew each strain in three triplicate flasks, filled with growth media containing just a minimal amount of sugar and salts, and incubated them at 37°C. Every time the cells reached a certain density, they transferred a tiny amount of cells into a fresh flask with the same exact conditions. And they repeated this again and again for 900 generations of cell growth.

    The beauty of this experiment is two-fold: First, it’s super simple and easy to replicate. Every time cells are transferred into a fresh flask, microbes evolve to grow faster such that they can take over the nutrients in that flask. So this is artificial selection in a tube. And second, they froze the cells at regular intervals so they could sequence them, measure their gene expression, and so on to really figure out how the cells were evolving over time.

    All six strains started with different growth rates. MG1655 doubled every hour, whereas the Crooks strain doubled every 47 minutes. After hundreds of generations, though, all the strains converged to a similar doubling time: About 40 minutes. All six strains also took similar paths toward the faster growth rate; the same genes kept getting mutated again and again across the flasks.

    Genes involved in eating and breaking down glucose were mutated to become more efficient. Ribosomal genes were also massively upregulated across strains, as were genes involved in making amino acids and nucleotides. This all makes sense.

    To compensate for this faster growth, the cells also “broke” a handful of genes. They mutated stress response genes, for example, because they no longer needed them in their stable, consistent environment. And they shut down motility genes (it’s super “costly” for a cell to make a flagellum, and why would they need one in a shaking flask anyway?)

    Even after all this convergent evolution, though, the key thing to remember is that none of the E. coli strains approached the division time of V. natriegens. All of them maxed out around 40 minutes per cell division in this “minimal” growth media.

    In the second study, then, Feist & co. wanted to see if they could use the same principles to make V. natriegens grow faster. So they set up 10 identifical flasks, each with some M9 growth media and a bit of salt and sugar, and inoculated the cells. Every time the V. natriegens cells hit a certain density, a small amount were transferred to a fresh tube. This was repeated for 1,000 generations, but there was no increase in cell division rates.

    The cells did acquire mutations, though. Nine of ten flasks mutated a stress response gene (much like the E. coli did) and most flasks made mutations to an ion transporter protein. But despite those changes, the cells had a median division time of 28 minutes at the start and end of the experiment.

    So what happens now? Well, for starters, I think we need to run more of these experiments, albeit with some different starting parameters, to see if we can make V. natriegens divide faster after all. Feist proposed, in his email, that we should insert more copies of ribosomal genes into V. natriegens and E. coli, say, and then repeat the evolution experiments with both of them. Perhaps the cells just didn’t have enough time to duplicate these genes on their own but, if we added them artificially, their speeds would go higher.

    This seems like quite an easy experiment to do. But without running the experiment, we won’t actually know if the abundance of ribosomal genes alone is sufficient to speed up division times, or if we also have to cut down the size of ribosomes themselves to make cells grow faster. Let’s get busy!

  • Many Great Inventions Weren’t Made by “Serendipity”

    Important inventions are often framed as happening by “serendipity,” where a scientist with a prepared mind saw something, or dreamed something, or took LSD, and had this big breakthrough that ended up changing the world. Serendipity literally means “the occurrence and development of events by chance in a happy or beneficial way.” And yet, the more I read about famous inventions in molecular biology, the more I’m struck by how many (not all, but many) of them were deliberate creations with no “chance” involved. Indeed, the more you read about these histories, the more you begin to see a pattern in which great inventors “engineered their own serendipity;” a point that Ed Boyden has written about profusely.

    The discovery of GFP, for example, was no accident. After nearly being blinded by the atomic bomb dropped on Nagasaki, Osamu Shimomura moved to the United States to study with Frank Johnson at Princeton. Johnson was already deeply fascinated by fluorescent molecules and knew about the jellyfish swimming off the coast of Friday Harbor, Washington, from which GFP would later be found. And Shimomura himself had already studied sea fireflies (crustaceans that emit light, widespread on the coast of Japan) in Nagasaki, so he came into the GFP project with lots of experience with fluorescence and a desire to isolate the molecules responsible.

    Nanopore sequencers, too, are often described as being invented by “serendipity,” because David Deamer was driving in his car on a California highway when, all of a sudden, he had this bolt of insight where he realized that nucleotides, moving through a protein pore, might be able to disrupt an electrical current in unique ways and thus be deciphered according to their “fingerprints.” Deamer pulled over, sketched out the idea in a notebook, and then spent the next seven years building it. But this story largely obscures the fact that Deamer had already spent more than a decade studying artificial cells and the ways that molecules move through cell membranes at UC Santa Cruz! He had been priming his mind to invent nanopore sequencers for many years.

    A handwritten notebook page with notes and diagrams regarding a scientific topic, including equations and structural representations.
    Deamer’s sketch in his 1989 lab notebook. Credit: Oxford Nanopore

    The list goes on and on. (The micropipette, too, was invented in just three days by a frustrated postdoctoral student named Heinrich Schnitger, who got sick of using his mouth to move toxic liquids around. There was no “eureka” moment, really.)

    But anyway, the reason I started writing this short essay was not to list off a bunch of inventions. My actual intention was to talk about optogenetics and how it came to pass, and how it serves as a powerful example of this “engineered serendipity” idea.

    For context, optogenetics is a technology that uses light to control cells; often neurons. You first insert a gene encoding a channelrhodopsin into cells. The cells then make the protein, which naturally embeds itself into the cell membrane. Then, when you shine a light at the cell, some photons will strike this protein and force it open, allowing ions to rush inside and trigger an action potential. After optogenetics was invented in 2005, neuroscientists adapted it in many obvious ways; they found light-sensitive channels that could be used to silence neurons instead, and also engineered light-sensitive proteins that respond to different wavelengths. The end result is that now we have this toolbox whereby neuroscientists can “play the brain like a piano,” as Rafael Yuste, a neuroscientist at Columbia University, likes to say.

    The invention of optogenetics was no accident, either. It was an act of “engineered serendipity.” Indeed, Francis Crick (who was ahead of his time on so many ideas) described ideas that seem similar to optogenetics, albeit without mentioning light, way back in 1979. Writing in Scientific American, Crick called for a technology which could “record from many neurons independently and simultaneously,” thus predating calcium sensors — a technique that can do exactly this — by a few decades. He also wrote about his desire to create “a method by which all neurons of just one type could be inactivated, leaving the others more or less unaltered,” which is also possible today using a method called holographic optogenetics (in which a laser beam is split and redirected to different neurons simultaneously.)

    There is also evidence that optogenetics was invented twice, independently, by researchers who had no knowledge of each other. Zhuo-Hua Pan expressed channelrhodopsins in retinal ganglion cells, growing in a dish, in February 2004 and also in rats that summer. He submitted his paper to Nature in November 2004, a few months before Boyden & Deisseroth, but it was rejected because reviewers thought his technology was a way to restore vision, rather than a more broadly useful tool for neuroscience.

    In any case, I’m struck by Boyden’s story of optogenetics because it so clearly conveys the act of discovery, as described by an engineer. The key breakthrough came about because the inventors wrote down the exact requirements they were looking for (specifically, a technology which could activate individual neurons with high spatiotemporal resolution) and then enumerated, or wrote down, all the ways they could possibly imagine to make this happen. Small molecules, magnets, electricity, light….

    Here is how Boyden told this story in a speech to high school students last year:

    “How did we make ourselves so lucky? … Simply put, try to think of every way of solving a problem… In this case, we made a list of all the forms of energy you can deliver to the brain – there’s light, sound, radio waves, a few other things. You can write the whole list down in a couple minutes. I liked light because it’s faster than anything else, and you can aim it precisely. Next question: how do you make brain cells sense light? Well, you can either design a tiny solar panel, or you can try to find one. That’s the whole list, just two cases. Finding one sounded easier. So we started emailing people, asking anyone who would listen – could you send us the light-driven molecule that you are studying, so we could put it into brain cells? And some people replied. We were in business! We took one molecule, put it in a brain cell, and as I told you, we could activate it with light. By writing down every way of solving a problem, in a systematic way, you can hone in on the best path. You may even find ideas you wouldn’t ordinarily think about. It helps you make a map of your own, when none is given to you.”

    Boyden tells this story as if it was a “simple” thing; a process which could be taught and then executed again and again. I’d normally be skeptical of anybody trying to convince me that they can invent an incredibly useful technology on-demand, but I trust Boyden on this point because he has actually invented a few important technologies, including expansion microscopy and new methods for connectomics. Naturally, I wonder where else we could “engineer serendipity” in biology, and also how we might teach this, but don’t have satisfying answers. (If I did, I’d probably be a multi-millionaire, or at least have some patents by now!)

    At the very least, it’s worth trying this approach in your own work. Start by precisely defining your needs, like “I want to turn neurons on- and off at the milliseconds timescale.” Then, write down a list of possibilities to do this. Weigh the pros and cons of each. Drugs are leaky and diffuse everywhere … electricity is too hard to control … light is cheap and abundant, and can be turned on- or off quickly, and we can shine it at a narrow point …

    Then, start emailing people. Ask for their advice. Perhaps you’ll quickly figure out that there is an algae, or some other photosynthetic microbe, that has already evolved proteins to sense light! Maybe you could use one of those. And even if you test one of these ideas and it doesn’t pan out, at least you’ll have a starting point to engineer the tool and troubleshoot your strategy.

    In biology, we often hear about how the “SEARCH SPACE IS INFINITE” and, in order to find a solution, we need to navigate through this combinatorial explosion of possibilities. The cell just has too many parameters! It’s a Black Box! Yada yada yada. But cearly there are ways to train your mind to precisely define what it is you’re trying to do, and what the specifications of a viable solution must look like, and then narrow down your search space — based on priors — to identify likely solutions that have already evolved in the infinite wisdom of Nature.

    We ought to consider, therefore, not only how we can teach this process to students more systematically, but also which technologies we might build using this approach. Can we convene a workshop, or series of workshops, where we get lots of interesting people together in a room to discuss a technological need, and then we enumerate the options, weigh the pros and cons, and come up with a plan to find solutions? Let’s test it out.

  • Where is my Mother Machine?

    There is a tiny device (first invented almost two decades ago) that lets you watch a single cell divide hundreds of times, under tightly-controlled conditions, and yet I almost never meet people who actually use it. This is a shame because there’s many interesting ideas that I think could be uniquely tested in such a device.

    Most experiments in biology are instead done in “bulk,” which is not a good way to deeply understand an organism. RNA-seq experiments, for example, are often done by growing lots of cells in a flask, exposing them to some chemical, and then killing all the cells, extracting their RNAs, and sequencing all of them together. The end result of this is just an average, and it completely obscures the messier (and more truthful) stochastic nature of life.

    But a Mother Machine, and other microfluidics tools like it, helps to solve this problem. It’s just a tiny device with a long trench through which nutrients flow. Cells travel down this trench and fall into little wells, etched perpendicularly to the main trench. Each of these wells is barely wide enough for a bacterium to fall inside. When a cell falls into a well, it keeps dividing and also has access to fresh nutrients, which are constantly pumped through. Waste molecules are continuously flushed out. As one cell divides into two, then four, then eight, and so on, some cells eventually extend out of the well entirely and get swept away with the current. The cell at the bottom, though, stays put and will keep dividing.

    Like many other “great” inventions, the Mother Machine was designed to answer a specific question. In a 2005 paper, some scientists claimed that, when a cell divides, whichever offspring inherits the “old pole” from the mother divides about 2 percent slower with each passing generation. (Said another way: When an E. coli cell divides, it builds a wall down the middle and cuts itself in two at that point. Each “daughter” cell has two ends; one end is made from the wall, and the other end is “old.” This old end gets passed down through the generations, again and again. The authors of the 2005 paper claimed that this constitutes a form of cellular aging, and that cells which inherit the old end are basically less fit than the other daughter cell. Provocative claim!)

    The Mother Machine was invented by a small team at Harvard to disprove this hypothesis. In the 2010 paper describing the device, they basically just strapped a camera to the microfluidics chip and recorded the growth rate for tens of thousands of cells, under constant nutrient conditions, for “hundreds of generations.” After tallying all the data, they concluded that “E. coli, unlike all other aging model systems studied to date, has a robust mechanism of growth that is decoupled from cell death.” In other words, growth does not slow down with age, and the 2005 claims were wrong.

    Other groups have since made modifications to the original device, using it to revisit classic experiments in molecular biology. In 2018, for example, a Swiss team modified the microfluidic chip to have two input channels, rather than just the one. With two ports, they could expose cells to different growth media at the same time. Or they could switch back-and-forth between the two conditions, or even expose cells to gradients of those conditions.

    Now, it has been known since the 1960s that E. coli cells prefer to eat glucose over lactose. When glucose runs out and only lactose is around, the cells activate their lac operon and begin making enzymes to digest it. Jacques Monod and François Jacob shared a Nobel Prize, in 1965, for figuring this out. But nobody had ever actually watched this “switch” at the level of single cells, under tightly-controlled conditions.

    But then the Swiss team made their modified Mother Machine. They flooded E. coli cells into the device, trapped them in wells, and switched the inputs between glucose and lactose every four hours. And what they found is that, when lactose comes in to replace glucose, every cell stops growing within three minutes. This outcome is extremely uniform! But the reverse — or the time it takes each cell to switch on its lac operon — is extremely variable. About one-fourth of cells start growing within 25-45 minutes, two-thirds start growing in one-to-three hours, and five percent of cells never grow again at all. By accounting for cells individually, in other words, the Mother Machine enabled these researchers to make observations which could never be made at the population-scale.

    And yet, Mother Machines still seem relatively rare! The blueprints are freely available online, but making these devices still requires an understanding of photolithography. The wells are only a micron wide, so they can’t be 3D-printed; one has to make a master mold using photomasks, cast PDMS in that mold, and then cure the polymer into that shape. The original specs only work for E. coli, too. If you wanted to study Bacillus or Caulobacter or yeast, you’d have to redesign the channels with different dimensions. A few companies sell Mother Machines, but they seem to be quite small.

    If Mother Machines did become widespread, though (maybe even cheap enough to ship in, say, a $100 kit for students) they could be used to run all kinds of interesting experiments.

    One idea is to combine a Mother Machine with a hypermutation tool, such that we can watch cells evolve in real-time. In a recent study, British scientists reported a way to do “highly mutagenic continuous evolution” in E. coli. The beauty of their tool is that it only requires two components: an error-prone DNA polymerase, and a replicon carrying a gene of interest. The error-prone polymerase, which introduces about one mutation per 1,000 bases every ten generations, only copies the DNA on the replicon; it doesn’t touch the host genome. One could take a gene encoding antibiotic resistance (against molecule X) and clone it onto the replicon, transform the whole thing into E. coli, and trap the cells inside a Mother Machine. Then, by exposing the cells to increasing levels of antibiotic Y, one could watch in real time as cells mutate their resistance gene and, perhaps, hit upon a solution that confers resistance against both molecules. This would be a way to study how cells evolve resistance autonomously, at the single-cell level.

    Another idea is to use Mother Machines to study how perturbations change a cells’ transcriptome in real-time. Felix Horns (previously in Michael Elowitz’s group at Caltech, now at Arc Institute) created an RNA Exporter tool. The gist is that genes encoding virus-like particles are placed into cells and, when these particles get made, they latch onto RNA molecules and physically carry them out of the cell. Cells are effectively engineered to export their own RNA.

    My understanding is that RNA Exporters are relatively unbiased, meaning they have a roughly equal chance of grabbing onto any RNA molecule. The molecules that get carried from the cell, then, are representative of the transcriptome as a whole. If cells carrying RNA Exporters were studied in a Mother Machine, it might be possible to perturb them and measure their transcriptional responses in real time — rather than the classical approach of perturbing millions of cells at once in a flask and doing RNA-seq on the entire population to collect average results.

    A third idea is to collect single-cell observations to train a predictive model for molecular burden. Any time we engineer an organism to carry new genes, we are forcing it to execute a function it wouldn’t normally do, thus draining resources that would otherwise go toward growth, DNA repair, and so on. Perhaps we could take 100+ plasmids, each carrying a fluorescent protein, and clone all of them into the same strain of E. coli. Then we could study each strain inside a Mother Machine, carefully quantifying growth rates and fluorescence levels, to map out the full distribution of outcomes for a given plasmid. If we did this enough times (hopefully with some kind of automated data pipeline) we could collect a huge dataset. The resulting model could also help bioengineers design constructs that impose less of a burden on living cells.

    I’m not entirely sure why we’re not seeing more of these ideas implemented, or why bioengineers still haven’t fully embraced single-cell experiments. Every university should have a microfluidics facility making custom devices, but I’ve only visited a few of them. Most experiments are still done in bulk, using orders-of-magnitude more cells and reagents than microfluidics would require; and usually the results are less representative of ground truth, too!

    It’s a shame, because one of the beautiful things about biology is that each cell is unique and lots of molecular phenomena are highly stochastic, following a distribution of outcomes. Biology is fun because it is not deterministic; and that makes it both richer as a field of study but also more complicated as an engineering medium. A Mother Machine, and other tools like it, help us to actually see these distributions, and we ought to embrace them at scale.

  • A Small Amount of Money Can Surface Many Good Ideas

    The “Fast Biology Bounties” went surprisingly well. My goal was to spend $10,000 to surface a few good ideas to “speed up or reduce costs for wet-lab experiments,” but so many good submissions came in that I ended up giving away $15,000 to 20 different projects instead.

    This surprised me, partly because I shared the bounty idea with two reviewers before announcing it — one a VC and the other a biology writer — and got mixed feedback. The VC said I wouldn’t get many good submissions, because the best people were already building companies or running labs and wouldn’t want to give away their ideas for “free.” The writer said I might get some decent ideas, but maybe not many. I decided to publish it anyway, on a bit of a whim, and waited to see what came in.

    After closing submissions on March 15th at midnight, I tallied the emails and began scoring the results. I received about 430 submissions from 335 individuals. Together, they totaled 155,115 words of text (roughly the length of Harry Potter and the Prisoner of Azkaban and The Great Gatsby put together), all of which I read in my spare time. None of the ~20 great ideas were clearly better than the others, so I didn’t award the $5,000 prize and instead distributed mostly $1,000 and $500 checks. If you haven’t heard from me about a bounty, that unfortunately means I didn’t select your idea. Many good ideas didn’t win a prize.

    My main lesson from this experiment is that good ideas are cheap and can be surfaced for a small amount of money. People with great ideas willingly share them! Many of the winners were highly generative; they sent me several ideas, all excellent. And when I asked whether I could share their ideas publicly, every one said “yes.” People with many good ideas tend to be bottlenecked by time and resources, and know they will have more ideas in the future. (Some submissions were funny, too. One person proposed strapping scientists to roller skates to help them move around the lab faster.)

    Some ideas surfaced repeatedly. Bounties, then, seem like an effective way not only to surface ideas broadly, but also to connect people around them. For future bounties, I’ll use a software tool that manages submissions and automatically connects people with overlapping ideas.

    (For example, I received three pitches on speeding up the heating and cooling times of PCR thermocyclers, several pitches for autonomous devices to monitor contamination in cell culture plates, at least five pitches to embed cameras in laboratories and use AI to automatically record scientific protocols, and more than fifty submissions about ways to skip the overnight growth and miniprep phases of DNA cloning. These people should probably all work together!)

    During the competition, I cataloged the email address, date and time, length, perceived quality (based on originality and tractability), and likelihood of AI use for each submission. I also recorded whether each person asked me to keep their idea private, then used these data to check for correlations between the length, timing, privacy, and quality of proposals.

    First, the timing of submissions. I received 94 emails in the first two days (March 2–3), 138 emails from March 4–13, and 103 emails in the final two days. Perceived quality was evenly spread across all three periods, meaning people who waited longer did not submit better ideas. I don’t know for sure, but this implies that great ideas were already sitting inside people’s heads, and the public competition didn’t induce many people to generate new ideas. But maybe I’m wrong about this.

    A vertical bar chart illustrating the number of submissions over a span of days from March 2 to March 15, showing a significant peak on March 15.

    Next, the length of submissions. I specifically said in the public call that “a few paragraphs will suffice.” But the vast majority of people sent me thousands and thousands of words — or sometimes entire, 20-page PDFs — for their ideas. Many academics rehashed their existing papers or PhD theses and sent me the entire document, which I did not enjoy. Many people scraped my blog, found ideas I had already written about (like cell division times), and then used AI tools to rehash my ideas and make them worse. I also did not enjoy this. 

    Overall, the average submission ran to 548 words. The median was 384 words. Many exceeded 4,000 words! My takeaway is that many people use AI to “expand” their ideas and inflate their word counts, at the cost of clarity and economy of thought. Unsurprisingly, submissions that seemed obviously AI-generated scored much lower than those that felt human.

    I scored each submission on originality and tractability. When I couldn’t evaluate an idea myself, I shared it (with permission) with some friends. Ideas often scored poorly because they were obviously derivative — or, in many cases, just verbatim descriptions of tools that already exist — or too vague and speculative to be technically feasible. Of all the ideas I received, I scored 60% as “Poor,” 25% as “Medium,” 10% as “Good,” and 5% as “Great.” There is an element of arbitrariness in these scores (as in all things in life), reflecting my own biases.

    I hope to do another round of bounties, albeit on a different topic. If I do, I’d also like to ask submitters directly whether they used AI or agents, and about their affiliation — industry, independent, academic, or something else. I have a hunch that many of the most original ideas came from people in industry, but it’s hard to tell because most people used personal email accounts.

    Bar graph showing the number of submissions categorized by word count ranges from 0-100 to 2000+. The highest bar is in the 300-400 and 1000-1500 word count ranges.

    I’d change many other things next time. Instead of using my email address, I’d set up a form that collects more information and automatically curates the data in a spreadsheet. My biggest problem, though, was responding to emails. I wrote a short reply for every submission, and I wish I had an automated platform that just said “Submission received” and, at the end, notified everyone who hadn’t been selected. As it stood, I only had the bandwidth to notify winners, which meant hundreds of other people never got a direct yes or no from me. This is deeply sad to me, because it means many good ideas will never reach the public, and many interesting people who ought to meet each other will not. I regret this and am working on solving these problems.

    (Zoe Senón and David Lang, who runs Experiment.com, have good thoughts on this problem. Lang once told me that, when you put out a call for funding and hear from hundreds of people, the good news is that you’ve reached your target audience! Everyone who writes back and shares an idea is your “ilk;” the kinds of people who share your interests. And yet only about 5 percent of them will receive money. The rest may leave feeling sad, or jaded, or nothing at all. Sending thoughtful rejections — which I didn’t do because of the sheer volume of emails — builds goodwill and camaraderie within the community.)

    Despite the flaws in this competition, though, I think it was a successful experiment. I’m sharing a few of the winners below, along with a sentence or two on what they proposed. I’ll expand on more of the winning ideas on this blog over the coming weeks.

    • Sebastian Cocioba for a laser-based PCR thermocycler, in which infrared heating replaces aluminum blocks. (Public notebook here.)
    • Lou Hom for an idea to run cell-free protein synthesis reactions on a gel filtration column, such that ribosomes continuously encounter fresh substrates while physically migrating past their waste and spent nutrients.
    • Bryan Duoto for a colony-to-sequence cloning workflow that uses magnetic beads and Nanopore sequencers. Scientists can verify clones in 1–3 hours instead of waiting overnight. (Public protocol here.)
    • Jeff Nivala for an idea to synthesize proteins directly from DNA, without relying on any RNA intermediates.
    • Sierra Bedwell for a clever automation system that uses off-the-shelf parts to combine a thermocycler, gel imager, and liquid handler to screen thousands of environmental DNA samples in parallel.
    • Xavier Bower for “IceCreamClone,” an interactive cloning strategy ranker that looks at a scientist’s available “parts,” or sequences, and then determines whether they ought to use Gibson, Golden Gate, restriction digest, or another strategy to assemble them together. The software also catches likely cloning errors and estimates the cost and time required for each option. (Software demo here.)
    • Andres Arango for two ideas: using antifreeze to accelerate DNA ligation by 2-3 orders of magnitude, and computationally designed protein cradles for expressing membrane proteins in E. coli.
    • Corey Howe for ideas to speed up Vibrio natriegens growth using 3D-printed mini-bioreactors and continuous whole-genome sequencing.
    • Alexander Vawter, from Heterodox Labs, for the “Experiment Engine,” a series of cameras and sensors that help debug experiments at the bench.
    • Michael Darcy for three highly original ideas, including a protein printer fabricated using DNA origami and a “GPU for liquid handling” device which uses a centrifuge, instead of pipettes or sonication, to move droplets.

    I’m under no illusions that these tiny grants will meaningfully push these projects forward. A $500 or $1,000 check often isn’t even enough to buy reagents for experiments. But these microgrants act as a vote of confidence. They tell people that I believe in them and that I think their ideas are good. In that way, a small amount of money can subtly shift the trajectory of what people choose to work on.

    More importantly, I’m encouraged that this bounty incentivized hundreds of people to think about the methods they use every day and consider ways to improve them. This is good for science, because good things happen when scientists think explicitly about the limits of their tools. (That is why a frustrated postdoc invented the micropipette; he got annoyed by using his mouth to move liquids around.)

    As always, please write to me if you’d like to talk more. I’m at nsmccarty3@gmail.com.

  • One Equation for Faster-Growing Cells

    Biologists are obsessed with records.

    We like to learn about the smallest and biggest cells, the animals that live longest, and the birds which migrate furthest. Perhaps this is an intrinsic part of Human Nature; but a part of me — deep down — wants to resist it. I’ll not be a stamp collector, I think, or mere record keeper! No; I shall study the mundane and the average, such that I can understand life as it really is, or at least usually is, on this beautiful Earth.

    And yet, what’s the fun in averages? I think there is something about “records,” and our hunt for them, that serves a valuable purpose. Indeed, records are often a starting point for a deeper curiosity.

    When we learn of an organism that lives for hundreds of years, or first hear that elephants do not get cancer despite the abundance of cells in their bodies, it is only natural to think, “Wait, then why do humans get cancer? We have way fewer cells than elephants!” In this way, records become a starting point toward rich questions.

    But the record I think about most is cell division; specifically, why an obscure microbe — called Vibrio natriegens — is able to divide every 9.8 minutes and not a moment sooner.

    Dividing V. natriegens cells. Credit: Max-Planck-Institut for Terrestrial Microbiology

    V. natriegens was first isolated by William Payne, a professor at the University of Georgia, from a glob of mud on Sapelo Island in 1958. Four years later, a man named R.G. Eagon incubated these cells at 37°C, shaking them vigorously in a liquid broth containing blended bits of brains and hearts. It was Eagon who found, in this experiment, that the cells divided every 9.8 minutes. This must have been a startling discovery, because the average microbe divides every three hours or so. Some microbes, living deep in the Earth’s crust, divide once every few years.

    It has been more than 60 years since Eagon made his discovery, and yet nobody has found a microbe which grows faster than V. natriegens. Is 9.8 minutes some kind of magical threshold; a speed limit to life’s replication? I don’t think so. And the reason I say so is because of a single equation, the parameters of which may actually reveal how to make cells grow faster.

    False Assumption

    My first assumption was that a cell’s division time is limited by DNA replication. For one cell to become two, the cell must copy its genome and pass one copy to each offspring. The bigger the genome, the longer it takes to make a copy, and the slower a cell divides. Right?

    Not quite. The enzyme responsible for copying the genome, called DNA polymerase, moves at roughly 1,000 bases per second. V. natriegens has about 5.17 million bases in its genome, split across two chromosomes. The first chromosome has 3.25 million bases, and the second has 1.93 million bases. At normal speed, one polymerase would need 54 minutes to copy the first chromosome and 32 minutes to copy the second.

    Doubling time for 214 microbes, organized by their optimal growth temperature.

    For years, many researchers thought that splitting the genome across two chromosomes was what enables V. natriegens to grow fast. With two chromosomes (so their thinking went), two polymerases can copy the genome in parallel, thus cutting division times in half! But then a 2024 paper came out, explaining how researchers had fused both chromosomes into a single genome, and the cells still divided every nine minutes. So clearly that’s not the bottleneck here.

    The truth is that cells don’t use a single DNA polymerase to copy their genomes. Instead, two polymerases copy the genome at the same time, albeit in opposite directions. This bidirectional copying also happens many times simultaneously. As soon as one set of DNA polymerases begin copying the chromosome, another set latches on and starts copying it, too. Multiple copies of the genome are thus in the act of being made at any given moment. When one cell becomes two, each “daughter” not only inherits a genome, but also inherits the copies of that genome that are in the act of being made.

    DNA replication is not the bottleneck to cell division. In theory, a cell could initiate dozens of rounds of DNA replication all at once, provided it has enough energy and nucleotides to do so.

    The true bottleneck, it turns out, are actually the ribosomes, or big “machines” (a tired metaphor, I know) that build proteins. Before a cell can split in two, it must double its pool of ribosomes so that each daughter cell has enough to survive. And as we’ll see, this is really slow.

    Many students are taught to think of ribosomes as “proteins that build other proteins.” But two-thirds of a ribosome’s mass is RNA; not amino acids. Each ribosome is also built from two pieces, called the large and small subunits. These two pieces glom onto a strand of messenger RNA and “read” its code to build proteins. After a ribosome has finished making a protein, it falls off the messenger RNA, searches for a new strand, and begins building the next one.

    E. coli and V. natriegens have nearly identical ribosomes. In both, the small subunit contains a long strand of RNA, called ribosomal RNA, packed inside of 21 proteins. The large subunit has two strands of RNA (one short and another long) stuffed inside of 33 proteins. Of all the RNA molecules floating around a cell, about 80 percent are ribosomal. (Messenger RNAs account for only a tiny fraction.) In total, each ribosome contains 4,566 nucleotides of RNA and 54 separate proteins, totaling 7,500 amino acids. This is enormous; an average protein has about 300 amino acids. Once built, each ribosome can “stitch together” about 16 amino acids per second.

    Now, I know there are a lot of numbers here. But recall that V. natriegens divides every 9.8 minutes, and consider what happens when we crunch the numbers on how long it takes a ribosome to build a copy of itself:

    There are 7,500 amino acids in a ribosome, and each ribosome stitches 16 amino acids together each second. Therefore, it takes one ribosome about 7 minutes and 50 seconds to build another ribosome; and V. natriegens divides every 9.8 minutes! That gap of about two minutes is all the time the cell has to make everything else: copying DNA, growing its lipid membrane, and building all the other proteins it needs to survive. Ribosome biosynthesis is the true bottleneck on cell division. No organism can divide faster than the time it takes to make its own ribosomes.

    If this explanation strikes you as too tidy, though, you are certainly not alone. I had the same reaction at first. And one question I began thinking about is this: Sure, it takes one ribosome about eight minutes to make one ribosome. But each cell has tens of thousands of ribosomes. Those ribosomes all work together, in parallel, to make more ribosomes. Because the 7,500 amino acids required to build each ribosome are split across 54 different proteins, 54 ribosomes could (in theory) work together to build each new ribosome.

    But this is only true at the level of one ribosome. If we zoom out to the whole cell, the math doesn’t work out quite this cleanly. For a cell to go from R ribosomes to 2R, it must build R ribosomes, and it only has R ribosomes to do this. Each ribosome, on average, must make one other ribosome; and that takes about eight minutes.

    (Parallelization works when you can add more “machines” independent of output but, in this case, the “machines” are also the output.)

    CategoryMetric Value
    DNA PolymeraseSpeed~1,000 bases/second
    V. natriegens GenomeTotal size5.17 million bases
    Number of chromosomes2
    Chromosome 13.25 million bases
    Chromosome 21.93 million bases
    Replication Time (single polymerase)Chromosome 154 minutes
    Chromosome 232 minutes
    Cell DivisionV. natriegens division time9.8 minutes
    Ribosome CompositionTotal ribosomal RNA4,566 nucleotides
    Total proteins54
    Total amino acids7,500
    Ribosome SpeedTranslation rate~16 amino acids/second
    Time for one ribosome to build one ribosome7min 50sec

    This raises other questions, too; like rather than fully double its ribosome pool, why doesn’t a dividing cell give fewer ribosomes to each daughter?

    A cell could do this. But doing so would mean each daughter cell then needs to “catch up” and make more ribosomes so it can grow at its maximum capacity again. Cells must devote about half **their ribosomes toward making the various proteins needed to sustain life (not ribosomes). If a cell devotes too many ribosomes toward making other ribosomes, it will not be able to sustain its metabolism, or make energy, or copy its genome, or all that other stuff. Short-shifting daughter cells, then, is just passing a problem down to future generations.

    So the ribosome bottleneck holds, no matter how we come at it. But this makes V. natriegens’ growth rate even more impressive. This microbe, pulled from a glob of mud in Georgia, has evolved a way to divide quite close to its theoretical, biophysical limit; mostly by optimizing for ribosome biosynthesis.

    First, V. natriegens has at least a dozen ribosomal RNA operons, or gene clusters encoding ribosomal RNA molecules, in its genome. E. coli, for comparison, has seven. And second, these ribosome genes are located next to “strong” promoters, or genetic sequences that recruit RNA polymerase enzymes. In other words, Vibrio devotes more of its genome to ribosomal genes, and has also evolved a stronger “start” signal for those genes, meaning the cell makes ribosomal RNA much more frequently, and in higher numbers, than other microbes.

    Scientists don’t fully understand why V. natriegens evolved to grow quickly, though. But remember that these cells were first discovered in nutrient-rich mud, on an obscure island off the coast of Georgia, where lots of organic matter washes up with the tide. As this tide flushes out, nutrients go with it. In their natural environment, then, these cells are exposed to ebbs and flows of nutrient-rich soup; cells that divide faster are able to “scoop up” more nutrients before it disappears. The end result, over millions of years, is that cells evolve to grow and consume as quickly as possible.

    I can’t help but wonder why evolution “stopped” at 9.8 minutes, though, rather than the eight minutes it takes to theoretically double the ribosome pool. Those extra two minutes, it turns out, come from the fact that a dividing cell must make not only ribosomes, but also many other proteins, before it divides. A cell needs to make all the enzymes required for DNA replication, proteins to “pull apart” the chromosomes for each daughter cell, lipid molecules to grow the cell membrane, and so on. All of these things require proteins, which are made by ribosomes. And that’s why ribosomes can’t spend all their time making other ribosomes! (Even at maximum growth rates, most microbes only devote about one-third of their ribosomes toward making more ribosomes. The rest are used to build other things.)

    Still, I wonder if cells could grow even faster.

    Math “Knobs”

    The interesting thing about essays is that they describe phenomena in the English language, and thus are imprecise by their nature. I can work really hard to edit my sentences and make my words as clear as possible, but there will always be a chance that you, my reader, will be confused. Or, I could just simplify everything by giving a single equation which captures and explains the whole phenomenon. It turns out that this works remarkably well for cell division.

    A few years ago, researchers at Caltech published a paper, titled “Fundamental limits on the rate of bacterial growth and their influence on proteomic composition.” In it, they write down two simple, mathematical relationships. First, they note that the fraction of a cell’s mass devoted to ribosomes depends on how many ribosomes it has (of course) and how big those ribosomes are, relative to all the proteins in the cell. And second, for a cell to double in size, it must synthesize a cell’s worth of new protein, and the rate at which ribosomes do this determines how fast the cell grows.

    By smashing these two relationships together, they arrived at a single equation — with just four parameters1 — that describes how quickly a cell will divide:

    λ=rtfaΦRLR\lambda = \frac{r_t \cdot f_a \cdot \Phi_R}{L_R}

    The left side, λ, is the cell’s growth rate, or number of times it divides per hour. On the right, there are four terms. rt is the translation elongation rate, or the speed at which a ribosome puts amino acids together; in most microbes, this is 15-30 amino acids per second. fa is the fraction of ribosomes actively making proteins at any given moment. In a normal cell, at a narrow slice of time, about 15 percent of all ribosomes are idle. ΦR is the ribosomal mass fraction, or percentage of all proteins in the cell that are ribosomes. And LR, on the bottom, is the total number of amino acids in each ribosome.

    The beauty of this equation — the reason it nearly brings a tear to my eye — is because it immediately explains both the biophysical limits of cell division and the knobs, or “dials,” by which we can change it. We can intuit, for example, that f_a must always be less than 1.0, because some ribosomes will always be between jobs, searching for their next strand of messenger RNA. And ΦR must be less than 1.0, too, because a cell made entirely of ribosomes is a cell without a metabolism, membrane, and so on. Both of these parameters have hard ceilings.

    To get a feel for what’s biologically plausible, let’s plug in some back-of-the-envelope numbers for V. natriegens:

    rt = 20 amino acids per second
    fa = 0.85 active ribosomes
    ΦR = 0.50 of protein mass is ribosomes
    LR = 7,500 amino acids per ribosome

    Crunching these numbers, we get λ = 4.08 h⁻¹, or a doubling time of 10.2 minutes; remarkably close to what Eagon measured in 1962!2

    The nice thing about mathematical equations, like this, is that they not only point at biophysical limits, but also reveal which parameters can be tweaked to change the results. Now that we know the four parameters which set growth rate, in other words, we can begin to dream up clever ways to tune each “knob” to make cells grow faster or slower.

    One option is to engineer ribosomes such that they literally build proteins faster. If we could raise the rt parameter to 30 or more (as some other microbes have), then division time goes down. Or, alternatively, we could try and make ribosomes smaller. Researchers have already explored this for E. coli. In 2002, researchers studied which proteins — of the 54 found in the E. coli ribosome — were “conserved” across other bacteria, archaea, and eukaryotes. In other words, they wanted to figure out which proteins show up again and again across species, and which proteins were only found in a few species (and, thus, might be disposable.) 

    They found that about 21 of E. coli‘s ribosomal proteins show up in bacteria, but not archaea or eukaryotes, and some could plausibly be trimmed. I’m not aware of anyone who has actually tried this, but I wouldn’t be surprised if we could cut out, say, 20 percent of the ribosome without impacting its function too much, and thus shave a couple minutes from the theoretical cell division time. Somebody should try this!

    Another option is to raise fa by boosting the fraction of “active” ribosomes within the cell. Protein synthesis is the most energetically expensive thing a cell does, so many organisms have evolved mechanisms to shut ribosomes down when they are not needed, thus conserving energy. E. coli, for example, carry “hibernation factors,” proteins that grab onto ribosomes and push them into an inactive form when they are not needed. It’s not known if V. natriegens encode the same proteins, but we could search through their genome and delete similar genes to test this theory.

    Or, perhaps, we could take a more agnostic approach and just let evolution take its course, albeit in an accelerated way. If Vibrio evolved with slow ocean tides, maybe we could make them evolve even faster in the laboratory. Perhaps we could run a Richard Lenski-esque experiment, in which V. natriegens’ cells are grown in a robotic bioreactor and flooded with glucose every few hours, followed by stretches of nutrient starvation. If we repeat this lots of times, some microbes may evolve to grow even faster during those periods of high glucose. Or maybe not; V. natriegens may already be quite close to the theoretical cell division time limit.

    These experiments haven’t been done yet. But that, in a way, is the whole point. 

    I never planned to write this essay, which emerged entirely by accident, with one question leading to another, until I found myself deep in the weeds of ribosomes and biophysics and growth rates. What surprised me most, in the end, was that the clearest answer to my question was found not in words, but rather in a single equation with just four parameters. Biology, at its limits, can often be described best with mathematics.

    This equation only exists because generations of biophysicists heard about a record set by a microbe pulled from Georgia mud in 1958 and couldn’t let it go. They spent decades modeling ribosome fractions and translation rates; not because anyone asked them to, but because the record raised questions which bothered them and they wanted desperately to answer. Eventually, they wrote down an equation that not only explains why V. natriegens divides as fast as it does, but points toward how we might push it further still.

    Records, it turns out, are not merely trivia, but rather a map toward the loose threads that, when pulled, unravel something remarkable about the world. In glorifying the exceptional, we can find answers to the mundane.

    1. A typical E. coli has 75,000 ribosomes and V. natriegens has 115,000 ribosomes. But why? The equation also helps explain why this is. The gist is that cells can’t just crank up ribosome speed indefinitely, because there is a maximum rate of protein biosynthesis. The only way to grow faster, then, is to make more ribosomes. But the downside of this is that, by making too many ribosomes (especially when nutrients are scarce), the cell’s amino acids will get depleted and the cell will slow down. Therefore, cells must carefully balance their ribosome numbers to match their available nutrients. This also explains, somewhat, why larger cells — even of the same species — divide more quickly; they are using the extra space to house more ribosomes. ↩︎
    2. Cells grow exponentially; each division yields two cells, each of which divides again. Therefore, the actual doubling time is not 60/λ, or roughly 15 minutes, but rather ln(2)/λ, or about 0.693/λ. Hence the 10.2 minute figure. ↩︎
  • eLife Fallout

    In October 2023, shortly after the war in Gaza began, scientist Michael Eisen shared an article from the satirical news website, The Onion, on Twitter. Entitled “Dying Gazans Criticized For Not Using Last Words To Condemn Hamas,” Eisen retweeted and added: “The Onion speaks with more courage, insight and moral clarity than the leaders of every academic institution put together.”

    At the time, Eisen was editor-in-chief of an open-access science journal called eLife. Ten days later, he was fired. After five of the journal’s editors resigned in protest, eLife’s board of directors released a statement: 

    Mike has been given clear feedback from the board that his approach to leadership, communication and social media has at key times been detrimental to the cohesion of the community we are trying to build … It is against this background that a further incidence of this behavior has contributed to the board’s decision.

    This happened more than two years ago. At the time, it received extensive media coverage. Some news articles celebrated the decision, while others quoted sources who criticized the journal for their “irrational attack” on Eisen’s freedom of speech. But never explained — at least not by the mainstream press — were details about the precise events and academic in-fighting that preceded Eisen’s ousting.

    Eisen was not fired because of a tweet, says Prachee Avasthi, who served on eLife’s board of directors. Rather, tensions had been mounting for months between eLife’s leadership team and its editors and readers. The journal had spent years pushing the boundaries of both publishing and peer review. eLife first required authors to publish preprints before submitting to the journal, and then they got rid of accept-reject decisions entirely. Eisen increasingly found his decisions at odds with the norms of the scientific community he was trying to reform. So when Eisen sent out his tweet, says Avasthi, the board just had a convenient excuse to get rid of him.

    The whole story is quite strange, especially given that the people involved — not only Eisen and Avasthi, but also former editors at the journal — still regularly cross paths in San Francisco’s open-science community. Even as eLife fractured, the same ideas and people reassembled elsewhere, carrying forward pieces of the original reform efforts. Conversely, the journal has quietly retreated from parts of the vision Eisen laid out for it.

    Origins of eLife

    eLife was designed as an experiment in removing gatekeepers from scientific publishing.

    Founded in 2012, eLife quickly joined the ranks of “prestige” journals because three famous research charities — the Howard Hughes Medical Institute, the Max Planck Society, and the Wellcome Trust — agreed to fund a large portion of its operations. From the start, this support meant that eLife didn’t have to worry about the financial pressures that often plague academic journals, which otherwise rely on large subscription and article-processing fees for survival.

    The journal’s first editor-in-chief was Nobel Laureate Randy Schekman. In 2013, Schekman criticized Nature, Science, and Cell as “luxury journals,” comparing their low acceptance levels with high-end “fashion designers” who deliberately inflate demand due to perceived scarcity. Just 8 percent of papers submitted to Nature are eventually accepted.1 (eLife’s acceptance rate in 2025, for comparison, was 15.4 percent.)

    Under Schekman, the journal began implementing reforms to remove gatekeepers and give authors more control over decisions. In mid-2018, eLife began requiring that authors post their manuscripts on preprint servers, such as bioRxiv, before submitting to the journal. The goal, as Eisen explained, was to show that journals could act as reviewers rather than as judges. “The main reason for doing that,” Eisen says, “was to show that publishing wasn’t our job. We were reviewing papers that authors had already published themselves.”

    The second major reform at eLife was to do away with accept-reject decisions entirely, thus making editors more like academic collaborators than gatekeepers.

    This was a big change, at least compared to the conventional publishing model. When scientists submit a paper to Nature, say, it is first assigned to an in-house editor, who decides whether a submission meets the journal’s standards. According to Nature’s website:

    The criteria for a paper to be sent for peer-review are that the results seem novel, arresting (illuminating, unexpected or surprising), and that the work described has both immediate and far-reaching implications.

    These criteria don’t leave room for null results, incentivizing authors to overstate the merits of their work. If the editor thinks that a paper fails to meet this bar, they can unilaterally reject it. If the editor thinks that it does meet this standard — and there is usually a bit of politicking involved — then the paper is sent to two or three peer reviewers.

    The reviewers usually take 2-3 weeks to read the paper and write feedback. Their reports return to the editor, who decides whether the manuscript should be rejected, revised, or accepted. Most papers go through at least two rounds of review. The full process can take months or years. 

    Reviewers may give differing opinions about a paper, too. Reviewer #1 might ask the scientists to repeat an experiment, while Reviewer #2 commends it and tells the editor to accept the paper as-is. This can get quite confusing, of course. But at most journals, the editor compiles and returns all of the reviewers’ feedback — conflicting or not — to the authors, who must then decide whether to revise their manuscript or send it elsewhere. The editors usually wield final authority over which studies appear in the journal.

    But since its inception, eLife — seeking to improve this peer review process — had adopted something called “consultative peer review,” meaning reviewers and editors talked to each other before sending back a single set of non-conflicting comments to the authors. Unlike most other journals, in a show of transparency, eLife also openly published the decision letters and reviewer reports for all accepted articles. (In a 2016 survey, 95 percent of reviewers said that “that the consultation process at eLife adds value for authors.”)

    In mid-2018, under Schekman’s tenure, eLife launched the Triage Trial, an experiment that removed accept-reject decisions for roughly 300 papers. Editors and reviewers still gave authors feedback, but eLife then published those comments whether or not the authors revised the manuscript or resubmitted it to a different journal. In other words, eLife became a peer review platform, rather than a typical publisher with yes-no decisions. “If we publish reviews for all papers, then why do we need an accept or reject decision at all?” Eisen says. “There was no good argument for only publishing the reviews of accepted papers.”

    Each of these decisions — from the shift in peer review, to requiring that authors post their studies on preprint servers, to eliminating accept-reject decisions — moved eLife closer to its ultimate aim of putting authors, rather than editors, in control of publishing. 

    Schekman resigned from his position in early 2019, and was replaced that February by Michael Eisen, a geneticist at UC Berkeley. Eisen had previously founded the first open-access journal, the Public Library of Science (PLOS), in 2001. The board interviewed five candidates, says Avasthi, and the finalists were Eisen and a “more moderate” choice. But Eisen received unanimous support among the board’s selection committee.

    At the same time, everybody at eLife knew that Eisen was not a politically neutral choice. Eisen has urged scientists to access papers via SciHub, “a shadow library that provides free access to millions of research papers, regardless of copyright,” according to Wikipedia. (Many scientists do this anyway, but it’s kind of a “you’re not supposed to talk about it” situation within academic circles). In 2018, Eisen also ran for a U.S. Senate seat in California, tweeting an image of Donald Trump with the caption, “What a fucking asshole.”

    In October 2022, based on results from the Triage Trial, eLife — now under Eisen — announced that they would scrap accept-reject decisions across the entire journal. From now on, editors would only screen submissions for serious flaws, and then pass them to outside experts for review. These reviewers would write a public report and, with the editor, an accessible description of what the study showed. eLife would then post both the paper and these reviews — even scathing ones — online as a “Reviewed Preprint.” Authors could then revise and repost their paper through eLife or resubmit elsewhere.

    “By relinquishing the traditional journal role of gatekeeper and focusing instead on producing public peer reviews and assessments, eLife is restoring control of publishing to authors,” Eisen wrote in a public letter. Avasthi, a member of eLife’s board of directors, was fully supportive of Eisen’s vision. Rejecting papers often felt arbitrary, she says, since those same studies would later appear in other journals anyway.2 “So why reject it? We realized these processes just slow things down and remove author agency.”3 (For a 2012 paper, researchers found that about 75 percent of all submitted papers are published in their first-choice journal. Of those papers that do get rejected, a majority are later published elsewhere.) 

    At the same time, eLife changed its revenue model. Previously, authors paid $3,000 only if their paper was accepted. Most other journals also charge authors only upon acceptance, meaning that the more selective they are, the more money they lose reviewing papers. Instead, eLife began charging a flat fee of $2,000 for all submissions. Critics saw this as opportunistic, but it directly tied the fee to a service: namely, peer review.

    With each change, eLife moved closer to Eisen and Avasthi’s ultimate goal of “a world where journals might not even need to exist.” Scientists would get to choose when and how to share their work, but without being able to hide criticisms by quietly re-submitting rejected work to other journals. The scientific community would then organize around studies or reject them conceptually, based on these public reviews, rather than promoting studies merely because they appeared in a “prestigious” journal.

    But then tensions started mounting.

    The Gatekeepers’ Demise

    For a while, eLife appeared to navigate this publishing shift successfully. Initial reactions were positive, submissions were steady, and the leadership team felt optimistic that other journals might follow their example.

    But then, in early 2023, with eLife poised to fully implement their new policies, a group of prominent editors — including Schekman — began voicing concerns, according to reporting in Nature. In private letters, nearly 30 senior editors threatened resignation, arguing that removing the accept-reject decision would undermine the journal’s prestige and compromise peer review standards. The board also began to question their ideas, then quietly postponed their scrapping of accept-reject decisions.

    Stephen Heard, an evolutionary biologist and writer at the University of New Brunswick, was one of eLife’s most vocal critics. Heard argued that eLife’s policy wasn’t even particularly radical; the journal still charges fees, screens submissions, and “is mostly a journal — just one with a 0 percent rejection rate for manuscripts that make it to the peer-review stage.” 

    Heard also suggested that publishing “unrevised” papers would shift the function of scientific quality-control onto readers. By removing power from editors, readers would need to evaluate the merits of papers themselves. This could be problematic, he claimed, because not all readers have the time or expertise needed to judge a paper’s quality.

    Mark Hanson, professor at the University of Exeter Penryn, applauded eLife’s courage but thought the move was “bad for the health of science … a push towards the death of expertise.” In his view, eLife hadn’t killed gatekeeping, but had instead swapped “hard” editorial power for “soft” influence: 

    Before, editors gave a binary accept/reject. Now they give an implicit accept/reject. I doubt authors will actually publish articles as final version of record [sic] if they’re totally trashed in the editor statement. But it frees up authors in that grey area to publish anyways, even if the editor and reviewers aren’t fully endorsing the article … 

    eLife remained publicly supportive of Eisen. In an editorial, the board and a few editors urged researchers to give the model a chance. “It is a very exciting time at eLife as we try to push the frontiers of publishing and navigate the challenges along the way,” wrote Deputy Editor, Tim Behrens

    But scientists continued to worry about submitting manuscripts to the journal. They wondered if papers published there would “count” for their career. If eLife was no longer accepting or rejecting articles, would universities and hiring committees treat them in the same way as other published articles? There was concern that papers in eLife would be viewed as “lesser than” articles published in standard journals.

    In March 2023, twenty-nine eLife editors and Schekman, the former editor-in-chief, wrote a letter to the executive editor of eLife’s non-profit owner, Damian Pattinson, urging him to replace Eisen “immediately,” according to reporting in Nature

    They added that they had no confidence in Eisen’s leadership because he had dismissed their concerns and had not considered compromise positions. One of the journal’s five deputy editors had already stepped down from that leadership position, and ‘significant numbers’ of reviewers and senior editors were ‘standing ready to resign.’ they wrote.

    The eLife board published an open letter in response, reiterating support for the new model. But despite this public show of support, members of the board privately worried. 

    In group chats and emails, Avasthi said, several board members were discussing the blowback from scientists “on a daily basis” and fielding “daily complaints from powerful scientists.” Avasthi, who was then Chair of the board, felt like her colleagues were not doing enough to support Eisen, especially given the fact he was merely championing decisions that the board, themselves, had already made. In late March, Avasthi resigned from her position. A few months later, Eisen wrote the infamous tweet that got him sacked. “I think the truth is that they had grown sick of me,” he says:

    [The board] didn’t want to deal with the reality of what [reform] actually looks like, which is that people were going to get upset. People were going to say, ‘I am never publishing in eLife anymore.’ People were going to accuse us of ruining science. People were going to attack us. So the practical consequences of trying to actually do something, as opposed to pretending to do something, is not easy and it’s not pleasant.

    End of the Impact Factor

    ​​After Eisen’s firing, eLife tapped its two deputy editors, plant biologist Detlef Weigel and neuroscientist Tim Behrens, to run the journal through 2024 while the board looked for a permanent leader. Weigel and Behrens’ first task was crisis control: reassuring skittish editors, calming authors, and keeping the accept-reject system alive.

    In October 2024, however, a company called Clarivate announced that eLife would lose its Impact Factor, a metric invented in the 1960s to help librarians select subscriptions from a growing number of scientific journals. In the last couple decades, universities have increasingly treated Impact Factors as a shorthand for prestige. Most scientists call the metric “stupid” or “arbitrary,” yet still try to publish in journals with high numbers because university hiring committees — as well as grant reviewers at major institutes — care about it.

    An Impact Factor is calculated by taking the number of citations on papers published by the journal in a given year, divided by the total number of papers published. If Nature published 100 papers that collectively netted 2,000 citations in a year, for example, then its Impact Factor would be 20. In 2023, eLife’s impact factor was 6.4. 

    Clarivate’s decision to remove the journal’s Impact Factor hinged on a simple concept: If the journal was not making accept-reject decisions, then there was no way to tag a paper as officially being “in” the journal, and so Clarivate could not fairly calculate the metric. Even before Clarivate made this decision public, the company had flagged eLife’s listing as “On Hold;” university librarians also noticed that some eLife papers had quietly disappeared from Web of Science, a paper indexing platform created by Clarivate. Authors began to worry that their eLife papers would become hard to find and, therefore, difficult to cite.

    In response, the board made a partial compromise: The journal would send papers above a certain quality threshold (submissions designated as “solid” or higher in the eLife editorial assessments) to be indexed in Clarivate’s Web of Science. But publicly, the journal protested, arguing that Clarivate’s move would stifle “attempts to show how publishing and peer review can be improved using open-science principles.” Critics, including Avasthi, viewed eLife’s decision as a capitulation; as a way of bringing back accept-reject decisions, albeit in a different form. 

    By early 2025, Behrens — now permanent Editor-in-Chief — insisted eLife would press on even if it lost its Impact Factor entirely. “We want to prove you can succeed without that number,” Behrens wrote to staff. But submissions to the journal have dipped significantly, especially from scientists in geographic regions where metrics are a deeply entrenched part of academic evaluation, such as in China and the United States. (Submissions from Europe dipped only slightly, according to Behrens.) eLife fully lost its impact factor in June 2025.

    Even so, eLife refused to return to the old system. “Our job is to take the risks now so other journals can copy what works, with much lower risk,” says Behrens. “Any experiment that plays by different rules from Clarivate will hit the same breakpoint” as his journal experienced. 

    Instead, the journal would lobby universities, funders, and scientific societies to state publicly that eLife papers (and papers published in other journals that follow their model) will count just as much as Impact Factor papers. On 8 May 2025, eLife publicly stated on their website:

    We’ve spoken to funders and institutions around the world and found that more than 100 (over 95% of respondents) still consider eLife papers in research evaluation despite eLife’s exclusion from the journal Impact Factor.

    “We’ve conflated sharing science with judging scientists,” says Behrens. “Decoupling them is the whole point.”

    Conclusion

    Despite the upheaval, eLife remains a great journal. Its decision to get rid of accept-reject decisions did not condemn it to obscurity. The journal still publishes thousands of articles yearly and is widely respected amongst scientists. Articles average a speedy 95 days between submission and publication.

    But still, as the journal decided first to require preprints and then to remove power from its editors, scientists wondered: Why not just do these experiments at another journal? Why “tank” the prestige of eLife itself?

    The reason, says Eisen, is that the journal carried weight; it was considered prestigious. “The only way to address the reliance on a journal’s brand as a proxy for quality was to take a recognized signifier and destroy that signification,” he says. “We wanted to remove reliance on that brand.” 

    A spin-off journal, lacking the same measure of perceived prestige, would be ignored by the scientific community. The whole point was to run this experiment in a big journal, in other words, so that people would pay attention.

    eLife’s decision to remove accept-reject decisions, and the backlash that ensued, also reveals the difficulties with metascience reforms as a whole. It shows how everything in academia is intertwined. When one pulls a single thread, like accept-reject decisions, the whole fabric of how we do science begins unraveling, and one sees how tightly a journal’s “prestige” is linked to hiring decisions and grant funding. Scientists may cheer reform in principle, but big actions usually fail because nobody in the collective wants to go first. 

    The story of eLife, then, is a test case for the entrenched incentives in science, many of which are governed by anointed “overseers.” Isn’t it bizarre that a for-profit company, Clarivate, is able to set and control a metric that has become so critical for tenure, grants, and publishing? Isn’t it odd that editors wield such extraordinary control over which papers get accepted or rejected in their journal? 

    Scientists who publish in Nature, Cell, or Science are far more likely to win big grants, given the perceived prestige of those journals. In many ways, then, a scientist’s success stems from their relationships with editors at those journals. “The editors of Nature have more influence over funding than the head of the NIH,” says Behrens. “That’s absurd.”

    And then there are the papers themselves, which aren’t a logical way to convey science in the first place. The work of a laboratory isn’t “done” when a paper goes out and gets published. Academic papers present experimental results as a linear, neatly-packaged “story,” when in reality, science does not resemble this in practice. A better way to publish would be to keep notebooks which get updated in real-time and clearly explain both the successes and failures of a given project.

    “If you ask me what’s the absolute worst thing about the current system of publication,” says Eisen, “it’s that we decide, at some point, that we have all the information that we need in order to show others that [our science is] valid and important.”

    Still, the culture of scientific publishing is slowly shifting. Thousands of researchers have posted preprints and written public reviews for eLife. The journal is working hard to scale-up their efforts. Behrens plans to give away eLife’s software, data pipelines, and even financial information so other journals can try out the same model.

    At the end of the day, the story of eLife is not the story of Eisen, or of his firing, or of free speech. It’s about what happens to those who try to change the incentive structures of science. eLife itself is just a journal — “one journal of thousands,” as Avasthi says — in a sea of other journals. Its rise, fall, and continued existence is arbitrary, as is so much else about how we do science.

    1. These numbers are misleading. Authors who “know” their paper won’t make it into Nature usually don’t submit it at all, because it wastes time. Instead, they often send it to a journal with a higher acceptance rate or a journal more suited to their particular field, like Nature Microbiology or Nature Biotechnology. ↩︎
    2. eLife actually collected data on this to verify the claim; so it’s neither speculative nor anecdotal. ↩︎
    3. This often has the effect of hiding the feedback that resulted in the rejection anyway, since that may or may not surface upon review elsewhere. Authors could ignore that feedback and publish the work as-is elsewhere, but readers suffer by not having reviewer concerns revealed. ↩︎
  • Fast Biology Bounties

    TL;DR: $10,000 in prizes for ideas on how to speed up wet-lab experiments. Prizes will be given for ideas that are highly original and technically tractable. A few paragraphs will suffice. Please send ideas to nsmccarty3@gmail.com with the words “Fast Biology” in the title by March 15th, 11:59pm Pacific.

    Wet-lab biology is a major bottleneck for scientific progress. Even in a scenario where AI models come up with useful ideas, those hypotheses must still be tested in the real world.

    And yet, the world is slow. Atoms are harder and more expensive to manipulate than bits. A kid with a $1,000 laptop can make a nearly infinite number of digital things using solely the electricity from a wall socket. But cloning even a single gene — something biologists do for just about any experiment — takes several days, hundreds of dollars in reagents, and tens of thousands in equipment. This is unacceptable.

    Fortunately, history shows that technical advances can drastically reduce the time and cost of common methods. In 1965, it took Robert Holley several years to sequence a single alanine tRNA, consisting of just 77 nucleotides. Today, an entire human genome, with billions of nucleotides, can be sequenced in a few hours, thanks mostly to innovations in chemistry and high-resolution microscopy. Similar improvements are long overdue for other biology methods.

    Therefore, I’m offering $10,000 in prizes for ideas to speed up or reduce costs for wet-lab experiments. Prizes will be awarded for ideas that are both highly original and technically tractable. These bounties are supported by Astera Institute.

    One idea might be to create a protein printer that enables scientists to fabricate any amino acid sequence without needing to order a DNA template. Or, perhaps you have an idea to use Vibrio natriegens (an organism that divides every nine minutes) with some kind of in situ mutagenesis system to speed up mutational scanning studies. The sky is the limit, and these are only tiny examples.

    You may submit ideas about anything; narrow or broad. Just don’t forget about originality and tractability. I’m skeptical that winning ideas will propose general solutions, or make hand-wavey statements about wet-lab automation, cloud labs, and “physical intelligence.” My strong suspicion is that the best ideas will propose concrete, technical approaches to speed up widespread methods, like PCR or cloning or protein synthesis.

    This competition is open to anybody, regardless of background or experience. You may remain anonymous. I’m also open, in principle, to keeping your idea private for a period up to three months. There is no limit on submissions, either in terms of quantity or length. I plan to award four prizes, with amounts totaling $5,000 / $3,000 / $1,000 / $1,000. Winners will be notified one week after submissions close, on March 22nd.

    Send submissions to nsmccarty3@gmail.com with “Fast Biology” in the subject line by March 15th, 11:59pm PT. (If I receive just one great idea, I will consider this a smashing success.)

    Terms: These bounties are a one-time award and not a grant to support future research, study, or services. Acceptance of a bounty does not create any employment or contractor relationship with Astera Institute. I cannot send money to people living in countries “blacklisted” by the United States. (See this website for more details.) Anybody who is currently affiliated with Astera Institute is not eligible. Ideas must be submitted via email and before the deadline to be considered. All decisions are final and made according to the perceived originality, clarity, and feasibility of the idea. I may not award bounties if no submission meets these standards. Authors retain full ownership over their ideas, but grant me a non-exclusive, royalty-free license to publish or reference the idea with appropriate attribution. Paid bounties are taxable income.

  • 30 Great Essays About Biology

    The world needs more essays about biology. So last month, I tweeted a link to one of my favorite essays (#1 below) and promised that I would continue to share an additional essay every day for the next 29 days. I titled the series, “30 Essays to Make You Love Biology.”

    I’ve now assembled all 30 essays in this article. I hope you’ll read them and emerge with a deeper appreciation for the cell, atoms and their confluence with physics and math.

    I scoured the internet for non-paywalled versions of each article, so all links go to open-source versions. This effort was inspired by the website “Read Something Wonderful.” Enjoy!

    1. “I should have loved biology” by James Somers. An easy-to-read essay about how biology is poorly taught in schools, and how this poor teaching masks its most intriguing bits. Students are typically told to read textbooks and memorize facts about the cell (Mitochondria are the powerhouse of the cell!) without ever appreciating its miraculous complexity. Tests are often given as multiple choice, with little to no problem-solving involved. As Somers writes: “It was only in college, when I read Douglas Hofstadter’s Gödel, Escher, Bach, that I came to understand cells as recursively self-modifying programs.” Link
    2. “Cells are very fast and crowded places” by Ken Shirriff. A short essay about some awe-inspiring numbers in cell biology. My two favorite lines are: “A small molecule such as glucose is cruising around a cell at about 250 miles per hour” and “a typical enzyme can collide with something to react with 500,000 times every second.” Link
    3. “Seven Wonders,” by Lewis Thomas. When Thomas was asked by a magazine editor “to join six other people at dinner to make a list of the Seven Wonders of the Modern World,” he declined and instead drafted this article about the seven wonders of biology. Number 2 on the list: Bacteria that survive in 250°C waters. Link
    4. “Life at low Reynolds number,” by E.M. Purcell. An all-time classic. One of the best biology lectures of all time. This essay opened my eyes to the weirdness of life at the microscale, where “inertia plays no role whatsoever.” Or, as Purcell says, “We know that F = ma, but [microbes] could scarcely care less.” Link
    5. “The Baffling Intelligence of a Single Cell,” by James Somers & Edwin Morris. This interactive article, about chemotaxis and flagella, gives “an intuition for how a bag of unthinking chemicals could possibly give rise to a being.” It’s stunning and slightly emblematic of the great Bartosz Ciechanowski’s blog. Link
    6. “Thoughts About Biology,” by James Bonner. A little-read essay, I think, that deserves more attention. Published in 1960, Bonner argues that biology is ever-changing and progress, often, comes from those outside the field. Part of biology’s beauty is that you can push it forward regardless of background. Link
    7. “Biology is more theoretical than physics,” by Jeremy Gunawardena. It is often said “that biology is not theoretical,” writes Gunawardena, but that’s not true. This essay gives examples where theory preceded and informed major discoveries in biology. It’s a must-read, especially for those who want to work on biology but don’t feel compelled to work at the bench with a pipette in hand. Link
    8. “Can a biologist fix a radio?” by Yuri Lazebnik. One of my favorites. Biologists tend to catalog things by breaking them apart. But without quantitative insights, it is difficult to piece them back together into a holistic understanding. Even if you think a line of inquiry in biology has been exhausted, there is always room to go deeper. Link
    9. “Schrodinger’s What Is Life? at 75” by Rob Phillips. In 1944, physicist Erwin Schrödinger wrote a book, called “What is Life?” that pondered a single question: “How can the events in space and time which take place within the spatial boundary of a living organism be accounted for by physics and chemistry?” This essay is an ode, synopsis, and expansion of that classic book. “Names such as physics and biology are a strictly human conceit,” writes Phillips, “and the understanding of the phenomenon of life might require us to blur the boundaries between these fields.” Link
    10. “Molecular ‘Vitalism’” by Marc Kirschner, John Gerhart & Tim Mitchison. Students are often taught that genes are the bedrock, or blueprint, for biology. But this picture is quickly changing, unraveling, fading. “Although…proteins, cells, and embryos are…the products of genes, the mechanisms that promote their function are often far removed from sequence information.” Link
    11. “Escherichia coli,” by David Goodsell. Goodsell is a computational biologist who also makes brilliant watercolor paintings of living cells. His paintings are based on atomic truth—that is, the ribosomes, mRNAs, and DNA molecules are all painted to scale. This short essay explains how he does it. Link
    12. “How Life Really Works,” by Philip Ball. This essay challenges much that students are taught about how cells actually work. DNA is not some all-powerful blueprint of the cell, as textbooks often suggest. To truly understand life, argues Ball, one must first realize that cells are far more complex than that. They are, in fact, intelligent agents that change their surroundings to their own benefit. Link
    13. “A Long Line of Cells,” by Lewis Thomas. Another masterful essay that traces one man’s life, and mankind’s progress, through the lens of evolutionary biology. It helped me appreciate how my own life is deeply intertwined with the lives of organisms all around me. Link
    14. “AlphaFold2 @ CASP14,” by Mohammed AlQuraishi. Biological progress is swift, and that is one reason it is so exciting. In this first-person essay, a computational biologist marvels at a scientific breakthrough in predicting protein structures from their amino acid sequences. Link
    15. “Theory in Biology: Figure 1 or Figure 7?,” by Rob Phillips. Another great essay about theory—and not just wet-lab experiments—as a key driver of scientific progress. “Most of the time, if cell biologists use theory at all, it appears at the end of their paper, a parting shot from figure 7. A model is proposed after the experiments are done, and victory is declared if the model ‘fits’ the data.” But such an approach is misguided, writes Phillips. As Henri Poincaré once said: “A science is built up of facts as a house is built up of bricks. But a mere accumulation of facts is no more a science than a pile of bricks is a house.” Link
    16. On Being the Right Size,” by J.B.S. Haldane. Published in 1926, this essay made me appreciate the myriad forms and functions of lifeforms all around me. I learned why an insect is not afraid of gravity; why a flea as large as a human couldn’t jump as high as that human; why a tree spreads its branches, and much more. Simple, beautiful. Link
    17. “I Have Landed,” by Stephen Jay Gould. The final essay in a 300-essay series, Gould  writes about how he often lies awake at night, pondering his purpose in the Universe and his fear of death. And how, upon deep reflection, he is most stunned by the fact that life—after more than 3.5 billion years of evolution—continues to exist at all “without a single microsecond of disruption.” Link
    18. “A Life of Its Own,” by Michael Specter. Published in The New Yorker in 2009, this piece explores the then-nascent field of synthetic biology. It opens by telling the story of Jay Keasling, a professor at UC Berkeley, who engineered yeast to make an antimalarial drug called artemisinin, which has been used to save at least 7.6 million lives. Artemisinin was historically extracted from the sweet wormwood plant in a painstaking and low-efficiency process. Link
    19. “Slaying the Speckled Monster,” by Jason Crawford. Smallpox killed an estimated 300 million people in the 20th century alone. This essay explains how a long line of brilliant scientists—from John Fewster and Edward Jenner to D.A. Henderson—invented the first vaccines against the disease and then, in the 1960s, launched campaigns to eradicate smallpox entirely. An inspiring story about how biological discoveries can save lives. I also learned this: “The origin story [about smallpox vaccines] that is usually told, where Jenner learns of cowpox’s protective properties from local dairy worker lore or his own observations of the beauty of the milkmaids, turns out to be false—a fabrication by Jenner’s first biographer, possibly an attempt to bolster his reputation by erasing any prior art.” Link
    20. “Why we didn’t get a malaria vaccine sooner,” by Saloni Dattani, Rachel Glennerster & Siddhartha Haria. Malaria has killed billions of humans in the last few centuries and continues to kill 600,000+ each year. This is, simply put, the best essay ever written on the history of malaria and the invention of vaccines to prevent it. We are living through a revolutionary time, considering these vaccines were only approved for the first time in 2021. Link
    21. “Biology is a Burrito” and “Fast Biology,” by Niko McCarty. Cells are often envisioned as wide-open spaces, where molecules diffuse freely. But this isn’t true. In reality, cells are so crowded, it’s a wonder they work at all. Every protein in the cell collides with about 10 billion water molecules per second. Protein ‘motors’ make energy-storing molecules by spinning around thousands of times a minute. Sugar molecules fly by at 250 miles per hour, nearly double the speed of a Cessna 172 airplane at cruising speed. When I first heard these numbers, I thought they were made up. After all, how is it even possible to measure such things? The world’s most powerful microscope cannot necessarily “see” a protein motor spinning, or watch a sugar molecule move through a cell. As a PhD student, I jumped head-first into the world of biological speed. My goal was to collect some “remarkable” numbers in biology and understand the experiments that brought them to light. My search made me appreciate how remarkable it is that life functions at all, considering the chaotic conditions in which cells exist. It also gave me a new appreciation for biology, and the incredible exactitude that one must have to engineer it — let alone engineer it successfully. LinkLink
    22. “Jonas Salk, the People’s Scientist,” by Algis Valiunas. Salk made one of the first successful polio vaccines. A double-blind clinical trial, launched in 1954, showed that patients who received his vaccine “developed paralytic polio at about one-third the rate of the control groups. On average across the different types…the vaccine was eighty to ninety percent effective.” Shortly after the trial’s results were made public, journalist Edward R. Murrow interviewed Salk. When Murrow asked Salk who held the patent on the vaccine, Salk replied: “Well, the people, I would say. There is no patent. Could you patent the sun?” Reading this essay helped me to appreciate the struggle and strife of biological research, the fickleness of fame, and the positive impact that a small group of scientists can have on the world. Link
    23. “On Protein Synthesis,” by Francis Crick. Arguably the most important essay in biology’s history, this was adapted from a lecture that Crick gave in 1957 during which the famed geneticist made several accurate predictions about how cells work well before experimental evidence existed to support them. “I shall…argue that the main function of the genetic material is to control (not necessarily directly) the synthesis of proteins,” wrote Crick. “There is a little direct evidence to support this, but to my mind the psychological drive behind this hypothesis is at the moment independent of such evidence.” At the time, scientists weren’t sure DNA had anything to do with proteins. In this essay, Crick also predicted the existence of a small ‘adaptor’ molecule that brings amino acids to the ribosome for protein synthesis (now known as tRNAs) and that future scientists would chart evolutionary lineages by comparing DNA sequences between organisms. Crick was years ahead of his time. This essay is a masterclass in scientific thinking. Link
    24. “The People Who Saw Evolution,” by Joel Achenbach. My favorite article on this list. Every year, for 40 years, Peter and Rosemary Grant traveled to Daphne Major, a volcanic island in the Galápagos, to study Charles Darwin’s finches. During that time, they watched “evolution happen right before their eyes.” In 1977, for example, just 24 millimeters of rain fell on Daphne Major, causing major food sources—including small, soft seeds—to become scarce. When the Grants returned to the island in 1978, they found that smaller finch species had died off, whereas “finches with larger beaks were able to eat the seeds and reproduce. The population in the years following the drought in 1977 had ‘measurably larger’ beaks than had the previous birds.” I also strongly recommend the book, “40 Years of Evolution,” from Princeton University Press.  Link
    25. “Is the cell really a machine?” by Daniel J. Nicholson. Living cells are far more complex—and beautiful—than any machines made by human hands. In this essay, a philosopher points to four areas of current research where the metaphor of “cells as machines” breaks down. For example: Even though proteins are depicted as static or unmoving molecules, they actually “behave more like liquids than like solids.” Link
    26. “Biological Technology in 2050” by Rob Carlson. “In fifty years,” writes Carlson, “you may be reading The Economist on a leaf. The page will not look like a leaf, but it will be grown like a leaf. It will be designed for its function, and it will be alive. The leaf will be the product of intentional biological design and manufacturing.” This is a futuristic essay about the potential of manipulating atoms via living cells. Link
    27. “Research Papers Used to Have Style. What Happened?” by Roger’s Bacon. This is an ode to beautiful scientific writing. The essay draws from classic biology research papers to make its case. Link
    28. “Night Science,” by Itai Yanai & Martin Lercher. A personal essay about scientific discoveries that do not emerge from the scientific method as it’s taught in school, as told by two biologists. Perhaps it will inspire you to take up night science experiments of your own. Link
    29. “Atoms Are Local,” by Elliot Hershberg. Biology is the ultimate distributed manufacturing platform. Cells harvest atoms from their environments—air and soil—and rearrange them to build materials, medicines, and everything we need to live. Link
    30. “The Mechanistic Conception of Life,” by Jacques Loeb. This is the article that got me hooked on biology a decade ago. Written by one of history’s greatest biologists, it poses a number of questions that I suspect will keep scientists busy for many decades to come. “We must either succeed in producing living matter artificially,” writes Loeb, “or we must find the reasons why this is impossible.” Link

    What essays did I miss? Let me know in the comments and I’ll expand the list 🙂

  • A Christmas Story

    I.

    For centuries, physicians have noticed an unsettling pattern: a string of young boys who seem doomed to bleed. Every scrape or cut on their bodies oozed blood long after other boys had scabbed and healed. Doctors didn’t know the cause; some speculated that the bloody kids merely had “fragile blood vessels.” Others suspected that platelets — the small, disc-like cells that help form clots — were defective. A slight bump against a doorframe might cause a bruise that blackens and spreads beneath the skin. Even a bending of the knees could cause joints to fill with blood! Worse still, internal organs would rupture and hemorrhage, causing the lungs or brain to fill with blood.

    Whatever the cause, families watched their children die before reaching adulthood. In the 1960s, prospects for people with this disease were grim. A 1967 study noted that of 113 patients who went untreated, most died in childhood or early adulthood, often from minor injuries that triggered uncontrolled bleeding. Only eight of these patients survived beyond 40 years.

    It wasn’t until the mid-20th century that researchers began to unravel, slowly, a mechanism for the disease. In 1952, researchers at Oxford University figured out that hemophilia — as the disease came to be called — was not one condition, but at least two. They reached their conclusion while studying a young boy named Stephen Christmas, and even published their findings in the Christmas issue of the British Medical Journal.

    II.

    Scientists have known about hemophilia since ancient times. The Babylonian Talmud forbade the circumcision of a male child if two of his brothers had already died from bleeding from the same procedure. In the 12th century, an Arab physician named Albucasis described a family in which multiple male relatives bled to death after minor injuries, according to an academic review titled The History of Hemophilia. These early authors had no way of understanding genetics, but they did suspect some kind of inherited pattern — a familial “curse,” so to speak.

    The first clinical documentation of hemophilia in the “modern literature” appeared in 1803. John Conrad Otto, a physician in New York, noted the disorder among his patients; he painstakingly traced pedigrees, mapping who bled and who carried the condition. This analysis laid the foundation for understanding hemophilia as an inherited disease on the X chromosome, although the exact genetic mechanism was not yet known. Otto published his findings in an article entitled, “An Account of a Hemorrhagic Disposition Existing in Certain Families.”

    Two decades later, Friedrich Hopff at the University of Zurich dug more deeply into the disease by studying families with recurring bleeding disorders and tracking which males were affected. Hopff wrote detailed case histories of men who bled spontaneously or who bled for days after a minor trauma. It was Hopff who first coined the term “hemophilia” by combining the Greek “Hemo-” (blood) and “-philia” (love or affinity).

    In the 19th century, hemophilia gained widespread attention when doctors realized that Queen Victoria of England — who reigned for 63 years, from 1837 until 1901 — carried the disease. Victoria passed it to her youngest son, Leopold, who died of a brain hemorrhage at 31 in Cannes. (At the end of his life, Leopold retreated to the south of France in search of refuge from the harsh British winters.)

    Two of Queen Victoria’s daughters, Alice and Beatrice, were also carriers. After marrying into other royal houses, they spread hemophilia into Spain, Germany, and Russia. Tsar Nicholas II’s son, Alexei, also inherited the gene through his mother, Alexandra (a granddaughter of Queen Victoria). It was Alexei’s frequent bleeding episodes that first drew Grigori Rasputin, a peasant faith healer, into the Romanov court. When Alexei died in 1918, so too did the last Russian tsesarevich, or heir apparent.

    And still, nobody knew what actually caused hemophilia. But slowly, over time, researchers discovered much more. Like how a defective segment on the X chromosome prevents the body from producing a functional clotting factor. Or that the process of clot formation in healthy people involves a series of proteins called “factors,” each activating the next in a cascading sequence.

    Factor IX, for example, helps convert prothrombin into thrombin, which in turn converts fibrinogen into fibrin to form a stable clot. When mutations disable factor IX, the clotting cascade stalls. Patients then bleed more, sometimes dramatically. This defect became known as hemophilia B. If the mutation affects another factor instead, called VIII, then patients are said to have hemophilia A. (There is also a third form of hemophilia, type C, that affects factor XI.)

    In the early 20th century, physicians did not know there were different subtypes. They had assumed that all these bleeding problems stemmed from the same root cause: “weak blood vessels.” This assumption persisted into the 1930s.

    But then, in 1936, two Harvard doctors isolated a substance from plasma that could fix the clotting defect in some people with hemophilia. They named this substance “antihemophilic globulin,” but did not know why the substance helped blood clot in some cases, yet not in others.

    An answer to their question would not appear until 1947, when an Argentinian physician named Alfredo Pavlosky mixed blood from two separate hemophilia patients and found, oddly, that the blood clotted quickly. One of those patients had hemophilia A, and the other had hemophilia B. Neither realized it at the time, but this observation showed that each patient lacked a distinct factor. One patient’s factor VIII could complement the other patient’s factor IX deficiency, and vice versa. Researchers slowly began to recognize that they were dealing with separate disorders.

    The defining moment in hemophilia B’s story, though, came in 1952. That’s the year Oxford scientists first described a five-year-old boy, named Stephen Christmas, who had frequent, uncontrollable bleeding since he was 20 months old. When the doctors mixed Christmas’ blood with blood from hemophilia A patients, they noted normal clotting; much like Pavlosky had five years earlier. The scientists therefore concluded that Stephen Christmas did not lack “antihemophilic globulin” (now called factor VIII).

    Unlike Pavlosky, though, the Oxford team took their experiments much further and showed, for the first time, that Christmas was missing a different protein, which they dubbed the Christmas Factor (factor IX). Their findings were published in the British Medical Journal’s Christmas issue and the disease was named, fittingly, “Christmas Disease.”

    The Oxford team used clever blood-mixing experiments to make their discovery. They took small samples of blood plasma from different patients and observed what happened when they were combined. If two samples improved clotting times when mixed, it suggested that each plasma had at least some clotting element the other lacked. For instance, mixing patient #2’s blood with patient #4’s blood yielded faster clotting times, indicating that each person was deficient in a different factor. In contrast, mixing certain pairs that both lacked the same factor did not produce any improvement in clotting. This approach ruled out the possibility that Christmas Disease was simply hemophilia A by another name. Over time, the disease was renamed to hemophilia B.

    III.

    Efficacious treatments for hemophilia did not appear for another decade after the Christmas paper. In 1964, a Stanford scientist named Judith Graham Pool discovered that the slushy precipitate left after partially thawing plasma (called the cryoprecipitate) contained a high concentration of factor VIII. This discovery meant that blood banks could collect and store large amounts of clotting factors in relatively small volumes.

    Patients with hemophilia A — a factor VIII deficiency — could now receive fewer, more potent infusions to control or even preempt bleeding episodes. This was great news, of course, but it did not help hemophilia B directly because factor IX was still missing from the cryoprecipitates. Still, hemophilia A affects about six-times more people than hemophilia B (1:5,000 births, compared to about 1:30,000 births) and the idea of separating and concentrating specific clotting factors set the stage for future treatments.

    The next leap came in the 1970s, when researchers developed freeze-dried concentrates containing both factor VIII and factor IX. These concentrates could be stored easily and administered at home, which allowed patients to treat themselves as soon as bleeding began. Orthopedic surgeries also became much safer, giving patients a chance to correct damage that had already accumulated.

    In Sweden, doctors like Inge Marie Nilsson and Ake Ahlberg went even further: they pioneered prophylactic treatment, giving factor VIII to hemophiliacs on a regular schedule rather than waiting for bleeds. The same principle applied to factor IX for hemophilia B patients. This approach transformed hemophilia from a life-threatening disorder into a manageable, yet chronic, condition.

    There is a tragic sidenote in this tale, though. Before 1985, many plasma-derived concentrates were unknowingly contaminated with human immunodeficiency virus (HIV) and hepatitis viruses. A devastating number of hemophilia patients contracted these conditions. It is estimated that [4,000 of the 10,000 hemophiliacs then thought to be living in the U.S. died from AIDS.

    Today, hemophilia has morphed from a chronic condition into a curable one. Lasting genetic fixes are now available. Rather than requiring frequent or even weekly infusions of factor IX, patients can get a one-time dose of a gene therapy — such as Hemgenix, a gene therapy approved by the FDA in 2022 — that prompts their own cells to make factor IX.

    Hemgenix is a one-time infusion for hemophilia B. It works like this: First, a healthy gene encoding the factor IX gene is inserted into an adeno-associated virus, or AAV. This virus is then infused into the bloodstream (this takes an hour or two), where it travels to the liver, gloms onto cells, and delivers its genetic payload. The AAVs deliver the healthy gene into liver cells. The gene integrates into the cells’ DNA, instructing them to make functional copies of factor IX. After getting Hemgenix, 96% of participants stop using their normal medication.

    The Hemgenix clinical trials measured the annualized bleeding rate before and after gene therapy. During the lead-in period, patients had about 4.1 bleeds per year. In months 7 to 18 after treatment, that average dropped to 1.9 bleeds per year. In other words, patients receiving Hemgenix bled less than half as often after receiving the gene therapy compared to before. The researchers also measured how much functional factor IX the patients’ blood contained over 24 months. Their factor IX levels hovered around 36–41% of normal. That range is typically enough to cause blood clotting, making severe bleed outs much less likely.

    In the United Kingdom, the National Health Service will pay about 2.6 million poundsper patient for Hemgenix. This price may seem high, but it’s likely far lower than the costs required to give those patients factor replacement medicines over several decades of their lives.

    It’s incredible to me that only one hundred years ago, families watched helplessly as children with “weak blood vessels” bled and died from small bumps. And that now, we have a gene therapy that corrects the disorder and makes hemophilia liveable for the first time in human history.

    So this Christmas, I’m grateful for biotechnology. Although often tied to scary things like “bioweapons” — especially by those outside of biology — my experience is that biotechnology is far more often used as a force for good. Christmas disease is just one example of that. In 2025, I’m hopeful that we’ll see much more progress on AAV engineering (using AI and other tools!) to make gene therapies safer and more precise, and less likely to cause severe immune reactions. If we figure this out, gene therapies could be used to cure many diseases that were once considered little more than death sentences.

  • How to Calculate BioNumbers

    Arithmetic is a superpower. Or, as Dynomight has written, a “world-modeling technology.” It is one of the first things we learn in school, and yet few seem to use it in everyday life to make predictions about the world.

    Physicists use back-of-the-envelope arithmetic all the time, though. Enrico Fermi famously used it to estimate the energy released during the Trinity atomic bomb test. Standing ten miles away, he wrote that:

    About forty seconds after the explosion the air blast reached me, I tried to estimate its strength by dropping from about six feet small pieces of paper before, during and after the passage of the blast wave. Since, at the time, there was no wind, I could observe very distinctly and actually measure the displacement of the pieces of paper that were in the process of falling while the blast was passing. The shift was about 2.5 metres, which, at the time, I estimated to correspond to a blast that would be produced by ten thousand tons of TNT.

    I don’t often meet biologists who use similar estimates to test their assumptions, even though they stand to benefit just as much as physicists. In Fast Biology, I gave an anecdote about some Caltech researchers who were trying to figure out the rate-limiting factor for bacterial growth — specifically, the “thing” that limits a cell’s division rate. They found the answer (ribosome biosynthesis) using simple arithmetic, scribbled on a sheet of paper. No complicated experiments were required.

    I’d like more biologists to use simple arithmetic to check their ideas prior to running experiments. Similarly, I hope more people outside biology will enter the field and contribute. To encourage this, I’m launching a new blog series called Order-of-Magnitude Thinking. Every few weeks, I’ll pose a question and walk through the steps I take to arrive at an answer using arithmetic. I hope you’ll follow along and try these calculations yourself. Over time, I think you’ll become adept at developing biological intuitions, doing sanity checks on experiments, and so on.

    Let’s start with a basic question: How long does it take E. coli to turn one average-sized gene into one protein?

    Before answering, let’s review some molecular biology. When I say “turn,” I really mean transcribe DNA into messenger RNA (mRNA), and then translate that mRNA into protein. We can think of a gene, in this case, as a stretch of DNA that contains all the instructions needed to build that protein. Three mRNA letters are called a “codon” and encode one amino acid — the building blocks of proteins. Thus, a gene’s length in nucleotides is at least three times longer than the protein it encodes.

    Now we’re ready to move forward. The first step is to break down the question and collect our variables. We’ll need to know the size of an average gene in E. coli, the size of a protein encoded by that gene, the transcription rate (the number of DNA “letters” converted to mRNA per second) and the translation rate (how many amino acids are added to a protein per second).

    If this question was about mammalian cells, we’d also need to account for the time it takes mRNA to move from the nucleus to the cytoplasm. But E. coli cells lack a nucleus, so we can ignore this step; their genome is mixed in with everything else, meaning that ribosomes can kick off translation as soon as a mRNA appears.

    I use the BioNumbers database to look up variables. Searching “average gene length E. coli” yields an answer of about 330 amino acids. Recall that each amino acid is encoded by three nucleotides, so let’s assume that an average E. coli gene has about 1,000 nucleotides.

    What about transcription and translation rates? At 37°C (a standard temperature for E. coli growth), the transcription rate is about 40 nucleotides per second. A typical translation rate is 8 amino acids per second.

    Great. Now that we’ve got our numbers, we can carry on with the calculation.

    First, we calculate the transcription time — the number of seconds it takes to convert our average-sized gene into mRNA. This is 1,000 nucleotides divided by 40 nucleotides per second, or 25 seconds.

    Next, we calculate the translation time — the time required for ribosomes to convert the mRNA into a protein. This is 330 amino acids divided by 8 amino acids per second, or about 41 seconds.

    At first glance, we might assume that the total time to make a protein is the sum of these two values: 25 seconds + 41 seconds = 66 seconds, or 1 minute and 6 seconds. But because E. coli lacks a nucleus, transcription and translation happen at the same time. Translation kicks off as soon as the mRNA starts forming. In other words, the creation of proteins in E. coli is bottlenecked by the speed of translation.Therefore, I’d estimate that it only takes about 40 seconds to make one protein from a gene.

    Keep in mind that this estimate involves several assumptions! For instance, we’re assuming that proteins fold immediately, even though some take several minutes to adopt their final structure. We’re also assuming that transcription begins immediately, even though the cell may have to wait several seconds for the correct enzyme to latch onto the correct gene. Many biological processes are limited by diffusion — the time it takes for molecules to encounter each other — and this is an issue I’ll return to in future estimates.

    In any case, the goal of order-of-magnitude estimates is to get within a factor of ten of the underlying reality. It’s okay to round numbers up or down, or to factor in some of your assumptions, to make your final estimate. You’ll develop an intuition for how to do this effectively over time. But for this question, I think it’s safe to say that it takes “a minute or so” to make a protein from a gene in E. coli.

  • We Need Biotech Data

    In 2011, while working in Brazil, Max Roser began formulating the idea for Our World in Data. He initially planned to publish “data and research on global change,” possibly as a book. Before long, that modest blueprint morphed into something far more ambitious.

    Our World in Data went live in May 2014 and, according to Roser, attracted an average of 20,000 visitors per month in its first six months. Today, the website has a worldwide audience. It’s difficult to get exact metrics, but they have more than 300,000 followers on Twitter alone. I’d argue that their true value, though, is not in their “audience reach,” but rather in their global impact.

    By publishing numbers and charts about global change on the internet, Our World in Data plays a key role in finding aspects of global development — like malaria cases over time, for example — that are particularly stubborn and, therefore, ripe for philanthropic or government interventions. In essence, they have shown how numbers, displayed in accessible forms, can illuminate which issues deserve urgent attention and where efforts can accelerate progress.

    We should build a similar initiative for biotechnology. The Schmidt Foundation has forecasted that the bioeconomy (encompassing everything from medicines to microbe-made materials) “could be a $30 trillion global industry.” If we intend to realize that potential, we first need to benchmark where biotechnology has been, assess where it stands now, and identify the most pressing challenges ahead.

    “If I’m just watching the news, I’m going to find it very difficult to get an all-things-considered sense of how humanity is doing,” researcher Fin Moorhouse has written. “I’d love to be able to visit a single site which shows me — in as close as possible to a single glance — some key overall indicators of how the world’s holding up.” Biotechnology deserves precisely this kind of concentrated, data-driven resource.

    More specifically, I’m imagining a website that aggregates information on everything from the computational costs of protein design to the efficiency of gene-editing tools across cell lines. Such a resource would help researchers, investors, and policymakers figure out which areas demand attention and which breakthroughs are worth scaling, all while helping prevent misuse.

    Pieces of this puzzle already exist, but it seems only in scattered or ad-hoc formats. Rob Carlson, managing director of Planetary Technologies, has famously publisheddata on DNA sequencing and synthesis costs. His charts became so popular that people eventually dubbed them “Carlson Curves.” Meanwhile, Epoch AI, a research institute that monitors the computational demands and scaling of AI models, is building the benchmarks and datasets needed to track the AI field’s progress. They could serve as a model for this biotechnology effort.

    A dedicated nonprofit research institute for “Biotech Data” could systematically track metrics such as:

    • Cloning times over the last several decades. How long does it take to synthesize DNA, stitch it together, and make sure everything works as intended? Bottlenecks in cloning slow scientific progress as a whole; the speed of experiments is a key driver of scientific speed overall.
    • CRISPR off-target scores over time. How frequently do gene-editing tools make unintended cuts in the genome, and how can we standardize measurements across studies? We’ll need to make some benchmarks.
    • Resolution and speed of cryo-EM. How rapidly have improvements in cryo-electron microscopy accelerated, both in terms of resolution and throughput?
    • Antibody manufacturing titers over time. Using a single antibody as reference, what titers are companies achieving in CHO (in g/L) or other cell types over time?
    • Bioscience PhDs awarded per year. How many new doctorates emerge from academia, and where do they end up across industry, startups, and research labs?

    Note that these datasets span both technical and societal issues. This is deliberate; to scale biotechnology, we have to understand both scientific breakthroughs and the workforce dynamics behind them. Tools are useless without a workforce to wield them. Many of these numbers already exist on the internet, but are buried in unwieldy government PDFs or tucked away in a patchwork of scientific articles. Others may require painstaking curation by combing through decades of research articles.

    Starting this nonprofit wouldn’t be too difficult. You could begin by collecting one dataset, transforming it into a chart, and posting it online. People on Twitter and LinkedIn seem to really love data visualizations, so you could probably grow an audience quickly. Over time, you might build automated scraping tools for government websites, create reusable templates to make charts quickly, and even publish short blog posts about various charts (like why, exactly, cryo-EM resolution got so good; what were the key innovations?)

    If this vision appeals to you, send me an email (niko@asimov.com), and I’ll help you get started. We briefly considered launching this venture at Asimov Press, but we only have two full-time employees and so don’t have the bandwidth. We might be keen to fund this project.

  • Biotech Needs a Hydrogen Atom

    The hydrogen atom revolutionized physics.

    Throughout the 20th century, physicists used this atom to develop a quantum theory of matter. By using the same atom from one experiment to the next, physicists were able to compare results and reconcile their findings. Hydrogen is the foundation on which physics built its cathedral.

    Biotechnology needs its own hydrogen atom.

    A zoologist and protein engineer both call themselves biologists, but otherwise share little in common. Biology is broad and multi-faceted. Even in narrowly-focused fields—such as for Alzheimer’s or cell death—disagreements abound. Every scientist pursues their own ideas using slightly different methods and cell strains. Papers are promptly locked behind paywalls and negative findings are rarely published at all.

    This is not a good way to build scientific cathedrals. Biotechnology promises to do so much for our world, and yet I fear I’ll never see many of its goods in my lifetime, simply because of the scattershot way in which we work. Biotechnology can learn from physics and build its own cathedral.

    Imagine a biological singularity, of sorts, in which one could design any molecule, or any cell, for any purpose. If biotechnology transcended from an era of trial-and-errorand billion-dollar development timelines, and instead could be used to design safe solutions to problems at will, most diseases would have a cure. Materials would be grown from layers of engineered cells and plants would fix their own nitrogen. Abundance.

    If this sounds like overzealous optimism, well, that’s because it is. But these achievements are not impossible. Cells are made from molecules, which are made from atoms, which can be understood. Nothing in this quest flies against the laws of physics. This century should be devoted to the mapping, quantification, and deep understanding of how life works, such that we can begin to reliably design living organisms to do more good in the world. We’re already seeing this with protein design; in the future, we may see it with cell design.

    But first, biotechnology will need to find its own hydrogen atom, a foundation on which to build tools and knowledge that can later be applied more broadly. I’d like to propose Mycoplasma genitalium, an organism with perhaps the smallest genome of any free-living thing. We’ve already made great progress in understanding this “simple” cell, but there is more to be done.

    In 2006, the J. Craig Venter Institute reported that only 382 genes in M. genitaliumare essential. A whole-cell model of this organism’s life cycle followed in 2012. But even now, dozens of genes in M. genitalium have unknown functions. We don’t fully understand how its molecules interact to carry out behaviors, and most of its proteins have unknown structures. There are also mysteries in the ways that these cells communicate and draw resources from their environment.

    We should build an institute that is wholly devoted to understanding a single type of cell, be it M. genitalium or another, at a depth that is complete enough such that its entire life and all its functions can be simulated on a computer. Achieving that simulation would require first that we build technologies to study life at high spatial and temporal resolutions, for one cell or populations of interacting cells, and then feed the collected data into predictive models that can later be applied more broadly. This institute would ideally operate as a non-profit and make all of these tools and models open-source.

    In this way, a single cell could provide a foundation for biotechnology’s future.

  • The Case for Bridge Editors

    Arc Institute researchers recently published a preprint showing that their gene-editing technology, called Bridge recombinases, work in human cells. Many people applauded the paper on social media, while others asked, “Wait, how does this tool even work? And why does it matter?”

    Fair questions. The preprint is not easy to understand, and the reasonsfor inventing a new type of gene-editing tool in 2025 are even less obvious. After all, there are already dozens of CRISPR gene-editing tools to swap ‘letters’ in the genome, delete stretches of DNA, or replace one sequence with another. What makes these recombinases any better?

    A few things. But if you don’t care to read on, and just want to hear my quick argument in 280 characters about 65 words, then here it is:

    Bridge recombinases can make large-scale changes to the human genome that other gene-editing tools cannot. Therefore, scientists can use them to answer basic research questions that they couldn’t before, like how certain chromosomal abnormalities cause cancer. Also, Bridge recombinases are able to make those big genome changes without relying on cellular repair mechanisms, which could make them more predictable than other gene-editing tools.

    So let me explain what a Bridge recombinase is. At its core, it’s a genome-editing tool made from two parts: a protein (called the recombinase) that cuts and rejoins strands of DNA, and a RNA molecule (called the ‘Bridge’) that guides the recombinase to a specific site in a cell’s genome.

    Bridge recombinases were discovered in nature; not made in a laboratory. They are a type of transposase, a gene that naturally “cuts-and-pastes” itself into new places in the genome. Transposases are found all over the place, like in plants, bacteria, and animals. Almost half of the human genome is thought to have originated from transposable elements, which get duplicated and move around over millions of years.

    Most transposase proteins recognize a specific stretch of DNA and always insert their transposable elements at that particular sequence. This means that it is very difficult to “reprogram” most transposase proteins.

    But last June, Arc Institute researchers described a family of naturally occurring transposases, called IS110, that rearrange DNA using an unusual mechanism. Unlike most transposases, which recognize DNA through protein binding, IS110 transposases use small RNA molecules (called “Bridge RNAs”) instead.

    In nature, these Bridge RNAs attach to two DNA sequences at the same time: one at the target location where the insertion occurs, and one within the transposon itself (the “donor DNA”). By bridging these two sequences, the Bridge RNA instructs the recombinase protein to cut and paste the transposon at the desired location. The Arc scientists showed that they could modify these Bridge RNAs to instruct the recombinase to edit other locations in the genome, too—not just the transposon’s original site. These scientists found two recombinases in this IS110 family (called IS621 and IS622) that can be used to edit large chunks of DNA in bacterial or human cells, respectively.

    Now, at this point you may be wondering: “OK, so that’s it? Bridge recombinases can edit genomes, but why not just use CRISPR-Cas tools to do that instead?” And the answer is this: The special thing about Bridge recombinases is that they edit the genome without relying on cellular repair mechanisms, unlike CRISPR-based tools.

    Since 2012, scientists have discovered all kinds of Cas proteins with various numbered names, like Cas9 and Cas12 and Cas13 and even (my favorite) Cas7-11. Researchers have also invented lots of CRISPR “spin-offs,” such as base editors and prime editors. Of all the CRISPR gene-editing tools available, prime editors are perhaps the most versatile. Prime editors can make lots of different types of edits to a genome, like swap one nucleotide for another (say, A → C), insert short sequences, or delete segments of DNA. These edits are usually short, though—typically 40–80 nucleotides, and rarely longer than 100 nucleotides.

    All CRISPR gene-editing tools share a flaw, though: they rely on cell repair pathways to make their edits. If a researcher wants to permanently “shut down” a gene, for example, they might use CRISPR-Cas9. The Cas9 protein goes into the genome and makes a cut at the position indicated by its guide RNA. But the Cas9 protein itself doesn’t then fix the DNA it has broken. The cell has to fix that damage a different way.

    Cells have two main ways to fix DNA breaks. Non-homologous end joining quickly slaps the two broken strands together, often adding or deleting random bits of DNA in the process. It is messy, but fast. The second option, homology-directed repair, uses a matching DNA template to fix the break. Basically, scientists can introduce a DNA “donor template” into cells alongside CRISPR-Cas9. The cell sees this template as the correct version of the sequence and copies from it to fix the break. But homology-directed repair only works reliably during specific phases of the cell cycle and happens less frequently than non-homologous end joining.

    Because CRISPR relies on these cellular repair pathways, its edits are inherently unpredictable. Cellular repair systems are non-deterministic (sometimes the cell uses one option, and sometimes it uses the other), and different cells therefore produce different results. Bridge recombinases bypass these cellular pathways, which could make their edits more predictable.

    Which brings me, finally, to the last question: “OK, so Bridge recombinases are perhaps more reliable than CRISPR tools, and they can make larger types of edits to the genome. But how can we actually use these things in the real-world?”

    In a few different ways. For their recent preprint, researchers used a Bridge recombinase to precisely invert a 930,000 base-pair sequence in human cells, and also to chop out 130,000 bases in a single go. They also used Bridge recombinases to edit a gene linked to a disease called Friedreich’s ataxia. Whereas healthy people have several repeats of a sequence—GAA—in a gene called FXN, people with Friedreich’s ataxia have hundreds or thousands of the repeats. This causes the gene to make a defective protein that, in turn, slowly causes nerve damage. In a cell culture model, the researchers used a Bridge recombinase to cut out more than 80 percent of the repeating sequences (and it worked about 40 percent of the time.)

    Now let’s zoom out and think about bigger applications for these Bridge recombinases. I can think, almost immediately, of two uses. The first is to study cancer in the laboratory, and the second is to quickly make transgenic mouse models for preclinical trials.

    There are many types of cancers caused by large-scale genome rearrangements. Chronic myeloid leukemia, for example, happens when chunks of chromosomes 9 and 22 swap places. Using Bridge recombinases, researchers could recreate this rearrangement in healthy cells to study how it causes disease and how to reverse it. Ditto for Ewing’s sarcoma, a bone cancer caused by another type of chromosome fusion.

    Bridge recombinases could also make it much simpler to make transgenic mice. Millions of mice are used in biology research each year. But historically, researchers would spend many months or years making just one transgenic mouse because the main technology to do—called Cre-loxP—is such an obnoxious pain to work with.

    Cre-loxP is a genetic tool that uses an enzyme, called Cre recombinase, to cut and rearrange DNA sequences located between specific DNA markers, called loxP sites. (In other words, Cre recombinase cuts DNA, but only at places in the genome containing these little DNA markers. Cre recombinases are therefore not programmable in the same way as a Bridge recombinase.)

    So to make a mouse model, scientists engineer one mouse line to have loxP sites flanking the gene of interest. Then, they engineer another mouse line to make the Cre recombinase. After breeding these two mouse lines together, the offspring inherit both the Cre protein and the loxP-flanked DNA. Only at that point can Cre recombinase finally rearrange or remove the DNA between the loxP sites. Each genetic change requires a new set of loxP insertions and many additional breeding cycles.

    With Bridge recombinases, much of this tediousness goes out the window. Instead of spending months making custom mouse lines with loxP sites, researchers can just design a single Bridge RNA and inject that RNA and the bridge recombinase protein, together, into embryos. The bridge RNA pairs with the DNA targets, and the recombinase rearranges the genome right there, in one step. No separate mouse lines, no extended breeding cycles, no pre-installed DNA sequences.

    There are other uses for Bridge recombinases, too. Scientists can use them to make just about any type of large-scale edits, which means that “genome design” is now an actual possibility. And so maybe the questions we started with—”How does this work? Why is it better?”—aren’t even the right ones.

    For decades, biologists have mainly been observers; cataloging genes, making little mutations; mapping chromosomes; knocking stuff out and trying to put piecemeal observations together again. But now, for the first time, there is a tool good enough to rewrite large stretches of the human genome. So the questions worth asking have changed. Instead of wondering what these tools can do, we should start thinking about what we want them to do.

    Thanks to Nicholas Perry and Matt Durrant for reading drafts of this.

    1 There is most likely a 2 bp flap that remains after recombination using the Bridge system. This can probably be mitigated depending on how scientists design the target and donor sequences. But in most cases it would require cellular processes to repair that 2 bp flap, according to an author of the preprint, Matt Durrant.

    2 There are more advanced forms of prime editing, but they are out of scope for this article. David Liu’s group has used prime editors to insert recombinase “landing sites” in a genome, and then use recombinases to make gene-sized inserts. But in general, prime editors alone are only able to modify ~100 bases of DNA at once.

    3 Prime editors get around these two major repair pathways, but still rely on a cell’s machinery (the mismatch repair pathway) to fix the damage and make the edit.

    4 Another benefit is that the DNA sequence Bridge recombinases insert is fully programmable, enabling nearly scarless genome edits. In contrast, prime editing paired with recombinases or CRISPR-associated transposases (CAST systems) rely on large recombinase recognition sequences—typically 30-50 nucleotides—that cannot be customized. These large, fixed recognition sequences make it difficult to do precise insertions, especially within things like exons.

  • C57Bl6/J

    The first mouse emulation appeared in 2032; a rodent’s entire anatomy, and all of its cells—including the brain—perfectly recapitulated using computer hardware. In those early days, only a few organizations had sufficient computing power (and the necessary data files) to run the emulations, which depended on custom-designed NVIDIA chips. The military had enough compute to run sevenemulations and the National Institutes of Health, or NIH, enough to run five. A thirteenth emulator was thought to exist, but no one knew for sure.

    The military’s emulators were commandeered by high-ranking officers at the Pentagon, Office for Naval Research, and CIA. The Pentagon sent a handful of chips to leading materials science laboratories, who worked tirelessly to dissect their atomic properties. The remaining emulators were mainly used to screen drugs that could make mice do various things of military interest—stay awake longer, move faster, grow larger muscles, etc. In 2035, a ProPublica investigation revealed that many of the in silico results had secretly been tested on prisoners in Guantanamo. Once the military felt that it had exhausted the emulators’ potential, they stored the chips and files somewhere in Fort Detrick. Not even the President knew exactly where.

    The NIH gave grant-making committees authority to dole out access to the mouse emulators. The committees announced a series of grants, but ultimately awarded them to close friends at various academic institutions. One emulator went to a consortium at Harvard, a second to MIT, and the others to academics at Stanford, Johns Hopkins, and the University of Utah. In exchange, the academics agreed to list all the NIH committee members as authors on all future papers in perpetuity. As h-indexes swelled to the hundreds, then thousands, they soon ceased to be relevant at all.

    These academic emulators were used to churn out biomedical research papers; about 50 per day. Every experiment that could possibly be run on mice—every possible gene deletion, or even combinations of deletions, and every battery of physiological tests—were modeled and executed in silico. Soon, every problem under the sun had been solved in mice; aging, eyesight, diabetes, cancer, you name it. The researchers spun up companies and chaired important committees. They sat on the boards of pharmaceutical companies and began to apply their findings to people. The F.D.A. agreed to remove some pre-clinical testing requirements, such that 11 academics were soon involved in 7,100 clinical trials that had collectively enrolled 2.3 million people.

    Rumors of the thirteenth emulator percolated around the Internet, but nobody knew for sure whether it was real. People in the r/biotech subreddit speculated that a disgruntled NVIDIA employee had quietly slipped away with a few chips and the data files, and was planning to sell them to a wealthy individual—perhaps Musk or Altman. So everyone was surprised when, in late 2032, a Reddit user by the name of Hitchhiker42 (later revealed to be a student living in Berkeley, California) uploaded all of the files and chip designs, for free, onto a public server. Hitchhiker42’s post began: “I think I found a bug in this emulator…”

  • Central Dogma in 7 Experiments

    Introduction

    In the days before DNA sequencing, high-powered microscopes, and molecular biology textbooks, decoding the finer workings of a living cell often required arduous experiments and clever speculation.

    ‍The history of molecular biology is rife with eccentric scientists who drummed up creative experiments to study unseen molecules, and then used deductive reasoning to piece a larger puzzle together. Mapping the Central Dogma is their crowning achievement.

    The Central Dogma was first described by Francis Crick, the Cambridge scientist who solved DNA’s structure with James Watson, based on x-ray images obtained by Rosalind Franklin. In 1958, Crick wrote that once genetic information has passed into protein, “it cannot get out again.”

    ‍Although students typically learn the Central Dogma as something like DNA → RNA → protein, or “DNA is transcribed to RNA which is translated to protein,” this is not what Crick originally said. There are also exceptions to the oft-mentioned DNA→RNA→protein depiction; RNA is reverse transcribed into DNA, for example, and prions are protein aggregates that replicate themselves. Crick regretted naming his idea the ‘Central Dogma,’ he wrote in Nature, because the idea itself was speculative. Crick had misunderstood the definition of the word dogma.

    ‍Still, the way that cells read instructions encoded in DNA to create all the proteins necessary for life is the cornerstone of modern molecular biology. The scientists who cracked this code were often brilliant thinkers, and their experiments ought to be an inspiration for future genetic designers hoping to make discoveries in areas where we are currently most blind.

    ‍In this essay, we highlight 7 experiments that elucidated the Central Dogma and information processing in cells. These experiments include those that first isolated the intermediate molecule between DNA and proteins, called messenger RNA, cracked the genetic code, and solved the basic mechanism for DNA replication in living cells.

    ‍Experiments described in this essay are important, but not exhaustive. Biological knowledge is built up, slowly, by the collective efforts of hundreds of scientists. Only a book like The Eighth Day of Creation, by Horace Judson, could even begin to do justice to the rich and beautiful history of molecular biology. This essay focuses on a few important years, and is inspired by The Generalist’s article on the history of AI.

    THE STRUCTURE OF DNA (1953) #

    Friedrich Miescher, a Swiss chemist, was the first person to isolate DNA. In 1869, he collected pus-covered bandages from patients at a university hospital and extracted a sticky substance from them. Miescher called this substance nuclein.

    For decades after, most biologists believed that Miescher’s discovery was little more than a quaint curiosity. Early molecular biologists (the coin was termed in 1938) thought that proteins, rather than DNA, was the genetic material of living cells. Proteins are built from many different amino acids, and appear in all kinds of different shapes and sizes. This made them seem like the likelier option for genetic material.

    ‍By 1944, though, this view began to crumble when three scientists at the Rockefeller Institute in New York City, named Oswald Avery, Colin MacLeod, and Maclyn McCarty did an experiment to identify the molecule responsible for carrying genetic information. Their results pointed to DNA.

    ‍The trio isolated protein and DNA from different strains of Streptococcus pneumoniae, a virulent bacterium, and then used enzymes to break down the molecules. The digested proteins and DNA were inserted into a harmless strain of bacteria, and the scientists waited to see if either molecule would turn the harmless cells virulent.

    ‍When digested DNA was added to the cells, the harmless bacteria adopted the traits of the virulent strain, but this did not happen with digested proteins. These results suggested that DNA, and not proteins, was a carrier of hereditary information.

    ‍A few miles north, at Columbia University, a biochemist named Erwin Chargraff read the Avery-MacLeod-McCarty paper and was “deeply moved by the sudden appearance of a giant bridge between chemistry and genetics,” as he later wrote. Chargaff had an academic background in molecular chemistry. He realized that, if DNA was indeed the genetic material, then perhaps a chemist could dissect how it differs across organisms and thus explain the rich diversity of the natural world.

    ‍Chargaff’s team spent several years chewing up DNA sequences, separating out the individual nucleotides on pieces of paper, and exposing the nucleotides to a UV spectrophotometer. They repeated this for DNA molecules harvested from yeast, bacteria, beef spleens, and calf thymus. By 1949, Chargaff had cracked a basic principle of the DNA code:

    “The desoxypentose nucleic acids from animal and microbial cells contain varying proportions of the same four nitrogenous constituents, namely adenine, guanine, cytosine, thymine…Their composition appears to be characteristic of the species, but not of the tissue, from which they are derived.”

    ‍In other words, Chargaff correctly determined that every organism on Earth uses DNA molecules that are made from the same four letters. Genetic material only differs, from one species to the next, by the order in which the four nucleotides appear. Chargaff also noted that “the molar ratios of total purines to total pyrimidines, and also of adenine to thymine and of guanine to cytosine, were not far from 1.” Said another way, the amount of ‘A’ in DNA is always equal to the total amount of ‘T’. Ditto for ‘G’ and ‘C’ nucleotides.

    ‍Chargaff shared his results in a lecture at Cambridge University in 1952. Watson and Crick were in attendance. The following year, using x-ray diffraction images first obtained by Rosalind Franklin at King’s College London, and perhaps also Chargaff’s observations, Watson and Crick assembled a biophysically accurate model of DNA. Their model was made from crude, metal sheets, but clearly depicted a right-handed double-helix in which ‘A’ connects to ‘T’, and ‘G’ connects to ‘C’. The model was published in Nature on 25 April 1953.

    Recent revelations have revised Rosalind Franklin’s role in solving DNA’s structure. In the classic telling of this tale, Franklin is “portrayed as a brilliant scientist, but one who was ultimately unable to decipher what her own data were telling her about DNA,” according to an article by Matthew Cobb & Nathaniel Comfort in Nature. “She supposedly sat on the image for months without realizing its significance, only for Watson to understand it at a glance.”

    ‍But this tale is not accurate. Newly unearthed documents, including a shelved article that Franklin wrote with Crick and Watson for Time magazine in 1953, now suggest that “Franklin did not fail to grasp the structure of DNA. She was an equal contributor to solving it.”

    DNA REPLICATION (1958) #

    Watson and Crick’s 1953 Nature paper concludes with one of the most famous passages in biology’s history:

    “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”

    ‍The Cambridge duo’s model correctly depicted DNA as a molecule composed from two interlocking strands, wherein ‘A’ always connects to ‘T’ and ‘G’ always connects to ‘C’. If the two strands were to unwind and detach from each other, Watson and Crick noted, it should be possible to recreate the original strand merely by pairing up each base in the separated strand with its appropriate nucleotide. This idea was called the semi-conservative model of replication.

    ‍Other eminent scientists attacked this idea. Max Delbrück was a renowned physicist at the California Institute of Technology who, together with Salvador Luria, had discovered that bacteria resist phage attacks via random mutations. He penned an article arguing that the semi-conservative model could not be correct because too much energy would be required to unwind the two DNA strands.

    ‍Delbrück favored a different model, called dispersive replication, in which small chunks of a DNA molecule are broken up, and then matching DNA sequences are synthesized directly in the broken regions to create an intact, double-stranded helix. A third group of scientists favored a conservative replication model, which theorized that the entire DNA molecule is somehow copied without unwinding whatsoever.

    ‍Thanks to a particularly innovative experiment devised by two young scientists at Caltech, named Matthew Meselson and Franklin Stahl, Watson and Crick’s semi-conservative model was ultimately vindicated.

    ‍It would be relatively simple to figure out how DNA replicates if one could directly observe these molecules. But that was not possible in 1958. Instead, Meselson and Stahl devised a clever experiment, based on spinning molecules quickly in a centrifuge, to test the three models.

    ‍Meselson and Stahl’s key insight was to tag DNA strands undergoing replication with heavy atoms, such as nitrogen (N15) that carries an extra neutron. The scientists grew bacterial cells in a growth medium containing this heavy nitrogen, waited for the N15 to incorporate into all of the cells’ molecules, and then quickly transferred the ‘heavy’ microbes into growth media with normal nitrogen.

    ‍As the DNA molecules replicated, Meselson and Stahl killed the cells and used a centrifuge to spin down the molecules. As the tubes spin, heavier DNA moves toward the bottom and lighter DNA stays closer to the top. Before the cells replicated their DNA, all of the DNA molecules contained heavy nitrogen. After one round of DNA replication, the DNA strands contained half-heavy and half-light nitrogen atoms (Meselson and Stahl saw two ‘bands’ begin to appear in their centrifuged tubes.) And after two rounds of DNA replication, only one-in-four DNA molecules contained heavy nitrogen, suggesting that the semi-conservative model was correct.

    ‍This experiment is renowned for its simplicity and clever approach – it is now called “the most beautiful experiment.” Delbrück was wrong; DNA replication occurs when the two interlocking strands unwind, and each strand is then used as a ‘template’ to remake a double helix.

    THE CENTRAL DOGMA (1958) #

    After publishing his 1953 Nature paper about DNA’s structure, Francis Crick toured the world to lecture on an idea that “permanently altered the logic of biology,” according to Horace Judson, author of The Eighth Day of Creation.

    ‍During his lectures, Crick would often draw a diagram on the auditorium’s blackboard. His diagram depicted how information flows through living cells; DNA is somehow converted into an intermediate molecule, which Crick called ‘template RNA’, that somehow encoded the amino acids in a protein molecule. Crick correctly predicted the basic details of protein synthesis years before direct experimental evidence had confirmed the existence of mRNA or tRNA.

    ‍In 1958, Crick adapted his lecture into a published article, called On Protein Synthesis. His target audience was “a general reader rather than the specialist.” The article gave two hypotheses to explain the relationship between DNA and proteins, called the Sequence Hypothesis and the Central Dogma.

    ‍“The direct evidence for both of them is negligible,” Crick wrote, “but I have found them to be of great help in getting to grips with these very complex problems.”

    The sequence hypothesis, in its simplest form, “assumes that the specificity of a piece of nucleic acid is expressed solely by the sequence of its bases, and that this sequence is a (simple) code for the amino acid sequence of a particular protein.” In other words, the bases in a strand of DNA or RNA corresponds to the amino acids in a protein.

    ‍The Central Dogma, Crick wrote, “states that once ‘information’ has passed into protein it cannot get out again.” Stated another way, “the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is im­possible.”

    ‍This passage marked the first time that the Central Dogma, the defining idea of molecular biology, had been published. But this is not why Crick’s article was so prescient.

    ‍In the article, Crick used scattered experimental evidence and anecdotal observations, including the fact that “spermatozoa contain no RNA,” to correctly predict that there must be a messenger RNA molecule in the cytoplasm that is produced by “the DNA of the nucleus.”

    ‍Crick’s astounding ability to theorize was most prominently displayed, though, when he correctly inferred the existence of tRNAs, predicted what they were made of, and explained how they likely became ‘charged’ with amino acids for protein synthesis.

    ‍Molecular biologists knew that proteins were made from 20 amino acids, but most other details of protein synthesis were a mystery. Today, we know that tRNA molecules get ‘loaded’ with the correct amino acid via the action of specific enzymes, and that this is how a message encoded in a strand of RNA is used by the ribosome to build a protein. But Crick had little evidence for any of this. And yet, in his 1958 paper, he wrote:

    “Granted that…[mRNA]…is the template, how does it direct the amino acids into the correct order? One’s first naive idea is that the RNA will take up a configuration capable of forming twenty different ‘cavities’, one for the side-chain of each of the twenty amino acids. If this were so, one might expect to be able to play the problem backwards – that is, to find the configuration of RNA by trying to form such cavities. All attempts to do this have failed, and on physical­ chemical grounds the idea does not seem in the least plausible…

    Apart from the phosphate-sugar backbone, which we have assumed to be regular and perhaps linked to the structural protein of the particles, RNA presents mainly a sequence of sites where hydrogen bonding could occur. One would expect, therefore, that whatever went on to the tem­plate in a specific way did so by forming hydrogen bonds. It is therefore a natural hypothesis that the amino acid is carried to the template by an ‘adaptor’ molecule, and that the adaptor is the part which actually fits on to the RNA. In its simplest form one would require twenty adaptors, one for each amino acid.

    What sort of molecules such adaptors might be is anybody’s guess. They might, for example, be proteins…though personally I think that proteins, being rather large molecules, would take up too much space. They might be quite unsuspected molecules, such as amino sugars. But there is one possibility which seems inherently more likely than any other-that they contain nucleotides. This would enable them to join on to the RNA template by the same ‘pairing’ of bases as is found in DNA, or in polynucleotides.

    If the adaptors were small molecules one would imagine that a separate enzyme would be required to join each adaptor to its own amino acid and that the specificity required to distinguish between, say, leucine, iso­leucine and valine would be provided by these enzyme molecules instead of by cavities in the RNA. Enzymes, being made of protein, can probably make such distinctions more easily than can nucleic acid.”

    ‍This paper is a tour-de-force of logical reasoning. It became the focal point, a rallying cry, for molecular biologists seeking to crack the genetic code and resolve the cell’s mysteries. Crick, fortunately, would not have to wait long for his ideas to be vindicated. A ‘template RNA,’ or messenger RNA as it’s now called, was discovered just three years later.

    ISOLATION OF MESSENGER RNA (1961) #

    Messenger RNA was first isolated by two separate research groups in 1961. Their results appeared back-to-back in the 13 May issue of Nature.

    ‍At the Institut Pasteur in Paris, the French scientists François Jacob and Jacques Monod had discovered that the enzymes required to break down a sugar in bacterial cells were only made after cells were exposed to that sugar. In other words, cells somehow “process” an external cue and make proteins in response. This marked the discovery of genetic regulation, but also raised a slew of questions.

    ‍Among them: How does a cell know which genes to turn on at any given time? Why doesn’t the whole genome “turn on” at the same time? Several answers were proposed. Maybe there is a custom ribosome corresponding to each gene, some said. Or maybe, as Crick had proposed in 1958, there is an intermediate molecule – a “template RNA” – that transmits messages between DNA and proteins.

    ‍In 1960, two groups set out to isolate this mystery molecule. The first group rallied around Matthew Meselson’s laboratory at the California Institute of Technology, and included Sydney Brenner and François Jacob. A second group rallied around Wally Gilbert’s group at Harvard, and included James Watson and François Gros, a French biologist who had worked with Jacob.

    ‍Both groups turned to a compelling, experimental model: bacteriophages. When E. colibacteria are infected with a phage, many scientists had noted that the cells stop making their own proteins, and quickly switch over to making the phage proteins. “This system thus provides an ideal model for observing the synthesis of new proteins following the introduction of specific DNA,” Gilbert’s team noted in their 1961 paper.

    ‍To isolate messenger RNA, the Caltech scientists grew bacteria in a growth medium with heavy isotopes, much like Meselson had done with Stahl several years earlier to validate the semi-conservative model of replication. These ‘heavy’ bacteria were then infected with phage and immediately transferred into a growth medium with light isotopes. Infected cells were finally lysed open at regular time points and spun down in Meselson’s ultracentrifuges.

    ‍The bands that emerged from Meselson’s centrifuges confirmed a few things. First, bacterial cells did not make new ribosomes after they were infected. This observation was evidence against the fact there is a unique ribosome for each gene. Second, the results confirmed that a new type of RNA molecule was swiftly made after phage infection, and that this new RNA quickly attached to existing ribosomes in the cell. This suggested that the DNA in phages was being quickly transcribed into messenger RNA. And third, the bacterial cells began to make phage proteins using their existing ribosomes.

    ‍They had discovered messenger RNA. There is an excellent, and much richer, account of this history by the scientific historian, Matthew Cobb.

    MAPPING A CODON (1961) #

    Crick’s 1958 paper made a series of predictions about messenger RNA, transfer RNAs, and how a code embedded in a DNA molecule could possibly encode a protein. But one longstanding question in molecular biology had to do with the nature of the genetic code itself. Namely, how do the nucleotides in a strand of RNA encode the amino acids in a protein? What does UAG mean, or GAA, or UUU, or any other codon, for that matter?

     *Nirenberg and Matthaei in the laboratory. Credit: NIH/Marshall W. Nirenberg.*

    ‍The first triplet codon to be mapped to an amino acid was ‘UUU’ to phenylalanine. This connection was made by two young researchers at the National Institutes of Health (NIH) in Bethesda, Maryland.

    ‍Heinrich Matthaei was a post-doctoral fellow working in the laboratory of Marshall Nirenberg, a new researcher at the Institutes. The two scientists were interested in the Central Dogma – they had read Crick’s paper – and aimed to understand the connection between RNA and proteins, often by running experiments on cell-free extracts, a liquid made by grinding up living cells in a mortar and pestle. This enabled the two scientists to study cell biochemistry without having to deal with living organisms.

    ‍At 3 o’clock in the morning of May 27th, the two scientists took some of these ‘cell guts’ and added a few drops of synthetic RNA with the sequence:

    UUUUUUUUUUUUUUUUU

    ‍Their concoction was next added to 20 different tubes, each of which held a different amino acid; valine, alanine, glutamine, and so on. One of the tubes contained phenylalanine amino acids that had been labeled with a radioactive isotope.

    ‍“The results were spectacular and simple at the same time,” according to a brief history from the NIH. “After an hour, the control tubes showed a background level of 70 counts, whereas the hot tube” – with the radioactive phenylalanine – “showed 38,000 counts per milligram of protein.”

    ‍In other words, when the synthetic RNA molecule was added to a tube of phenylalanine amino acids, the cell-free extract began to churn out radioactive peptides. This singular experiment suggested that the nucleotides UUU somehow encode phenylalanine during protein synthesis.

    ‍Over the next several years, Nirenberg and other researchers would go on to map all 64 codons, including the codon that signals the start of translation, AUG. Nirenberg shared the 1968 Nobel Prize in Physiology or Medicine.

    CRACKING THE GENETIC CODE (1961) #

    The year 1961 was molecular biology’s annus mirabilis. Messenger RNA was isolated for the first time and Nirenberg and Matthaei decoded the ‘meaning’ of the first codon – UUU. Even after those papers were published, though, mysteries remained. Among them: Is the genetic code overlapping or non-overlapping? And is it actually made from doublet, triplet, or quadruplet codons?

    ‍A messenger RNA sequence that reads ‘AUGACC’ could be read by the ribosome as ‘AUG’ and then ‘ACC,’ or it could be read by the ribosome as ‘AUG’, ‘UGA,’ ‘GAC’, ‘ACC’. The former is a non-overlapping code, and the latter is an overlapping code. Similarly, the code could be read as ‘AU’ and then ‘GA’ and then ‘CC’ if codons were doublets, or ‘AUGA’ and then ‘UGAC’ if they were quadruplets, and so on. Nirenberg and Matthaei’s experiment did not help to answer either of these questions, because their synthetic RNA had a repetitive sequence: UUUUUUUU.

    ‍In the waning weeks of 1961, Sydney Brenner, Lesie Barnett, Francis Crick, and R.J. Watts-Tobin used fragmentary experimental evidence and thought experiments to conclude that each amino acid in a protein is encoded by a triplet code, and that the letters in this code do not overlap. Their ideas were published in a paper entitled, “General nature of the genetic code for proteins.”

    ‍Their experiments hinged on two things: A bacteriophage, called T4, that infects bacteria, and a particular type of dye, an acridine called proflavin, that precisely mutates DNA by adding or removing a single nucleotide.

    ‍Crick, the ever-careful thinker, had a beautiful idea. He decided to take some T4 bacteriophage and then expose it to proflavin, such that the phage lost its ability to make a particular protein. If Crick added one base and then removed one base, using the acridine, he noted that the phage were able to make the protein. But if he used acridine to add two bases, the phage did not make the protein. When three bases were added, the phage made the protein again. From these observations, the scientists argued that the genetic code must use triplets to encode each amino acid. Their takeaway was based on this fragmented, experimental evidence.

    ‍Even though the “combination of mutations strongly suggested that the code was based on units of three bases, the experiments could not prove that to be the case – a code using groups of six bases was consistent with the results,” wrote Matthew Cobb in a 2021 history of this paper.

    ‍Today, we know that there are 64 codons in total, and that codons appear as ‘triplets’ to encode amino acids in a final protein chain. Codons made of six bases “would raise all sorts of problems,” as Cobb notes, “by massively increasing the number of either meaningless or degenerate sequences (there would be 4096 possible combinations of bases, rather than a mere 64).”

    ‍As Crick later said: This was “hardly likely to be taken seriously.”

    TRANSLATION VIA A SINGLE RIBOSOME (2008) #

    By 1961, the basic contours of the Central Dogma had been resolved. But that doesn’t mean all work has since abated, nor that the years from 1953 to 1961 are all-encompassing. Linus Pauling at Caltech predicted the main structural motifs of proteins as early as 1951. A ‘stop’ codon that halts protein synthesis was identified in 1965. The ribosome’s structure was solved in 2000, after decades of work, and culminated in the 2009 Nobel Prize in Chemistry.

    ‍Today, synthetic biologists continue to expand the Central Dogma using technologies that Francis Crick, in 1958, could only have dreamed of. And yet, the molecular choreography that underlies the Central Dogma continues to surprise. There are far more enzymes and components involved than early molecular biologists ever could have realized. Transfer RNAs carry amino acids to the ribosome, proteins interact with the ribosome to push it off the RNA strand, and dozens of proteins are involved in transcription initiation, elongation, and termination in human cells.

    ‍Molecular biologists continue to resolve this complexity today. In a 2008 study, called “Following translation by single ribosomes one codon at a time,” chemists at the University of California, Berkeley studied individual ribosomes as they moved along a single messenger RNA molecule. Their experiment revealed the stochastic starts and stops of a ribosome during translation.

    ‍For this experiment, each end of an mRNA molecule was attached to a polystyrene bead. One of the beads was then placed in a laser trap, holding it in place. The middle of the mRNA molecule contained a long loop, which slowly unwound as the ribosome traversed along its length. As the mRNA molecule stretched out, this elongation could be directly measured by measuring the distance between the two beads.

    ‍The chemists repeated this experiment several times, and measured the rate at which the mRNA molecule stretched out each time. Their key result was this: Ribosomes do not glide along the mRNA at a steady pace (which would stretch out the molecule in a linear fashion), but rather jump from one codon to the next in time steps of around 0.1 seconds. The ribosome occasionally pauses between jumps. Each ribosome, then, translates a strand of mRNA in a slightly different amount of time.

    ‍This experiment is one of thousands that have been applied to study the Central Dogma in the last two decades. Crick’s 1958 article continues to inspire generations of molecular biologists, who have found his ideas to be rich fodder for a lifetime of scientific work. We now know a shocking amount about transcription, translation, and the genetic code; bacteria add about eight amino acids to a protein each second, human cells add about five amino acids in the same length of time, and DNA is transcribed to RNA at a rate of about 40 nucleotides per second.

    CONCLUSION #

    The ways in which cells process information is a biophysical marvel that has slowly unraveled over the last 70 years. The Central Dogma, and the 7 seminal experiments described in this essay, are the basis for most everything we do in genetic engineering. But there are still many instances in which we, as genetic designers, place a gene into a cell expecting one thing to happen, but observe something entirely unexpected instead. In other words, biology does not always behave as we expect.‍

    Though useful, the Central Dogma is an incomplete way to think of a living cell. DNA is not always transcribed to RNA, and RNA is not always translated into protein. Sometimes RNA goes back to DNA. The only rule in biology is that there are exceptions to every rule. In future posts, we’ll continue our exploration of the Central Dogma and explain many of these exceptions.

    ***

    Contributors: Ben Gordon and Alec Nielsen. Words by Niko McCarty.

  • Think of the Eggs

    When people think of “biotech” — myself included — they tend to picture GLP-1s and gene therapies. But biotech is much broader than just medicine; it’s also pushing forward a renaissance in the egg industry.

    Eggs aren’t usually top of mind for me. I toss a carton in my grocery cart now and then, but rarely think about how those eggs landed on the shelf in the first place. Perhaps I should. Every year, the global egg industry kills around six billion male chicks shortly after they hatch. Why? Because male birds, bred from “layer” lines, don’t make eggs and don’t pack on enough meat to be profitable. Hence, they’re thrown into a blender.

    Fortunately, scientists have figured out how to determine a chicken’s sex before it hatches. These technologies are called in ovo sexing. Using hyperspectral cameras or PCR, they can be used to figure out which eggs will hatch male vs. female. With widespread adoption, in ovo sexing could spare billions of chicks from the blender. Alas, these technologies weren’t available at all in the U.S. … until last month. Hardly anyone in the mainstream biotech community seems to know about what’s going on in this sector but, in my view, it’s among the most underrated and important stories of today.

    In ovo sexing has been available in Europe for years. Germany banned chick culling in 2022. In response, hatcheries were initially forced to keep male chicks alive and raise them for meat — “a practice that was costly and unsustainable,” according to Innovate Animal Ag. (Again, so-called “layer” chickens just don’t produce much meat. Broilerchickens, on the other hand, are specially bred to grow quickly; they “can grow to be over four times the weight of a natural chicken in only 6-7 weeks,” according to an article in Asimov Press.)

    Sensing an opportunity, companies launched in ovo sexing technologies in Europe so hatcheries could screen out male eggs before they hatched. If eggs are destroyed by day 12 of development, the embryo feels no pain. Thanks to this shift, about 78.4 million of Europe’s 389 million hens — or about 20 percent — came from in ovo sexed eggs last year, according to data from Innovate Animal Ag.

    But only two in ovo sexing methods have reached commercial scale so far. As Robert Yaman, CEO of Innovate Animal Ag, previously wrote for Asimov Press:

    The first of these approaches utilizes imaging technologies like MRI or hyperspectral imaging to look “through” the shell of the egg to determine the sex of the embryo inside. The second approach involves taking a small fluid sample from inside the egg, and then running PCR to identify the sex chromosomes, or using mass spectrometry to locate a sex-specific hormone…

    …Other approaches are in development and have not yet been commercially deployed. Some technologies can “smell” a chick’s sex by analyzing volatile compounds excreted through the eggshell. Another approach uses gene editing so that male eggs have a genetic marker that allows their development to be halted by a simple trigger, such as a blue light. Unlike humans, the sex of a chicken is determined by the chromosomal contribution of its mother. By only modifying the sex chromosome of the female parent line that yields male chicks, the female chicks end up without the gene edit. This means that the eggs they lay do not need to be labeled as “gene-edited” for consumers.

    As Europe rolls out these technologies, most American consumers still have no idea that chick culling is even a thing. In one poll, only 11 percent of Americans knew about chick culling; once informed, a majority opposed it. Fortunately, in ovo sexing technologies have finally arrived in the U.S.

    Three U.S. egg companies — Egg Innovations, Kipster, and NestFresh — have announced plans to adopt in ovo sexing technology. In late 2024, Agri-Advanced Technologies also rolled out a machine called “Cheggy” to hatcheries in Iowa and Texas. Cheggy can scan 25,000 eggs per hour and figure out the sex of embryos inside using hyperspectral imaging. The machine is able to “see” the color of down feathers forming beneath the shell. (Brown-egg chicken breeds typically have differently colored feathers for males and females, but this doesn’t work on white eggs.) Hyperspectral imaging is great because it’s non-invasive; the eggs don’t need to be cracked or poked at all. If the machine detects a female embryo, it sends it back to the incubator. Male eggs are destroyed and turned into protein for pet food.

    Also, in December, Respeggt announced that by February 2025, it will roll out its own in ovo sexing tech at a massive Nebraska hatchery, with a capacity to serve 10 percent of the entire U.S. layer market. Respeggt’s technology relies on PCR, so it works for both white and brown eggs.

     *Respeggt’s technology uses a laser to puncture eggs and retrieve a small amount of liquid to run PCR.*

    In Europe, in-ovo-sexed eggs cost only about one to three euro cents more each. That’s a tiny bump, and I’d gladly pay extra just for the mental solitude of knowing that farmers didn’t have to kill any male chicks to produce them. But I am not most consumers; eggs are one of the most price-sensitive grocery items. When people talk about inflation, they usually talk about the price of bread, milk, and eggs!

    Fortunately, a Nielsen survey found that 71 percent of American egg buyers say they’d pay more for in-ovo-sexed eggs. We’ll see what happens, though, as these eggs get rolled out to grocery stores (likely by mid-2025). Consumer reactions will be super important here because the U.S. government doesn’t mandate whether or not hatcheries kill baby chicks. The survival of these technologies will literally be determined by whether or not people buy the eggs.

    Finally, I just want to say that few (if any) people have been pushing for this harder than Innovate Animal Ag. They didn’t pay me to say that, either; they don’t even know I’m writing this article! But they’re the ones dropping all these reports and data about chick culling, commissioning surveys to figure out price points, and pushing for new certifications to coax consumer buy-in.

    So yeah, we often celebrate biotech’s potential — gene editing, advanced vaccines, cultivated meat — but in ovo sexing is already improving the egg industry at scale. It flies under the radar, but at least now you know the story.

  • Estimating the Size of a Single Molecule

    Many decades before the discovery of x-rays and the invention of powerful microscopes, Lord Rayleigh calculated the size of a single molecule. And he did it, remarkably, using little more than oil, water, and a pen. His inspiration was none other than Benjamin Franklin.

    Sometime around 1770, while visiting London, Franklin became intrigued by a phenomenon he had observed during his transatlantic voyage. Specifically, he noticed that when ships discarded greasy slops into the ocean, the surrounding waves would calm. This ancient practice of oiling the seas to pacify turbulent waters was known to the Babylonians and Romans, but Franklin decided to investigate further.

    On a windy day in London, he walked to a pond on Clapham Common. Carrying a small quantity of oil — “not more than a Tea Spoonful,” according to his diary — Franklin poured it onto the agitated water. The oil spread rapidly across the surface, covering “perhaps half an Acre” of the pond and rendering its waters “as smooth as a Looking Glass.” Franklin documented his observations in detail; they can be read today on the Clapham Society’s website.

    Franklin’s oil drop experiment, of course, was just one in a long line of his “amateur” science experiments. He was also the first to demonstrate that lightning is electrical in nature (via his famous kite experiments), and he charted the Gulf Stream’s course across the Atlantic ocean, noting that ships traveling from America to England sailed quicker than those going the opposite direction. His experiments at Clapham Common are not nearly as well-known.

    But Franklin was a careful experimenter, repeating his oil drop multiple times and taking notes each time. In his journal, he opined on how much oil might be needed to calm various areas of ocean (he was thinking specifically about applications for the Royal Navy) but never grasped the molecular implications of his experiments. It wasn’t until more than a century later that Lord Rayleigh, whose real name was John William Strutt, revisited Franklin’s experiment with a brilliant new perspective.

    An academic at the University of Cambridge and a baron by title, Rayleigh was renowned for his work in physics. The Rayleigh number, a common parameter used to describe the flow of water, is named for him; as is Rayleigh scattering, which explains how photons diffuse through the atmosphere and color the sky blue. Rayleigh also discovered the noble gas, Argon, earning a Nobel Prize for it in 1904.

    But a little experiment that Rayleigh performed in 1890, inspired directly by Franklin’s observations, is not nearly as well-known.

    Rayleigh carefully measured a tiny volume of olive oil — 0.81 milligrams, to be exact — and placed it onto a known area of water. The oil quickly spread out and covered an area, which Rayleigh precisely measured. And then he did something that Franklin never thought of: Rayleigh divided the volume of the oil by the area it covered, thus estimating the thickness of the oil film. Assuming that the oil formed a single layer of molecules — a monolayer — then the thickness of the oil film is the same thing as the length of one oil molecule.

    This is how Lord Rayleigh became the first person to figure out a single molecule’s dimensions, many years before anyone could see such molecules.

    Rayleigh’s final result was 1.63 nanometers. Olive oil is mainly composed of fat molecules called triacylglycerols, and modern measurements show that they measure about 1.67 nanometers in length, thus implying that Rayleigh’s “primitive” estimates were off by just 2 percent. His original paper detailing the experiment can be found here.

    I love this story because it shows, at least anecdotally, how deep scientific insights can emerge from the simplest of experiments. It’s a testament to the idea that you don’t always need sophisticated equipment to unlock the secrets of nature — sometimes, all it takes is a drop of oil and a bit of ingenuity.

    For those interested in delving deeper into the history of these oil drop experiments, Charles Tanford’s book, Ben Franklin Stilled the Waves, offers a much deeper exploration.

  • Microbial Lenses

    There’s a new paper out in PNAS that hints at some intriguing synthetic biology applications. Researchers at the University of Rochester introduced a sea sponge gene into Escherichia coli, giving the bacteria a translucent, silica-based coating. This biosilica shell transforms the cells into tiny microlenses that focus beams of light.

    Here’s an excerpt from the paper (paywalled):

    Remarkably, the polysilicate-encapsulated bacteria focus light into intense nanojets that shine nearly an order of magnitude brighter than unmodified bacteria. Polysilicate-encapsulated bacteria remain metabolically active for up to four months, potentially enabling them to sense and respond to stimuli over time. Our data show that synthetic biology can produce inexpensive and durable photonic components with unique optical properties.

    Typically, microlenses are just tiny spheres, a few micrometers across, fabricated in cleanrooms with harsh chemicals. They appear in photodetectors and camera sensor arrays. Engineered microbes can’t match the precision of these fabricated microlenses, but they offer a major advantage: you can make them at room temperature and neutral pH in a flask of liquid. (And the cells reproduce themselves “for free”!)

    Notably, lifeforms evolved primitive microlenses long before this paper. Cyanobacteria focus incoming light on their cell membranes to locate the sun’s position; they’re probably the world’s smallest and oldest camera eyes. Other cells, like yeast and red blood cells, also naturally behave as microlenses.

    What’s new about this paper is that the silica coating majorly improves the cells’ ability to focus light. More importantly, the work shows that we can tune a living organism’s optical properties through genetic engineering.

    The researchers took silicatein, an enzyme from sea sponges, and fused it to OmpA, an outer-membrane protein that allows molecules to flow in and out of the cell. Silicatein grabs silicon-containing molecules from the environment and stitches them into silica polymers; sea sponges use it to build “bioglass” structures. When fused, OmpA embeds into the cell membrane and holds silicatein outward, like a fishing hook.

    By flooding the engineered cells with orthosilicate (a silicon-containing molecule), the silicatein “hooks” grab it and stitch together a silica shell around the entire cell. The researchers confirmed this with confocal imaging and a dye that binds specifically to silica. The engineered cells ended up surrounded by dye, while normal cells remained unstained.

     *Rho123, a dye, stains silica. Cells were engineered to express silicatein enzyme from two different microbes (hence column A and B), and were compared to wildtype. From Sidor et al.*

    This silica shell significantly changes the cells’ optical properties. To visualize this, the researchers built a custom microscope that can shine light on cells from any imaginable angle relative to the vertical axis. Uncoated cells scattered some light but didn’t create a distinct focal spot beyond their surface. In contrast, silica-encapsulated microbes produced light beams that stretched for several microns, with peak intensities nearly an order of magnitude higher than wildtype cells.

    I would have guessed this treatment might kill the cells — either because the silica shell blocks nutrients or because photons would roast them — but it doesn’t. Engineered cells continued scattering and focusing light even months after switching on the fusion protein. The only downside is that the cells grow more slowly, if at all.

    What could we do with these living lenses?

    My first step would be to engineer cells of different shapes and dimensions. A typical E. coli measures about two microns long and one micron wide. What if we engineered more spherical cells? Or longer cells? We could create a series of living microlenses, each with unique optical properties, by tuning the silicatein protein and adjusting the cells’ physical dimensions.

    (In the video below, researchers are blasting a stationary cell with light at angles ranging from -90° to 90°. There are some orientations where a nanojet appears, but it happens quickly.)

    From there, the applications depend on our imaginations. We might wire living bacteria into optical devices that don’t need batteries and last for months without a power supply. Or we could build medical devices. Instead of swallowing a pill camera powered by toxic batteries, perhaps we could engineer E. coli into a camera. I’m not sure. At this stage, it’s speculation.

    Practical limitations exist with current microlenses. As pixel sizes in camera sensor arrays shrink below two micrometers, placing microlenses becomes difficult. However, cells can “swim” to a specific destination and arrange themselves autonomously. In other words, arrays of bacteria could line up over a sensor — maybe using microfluidic channels — to focus and direct light into tiny pixels.

    Will any of these ideas actually happen? Probably not soon. Still, when a paper broadens our “design space” in biological engineering, it’s worth paying attention. One of my first questions, upon reading something like this, is usually: “Where else could this be applied, especially in unexpected ways?”

    Consider optogenetics: Ed Boyden and Karl Deisseroth discovered channelrhodopsins—light-responsive proteins—and imagined splicing them into neurons to control action potentials. That mental leap doesn’t seem so large in hindsight.

    Engineered gas vesicles, similarly, are being used to improve ultrasound resolution within the body, enabling scientists to image individual cells moving through the bloodstream. I’ve written about these structures before for Asimov Press. Mikhail Shapiro got the idea for engineering gas vesicles after reading “two short paragraphs” about photosynthetic algae!

    In other words, pay attention when a paper like this appears. It might plant the seeds for something exciting, even if we don’t recognize it immediately.

  • How to Minimize Cell Burden

    I. Molecular Burden

    Biochemistry textbooks often depict cells as spacious places, where molecules float in secluded harmony. But cells are dense and crowded; a bit like molecular burritos, according to Michael Elowitz, a biologist at Caltech.

    Roughly three to four million proteins jostle around inside a single E. coli bacterium, which has an internal volume 50 billion times smaller than a drop of water. A typical enzyme within this crowded cell collides with its substrate 500,000 times each second. When bioengineers manipulate life, they must also consider how their modifications will impact everything else within the cell, too—for everything in the cell is connected to everything else.

    In 2000, Elowitz published one of the first synthetic gene circuits—called the “repressilator”—with his mentor, Stanislas Leibler. A gene circuit is made from RNA or proteins that interact with one another, enabling cells to perform logical functions. The repressilator was crafted from just three genes, each of which encoded a protein that repressed another protein to form an inhibitory loop. One of these proteins was fused to a green fluorescent protein so that, as the protein levels rose and fell, the cells flashed green—on and off—in 150 minute intervals.

    As synthetic biology advanced, and its tools grew sharper, synthetic gene circuits swelled in size. In 2016, a paper in Science reported an engineered circuit made from 55 different sequences assembled into 11 genes; among the largest gene circuits yet assembled in a single cell. Building significantly larger synthetic gene circuits will require careful consideration of the finite resources available to cells.

    After all, cells are not empty vessels that have evolved to do our bidding. When we engineer an organism, coaxing it to make new proteins or molecules, we are imposing a molecular burden upon it. Typically, burdensome genes are defined as those that “impose a high enough energetic burden to be opposed by selection if they do not confer sufficient added benefits.” Any genes added to a cell must compete for cellular resources—energy, ribosomes, and RNA polymerases—that may diminish the cell’s ability to carry out other functions; to grow, metabolize, and divide.

    recent study in Nature Communications measured the molecular burden imposed by 301 different plasmids; fewer than 20 percent of them caused E. coli cells to grow more slowly. But surprisingly, some of the most burdensome plasmids were also the simplest—a plasmid encoding red fluorescent protein—and nothing more—caused a 44% reduction in growth rate.

    The study is intriguing, in part, because its dataset could provide insights into whysome genes, once expressed, cause cells to grow more slowly. More importantly, though, this study reveals that there is still so much we don’t understand about biology, or toxicity, or how to ease molecular loads as we strive to engineer life in increasingly sophisticated ways.

    II. Competition

    Cells have finite resources. Insert a synthetic gene into a cell, and several things quickly happen.

    First, the gene is transcribed into RNA by an enzyme calledRNA polymerase. Then, the RNA molecules are translated into protein via ribosomes, large protein-RNA complexes made from dozens of interlocking components. A typical E. coli cell contains about 3,000 RNA polymerase molecules and 30,000 ribosomes. Exogenous genes pull some of these enzymes away from other parts of the cell. And, for reasons that are not fully understood, cells burdened with recombinant DNA do not upregulate their production of RNA polymerase or ribosomes to compensate for the increased load, according to a 2020 study.

    Although the term “burden” typically refers to resource limitations—be they metabolic, transcriptional, or translational—it is often experimentally difficult to untangle from toxicity. A thorough investigation is often needed to tell whether a cell is growing slowly due to burden or toxicity, because the outcome—slow growth—is the same.

    Some proteins that are normally non-toxic also become toxic when expressed above a certain threshold. For a 2018 study, researchers expressed 29 different enzymes in yeast. All of the enzymes have well-known mechanisms and are non-toxic at normal levels. Some of the enzymes became toxic in the yeast, however, because they “aggregated together, they overloaded a transport system that [took] them to a specific cell compartment, or [they] produced too much catalytic activity.”

    A cell faced with excess burden or toxicity really only has one way out: To mutate and break the troublesome genes. A single milliliter of liquid culture holds as many as one billion E. coli cells. If just one of those cells mutates the burdensome genes and breaks its function, then that cell will grow more quickly than its neighbors. The mutated cell’s progeny will eventually take over the entire population. The more burdensome a genetic sequence, the more likely a mutant will appear and take over.

    Remember that Nature Communications study that I mentioned earlier? Well, the authors built a simple mathematical model to predict the correlation between different levels of burden and “population takeovers” when cells are grown in different sized containers. A plasmid causing more than a 30% reduction in growth rate, for example, is likely to result in a “mutant takeover” when the cells are grown in even a small container, such as a flask.

    Collecting the data to build this model was straightforward. The authors placed each of the 301 different plasmids into E. coli cells, and then measured how much each plasmid slowed down their growth rates. A plate reader machine measured the cloudiness of each population over time; a proxy for cell growth. The authors also measured growth rates for E. coli carrying one of five different plasmids that imposed known levels of burden. These controls were used to normalize growth rates between experiments.

    Of the 301 plasmids tested, just six caused cells to grow more than 30% slower than unaltered cells. A further 19 plasmids caused cells to grow more than 20% slower. In total, the authors found 59 plasmids that caused measurable changes to bacterial growth rates.

    Genes expressed from constitutive promoters (meaning they are always “on”) were 2.9 times more likely to be in the burdensome set of 59 plasmids. And plasmids containing a strong ribosome binding site (the part of an mRNA strand where ribosomes bind and kickstart translation) were 2.1 times as likely to slow E. coli growth, compared to plasmids that include weaker RBS variants.

    III. Build Bigger

    If this study’s results were distilled in a single sentence, I think it would be this:

    Genetic sequences inserted into a cell do not usually cause excess burden; but when they do, it is often for reasons we don’t fully understand.

    Why, for example, is a plasmid encoding red fluorescent protein so burdensome? Plasmids encoding YFP and GFP also caused 29.5% and 27.1% reductions in growth rate, respectively. A plasmid encoding a chloramphenicol antibiotic resistance gene—and nothing more—caused cells to grow 33.4% slower. Molecular mechanisms explaining these growth defects are often unclear, or completely absent.

    At Asimov, one of our primary applications involves engineering Chinese Hamster Ovary (CHO) cells to make therapeutic proteins, like monoclonal antibodies. This particular type of cell, originally derived from animals smuggled out of China in 1948, are used to make nearly 90% of all therapeutic proteins.

    In our hands, most therapeutic antibodies can be expressed well by optimizing the genetic design or bioreactor process. In many cases, we’ve engineered CHO cells to make more than 10 grams per liter of antibodies without causing any noticeable growth defects on the cells. But other times—and for reasons we don’t fully understand—engineering CHO cells to make certain therapeutic antibodies imposes huge burdens or toxicity. Debugging these cases is an interesting exercise on its own. The root cause is often mysterious, but in other cases we can detect hallmarks of endoplasmic reticulum (ER) stress, which suggests protein misfolding or aggregation in the cell.

    Fortunately, there are steps we can take to reduce molecular burden or toxicity.

    Codon optimization is one option. This is when scientists convert the DNA sequence from one organism into codons “preferred” by another organism, without altering the order of amino acids in the final protein. In the lab, we have tested various codon configurations to find those that slow down the ribosome’s movement, thus giving proteins more time to fold and reducing toxicity.

    Another way we solve this problem is by balancing the expression of genes. Antibodies are made from proteins—called heavy chain and light chain—that come together to make a Y-shaped molecule. If one of these chains is expressed at a much lower level than the other, it can become rate-limiting in the formation of the antibodies. At the same time, if the “excess” chain is the antibody heavy chain, it can float around the cell and cause toxicity. Another way to reduce burden is to integrate genes directly into the host genome, rather than using multi-copy plasmids, such that only one copy of the genes exist and they don’t consume too many cellular resources.

    A more complicated approach is to engineer cells with incoherent feedforward loops, or IFFLs, to mitigate burden caused by gene expression. Such gene circuits are designed to dampen mRNA levels when a gene’s expression diminishes the cell’s ability to carry out other functions.

    A balance must be struck, however. It is good to reduce burden, but not at the cost of antibody production. Molecular burden, toxicity, and economics are all valid things to consider.

    Most of these strategies are also akin to using Tylenol to treat a cold—we may get the outcome we’re after (less burden), but only because we don’t understand how to solve the problem at its core. It is only by peering deeper into living cells, and untangling their intricate complexities, that we begin to understand what goes wrong when we manipulate them.

    In this case, as in many others, greater basic science research will enable more sophisticated engineering.


    By Niko McCarty

    Thanks to Rachel Kelemen, Alec Nielsen, Ben Gordon, Kate Dray, Kevin Smith, Chris Voigt, and Arturo Casini for help with this essay.

  • Writing Moats

    Good writers should not fear AI. Many types of writing that people enjoy reading cannot easily be replicated by machines. Also, writing is a good way to think and you shouldn’t let machines think for you. Instead of trying to compete directly with AI models, then, the best writers will instead adapt and double down on the parts of writing that celebrate their humanness.

    Unfortunately, I know many good writers who fear AI models. These writers think that a data center in the desert will soon make their career plans infeasible. Many of these writers are already quitting their blogs to focus on building companies in the physical world. Thousands of journalists have lost their jobs in recent years, in part because publishers think AIs can replace humans. This is true for some types of writing, but almost certainly false for other types of writing.

    If you are a writer who fears AI models, you should keep writing anyway. After all, writing is the best way to think, and , the world will soon be divided into the “Writes and Write-Nots,” as Paul Graham has written. The “writes” are those people who can write, and therefore think. The “write-nots” are people who cannot write, and therefore cannot think clearly enough. If you’d like to be a deep thinker in science, at work, or anywhere else, then you should keep writing, even if you never publish that writing online.

    Another reason to write is to influence the models, as Gwern has suggested. Most humans will soon use AIs to complete a majority of their cognitive tasks, because outsourcing thoughts to a data center is easier than actually thinking. But if you write a lot on a subject, then your views will be incorporated into training data, fed into AI models, and regurgitated to billions of people. Your views can therefore shape the future in strong ways, a bit like how Julius Caesar wrote his own memoirs to mask the fact that he was an egotistical psychopath. Modern writers with strong opinions will be immortalized in the models, even if the things they write don’t reflect their real beliefs or behaviors!

    There are other reasons to write, too. But lots of people, including Tyler Cowen, have already described them. I’ve seen far less discussion about how to write to stand out from AIs, though. And so I emailed three bloggers—Ruxandra TesloianuAbhishaike Mahajan, and Eryney Marrogi—with the same question: “What kinds of writing do you think are defensible in the age of AI?” All three responded (thank you). All three had good ideas (nice). With their permission, I’ve sifted through those ideas to arrive at an answer. (And yes, before you say it, this does mean that I outsourced part of my thinking to others.)

    One of the best ways to stand out, we all agreed, is to make things that only human hands, or the human mind, can make.

    When the camera was invented, artists feared that it would commoditize their work. The artists feared cameras would make it possible for even amateurs to create “art.” That was true, to an extent (look at Instagram), but what actually happened is that the arts were revitalized. A large swath of people began placing a premium on handmade paintings. And instead of merely painting what they saw, painters began to question realism and turn to the abstract instead. Art became an expression of individuality and taste, rather than a one-to-one mapping of reality. The same will come to pass in writing, says Adam Mastroianni:

    “It used to be that our only competitors were made of carbon. Now some of our competitors are made out of silicon. New competition should make us better at competing—this is our chance to be more thoughtful about writing than we’ve ever been before. No system can optimize for everything, so what are our minds optimized for, and how can I double down on that? How can I go even deeper into the territory where the machines fear to tread, territories that I only notice because they’re treacherous for machines?”

    Another way to stand out is to publish things that nobody else has. Maybe this seems obvious. Whereas many AI agents can search the Internet, they don’t yet have corporeal bodies to meet people face-to-face, in the same sensory environment. But there is a lot of alpha in having real conversations with real people in real places! On-the-ground reporting will retain its value for a long time for this reason. ProPublica’s investigative reporters should not fear for their jobs.

    Finding “new things” to write about doesn’t require traveling, either. A lot of information is never captured and published, even if that information seems obvious. Many powerful ideas exist only in the minds of a few people, or are only raised in a single conversation in one bar at a particular moment. Most researchers never publish failed experiments. Most people never think to write about what they did on a particular day, or how normal people reacted when the Internet was first introduced, or what people wore to Woodstock in the 1960s. But even a seemingly simple observation can become an important part of the historical record.

    Other writers will stand out because they are experts on a particular issue. Readers crave authority, and this will remain true for some time. Many people read the Wall Street Journal to get an economist’s opinion, in part so they can recite that opinion to people at a party later that evening. Many readers “hang their hats,” so to speak, on the opinions of experts.

    Brian Potter, the writer behind Construction Physics,is one example. His writing is often raised in online discussion boards because it includes original context that is otherwise missing from the public record. People see him as an expert, and rightly so. Potter reads many books (some of them obscure and out-of-print) while writing his essays, but also speaks with people in the field to gather context that nobody else has. He is uniquely equipped to say, “Y’know, this story in The New York Times says such-and-such, but I met a CEO last week who said that it’s not true for these reasons.” A large language model can’t do that.

    These “writing moats” may make the creative process feel like a painful ordeal. Perhaps it seems like the only people who will make it as writers are those who travel to war zones or go to lots of parties or spend years of their life studying a single field. But that’s not true! Most of my favorite essays have the same format: A person describes their experience with something, and then reflects on that something to arrive at a beautiful lesson. A machine can write prose that appears to reflect on an experience, but the lived nature of that experience belongs solely to the human author.

    Looking for Alice” is, ostensibly, an essay about dating. But its actual power stems from the personal stories and anecdotes scattered throughout—all of which are based on experiences common to all people. “Always Bet on Text” is evocative because the writer takes a strong stance for a thing—text—that they think other people don’t value enough. This essay works because the writer is clearly passionate about the subject, and because they express strong opinions with examples. “I Should Have Loved Biology” does both of these things well. It takes a strong stance, but also incorporates anecdotes and personal stories to drive the argument home; namely, that biology is beautiful, but textbooks teach it in all the wrong ways.

    There is absolutely nothing in these essays that is unique to any one individual, or that only experts could understand. None of these essays required on-the-ground reporting. All of these writers simply took personal observations, reflected on them, and distilled the lessons into a singular and poetic message. I love these pieces and yearn, every day, to read more of them.

    The ultimate moat, then, is individuality. “In many ways, this is the last moat of everything,” Abhi told me. It’s “people consuming something made by a human purely because they like the vibes of that human.” This is similar to the idea of taste; people consume Scott Alexander’s monthly roundups because they feature esoteric but interesting articles that are rarely mentioned anywhere else.

    As I was writing this essay, I began to reflect on my own writing career. I thought about my first staff job at a neuroscience magazine in New York, and how my editor told me which articles to write and whom I ought to interview to write them. I didn’t have much independence at that job, and I was never allowed to express a personal opinion in my articles. So after a year, I moved to work at a small nonprofit in Boston.

    My job at that nonprofit was to write blogs about science. I could write about anything, and my boss encouraged me to express strong opinions. But when I filed my first story, he merely skimmed the text, turned his head to look at me, and said, “This is so boring. Why do you write like this?”

    The truth is that my past slew of academic and corporate jobs had neutered my ability to write evocatively and creatively. Up until that point, I had never really stood up for anything in public. Perhaps I was afraid that people would attack me, or that my former mentors would be disappointed in my decision to publish argumentative or opinionated pieces. But that single sentence, uttered by my boss, shook me up. I started writing with fewer self-imposed restrictions. I stopped fearing the reactions of others. I decided to just be myself—to be uniquely human, and not give a damn.


    Thanks to Eryney Marrogi, Xander Balwit, and Alec Nielsen for feedback.

    1 Like Tyler Cowen, I don’t use AI to generate text for my essays because I don’t want to write in the style of the AI. But I do often use AIs as a reading companion, to ask questions, to do research, to find ideas more quickly than Google search, and so on. If your brain struggles against the yearning, aching feeling to take an easy way out—then fight that feeling!

    2 And it will become easier to get to the frontier of an issue, and therefore become an expert in a particular domain, because of AI.

    3 If original information becomes more valuable to the writing process, then I’d also assume that a writer’s physical place in the world will matter more, too. Deep conversations rarely happen over the Internet or on the phone (in my experience). In-person interactions have a lot of value. If you want to write deeply about biology, for example, then it’s probably best to live in San Francisco or Boston.

  • The Most Abundant Protein

    One reason David Goodsell’s paintings attract biologists, I think, is because they are unapologetically realistic. His paintings depict seas of macromolecules splayed out in pastel shades. A Goodsell painting looks nothing like the spacious diagrams one finds in high school biology textbooks, and that’s exactly why they linger in the mind: they show, visually, how crowded cells really are.

    But crowded with what, exactly?

    Well, an E. coli cell has an internal volume of just one femtoliter (or one cubic micron) and a total mass of 1 picogram. These are handy numbers to remember. About 70 percent of that mass is water, and the other 30 percent is mostly proteins, RNA, DNA, lipids, and smaller molecules like metabolites. Proteins alone make up 55 percent of the cell’s dry mass, which made me wonder: Which protein is the most abundant?

    If I sat down at my computer without looking up the answer, I’d guess it has something to do with translation. After all, proteins account for most of the cell’s dry mass, and other proteins are needed to build all those proteins! So maybe the most abundant protein is one of the ribosomal subunits, or something involved in transcribing DNA to RNA. Another possibility is that the most abundant protein is involved in energy production or some other critical process.

    But then I started digging. And here’s what I found.

    In 1978, researchers believed that elongation factor EF-Tu was the most abundant protein, with around 60,000 copies per cell. EF-Tu helps the ribosome grab the correct amino acid during translation. Around that same time, scientists also identified acyl carrier protein and RpiL (involved in fatty acid biosynthesis and protein translation, respectively) as top contenders. They estimated that each E. coli cell has something like 60,000 to 110,000 copies of these proteins.

    Then, in 1979, a paper in Cell argued that those weren’t actually the most abundant proteins. Instead, the authors claimed that E. coli contains a protein with an order-of-magnitude more copies than either EF-Tu or RpiL or anything else. They reported more than 700,000 copies of this protein inside each cell, an astounding figure given that E. coli typically holds only 3–4 million total proteins.

    That protein is called Lpp, and it basically maintains the structural integrity of the cell envelope by anchoring the outer membrane to the peptidoglycan layer. Lpp exists in two forms: one-third of the molecules are covalently bound to the peptidoglycan, and the remaining two-thirds float freely in the membrane. Together, these molecules create a network that stabilizes the cell envelope. They’re what keep cells “roughly” spherical and prevent them from collapsing. Without Lpp, the outer membrane would detach from the peptidoglycan layer and cells would get wrecked by various environmental stressors.

    Decades of experimental evidence now support the high copy number of Lpp. Way back in 1969, a duo named Braun and Rehn treated E. coli cell walls with trypsin (an enzyme that cleaves proteins) and observed a rapid decrease in light absorbance. This suggested that about 40 percent of the rigid cell wall is protein. Subsequent experiments identified Lpp as that protein.

    follow-up study in 1972 used lysozyme and SDS-PAGE to separate the bound and unbound forms of Lpp. By tracking radiolabeled arginine incorporation, the researchers discovered that free Lpp is synthesized first and then converted to the bound form. Combining those findings with earlier data, they estimated that each cell contains around 300,000 total Lpp molecules. Later studies, including the 1979 Cell paper, refined this estimate to 720,000 copies (I don’t entirely understand how; the authors cite those earlier experiments).

    [Despite some of this shaky evidence, I do believe that Lpp is the most abundant protein by far. A 2023 paper in Science Advances visualized this protein in individual cells using atomic force microscopy, and again concluded that each E. coli cell contains hundreds of thousands to about one million copies.]

    Despite the evidence for Lpp’s abundance, some discrepancies remain. For example, the PaxDB database, which compiles protein abundance data from various studies, lists UspA (a stress response protein) as the most abundant E. coli protein. That is almost certainly not correct; many studies isolate and measure cytoplasmic proteins but lose the cell membrane in the process, which can bias results. Protein abundance also depends heavily on the E. coli strain and its growth conditions. Rapidly dividing cells ramp up Lpp to expand their membranes—but they also churn out more ribosomes to handle higher translation demands. Conversely, cells in nutrient-limited conditions might boost stress response proteins like UspA.

    So what are the lessons in all this? A few things:

    1. Few questions in biology have simple answers, and my initial guesses are often wrong.
    2. Don’t just trust a database. Instead, figure out how its data were actually collected before drawing conclusions.
    3. Cells change a lot from one moment to the next, and also between strains. Answers depend on these variables!

    1 Here’s how it works, briefly: Researchers feed cells a radioactive form of the amino acid, arginine. When the cells make proteins, they incorporate this tagged arginine and scientists can measure it using radioactivity devices. For this particular Lpp experiment, researchers grew E. coli in the presence of radiolabeled arginine and then separated the different forms of Lpp (bound vs. free) on SDS-PAGE gels. By looking at how much radioactivity appeared in each band over time, they could see whenand where new Lpp was being produced.

  • The Art of Emails

    Emails are underrated. Many people view them as purely functional — as just another part of the job. But they can be much more than that. Emails are a useful way not only to advance your career, but to actually become a better writer.

    First, consider the power of a “cold” email. Learning to reach out to strangers with a specific ask is one of the best ways to meet people you admire and to further your career. Nearly every job I’ve ever held began with a cold email, or through a connection with someone who I had cold emailed. Cold emails will set you apart because so few people send them. They show initiative and a heartfelt desire to speak with someone. They are genuine precisely because they are “cold;” they exist outside of a job’s duties, and thus indicate a true desire to connect with another human.

    But I think that writing cold emails is even more important for an entirely different reason. Namely, it will teach you to be a better writer, without you even realizing it.

    Consider Paul Graham’s essays. Many of them have titles like: “Putting Ideas into Words,” “Write Simply,” “How to Write Usefully,” and so on. These essays are filled with useful writing advice: “The easier something is to read, the more deeply readers will engage with it.” Or, “It’s not just having to commit your ideas to specific words that makes writing so exacting. The real test is reading what you’ve written. You have to pretend to be a neutral reader who knows nothing of what’s in your head, only what you wrote. When he reads what you wrote, does it seem correct? Does it seem complete?” And: “Just as inviting people over forces you to clean up your apartment, writing something that other people will read forces you to think well. So it does matter to have an audience. The things I’ve written just for myself are no good.”

    All of this advice applies to the cold email. A great email is tailored to a specific audience, a single person who is likely to read the thing you’ve written. If you want this person to reply, then your email must be thoughtful and clear. You should re-read and re-write the cold email until you’re convinced that the email will serve its goal: time, attention, money, a meeting, a chance, whatever. The email must be simple, logical, and engaging. A great email forces you to read your own words from their perspective, and then ask: “Would I be convinced of this?” Graham refers to this as getting ideas “past the stranger.”

    The next time you are struggling to write an essay, then, just think of it as an email. This simple exercise will force you to hold a specific audience in your mind. You’ll naturally ask: What do I want to say, and how will I convince them it’s true? The words will also flow more easily. I often find it’s difficult to sit down and write an essay, compared to an email, because my audience for the essay is fuzzy and I fear people will not like what I’ve written. These concerns go away when I imagine I’m writing for an audience of one.

    So open up a browser or a notepad, and start typing. Don’t worry about the structure. Just focus on saying what you want to say, as clearly as possible, for this one person. Then refine what you’ve written until the stranger is satisfied.

  • Underrated Science Books

    It’s generally a bad idea to write a book.

    First, it takes time away from other things you could be writing. And second, it freezes your ideas in time, such that you can’t easily take them back or tell readers, “Wait, no! I’ve changed my mind!” later on. Even worse, as Gwern wrote in a recent essay, is that:

    “A book commits you to a single task, one which will devour your time for years to come, cutting you off from readers and from opportunity; in the time that you are laboring over the book, which usually you can’t talk much about with readers or enjoy the feedback, you may be driving yourself into depression.”

    And what happens when a writer finally finishes their book? Well, that’s when their true task begins, for they must pray and plead with people to buy it. Despite a writer’s best efforts, however, odds are that very few people will read it.

    About 90 percent of books sell fewer than 1,000 copies. Half of all published books sell less than one dozen copies. Most best-selling books are written by celebrities and politicians (or their ghost writers) and existing authors with large, established audiences — Michelle Obama, Brandon Sanderson, Stephen King…that kind of thing.

    Just because a book sells poorly, or goes out of print shortly after it’s published, does not mean it’s not a good book. The market does not always have good taste! I suspect there will always be an eager audience for books by Nick Lane and Ed Yong, but many other excellent writers fly ‘under the radar.’

    I’d like to remedy this situation — just a bit! — by sharing some of my favorite ‘underrated’ science books. I selected these books simply because I enjoyed reading them and have never heard others bring them up in conversation. Note that this is not a ranked list, because people don’t seem to like those.

    Please share your own underrated book recommendations in the comments below.

    • 40 Years of Evolution, by Peter & Rosemary Grant. This is my favorite book in the bunch. It is written by two Princeton scientists — a husband and wife duo — who spent several months on Daphne Major, an island in the Galápagos, every year for forty years. While there, they captured finches and measured their beaks, observing evolution in real-time. It’s absolutely brilliant and highly underrated.
    • Ben Franklin Stilled the Waves, by Charles Tanford. This book recounts the story of Benjamin Franklin’s experiments with oil on water (I wrote about it here). The gist is that he dropped some oil on a pond in Clapham Common, London, and noted how it “stilled the waves.” In the late 1800s, Lord Rayleigh repeated Franklin’s experiments and made more precise measurements. By dividing the volume of oil by the area it covered upon the water’s surface, Rayleigh was able to calculate the oil’s thickness and, in doing so, estimate the length of a single molecule. His estimates were off by just 2 percent. I love this book because it shows how simple experiments and mathematics can, together, reveal the invisible by measuring the visible.
    • Invisible Frontiers, by Stephen S. Hall. This is the most readable book I’ve found about biotechnology’s formative years. It covers the invention of recombinant DNA and the race between academic scientists in Massachusetts, California, and a startup company called Genentech to create human insulin using engineered microbes.
    • Life’s Ratchet, by Peter M. Hoffmann. This book is subtitled, “How molecular machines extract order from chaos.” Hoffmann does a brilliant job explaining how proteins convert electrical voltage into motion, or how ‘tiny ratchets’ transform random motion into ordered outcomes. This is an accessible introduction to biophysics, and is filled with incredible statements. For example, while describing a protein that carries cargo through a cell, Hoffmann explains that 1021 of them would generate as much power as a typical car engine…Yet, this number of molecular machines barely fills a teaspoon—a teaspoon that could generate 130 horsepower!
    • Projections, by Karl Deisseroth. A brilliant modern take on neuroscience, written by one of its foremost practitioners. Deisseroth is a co-inventor of optogenetics, a technique that uses pulses of light to trigger action potentials in the brain. In this book, he explains how the method works and how it’s being used to map neural circuits. Deisseroth also draws from his experiences as a physician, using patient stories to illustrate how neurodegenerative and psychiatric diseases operate at a mechanistic level. A worthy successor to Oliver Sacks.
    • Where the Sea Breaks Its Back, by Corey Ford. First published in 1966, this is an adventure book first and a science book second. It chronicles the expedition of naturalist Georg Steller and Vitus Bering in the 18th century. Despite preparing to set sail for Alaska over a ten-year period, Steller spent just ten hours in Alaska. The crew later shipwrecked on an island for about a year and many men died. Throughout the voyage, Steller writes about sea otters near the Aleutian islands (their populations later collapsed) and gives detailed anatomical descriptions of sea-cows, which were later named after him. This book is reminiscent of Endurance, about Shackleton’s escape from Antarctica, but is more scientifically-driven.
    • The Life of Isaac Newton, by Richard Westfall. An accessible portrait of Newton, covering his contributions to optics and mathematics, but also his lesser-known pursuits in alchemy and theology. This is really the first book that made me appreciate Newton’s genius; all the others seem to overcomplicate the subject.
    • Magnificent Principia, by Colin Pask. The only book that actually helped me understand Newton’s Principia Mathematica. Divided into seven parts, Pask first describes Newton’s background and character, then dives into his scientific approach and explains what classical mechanics says about the world around us. Notably, there are lessons in here about the ‘risk-averse’ nature of modern science as opposed to the freewheeling methods that often governed discoveries in Newton’s day.
    • How Economics Shapes Science, by Paula Stephan. This book paints a detailed picture of science funding in the United States. It is so detailed, in fact, that it became outdated shortly after its publication in 2012. Still, I think this book is an essential read for scientists because Stephan exposes how funding, incentives, and economic pressures influence the direction and nature of scientific inquiry. This is the first book where I really felt like I understood how science works at a meta-level, and how we might be able to make it better (shortly after reading it, I went to work at New Science.)

    Other great books not on this list:

    • Mutants by Armand Marie Leroi
    • King Solomon’s Ring by Konrad Lorenz
    • Gödel, Escher, Bach by Douglas Hofstadter
    • The Demon Under the Microscope by Thomas Hager
    • The Vital Question by Nick Lane
    • Gene Machine by Venki Ramakrishnan
    • Edison by Edmund Morris
    • Tesla: Inventor of the Electrical Age by W. Bernard Carlson
    • The Invention of Nature by Andrea Wulf
    • The Billion-Dollar Molecule by Barry Werth
    • Stories of Your Life and Others (sci-fi) by Ted Chiang
    • The Wizard and the Prophet by Charles C. Mann
    • Laws of the Game by Manfred Eigen
    • The Lives of a Cell by Lewis Thomas
    • The Genesis Machine by Amy Webb & Andrew Hessel

    Book recommendations from Twitter:

    • How Life Works by Philip Ball
    • The Eighth Day of Creation by Horace Freeland Judson
    • Power, Sex, Suicide by Nick Lane
    • Cathedrals of Science by Patrick Coffey
    • Longitude by Dava Sobel
    • Beyond the Hundredth Meridian by Wallace Stegner
    • Trilobite by Richard Fortey
    • Altered Fates by Jeff Lyon and Peter Gorner
    • Gene Dreams by Teitelman
    • Breath from Salt by Bijal P. Trivedi
    • Alchemy of Air by Thomas Hager