Enzymes from Random Molecules

A new paper in Nature shows that enzymes can be made by mixing just four molecules together, none of which are amino acids. The four molecules randomly link together to form long polymer chains, some of which catalyze chemical reactions.

Though this sounds impressive, the paper itself is quite strange. For one, it is extremely short (only about 2,700 words) and has no discussion section. The text is also absurdly dense; likely designed to be read by materials or physics people, rather than biologists. And lastly, I think the paper is most interesting for the things it leaves unwritten — the ideas left out rather than put in. Understanding why this paper matters, then, is mostly an exercise in speculation.

For context, scientists have been trying to design new enzymes for decades. But this “design” has traditionally been done by searching for amino acid sequences which then fold into a 3D shape with some desired function. Computational biologists tend to fixate on the sequence; they tend to consider proteins as individuals rather than as populations of molecules.

Enzyme design is also a really hard problem. An enzyme’s interior holds amino acids in a precise way, such that the amino acid(s) in the active site latch onto substrates and convert them into new molecules. This “active site” is surrounded by other amino acids that create a microenvironment suited to the reaction. If the substrate is negatively-charged, for example, the microenvironment works to exclude positively-charged molecules.

Despite their complexity, biologists have designed viable enzymes computationally. Last year, David Baker’s group at the University of Washington designed a serine hydrolase that breaks down ester groups, or chemicals made by joining together an acid and an alcohol. This AI-designed enzyme has an active site made from three amino acids (a “catalytic triad”) that work together to catalyze the reactions. But it was quite slow, completing just one reaction per second, compared to the thousands of reactions per second that is typical of natural serine hydrolases. Enzyme design thus remains a mostly unsolved problem.

This new Nature paper, though, took a completely different approach. The key breakthrough, in my eyes, is its focus on populations of polymers rather than in trying to create one perfect polymer. The authors created enzymes using a statistical or probabilistic approach, rather than a deterministic one.

The researchers focused on metalloenzymes, which are arguably simpler than serine hydrolases because they only have a single amino acid in their active site, rather than a ‘triad’. Metalloenzymes hold metal ions (often zinc, iron, or copper) in that active site; hence the name. The researchers made two types of metalloenzymes: terpene cyclases, which take a string of carbons as substrate and “loop” them into a circle, and peroxidases, which use the iron in heme to oxidize substrates, like hydrogen peroxide. I’ll just focus on the terpene cyclase, as the approach taken was largely identical in both cases.

In nature, terpene cyclases take a straight chain of ten carbon atoms — a molecule called citronellal — and fold them into a ring. If all goes well, the enzyme makes isopulegol, which is a carbon ring with one alcohol group. But if water gets into the active site, this reaction is disrupted and the enzyme instead makes menthoglycol, which is the same carbon loop but with two alcohol groups.

Natural terpene cyclases have aspartate in their active site. The aspartate donates a proton to citronellal, thus making one of its carbon atoms positively charged. This triggers cyclization into a ring, as the “activated” carbon joins the carbon at the other side of the chain. The aspartate is surrounded by a hydrophobic shell, which keeps water out so that isopulegol gets made selectively instead of menthoglycol.

Seeking to create random polymers which could mimic a terpene cyclase, the researchers first analyzed 1,300 metalloproteins, looking for commonalities between them. They found two things: First, metalloproteins tend to have one “key” amino acid in their interior — often histidine or aspartate — which latches onto the metal ion, locking it in place, so that it can perform the chemical reaction. Second, metalloenzymes tend to surround their active sites with hydrophobic amino acids, which exclude water molecules. To make a metalloenzyme, then, one basically just needs to situate a single amino acid, or electron donor, inside a hydrophobic shell.

Next, the authors scoured chemical databases for molecules with these same properties, meaning they are hydrophobic or similar in shape and charge to histidine or aspartate. They ultimately settled on four molecules:

(Note: You need some hydrophilic molecules, even when trying to build a hydrophobic active site, because the polymers won’t dissolve in water without them. Instead, they will aggregate or precipitate out of the solution. Hence the inclusion of OEGMA.)

Then, the researchers mixed these four molecules together, and each molecule randomly linked with others to create long and unique polymers. The hope was that some of these “pseudo-random” polymers would position a SPMA amid hydrophobic molecules, thus creating a terpene cyclase mimic. Initially, things did not go to plan.

In their first trial, the researchers mixed 50% MMA, 20% EHMA, 25% OEGMA, and 5% SPMA and added the resulting polymers to citronellal. After 24 hours, the polymers cyclized citronellal, but poorly. About half of the citronellal molecules were converted, and only 55 percent of products were isopulegol. In other words, the polymers could slowly catalyze reactions, but not selectively.

So the authors iterated. To optimize their reaction, they used a Monte Carlo algorithm to generate 100,000 polymer sequences based on each molecule’s ratio and reactivity. By tinkering with the molecular ratios and re-running these simulations, they figured out they could improve the odds that SPMA would be surrounded by hydrophobic residues — and thus act like a terpene cyclase — if they increased SPMA’s concentration (to 15%) while decreasing OEGMA (to 5%).

This yielded much better results. In a second round, the polymers converted 91 percent of citronellal after 24 hours, with a selectivity for isopulegol of 76 percent.

So why does any of this matter? Well, the paper doesn’t really say, outside of some vague or indirect commentary. So what follows is mostly speculation…

I think one reason this paper is important is because it does away with the outdated notion that enzymes must be tuned at the sequence-level. The study shows, rather, that enzymes can be made spontaneously using pseudo-random populations of molecules, much like the earliest cells on Earth probably did. Early lifeforms didn’t need to evolve the perfect enzyme; they just needed to find concoctions of molecules that were “good enough” for a particular function.

The study also suggests that the 20 amino acids used by cells are not particularly special, and their functions can be replaced with other molecules carrying the same properties — like “charged” or “hydrophobic” or “flexible” and so on.

When I first discussed this paper with a friend, a protein biochemist, they urged me not to write about it. They said that metalloenzymes are not particularly difficult to make, and so this paper’s outcomes aren’t all that surprising. They pointed to another study demonstrating that it’s possible to make functional metalloenzymes simply by mixing purified phenylalanine with zinc ions.

My retort to their criticism, though, is that the authors have already used this same “random polymer” approach to make other types of proteins. In 2020, for example, they made protein channels that were exquisitely sensitive to protons and, during our conversation, hinted that they have also made other classes of enzymes, including hydrolases.

But still, this paper leaves so much left unsaid. I suspect many protein biochemists reading this blog still won’t find the work impressive or useful or surprising or whatever. It takes a long time to overturn dogma, after all, and it’ll be an uphill battle to change peoples’ perceptions of enzymes and how one can make them.

The paper itself also took seven years of work, according to a corresponding author, and involved many back-and-forth debates with a “hostile” reviewer. The manuscript was cut nearly in half (from nearly 5,000 words to 2,700), losing much of its philosophical framing. “This was the hardest paper I’ve ever published,” the authors told me. And after spending a week wrestling with whether to write about it, I understand why.

basic-science