A ‘Holy Grail’ of Science Is Getting Closer
10 min readThe human cell is a miserable thing to study. Tens of trillions of them exist in the body, forming an enormous and intricate network that governs every disease and metabolic process. Each cell in that circuit is itself the product of an equally dense and complex interplay among genes, proteins, and other bits of profoundly small biological machinery.
Our understanding of this world is hazy and constantly in flux. As recently as a few years ago, scientists thought there were only a few hundred distinct cell types, but new technologies have revealed thousands (and that’s just the start). Experimenting in this microscopic realm can be a kind of guesswork; even success is frequently confounding. Ozempic-style drugs were thought to act on the gut, for example, but might turn out to be brain drugs, and Viagra was initially developed to treat cardiovascular disease.
Speeding up cellular research could yield tremendous things for humanity—new medicines and vaccines, cancer treatments, even just a deeper understanding of the elemental processes that shape our lives. And it’s beginning to happen. Scientists are now designing computer programs that may unlock the ability to simulate human cells, giving researchers the ability to predict the effect of a drug, mutation, virus, or any other change in the body, and in turn making physical experiments more targeted and likelier to succeed. Inspired by large language models such as ChatGPT, the hope is that generative AI can “decode the language of biology and then speak the language of biology,” Eric Xing, a computer scientist at Carnegie Mellon University and the president of Mohamed bin Zayed University of Artificial Intelligence, in the United Arab Emirates, told me.
Much as a chatbot can discern style and perhaps even meaning from huge volumes of written language, which it then uses to construct humanlike prose, AI could in theory be trained on huge quantities of biological data to extract key information about cells or even entire organisms. This would allow researchers to create virtual models of the many, many cells within the body—and act upon them. “It’s the holy grail of biology,” Emma Lundberg, a cell biologist at Stanford, told me. “People have been dreaming about it for years and years and years.”
These grandiose claims—about so ambiguous and controversial a technology as generative AI, no less—may sound awfully similar to self-serving prophesies from tech executives: OpenAI’s Sam Altman, Google DeepMind’s Demis Hassabis, and Anthropic’s Dario Amodei have all declared that their AI products will soon revolutionize medicine.
If generative AI does make good on such visions, however, the result may look something like the virtual cell that Xing, Lundberg, and others have been working toward. (Last month, they published a perspective in Cell on the subject. Xing has taken the idea a step further, co-authoring several papers about the possibility that such virtual cells could be combined into an “AI-driven digital organism”—a simulation of an entire being.) Even in these early days—scientists told me that this approach, if it proves workable, may take 10 or 100 years to fully realize—it’s a demonstration that the technology’s ultimate good may come not from chatbots, but from something much more ambitious.
Efforts to create a virtual cell did not begin with the arrival of large language models. The first modern attempts, back in the 1990s, involved writing equations and code to describe every molecule and interaction. This approach yielded some success, and the first whole-cell model, of a bacteria species, was eventually published in 2012. But it hasn’t worked for human cells, which are more complicated—scientists lack a deep enough understanding to imagine or write all of the necessary equations, Lundberg said.
The issue is not that there isn’t any relevant information. Over the past 20 years, new technologies have produced a trove of genetic-sequence and microscope data related to human cells. The problem is that the corpus is so large and complex that no human could possibly make total sense of it. But generative AI, which works by extracting patterns from huge amounts of data with minimal human instructions, just might. “We’re at this tipping point” for AI in biology, Eran Segal, a computational biologist at the Weizmann Institute of Science and a collaborator of Xing’s, told me. “All the stars aligned, and we have all the different components: the data, the compute, the modeling.”
Scientists have already begun using generative AI in a growing number of disciplines. For instance, by analyzing years of meteorological records or quantum-physics measurements, an AI model might reliably predict the approach of major storms or how subatomic particles behave, even if scientists can’t say why the predictions are accurate. The ability to explain is being replaced by the ability to predict, human discovery supplanted by algorithmic faith. This may seem counterintuitive (if scientists can’t explain something, do they really understand it?) and even terrifying (what if a black-box algorithm trusted to predict floods misses one?). But so far, the approach has yielded significant results.
“The big turning point in the space was six years ago,” Ziv Bar-Joseph, a computational biologist at Carnegie Mellon University and the head of research and development and computational sciences at Sanofi, told me. In 2018—before the generative-AI boom—Google DeepMind released AlphaFold, an AI algorithm that functionally “solved” a long-standing problem in molecular biology: how to discern the three-dimensional structure of a protein from the list of amino acids it is made of. Doing so for a single protein used to take a human years of experimenting, but in 2022, just four years after its initial release, AlphaFold predicted the structure of 200 million of them, nearly every protein known to science. The program is already advancing drug discovery and fundamental biological research, which won its creators a Nobel Prize this past fall.
The program’s success inspired researchers to design so-called foundation models for other building blocks of biology, such as DNA and RNA. Inspired by how chatbots predict the next word in a sentence, many of these foundation models are trained to predict what comes next in a biological sequence, such as the next set of As, Ts, Gs, and Cs that make up a strand of DNA, or the next amino acid in a protein. Generative AI’s value extends beyond straightforward prediction, however. As they analyze text, chatbots develop abstract mathematical maps of language based on the relationships between words. They assign words and sentences coordinates on those maps, known as “embeddings”: In one famous example, the distance between the embeddings of queen and king is the same as that between woman and man, suggesting that the program developed some internal notion of gender roles and royalty. Basic, if flawed, capacities for mathematics, logical reasoning, and persuasion seem to emerge from this word prediction.
Many AI researchers believe that the basic understanding reflected in these embeddings is what allows chatbots to effectively predict words in a sentence. This same idea could be of use in biological foundation models as well. For instance, to accurately predict a sequence of nucleotides or amino acids, an algorithm might need to develop internal, statistical approximations of how those nucleotides or amino acids interact with one another, and even how they function in a cell or an organism.
Although these biological embeddings—essentially a long list of numbers—are on their own meaningless to people, the numbers can be fed into other, simpler algorithms that extract latent “meaning” from them. The embeddings from a model designed to understand the structure of DNA, for instance, could be fed into another program that predicts DNA function, cell type, or the effect of genetic mutations. Instead of having a separate program for every DNA- or protein-related task, a foundation model can address many at once, and several such programs have been published over the past two years.
Take scGPT, for example. This program was designed to predict bits of RNA in a cell, but it has succeeded in predicting cell type, the effects of genetic alterations, and more. “It turns out by just predicting next gene tokens, scGPT is able to really understand the basic concept of what is a cell,” Bo Wang, one of the programs’ creators and a biologist at the University of Toronto, told me. The latest version of AlphaFold, published last year, has exhibited far more general capabilities—it can predict the structure of biological molecules other than proteins as well as how they interact. Ideally, the technology will make experiments more efficient and targeted by systematically exploring hypotheses, allowing scientists to physically test only the most promising or curiosity-inducing. Wang, a co-author on the Cell perspective, hopes to build even more general foundation models for cellular biology.
The language of biology, if such a thing exists, is far more complicated than any human tongue. All the components and layers of a cell affect one another, and scientists hope that composing various foundation models creates something greater than the sum of their parts—like combining an engine, a hull, landing gear, and other parts into an airplane. “Eventually it’s going to all come together into one big model,” Stephen Quake, the head of science at the Chan Zuckerberg Initiative (CZI) and a lead author of the virtual-cell perspective, told me. (CZI—a philanthropic organization focused on scientific advancement that was co-founded by Priscilla Chan and her husband, Mark Zuckerberg—has been central in many of these recent efforts; in March, it held a workshop focused on AI in cellular biology that led to the publication of the perspective in Cell, and last month, the group announced a new set of resources dedicated to virtual-cell research, which includes several AI models focused on cell biology.)
In other words, the idea is that algorithms designed for DNA, RNA, gene expression, protein interactions, cellular organization, and so on might constitute a virtual cell if put together in the right way. “How we get there is a little unclear right now, but I’m confident it will,” Quake said. But not everyone shares his enthusiasm.
Across contexts, generative AI has a persistent problem: Researchers and enthusiasts see a lot of potential that may not always work out in practice. The LLM-inspired approach of predicting genes, amino acids, or other such biological elements in a sequence, as if human cells and bodies were sentences and libraries, is in its “very early days,” Quake said. Xing likened his and similar virtual-cell research to having a “GPT-1” moment, referencing an early proof-of-concept program that eventually led to ChatGPT.
Although using deep-learning algorithms to analyze huge amounts of data is promising, the quest for more and more universal solutions struck some researchers I spoke with as well-intentioned but unrealistic. The foundation-model approach in Xing’s AI-driven digital organisms, for instance, suggests “a little too much faith in the AI methods,” Steven Salzberg, a biomedical engineer at Johns Hopkins University, told me. He’s skeptical that such generalist programs will be more useful than bespoke AI models such as AlphaFold, which are tailored to concrete, well-defined biological problems such as protein folding. Predicting genes in a sequence didn’t strike Salzberg as an obviously useful biological goal. In other words, perhaps there is no unifying language of biology—in which case no embedding can capture every relevant bit of biological information.
More important than AlphaFold’s approach, perhaps, was that it reliably and resoundingly beat other, state-of-the-art protein-folding algorithms. But for now, “the jury is still out on these cell-based models,” Bar-Joseph, the CMU biologist, said. Researchers have to prove how well their simulations work. “Experiment is the ultimate arbiter of truth,” Quake told me—if a foundation model predicts the shape of a protein, the degree of a gene’s expression, or the effects of a mutation, but actual experiments produce confounding results, the model needs reworking.
Even with working foundation models, the jump from individual programs to combining them into full-fledged cells is a big one. Scientists haven’t figured out all of the necessary models, let alone how to assemble them. “I haven’t seen a good application where all these different models come together,” Bar-Joseph said, though he is optimistic. And although there are a lot of data for researchers to begin with, they will need to collect far more moving forward. “The key challenge is still data,” Wang said. For example, many of today’s premier cellular data sets don’t capture change over time, which is a part of every biological process, and might not be applicable to specific scientific problems, such as predicting the effects of a new drug on a rare disease. Right now, the field isn’t entirely sure which data to collect next. “We have sequence data; we have image data,” Lundberg said. “But do we really know which data to generate to reach the virtual cell? I don’t really think we do.”
In the near term, the way forward might not be foundation models that “understand” DNA or cells in the abstract, but instead programs tailored to specific queries. Just as there isn’t one human language, there may not be a unified language of biology, either. “More than a universal system, the first step will be in developing a large number of AI systems that solve specific problems,” Andrea Califano, a computational biologist at Columbia and the president of the Chan Zuckerberg Biohub New York, and another co-author of the Cell perspective, told me. Even if such a language of biology exists, aiming for something so universal could also be so difficult as to waste resources when simpler, targeted programs would more immediately advance research and improve patients’ lives.
Scientists are trying anyway. Every level of ambition in the quest to bring the AI revolution to cell biology—whether modeling of entire organisms, single cells, or single processes within a cell—emerges from the same hope: to let virtual simulations, rather than physical experiments, lead the way. Experiments may always be the arbiters of truth, but computer programs will determine which experiments to carry out, and inform how to set them up. At some point, humans may no longer be making discoveries so much as verifying the work of algorithms—constructing biological laboratories to confirm the prophecies of silicon.