Theorem Prover Sheds New Light on Stem-Cell Behavior
Advances in computer science in recent years have had dramatic effects on how more traditional sciences are conducted in the 21st century. That has become common knowledge by now, but if you require further proof, just turn to the June 6 issue of Science.
That magazine includes a paper, co-written by Microsoft researchers in concert with a couple of academic collaborators, that details how a new, computational model can explain how genes interact to keep mouse embryonic stem (ES) cells in an unspecialized state.
The scientific term for such a state is “pluripotency,” the potential of a stem cell to differentiate into specialized directions, such as those that can develop into brain cells or heart cells or lung cells. The achievement outlined in the Science piece would not have been possible by biologists alone, nor by computer scientists alone. By working together, though, they have been able access a significant key to the magic of life.
“For the first time,” says Sara-Jane Dunn, a Microsoft researcher and one of the co-authors of the Science paper, “we have uncovered a possible program that can explain how genes interact to keep mouse embryonic stem cells in an unspecialized state—the pluripotent state.
“Until now, the field had a wealth of information on the key regulators of pluripotency, but no explanatory or predictive models of how external signals are processed by the cell to maintain its naivety. By developing novel computational techniques, we have shown that propagation of ES cell identity can be explained by a simple process of molecular computation, which is supported by a high number of accurate, non-intuitive predictions of the response of ES cells to genetic perturbations.”
The Microsoft researchers worked with scientists and researchers from the University of Cambridge, constituting the Wellcome Trust-Medical Research Council Cambridge Stem Cell Institute, on the project resulting in the Science paper, Defining an Essential Transcription Factor Program for Naïve Pluripotency.
While the collaboration was biological in nature, it took a uniquely computational approach—a bespoke tool built upon Microsoft’s Z3 theorem prover—to attain the results described in the paper.
It has been known for a while that maintenance of the stem-cell pluripotent state is regulated by a small number of key genes—about 20. But how they were linked together remained a mystery, and a complex one at that. There are literally billions of possible ways to combine them.
The scientists and researchers analyzed correlations between the genes and created a meta-model of the network, using techniques used to identify software bugs and guarantee program correctness—particularly in safety-critical systems—to distill the connections that result in observed stem-cell behaviors.
“Our primary motivation was to uncover the biological program that governs pluripotency,” Dunn explains. “Experiments reveal a set of possible regulatory interactions and effects between key genes that govern pluripotency, which constitute a large set of possible biological programs. The challenge was to identify only those programs that could simultaneously explain many experimentally observed behaviors.
“Satisfiability Modulo Theory solvers, such as the Microsoft Z3 solver applied in this work, are used both to prove the correctness of a software program but also for program synthesis—the automatic generation of a program from high-level specifications. In this context, adapting these methods allowed us to constrain the set of possible programs against biological ‘specifications.’ The success of this approach confirms that explicitly seeking to identify information processing performed by a cell is an approach worth pursuing.”
The authors—including, in addition to Dunn, Microsoft researchers Boyan Yordanov and Stephen Emmott, along with Graziano Martello and Austin Smith from the Stem Cell Institute—found that the decision of a stem cell to remain self-renewing or to begin to differentiate can be explained by a relatively simple program, one featuring just three input signals, 12 genes, and, in its simplest form, 16 interactions.
In addition, the modeling approach can predict stem-cell behavior accurately amid new conditions, the first time computation has been able to predict this complex decision-making in practice.
“More generally,” Dunn adds, “the development and application of automated-reasoning techniques for biological systems is part of a broader, cross-group effort exploiting Microsoft’s expertise in formal methods, which has led to promising results in diverse areas from developmental biology to the engineering of DNA computing circuits.”
The hope is that the model could lead to further discoveries.
“There are four routes,” Dunn says, “that stem from this model and approach. We can now begin to examine the specific decisions that ES cells make as they differentiate to become specialized cells—and how these decisions are orchestrated within the program of organogenesis. It should also be possible to examine how and why current human pluripotent stem cells are different from those of a mouse, potentially enabling the capture of human ground-state stem cells, which currently eludes the field.
“Beyond this, we are excited to investigate the phenomenon of cellular reprogramming—a feat that reaped Japanese researcher Shinya Yamanaka a Nobel Prize in 2012. An outstanding question is how to improve the efficiency of reprogramming to induce pluripotency, which is currently only 1 percent. Lastly, the methods developed as part of this work are rather general and could also allow us to uncover and study the computation performed by other biological systems in the future.”