Computational Biology at MSR New England

Established: January 1, 2011

Computational challenges have become more and more important to infer biologically relevant information from the vast amount of experimental data available to systems biologists.

We encompass several approaches to computational biology: we try to frame the biological question under consideration in terms of more standard problems in computer science, like clustering, Steiner trees, flow problems, etc., and then use approximation algorithms motivated by statistical physics to solve these problems. One of our most successful approaches in this realm involves variants of belief- and survey propagation algorithms, but in the course of adapting our problem to this setting, we often need to derive alternative representations of the original computer science problem which might be useful when applying other algorithms as well.

We also approach many problems from the perspective of applied statistics and machine learning, making use of latent variable models and efficient operations on them to perform inference and learning. In this vein, we have tackled problems in CRISPR gene editing;  problems in statistical genetics such as effective and efficient handling of unknown confounding factors in eQTL association studies, genome-wide association studies, and analysis of methylation data; immunoinformatics such as HLA imputation and refinement, epitope prediction; problems in proteomics such as alignment of vector time series resulting from liquid-chromatography-mass-spectrometry systems.

Selected Publications

Optimized sgRNA design to maximize activity and minimize off-target effects for genetic screens with CRISPR-Cas9 
JG Doench*, N Fusi*, M Sullender*, M Hegde*, EW Vaimberg*, KF Donovan, I Smith, Z Tothova, C Wilen , R Orchard , HW Virgin, J Listgarten*, DE Root, Nature Biotechnology(2016)

Warped linear mixed models for the genetic analysis of transformed phenotypes 
Fusi F., Lippert C., Lawrence N., Stegle O, Nature Communications (2014)

Epigenome-wide association studies without the need for cell-type composition
Zou J, Lippert C, Heckerman D, Aryee, M, Listgarten J Nature Methods,309–311 (2014)

FaST-LMM-Select for addressing confounding from spatial structure and rare variants
Listgarten* J, Lippert* C, Heckerman* D (*equal contributions) Nature Genetics, 45, 470-471 (2013)

Improved linear mixed models for genome-wide association studies
Listgarten J*, Lippert* C, Kadie C, Davidson B, Eskin E, Heckerman* D *(equal contributions)
Nature Methods, 2012

FaST Linear Mixed Models for Genome-Wide Association Studies
Lippert* C, Listgarten* J., Liu Y, Kadie C, Davidson R, Heckerman* D. (*equal contributions) Nature Methods, Aug. 2011

Correction for Hidden Confounders in the Genetic Analysis of Gene Expression 
Listgarten J, Kadie C, Schadt E, Heckerman D
Proceedings of the National Academy of Sciences, September 1, 2010

Statistical resolution of ambiguous HLA typing data
Listgarten J, Brumme Z, Kadie C, Xiaojiang G, Walker B, Carrington M, Goulder P, Heckerman D, PLoS Computational Biology (2008)

Statistical and computational methods for comparative proteomic profiling using liquid chromatography-tandem mass spectrometry Listgarten J and Emili A, Molecular and Cellular Proteomics (2005)

Simultaneous reconstruction of multiple signaling pathways via the prize-collecting Steiner forest problem (N. Tuncbag, A. Braunstein, A. Pagnani, S.S. Huang, J. Chayes, C. Borgs, R. Zecchina, and E. Fraenkel) Journal of Computational Biology20 (2013) 124 – 136.

Finding undetected protein associations in cell signaling by belief propagation (with M. Bailly-Bechet, C. Borgs, A. Braunstein, J. Chayes, A. Dagkessamanskaia, J. Francois, and R. Zecchina). Proceedings of the National Academy of Sciences (PNAS) 108 (2011) 882 – 887.

Statistical mechanics of Steiner trees (M. Bayati, C. Borgs, A. Braunstein, A. Ramezanpour, and R. Zecchina) Physical Review Letters 101, 037208 (2008), reprinted in Virtual Journal of Biological Physics Research16, August 1 (2008).

Collaborators

Ernest Fraenkel, MIT

Ernest Fraenkel studied Chemistry and Physics as an undergraduate at Harvard College and obtained his Ph.D. in Structural Biology at MIT in the department of Biology. After doing post-doctoral work in the same field at Harvard, he turned his attention to the emerging field of Systems Biology. His research now focuses on using high-throughput techniques and computational methods to uncover the molecular pathways that are altered in disease and to identify new therapeutic strategies. Read more…

Riccardo Zecchina, Politecnico di Torino, Italy

Riccardo is Professor of Theoretical Physics at the Politecnico di Torino in Italy. His interests are in topics at the interface between Statistical Physics and Computer Science. His current research activity is focused on combinatorial and stochastic optimization, probabilistic and message-passing algorithms and interdisciplinary applications of statistical physics (in computational biology, graphical games and statistical inference). Read more…

People