Computational Aspects of Biological Information 2016

Computational Aspects of Biological Information 2016


Computational Aspects of Biological Information 2016Computational Aspects of Biological Information (CABI) 2016 is the fourth one-day symposium on challenges and successes in computational biology. CABI 2016 will be held at Microsoft Research New England on November 30, 2016 and will bring together experts in the Boston/Cambridge area to discuss computational solutions to problems in biology, including systems biology, genomics, and related areas. The symposium is open to everyone and registration is free of charge. Lunch will be served.

Symposium Speakers

Confirmed speakers include:

Organizing committee

Poster Session

There will be a poster session in the afternoon. Poster submissions are now closed.

Hospitality Notice for University and Government Employees: Microsoft Research is providing hospitality at this event. Please consult with your institution to determine whether you can accept meals and other hospitality under your institution’s ethics rules and any other laws that might apply. By accepting our invitation, you confirm that this invitation is compliant with your institution’s policies.

Arrival Guidance

Upon arrival, be prepared to show a picture ID and sign the Building Visitor Log when approaching the Lobby Floor Security Desk. Alert them to the name of the event you are attending and ask them to direct you to the appropriate floor. The talks will be held the First Floor Conference Center, Horace Mann Conference Room.

Poster Submissions

Poster submissions are now closed.

To be considered for the CABI 2016 poster session, please submit the following information to by October 31st. Submissions will be evaluated on a rolling basis, and acceptance notifications will be sent by November 9th. Space is limited, and preference will be given to early submissions.

Posters may be up to 48”W X 36”H.

  • Presenting author name
  • Presenting author affiliation/institution
  • Presenting author email address
  • Title
  • Complete list of authors (first name last name).  Please do not list author affiliations.
  • Abstract (250 words maximum)

Please email the conference organizers at if you have any questions.


Time Session Speaker
9:00 AM
Registration and coffee service
9:50 AM
Opening remarks Jennifer Chayes
10:00 AM
The role of fluctuations in individual cells
Johan Paulsson
10:30 AM
Visualizing transcription at nucleotide resolution with NET-seq
Stirling Churchman
11:00 AM
Coffee break
11:15 AM
Systematic identification of cancer targets Bill Hahn
11:45 AM
Cancer Genome Evolution Ben Raphael
12:15 PM
1:00 PM
Poster session
1:30 PM
Reading & Writing Omni Omics In Situ George Church
2:00 PM
Darwinian Evolution as Learning Leslie Valiant
2:30 PM
Measuring and Controlling the Dynamics of the Anesthetized Brain Emery Brown
3:00 PM
Coffee break/poster session
3:45 PM
Mapping the regulatory genome David Gifford
4:15 PM
Genetic variation in human transcription factors Martha Bulyk
4:45 PM
Scaling up genetic analysis Benjamin Neale
5:15 PM
Closing remarks


The role of fluctuations in individual cells

Speaker: Johan Paulsson

All processes in single cells involve components present in low numbers, creating spontaneous fluctuations that in turn can enslave the components present in high numbers. In the first half of the talk I will discuss some mathematical frameworks we developed to analytically predict and analyze random fluctuations in complex processes. The second half of the talk will focus on experimental methods to quantify dynamics in cells, as well as examples of microbial systems where fluctuations play a large role, including feedback control of replication, cell fate decisions, epigenetic oscillations and DNA repair.

Systematic identification of cancer targets

Speaker: Bill Hahn

Although we now have a draft view of the genetic alterations that occur in human cancer, the number of mutations found at low frequency and the molecular heterogeneity of most cancers makes identifying genes that contribute to cancer phenotypes challenging. Determining the function of genes altered in cancer genomes is essential to develop new therapeutic approaches. To complement these genome characterization studies, we have used genome scale gain and loss of function approaches to identify genes required for cell survival and transformation. Specifically, we have performed systematic studies to interrogate rare alleles found altered in cancer genomes and used advances in synthetic gene synthesis to prospectively interrogate all possible alleles of known cancer genes. In parallel, we have performed both genome scale RNAi and CRISRP-Cas9 screens in more than 500 cell lines to identify differentially essential genes and the context that specifies gene dependency. This approach now permits us to identify and classify cancer dependencies. These studies allow us to begin to define a global cancer dependencies map.

Cancer Genome Evolution

Speaker: Ben Raphael

Cancer is an evolutionary process driven by somatic mutations that accumulate in a population of cells. In this talk, I will describe several algorithms to reconstruct this process from DNA sequencing data of tumor samples. These algorithms address challenges that distinguish the cancer genome phylogeny problem from classical phylogenetic tree reconstruction. I will demonstrate the application of these algorithms to sequencing data from multiple cancer types.

Reading & Writing Omni Omics In Situ

Speaker: George Church

Exponential technologies such as Expansion Fluorescent In Situ Sequencing (FISSeq/ExSeq) enable computational analysis of multicellular organs including synapse-level-resolution of connectome and transcriptome plus nucleosome-level-resolution chromosome chain tracing in situ (via Oligopaints).  We can also computationally design and build whole genomes (via synthesis and recombination) and epigenomes (via comprehensive transcription factor libraries).  Data from the IARPA MICrONS BRAIN project is aimed at new insights into visual machine learning strategies.

Darwinian Evolution as Learning

Speaker: Leslie Valiant

Living organisms function according to protein circuits. Darwin’s theory of evolution suggests that these circuits have evolved through variation guided by natural selection. However, the question of which kinds of circuits can so evolve in realistic population sizes and within realistic numbers of generations has remained essentially unaddressed.

We suggest that computational learning theory offers the framework for investigating this question, of how circuits can come into existence adaptively from experience, without a designer, or be then maintained. We formulate evolution as a form of learning from examples. The targets of the learning process are the functions of highest fitness. The examples are the experiences. The learning process is constrained so that the feedback from the experiences is Darwinian. We formulate a notion of evolvability that distinguishes function classes that are evolvable with polynomially bounded resources from those that are not. The dilemma is that if the function class, in particualr of the expression levels of proteins in terms of each other, is too restrictive, then it will not support biology, while if it is too expressive then no evolution algorithm will exist to navigate it. We shall review current work in this area.

Scaling up genetic analysis

Speaker: Ben Neale

With the advent of sequencing technology and ever increasing genome-wide association datasets, we tools to meet the challenges of scale. Here I will describe our efforts to develop a software package, hail, that is built using spark and scala. Hail leverages a distributed model of computing to perform scalable sequence data quality control and analysis. In doing so, we can perform primary quality control analyses on whole genome sequencing datasets of ~5,000 individuals in under an hour. Using hail, we have performed analyses of education attainment on a sample of over 14,000 individuals, identifying a clear role of ultra-rare disruptive mutations in the genetic architecture. We further explored the consequences of this class of variation across a wide range of traits and demonstrate that neuropsychiatric traits appear to have a directional burden effect in contrast to later onset systemic disease.

Mapping the regulatory genome

Speaker: David Gifford

With the advent of multiplexed DNA oligo synthesis, CRISPR genome editing, and high-throughput sequencing it is now possible to characterize genome function with experiments that directly observe the effect of sequence variants.   We will discuss the computational design and analysis of sequence variants that have a causal effect on the binding of DNA regulatory proteins and proximal gene expression.    New results include the observation of regulatory elements in regions of the genome with no observed epigenetic marks.

Genetic variation in human transcription factors

Speaker: Martha Bulyk

Sequencing of exomes and genomes has revealed abundant genetic variation affecting the coding sequences of human transcription factors (TFs), but the consequences of such variation remain largely unexplored. We developed a computational, structure-based approach to evaluate TF variants for their impact on DNA-binding activity and used universal protein binding microarrays to assay sequence-specific DNA-binding activity across 41 reference and 117 variant alleles found in individuals of diverse ancestries and families with Mendelian diseases. We found 77 variants in 28 genes that affect DNA-binding affinity or specificity and identified thousands of rare alleles likely to alter the DNA-binding activity of human sequence-specific TFs. Our results suggest that most individuals have unique repertoires of TF DNA-binding activities, which may contribute to phenotypic variation.