Project Premonition

Project Premonition

Established: March 2, 2015


Microsoft Research blog


Interested in evaluating Project Premonition technologies and data?
Sign up here

Project Premonition aims to detect pathogens before they cause outbreaks

Emerging infectious diseases such as Zika, Ebola, Chikungunya and MERS are dangerous and unpredictable. Public health organizations need data as early as possible to predict disease spread and plan responses. Yet early data is very difficult to obtain, because it must be proactively collected from potential disease sources in the environment. Researchers estimate between 60 and 75% of emerging infectious diseases originate from animals, which are very difficult to monitor.

Project Premonition aims to detect pathogens before they cause outbreaks — by turning mosquitoes into devices that collect data from animals in the environment. There are over 3,600 known species of mosquitoes, which bite a wide range of animals from dogs and chickens to snakes and mice. Each bite may collect a few microliters of blood, containing genetic information about the animal that was bitten and pathogens circulating in that animal. In fact, it has already been shown that the DNA collected from mosquitoes can be used to identify: (1) the types of animals that were bitten, (2) mosquito-borne diseases such as Zika and West Nile that infect both mosquitoes and hosts (e.g. humans and animals), and (3) previously unknown viruses of unknown origin.



mosquito hotspots by drone



mosquitoes with robots



pathogens by gene sequencing

However, catching and analyzing mosquitoes isn’t as easy as it sounds. First, the Project Premonition system must efficiently find and collect many live specimens from the environment; traditionally a labor intensive task. Second, it must sort through the tremendous amount of data generated by gene sequencing mosquitoes to accurately detect potential pathogens. In fact, it is only due to recent advances in robotics, gene sequencing, and cloud computing that such a system could feasibly be constructed. To accomplish these tasks, we are developing drones that autonomously locate mosquito hotspots, robotic traps to identify and collect interesting specimens, and cloud-scale genomics and machine learning algorithms to search for pathogens. These technologies are being developed in a collaboration among researchers at Microsoft, University of Pittsburgh, Johns Hopkins University, University of California Riverside, and Vanderbilt University utilizing the Microsoft Cloud, and in collaboration with public health organizations. This project is part of Microsoft Healthcare NExT and powered by the Microsoft Cloud — watch this to learn more.

See more at: CNN, Today Show, GeekWire


Find mosquito hotspots by drone


Drones that locate mosquito hotspots…and eventually place robotic traps

Monitoring pathogens through mosquitoes means first finding where mosquitoes hide. It turns out finding mosquito hotspots is harder than it sounds. Mosquito populations change daily with weather. One urban block might have thousands of mosquitoes, and the next block almost none. This challenge is well-understood by public health organizations that must continually inspect large areas for new mosquito hotspots and breeding sites.

Drones offer an efficient means to scan large areas for likely mosquito habitat. In order to evaluate the capabilities of drones, the Project Premonition team collaborated with the island nation of Grenada to survey locations across the country. At each site a standard mosquito trap (CO2 baited CDC light trap) was placed to determine whether it was a hotspot, and a drone was flown to capture high resolution visual data. Grenada was an ideal location to evaluate drone capabilities because of its diverse habitats ranging from low-lying urban environments to dense cloud forests all within 135 square miles.

The map below shows the sites that were evaluated. The adjacent map shows the current method for finding potential hotspots using satellite imagery to estimate vegetation density (NDVI). From the satellite’s point-of-view, the entire island appears as a hotspot (everything is green), even though some of these locations were hotspots while others were not. The image to the right shows the drone’s point-of-view, which can clearly distinguish pools of water, containers, and unsealed structures that are associated with mosquitoes. The drone data is more effective at distinguishing hotspots because of its ability to see small features under trees and around buildings. We concluded drones have significant potential for public health applications.

Project Premonition: Grenada from satellite

Grenada: Satellite view shows most places appear as possible hotspots (green)

Project Premonition: Grenada from drone

Grenada: Drone view shows standing water and structures that can be directly observed

Project Premonition aims to perform these tasks safely and at scale, which means drones that can autonomously navigate complex environments to search for hotspots (eventually retrieving specimens on their own). We are investigating new engineering methods to give systems, such as drones, more autonomous capabilities while ensuring they behave correctly. For instance, Microsoft’s protocol programming language P is being used to build safer flight logic from scratch. And, Microsoft’s state-of-the-art Corral code analyzer is being used to find potential catastrophic failures in existing drone code. A recent National Science Foundation student competition challenged teams of undergraduate students to build drones that could retrieve mosquito specimens, helping the next generation of engineers to learn how to build autonomous systems.

See more at: P Language (GitHub), Corral (GitHub), NSF Student Competition (YouTube)


Collect mosquitoes with robots


Robotic mosquito traps that identify and capture interesting mosquitoes in milliseconds

Project Premonition requires lots of interesting mosquitoes, but this is easier said than done. Existing mosquito traps can’t distinguish mosquitoes from other insects, requiring entomologists to process the insects collected from every trap. This is a skilled and labor intensive process.

Project Premonition redesigned the mosquito trap to be robotic and smart. It is comprised of 64 smart cells, each of which monitors the insects flying into it. If the wing movements of an insect match that of an interesting mosquito, then a cell can close a door, capturing that insect and tagging it with key environmental data including time, temperature and light levels. The trap can learn from its mistakes to become more efficient, and it is designed to run for more than 20 hours in hot and humid environments.

The Project Premonition robotic trap was deployed in Houston, TX in collaboration with Harris County Public Health Mosquito and Vector Disease Control Division during the Summer of 2016. In 87 experiments it captured an unprecedented 20GB of data about mosquito behavior, including the behavior of Zika mosquitoes. It detected over 22,000 mosquito encounters and has been tested on 9 species of mosquitoes, including those that carry Zika, Dengue, West Nile, and Malaria.

Project Premonition: Data analytics

Data analytics using the Microsoft Cloud and R for understanding behavior

Project Premonition: Captured mosquitoes

Captured mosquitoes are high quality and suitable for laboratory analysis

In collaboration with University of California Riverside, fine-grained data collected from the Project Premonition trap has been used to evaluate the effects of chemical lures on mosquitoes. Epidemiologists at Vanderbilt University are studying how to improve models of mosquito-borne diseases by incorporating the big data collected by the system. Scalable cloud-based data analytics and machine learning with an R interface running in the Microsoft Cloud enable these studies.

See more at: CNN, Today Show, GeekWire


Detect pathogens by gene sequencing


Cloud-scale algorithms that detect known and unknown pathogens

Some emerging infectious diseases are caused by pathogens that were known to exist, but were not regularly tested for in labs, e.g. Zika and Ebola. Others are caused by pathogens that were previously unknown to science, e.g. SARS and MERS. These properties make emerging diseases very difficult to detect early using traditional techniques.

In order to detect all potential pathogens and their hosts with a single test, we are developing algorithms that analyze the genetic material obtained from mosquitoes. Gene sequencing converts mosquitoes into hundreds of gigabytes of genetic data without focusing on a specific set of pathogens. These data can tell us about the species of mosquitoes collected, the animals they have bitten, and the pathogens they may have encountered. However, new algorithms must be developed to quickly search for viruses and microbes, which are needles in this haystack of data.

The Project Premonition metagenomics pipeline reconstructs the mixture of organisms present in a sequenced biological sample. As a starting point, it uses a reference library of over 250,000 genomes (over 1 trillion base pairs), covering all types of organisms from viruses and bacteria to reptiles and mammals, drawn from public sequence databases. Using the SNAP alignment tool, small parts (called reads) of the input genetic sequences are matched (soft-aligned) against all reference genomes (using weak thresholds) to identify all plausible organisms that may have contributed genetic material. The inevitable ambiguities that arise from the close genetic similarity among species is then resolved using a mixture model approach, which shares information among reads and across the tree of life, guiding the algorithm to the best-supported assignments. (The SNAP aligner was developed in a collaboration between Microsoft Research, University of California Berkeley, and University of California San Francisco.)

Project Premontion: Genetic mixture

Genetic mixture is reconstructed and can be easily explored

designed mosquitoes

Mosquitoes with known viruses are prepared at Johns Hopkins University for calibrating pipeline

In order to guide development, the system includes a “mosquito synthesizer” that can simulate the genetic material that would be obtained from a mosquito having fed on a particular animal harboring a set of viruses and bacteria. On these synthetic experiments, the pipeline reconstructs the correct mixture of organisms with 99.99% accuracy, provided the correct genomes have been previously sequenced and are in the reference database. More importantly, sharing information across the taxonomy allows us to detect the correct genus and family of novel species and genera with >90% accuracy. In collaboration with Johns Hopkins University and University of Pittsburgh, the pipeline is being evaluated and refined on laboratory and wild-caught mosquitoes.

Inspired by the challenge mosquito metagenomics presents, our aim is to develop a generic drag-n-drop metagenomics pipeline, which makes no assumptions about the source of the genetic material, yet can accurately reconstruct the mixture of source organisms. We have begun testing this pipeline on DNA and RNA data from a range of tissues and hosts, including from mosquitos, ticks, birds, and humans. This pipeline runs end-to-end in the cloud in <12 hours for DNA/RNA samples with 100 million short reads. We are currently developing an online service to expose this pipeline.

Interested in evaluating Project Premonition technologies and data?
Sign up here


More Articles and News

Research Team