About

Medicine today is imprecise. For the top 20 prescription drugs in the U.S., 80% of patients are non-responders. Recent disruptions in sensor technology have enabled precise categorization of diseases and treatment effects. For example, sequencing technology has reached the exciting disruption point of $1000 person genome. However, progress in precision medicine is difficult, as genome-scale knowledge and reasoning become the ultimate bottlenecks in deciphering cancer and other complex diseases. Today, it takes hours for a molecular tumor board of 10-20 highly trained specialists to review one patient’s omics data and make treatment decisions. With 1.6 million new cancer cases and 600 thousand deaths in the U.S. each year, this is clearly not scalable.

My research interest can be summed up as developing “AI for Precision Medicine“, to overcome these bottlenecks. For example, machine reading automates extracting knowledge from biomedical literature and converting free-text clinical notes into structured databases, whereas sophisticated machine learning methods can integrate rich prior knowledge with experimental data, for personalized cancer treatment and chronic disease management.

I have given invited talks at various places including UIUC, J. Craig Venter Institute, University of Colorado at Denver, University of Maryland, Johns Hopkins, University of Massachusetts, MIT, University of Washington. Here are the slides for an MIT talk in 2015 (thanks Regina Barzilay for inviting me), and the video for a talk in NIPS.

In our most recent Project Hanover, we focus on three interwoven agenda:

  • Machine reading: Develop information extraction methods that do not require annotated examples, by leveraging prior knowledge and other available structured resources.
  • Cancer decision support: Develop machine learning methods to integrate genomics knowledge with experimental data, for
    personalizing drug combinations in Acute Myeloid Leukemia (AML), where treatment hasn’t improved in the past three decades. We are collaborating with the Knight Cancer Institute, a pioneer in cancer precision medicine.
  • Chronic disease management: Develop machine learning methods for modeling chronic disease progression, based on EMRs and other health sensor data.

Results from our machine reading work can be found in Literome, an Azure-based cloud service for knowledge extraction from PubMed. Currently, it focuses on two types of knowledge more pertinent to genomic medicine: gene-gene interactions (as in biological pathways) and genotype-phenotype associations, such as single nucleotide polymorphism (SNP) vs. disease predisposition or drug reaction [Bioinformatics Paper].

I’m excited to participate in the DARPA Program on automating the construction of “Big Mechanisms” for cancer systems biology by reading literature, integrating ontologies and knowledgebases, and deciphering experimental data. I am a co-PI in a team led by Andrey Rzhetsky.

My past work has been recognized with Best Paper Awards in top NLP and machine learning conferences such as NAACL, EMNLP, and UAI.

I spent some truly amazing years in the Department of Computer Science and Engineering at the University of Washington.
My Ph.D. advisor is Pedro Domingos.
My dissertation is: Markov Logic for Machine Reading.

For more information, check out my publications and LinkedIn profile.

Select Press Coverage: Bloomberg Technology, Microsoft News, Verge, ZDNet, eWeek, Puget Sound Business Journal, SWE Magazine cover story.

Select Publications

  • Wide-Open: accelerating public data release by automating detection of overdue datasets.  
    Maxim Grechkin, Hoifung Poon, and Bill Howe.
    In PLOS Biology, to appear.
  • Cross-Sentence N-ary Relation Extraction with Graph LSTMs.   [Paper]
    Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, and Scott Yih.
    In Transactions of the Association for Computational Linguistics (TACL), 2017.
  • Distant Supervision for Relation Extraction beyond the Sentence Boundary.   [Paper]
    Chris Quirk and Hoifung Poon
    In Proceedings of the Fifteenth Conference of the European Association for Computational Linguistics (EACL), 2017.
  • Compositional Learning of Embeddings for Relation Paths in Knowledge Bases and Text.   [Paper]
    Kristina Toutanova, Xi Victoria Lin, Wen-Tau Yih, Hoifung Poon, and Chris Quirk.
    In Proceedings of the Fifty Fourth Annual Meeting of the Association for Computational Linguistics (ACL), 2016.
  • Representing Text for Joint Embedding of Text and Knowledge Bases.   [Paper]
    Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choudhury, and Michael Gamon.
    In Proceedings of the Annual Conference of Empirical Methods in Natural Language Processing (EMNLP), 2015.
  • Model Selection for Type-Supervised Learning with application to POS Tagging.   [Paper]
    Kristina Toutanova, Waleed Ammar, Pallavi Chourdhury, and Hoifung Poon.
    In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2015.
  • Grounded Semantic Parsing for Complex Knowledge Extraction.   [Paper]
    Ankur Parikh; Hoifung Poon; Kristina Toutanova
    In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2015.
  • Distant Supervision for Cancer Pathway Extraction from Text.   [Paper]
    Hoifung Poon, Kristina Toutanova, and Chris Quirk
    In Proceedings of the Pacific Symposium on Biocomputing, 2015.
  • Literome: PubMed-Scale Genomic Knowledge Base in the Cloud.   [Paper]
    Hoifung Poon, Chris Quirk, Charlie DeZiel, and David Heckerman
    Bioinformatics 2014; doi: 10.1093/bioinformatics/btu383
  • Grounded Unsupervised Semantic Parsing.   [Paper]
    Hoifung Poon.
    In Proceedings of the Fifty First Annual Meeting of the Association for Computational Linguistics (ACL), 2013.
  • Probabilistic Frame Induction.   [Paper]  
    Jackie Cheung, Hoifung Poon and Lucy Vanderwende.
    In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2013.
  • An Exhaustive Epistatic SNP Association Analysis on Expanded Wellcome Trust Data.   [Paper]  
    Christoph Lippert, Jennifer Listgarten, Robert Davidson, Scott Baxter, Hoifung Poon, Carl M. Kadie, David Heckerman.
    In Scientific Reports, 2013, doi:10.1038/srep01099.
  • Sum-Product Networks: A New Deep Architecture.   [Paper]   [Slides]   [Download code and results]
    Hoifung Poon and Pedro Domingos.
    In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, (UAI), 2011.
    Best Paper Award
  • Unsupervised Ontology Induction from Text.   [Paper]
    Hoifung Poon and Pedro Domingos.
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2010.
  • Joint Inference for Knowledge Extraction from Biomedical Literature.   [Paper]
    Hoifung Poon and Lucy Vanderwende.
    In Proceedings of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies Conference (NAACL-HLT), 2010.
  • Unsupervised Semantic Parsing.   [Paper]   [Slides]   [Download data and code]
    Hoifung Poon and Pedro Domingos.
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2009.
    Best Paper Award
  • Unsupervised Morphological Segmentation with Log-Linear Models.   [Paper]
    Hoifung Poon, Colin Cherry, and Kristina Toutanova.
    In Proceedings of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies Conference (NAACL-HLT), 2009.
    Best Paper Award
  • Language ID in the Context of Harvesting Language Data off the Web.   [Paper]
    Fei Xia, William Lewis, and Hoifung Poon.
    In Proceedings of the Conference of European Association for Computational Linguistics (EACL), 2009.
  • Joint Unsupervised Coreference Resolution with Markov Logic.   [Paper]
    Hoifung Poon and Pedro Domingos.
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2008.
  • A General Method for Reducing the Complexity of Relational Inference and its Application to MCMC.   [Paper]
    Hoifung Poon, Pedro Domingos, and Marc Sumner.
    In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (AAAI), 2008.
  • Markov Logic.   [Book Chapter]
    Pedro Domingos, Stanley Kok, Daniel Lowd, Hoifung Poon, Matthew Richardson, Parag Singla.
    In L. De Raedt, P. Frasconi, K. Kersting and S. Muggleton (eds.), Probabilistic Inductive Logic Programming, 2008.
  • Joint Inference in Information Extraction.   [Paper]   [Online Appendix]
    Hoifung Poon and Pedro Domingos.
    In Proceedings of the Twenty-Second National Conference on Artificial Intelligence (AAAI), 2007.
  • Sound and Efficient Inference with Probabilistic and Deterministic Dependencies.   [Paper]
    Hoifung Poon and Pedro Domingos.
    In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI), 2006.
  • Unifying Logical and Statistical AI.   [Paper]
    Pedro Domingos, Stanley Kok, Hoifung Poon, Matthew Richardson, Parag Singla.
    In Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI), 2006.
    Invited paper.

‚Äč