Portrait of Nan Duan

Nan Duan

Lead Researcher


Hi there, I am Nan Duan (段楠).

I got my Ph.D. from Tianjin University in 2011, supervised by Dr. Ming Zhou.

I am a researcher in the Natural Language Computing group at Microsoft Research Asia.

I am working on fundamental NLP tasks, such as open domain question answering, semantic parsing, dialogue system, paraphrasing, and etc., for building AI products with massive users, such as Xiaoice, Cortana, and Bing.

I am organizing the largest Chinese QA evaluation tasks (http://tcci.ccf.org.cn/conference/2016/pages/page05_CFPTasks.html), and in this year, we have a total of 99 teams participated in our two QA tasks (DBQA and KBQA). The QA datasets can be found in https://github.com/nanduan/MSRA-Open-Domain-QA-Tasks.


#: students I mentored in MSRA


  • Zhao Yan#, Nan Duan, Peng Chen, Ming Zhou, Jianshe Zhou, Zhoujun Li, “Building Task-Oriented Dialogue Systems for Online Shopping“, AAAI, 2017.


  • Zhao Yan#, Nan Duan, Ming Zhou, Zhoujun Li, “An Open Domain Topic Prediction Model for Answer Selection“, NLPCC-ICCPOL, 2016.
  • Nan Duan, “Overview of the NLPCC-ICCPOL 2016 Shared Task: Open Domain QA“, NLPCC-ICCPOL, 2016.
  • Junwei Bao#, Nan Duan, Zhao Yan, Ming Zhou, Tiejun Zhao, “Constraint-Based Question Answering with Knowledge Graph“, COLING, 2016.
  • Zhao Yan#, Nan Duan, Junwei Bao, Peng Chen, Ming Zhou, Zhoujun Li, Jianshe Zhou, “DocChat: An Information Retrieval Approach for Chatbot Engines Using Unstructured Documents“, ACL, 2016.


  • Nan Duan, “Overview of the NLPCC 2015 Shared Task: Open Domain QA“, NLPCC, 2015.
  • Pengcheng Yin#, Nan Duan, Ben Kao, Junwei Bao, Ming Zhou, “Answering Questions with Complex Semantic Constraints on Open Knowledge Bases“, CIKM, 2015.


  • Min-Chul Yang#, Nan Duan, Ming Zhou, Hae-Chang Rim, “Joint Relational Embeddings for Knowledge-based Question Answering“, EMNLP, 2014.
  • Junwei Bao#, Nan Duan, Ming Zhou, Tiejun Zhao, “Knowledge-based Question Answering as Machine Translation”, ACL, 2014.


  • 段楠, “从图谱搜索看搜索技术的发展趋势“, 《中国计算机学会通讯》, 2013.
  • Nan Duan, “Minimum Bayes Risk based Answer Re-ranking for Question Answering”, ACL, 2013.
  • Chenguang Wang#, Nan Duan, Ming Zhou, Ming Zhang, “Paraphrasing Adaptation for Web Search Ranking”, ACL, 2013.
  • Hong Sun#, Nan Duan, Yajuan Duan, Ming Zhou, “Answer Extraction from Passage Graph for Factoid Question Answering”, IJCAI, 2013.


  • Nan Duan, Mu Li, Ming Zhou, “Forced Derivation Tree based Model Training to Statistical Machine Translation”, EMNLP, 2012.
  • Nan Duan, “Consensus Decoding to Statistical Machine Translation“, Ph.D. thesis. (Chinese)


  • Nan Duan, Mu Li, Ming Zhou, “Improving Phrase Extraction via MBR Phrase Scoring and Pruning”, MT Summit XIII, 2011.
  • Nan Duan, Mu Li, Ming Zhou, “A Comparative Analysis of Consensus Decoding Methods for Statistical Machine Translation”, Journal of Chinese Information Processing, 2011. (Chinese)
  • Nan Duan, Mu Li, Ming Zhou, “Hypothesis Mixture Decoding for Statistical Machine Translation”, ACL, 2011.


  • Chi-Ho Li, Nan Duan, Yinggong Zhao, Shujie Liu, Lei Cui, Mei-yuh Hwang, Amittai Axelrod, Jianfeng Gao, Yaodong Zhang, Li Deng, “The MSRA Machine Translation System for IWSLT 2010”, IWSLT, 2010.
  • Nan Duan, Hong Sun, Ming Zhou, “Translation Model Generalization using Probability Averaging for Machine Translation”, COLING, 2010.
  • Nan Duan, Mu Li, Dongdong Zhang, Ming Zhou, “Mixture Model-based Minimum Bayes Risk Decoding using Multiple Machine Translation Systems“, COLING, 2010.


  • Nan Duan, Mu Li, Tong Xiao, Ming Zhou, “The Feature Subspace Method for SMT System Combination”, EMNLP, 2009.
  • Mu Li, Nan Duan, Dongdong Zhang, Chi-Ho Li, Ming Zhou, “Collaborative Decoding: Partial Hypothesis Re-ranking using Translation Consensus between Decoders”, ACL, 2009.
  • Dongdong Zhang, Chi-Ho Li, Nan Duan, Shujie Liu, Mu Li, Ming Zhou, “MSRA Technical Report for the 5th China Workshop on Machine Translation“, in CWMT, 2009.


  • Dongdong Zhang, Mu Li, Nan Duan, Chi-Ho Li, Ming Zhou, “Measure Word Generation for English-Chinese SMT Systems”, ACL, 2008.


We build two largest open domain QA datasets for Chinese language, which can find in https://github.com/nanduan/MSRA-Open-Domain-QA-Tasks.

The first dataset is for Knowledge-based QA (or KBQA) task, and the second dataset is for Document-based QA (or DBQA) task.

Both of these two datasets are used in http://tcci.ccf.org.cn/conference/2016/pages/page05_CFPTasks.html.