Portrait of Linjun Shou (寿林钧)

Linjun Shou (寿林钧)

Principal Applied Scientist Manager, Microsoft STCA

Connect on LinkedIn

Publications

  • Zilin Xiao, Ming Gong, Jie Wu, Xingyao Zhang, Linjun Shou, Jian Pei, and Daxin Jiang. “Instructed Language Models with Retrievers Are Powerful Entity Linkers”. In: EMNLP. 2023.
  • Zilin Xiao, Linjun Shou, Xingyao Zhang, Jie Wu, Ming Gong, Jian Pei, and Daxin Jiang. “Coherent Entity Disambiguation via Modeling Topic and Categorical Dependency”. In: Findings of EMNLP. 2023.
  • Ning Wu, Ming Gong, Linjun Shou, Jian Pei, and Daxin Jiang. “RUEL: Retrieval-Augmented User Representation with Edge Browser Logs for Sequential Recommendation”. In: CIKM. 2023.
  • Nuo Chen, Linjun Shou, Tengtao Song, Ming Gong, Jian Pei, Jianhui Chang, Daxin Jiang, and Jia Li. “Structural Contrastive Pretraining for Cross-Lingual Comprehension”. In: ACL. 2023.
  • Zimeng Li, Bo Shao, Linjun Shou, Ming Gong, Gen Li, and Daxin Jiang. “WIERT: Web Information Extraction via Render Tree”. In: AAAI. 2023.
  • Nuo Chen,Linjun Shou, Ming Gong, Jian Pei, Bowen Cao, Jianhui Chang, Daxin Jiang, and Jia Li. “Alleviating Over-smoothing for Unsupervised Sentence Representation”. In: ACL. 2023.
  • Shengyao Zhuang, Linjun Shou, Jian Pei, Ming Gong, Houxing Ren, G. Zuccon, and Daxin Jiang. “Typosaware Bottlenecked Pre-Training for Robust Dense Retrieval”. In: SIGIR. 2023.
  • Ning Wu, Ming Gong, Linjun Shou, Shining Liang,and Daxin Jiang. “Large Language Models are Diverse Role-Players for Summarization Evaluation”. In: Natural Language Processing and Chinese Computing. 2023.
  • Houxing Ren*, Linjun Shou, Ning Wu, Ming Gong and Daxin Jiang. Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP’22), Abu Dhabi, December 7-11, 2022. (* intern I mentored)
  • Shining Liang*, Linjun Shou, Jian Pei, Ming Gong, W. Zuo, X. Zuo, and Daxin Jiang. “Label-aware Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding“. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP’22), Abu Dhabi, December 7-11, 2022. (* intern I mentored)
  • Houxing Ren*, Linjun Shou, Jian Pei, Ning Wu, Ming Gong, and Daxin Jiang. “Lexicon-Enhanced Self-Supervised Training for Multilingual Dense Retrieval“. In Findings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP’22), Abu Dhabi, December 7-11, 2022. (* intern I mentored)
  • Jianhuan Zhuo, Jianxun Lian, Lanling Xu, Ming Gong, Linjun Shou, Daxin Jiang, Xing Xie, Yinliang Yue. Tiger: Transferable Interest Graph Embedding for Domain-Level Zero-Shot Recommendation. CIKM 2022.
  • N. Chen*, L. Shou, M. Gong, J. Pei, and D. Jiang. “Bridging the Gap between Language Models and Cross-Lingual Sequence Labeling“. NAACL 2022. (* intern I mentored)
  • Ning Wu, Yaobo Liang, Houxing Ren, Linjun Shou, Nan Duan, Ming Gong, Daxin Jiang. Unsupervised Context Aware Sentence Representation Pretraining for Multi-lingual Dense Retrieval. IJCAI, 2022.
  • N. Chen*, Linjun Shou, M. Gong, and J. Pei. “From Good to Best: Two-Stage Training for Cross-lingual Machine Reading Comprehension”. AAAI 2022. (* intern I mentored).
  • Yingmei Guo*, Linjun Shou, Jian Pei, Ming Gong, Mingxing Xu, Zhiyong Wu and Daxin Jiang. Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding. EMNLP 2021. (* intern I mentored).
  • Junjie Huang, Duyu Tang, Wanjun Zhong, Shuai Lu, Linjun Shou, Ming Gong, Daxin Jiang and Nan Duan. WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach. Findings of EMNLP 2021.
  • Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. NeurIPS (Datasets and Benchmarks Track), 2021.
  • Linjun Shou, Ming Gong, Jian Pei, Xiubo Geng, Xingjie Zhou and Daxin Jiang. Language Scaling: Applications, Challenges and Approaches. KDD Tutorial, 2021.
  • S. Liang, M. Gong, J. Pei, L. Shou, W. Zuo, X. Zuo, and D. Jiang. “Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition“. In Proceedings of the Twenty-seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’21), Singapore, August 14-18, 2021.
  • Junjie Huang, Duyu Tang, Linjun Shou, Ming Gong, Ke Xu, Daxin Jiang, Ming Zhou, Nan Duan. CoSQA: 20,000+ Web Queries for Code Search and Question Answering. ACL, 2021.
  • Junjie Huang, Duyu Tang, Wanjun Zhong, Shuai Lu, Linjun Shou, Ming Gong, Daxin Jiang, Nan Duan. WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach. arXiv, 2021.
  • Zenan Xu, Daya Guo, Duyu Tang, Qinliang Su, Linjun Shou, Ming Gong, Wanjun Zhong, Xiaojun Quan, Nan Duan, Daxin Jiang. Syntax-Enhanced Pre-trained Model. ACL, 2021.
  • Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu, Michael Zeng. Retrieval Enhanced Model for Commonsense Generation. Findings of ACL, 2021.
  • Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, Pengcheng Wang, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Ruofei Zhang, Winnie Wu, Ming Zhou, Nan Duan. GLGE: A New General Language Generation. Findings of ACL, 2021.
  • Linjun Shou, Ming Gong, Jian Pei, Xiubo Geng, Xingjie Zhou and Daxin Jiang. Scaling out NLP Applications to 100+ Languages. TheWebConf (WWW) Tutorial. 2021.
  • Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Sefik Eskimez, Liyang Lu, Hong Qu, Michael Zeng. Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model. ICASSP 2021.
  • Fei Yuan#, Linjun Shou, Jian Pei, Wutao Lin, Ming Gong, and Daxin Jiang (# intern I mentored). Reinforced Multi-Teacher Selection for Knowledge Distillation. AAAI 2021.
  • Shining Liang#,Linjun Shou, Jian Pei, Ming Gong, Wanli Zuo, and Daxin Jiang. CalibreNet: CalibrationNetworks for Multilingual Sequence Labeling. WSDM. 2021 (# intern I mentored).
  • Xingyao Zhang*, Linjun Shou*, Jian Pei, Ming Gong, Lijie Wen, and Daxin Jiang. A Graph Representation of Semi-structured Data for Web Question Answering. COLING. 2020 (* Equal contribution).
  • Junhao Liu#, Linjun Shou, Jian Pei, Ming Gong, Min Yang, and Daxin Jiang. Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation. COLING. 2020 (# intern I mentored).
  • Huaishao Luo, Yu Shi, Ming Gong, Linjun Shou, Tianrui Li. MaP: A Matrix-based Prediction Approach to Improve Span Extraction in Machine Reading Comprehension. AACL 2020.
  • Xuguang Wang, Linjun Shou, Ming Gong, Nan Duan, Daxin Jiang. No Answer is Better Than Wrong Answer: A Reflection Model for Document Level Machine Reading Comprehension. EMNLP/IJCNLP, 2020.
  • Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Bruce Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Ming Zhou. XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation. EMNLP/IJCNLP, 2020.
  • Martin Kuo, Yaobo Liang, Lei Ji, Nan Duan, Linjun Shou, Ming Gong, Peng Chen. Tag and Correct: Question aware Open Information Extraction with Two-stage Decoding. arxiv, 2020.
  • Linjun Shou, Shining Bo, Feixiang Cheng, Ming Gong, Jian Pei and Daxin Jiang. Mining Implicit Relevance Feedback from User Behavior for Web Question Answering. KDD, 2020. (Acceptance rate: 121/756)
  • Fei Yuan#, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang. Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension. ACL, 2020. (# Intern I mentored)
  • Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou: CodeBERT: A Pre-Trained Model for Programming and Natural Languages. EMNLP/IJCNLP, 2020
  • Ze Yang*, Linjun Shou*, Ming Gong, Wutao Lin, Daxin Jiang. Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System. WSDM, 2020. (* Equal contribution)
  • Martin Kuo, Yaobo Liang, Lei Ji, Nan Duan, Linjun Shou, Ming Gong, Peng Chen. Tag and Correct: Question aware Open Information Extraction with Two-stage Decoding. arxiv, 2020.
  • Daya Guo, Akari Asai, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Jian Yin, Ming Zhou. Inferential Text Generation with Multiple Knowledge Sources and Meta-Learning. arxiv, 2020
  • Junwei Liao, Sefik Emre Eskimez, Liyang Lu, Yu Shi, Ming Gong, Linjun Shou, Qu Hong and Michael Zeng: Improving Readability for Automatic Speech Recognition Transcription. arxiv, 2020
  • Wanjun Zhong, Duyu Tang, Zhangyin Feng, Nan Duan, Ming Zhou, Ming Gong, Linjun Shou, Daxin Jiang, Jiahai Wang, Jian Yin. LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network. ACL, 2020.
  • Shangwen Lv, Daya Guo, Jingjing Xu, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Songlin Hu. Graph-based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering. AAAI, 2020
  • Ze Yang, Pengfei Wang, Lei Zhang, Linjun Shou, Wenwen Xu: A Recurrent Attention Network for Judgment Prediction. ICANN (4) 2019.
  • Haoyang Huang, Yaobo Liang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Ming Zhou. Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks. EMNLP, 2019.
  • Ming Gong, Linjun Shou, Wutao Lin, Zhijie Sang, Quanjia Yan, Ze Yang, Feixiang Cheng, Daxin Jiang: NeuronBlocks: Building Your NLP DNN Models Like Playing Lego. EMNLP, 2019.
  • Ze Yang#, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang. Model Compression with Multi-Task Knowledge Distillation for Web-scale Question Answering System. arXiv, 2019. (# Intern I mentored)