Tie-Yan Liu (刘铁岩) is an assistant managing director of Microsoft Research Asia (微软亚洲研究院副院长), leading the machine learning research area. He is a fellow of the IEEE and a distinguished member of the ACM. He is an adjunct professor at Carnegie Mellon University (CMU), Tsinghua University (THU), Nankai University, and University of Science and Technology of China (USTC), and an honorary professor at Nottingham University.
As a researcher in an industrial lab, Tie-Yan is making his unique contributions to the world. On one hand, many of his technologies have been transferred to Microsoft’s products and online services, such as Bing, Microsoft Advertising, Windows, Xbox, and Azure. On the other hand, he has been actively contributing to the academic community.
Tie-Yan’s seminal contribution to the field of learning to rank has been widely recognized (https://en.wikipedia.org/wiki/Learning_to_rank). He invented several highly impactful algorithms and theories, including the listwise approach to ranking, relational ranking, and statistical learning theory for ranking. He is an advocator of learning to rank as a self-contained research discipline – he gave the first batch of keynote speeches and tutorials, organized the first series of workshops, and wrote the very first book on this topic (among top-10 Springer computer science books written by Chinese authors). He is the creator of LETOR benchmark dataset (https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval), which has become a must-have experimental platform for the research on learning to rank. With his deep research and social efforts, learning to rank has become a fundamental technology in major search engines today, and it continues to be one of the most important directions in the related research communities.
Tie-Yan has done impactful work on scalable and efficient machine learning. As early as in 2005, Tie-Yan developed the largest text classifier in the world, which can categorize over 250,000 categories on 20 machines, according to the Yahoo! taxonomy (published at SigKDD Explorations). Recently, Tie-Yan and his team developed many other scalable and efficient machine learning tools, including the fastest and largest topic model in the world (LightLDA, with one million topics, published at WWW 2015), the most efficient and scalable recurrent neural networks (LightRNN, published at NIPS 2016) and GBDT model (LightGBM, published at NeurIPS 2016 and 2017), and the fastest neural TTS engine (FastSpeech, published at NeurIPS 2019). Some of these models were open-sourced through Microsoft Distributed Machine Learning Toolkit (http://www.dmtk.io/), which has attracted millions of visitors, hundreds of thousands of downloads, and tens of thousands of stars at GitHub.
Recently, Tie-Yan has done advanced research on deep learning and reinforcement learning. In particular, he and his team have proposed a few new machine learning concepts, such as dual learning, learning to teach, and deliberation learning. Dual learning leverages the structure duality of AI tasks to enable effective learning even if there are no sufficient training data. Together with some other innovations including deliberation networks, dual learning has achieved the best performance in many tasks (including human parity in Chinese-to-English news translation, and the first place in 8 tasks of WMT 2019). Learning to teach goes beyond traditional machine learning, and utilizes reinforcement learning technologies to automate the data selection, loss function selection, and hypothesis space selection of machine learning tasks. It enlarges the scope of classical machine learning, and achieved state-of-the-art results in many tasks. These inspiring works were published at NeurIPS, ICML, and ICLR, and attracted a lot of attention from the research community. In addition, Tie-Yan’s team built the world-best Mahjong AI, named Suphx, which achieved 10 DAN on the Tenhou Mahjong platform in mid 2019.
Tie-Yan has also conducted innovative research on algorithmic game theory. For example, in order to bridge theory and practices, he introduced many practical constraints into auction mechanism design (e.g., bounded rationality, budget constraints), and proposed a data-driven framework called “game-theoretical machine learning” for auction optimization. This framework learns the bounded rationality model from data, and optimizes the action parameters based on the learned model using a simulation-based framework. The framework extends algorithmic game theory due to the introduction of data, and extends machine learning by considering the strategic (non-i.i.d.) behaviors behind data generation.
Over the years, Tie-Yan and his team have been recognized as one of the global powerhouses and trendsetter in machine learning and information retrieval. He and his team have contributed hundreds of high-impact papers at top conferences – a good indicator of their influence and impact. His top five papers have been cited nearly 7000 times in refereed conferences and journals. He has won quite a few awards, including the best student paper award at SIGIR (2008) and ACML (2018), the most cited paper award at Journal of Visual Communications and Image Representation (2004-2006), the research break-through award at Microsoft Research (2012), and the most cited Chinese researcher (2017-2019), China AI Leader Award – Technical Innovation (2018), and Most Influential Scholar Award by AMiner (2007-2017). He has been invited to serve as general chair, PC chair, or area chair for a dozen of top conferences including WWW/WebConf, SIGIR, NIPS, ICLR, IJCAI, AAAI, KDD, ACL, ICTIR, as well as associate editor/editorial board member of IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), ACM Transactions on Information Systems (TOIS), ACM Transactions on Web (TWEB), Information Retrieval Journal, NeuroComputing, and Foundations and Trends in Information Retrieval.
Tie-Yan Liu and his works have been reported by many International media, including National Public Radio, CNET, MIT Technology Review, and PCTech Magazine. Recently, he and his team have applied their AI technologies to bring digital transformations to many traditional industries, including finance, logistics, healthcare, and sustainability. One of his recent works on digital transformation was awarded as “30 Best AI Use Case of 2019” by Synced.
Dataset and Toolkit Release
- LightGBM (https://github.com/Microsoft/LightGBM), 2017 – with 11K stars on Github, the most popular machine learning tools used in KDD Cup and Kaggle.
- Microsoft Graph Engine (https://www.graphengine.io/), 2016 – the most powerful graph engine in the world!
- Microsoft Distributed Machine Learning Toolkit (http://www.dmtk.io/), 2015 – Attracted millions of page views, hundreds of thousands of downloads, and thousands of stars at GitHub; including record-keeping machine learning algorithms like LightLDA.
- LETOR Benchmark Dataset for Learning to Rank (https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval/), 2007 – A must-have experimental platform for research on learning to rank. According to incomplete statistics, more than half of the papers on learning to rank published at major conferences and journals have used this dataset for their evaluations in the past ten years.
We Are Hiring!
- We are hiring at all levels (especially senior researchers)! If your major is machine learning (especially deep learning and distributed machine learning), and you have the passion to change the world, please send your resume to firstname.lastname@example.org.