Tie-Yan Liu is an assistant managing director of Microsoft Research Asia, leading the machine learning research area.
As a researcher in an industrial lab, Tie-Yan is making his unique contributions to the world. On one hand, many of his technologies have been transferred to Microsoft’s products and online services, such as Bing, Microsoft Advertising, Windows, Xbox, and Azure. He has received many recognitions and awards in Microsoft for his significant product impacts. On the other hand, he has been actively contributing to the academic community. He is an adjunct professor at CMU and several universities in China, and an honorary professor at Nottingham University. He is frequently invited to chair or give keynote speeches at major machine learning and information retrieval conferences. He is a fellow of the IEEE and a distinguished member of the ACM.
Tie-Yan’s seminal contribution to the field of learning to rank has been widely recognized (https://en.wikipedia.org/wiki/Learning_to_rank). He invented several highly impactful algorithms and theories, including the listwise approach to ranking, relational ranking, and statistical learning theory for ranking. He is an advocator of learning to rank as a self-contained research discipline – he gave the first batch of keynote speeches and tutorials, organized the first series of workshops, and wrote the very first book on this topic (among top-10 Springer computer science books written by Chinese authors). He is the creator of LETOR benchmark dataset (https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval), which has become a must-have experimental platform for the research on learning to rank. With his deep research and social efforts, learning to rank has become a fundamental technology in major search engines today, and it continues to be one of the most important directions in the related research communities.
Tie-Yan has done impactful work on large scale machine learning. As early as in 2005, Tie-Yan has developed the largest text classifier in the world, which can categorize over 250,000 categories on 20 machines (published at SigKDD Explorations). Recently, Tie-Yan and his team developed many other large-scale machine learning tools, including the fastest and largest topic model in the world (LightLDA, with one million topics, published at WWW 2015), the largest word embedding model, the most efficient and scalable recurrent neural networks (LightRNN, published at NIPS 2016) and GBDT model (LightGBM, published at NIPS 2016 and 2017). Some of these models were open-sourced through Microsoft Distributed Machine Learning Toolkit (http://www.dmtk.io/), which has attracted millions of visitors, hundreds of thousands of downloads, and tens of thousands of stars at GitHub. The Multiverso parameter server developed by his team has also served as the distributed computational engine in Microsoft’s Cognitive Toolkits (CNTK).
Recently, Tie-Yan has done advanced research on deep learning and reinforcement learning. In particular, he and his team have proposed a few new machine learning concepts, such as dual learning, learning to teach. Dual learning leverages the structure duality of AI tasks to enable effective learning even if there are no sufficient training data. Together with some other innovations including deliberation networks, dual learning has achieved the best performance in many machine translation tasks (including human parity in Chinese-to-English news translation), and won the first place in 8 tasks in the objective evaluations of WMT 2019 . Learning to teach goes beyond traditional machine learning, and utilizes reinforcement learning technologies to automate the data selection, loss function selection, and hypothesis space selection of machine learning tasks. It enlarges the scope of classical machine learning, and achieved state-of-the-art results in many tasks. These inspiring works were published at NIPS, ICML, and ICLR, and attracted a lot of attention from the research community.
Tie-Yan has also conducted innovative research on algorithmic game theory. For example, in order to bridge theory and practices, he introduced many practical constraints into auction mechanism design (e.g., bounded rationality, budget constraints), and proposed a data-driven framework called “game-theoretical machine learning” for auction optimization. This framework learns the bounded rationality model from data, and optimizes the action parameters based on the learned model using a simulation-based framework. The framework extends algorithmic game theory due to the introduction of data, and extends machine learning by considering the strategic (non-i.i.d.) behaviors behind data generation. He and his team have applied their AI and game-theoretic technologies to bring digital transformations to many traditional industries, including finance and logistics.
Over the years, Tie-Yan and his team have been recognized as one of the global powerhouses and trendsetter in machine learning and information retrieval. He and his team have contributed hundreds of high-impact papers at top conferences – a good indicator of their influence and impact. His top five papers have been cited about 5000 times in refereed conferences and journals. He has won quite a few awards, including the best student paper award at SIGIR (2008) and ACML (2018), the most cited paper award at Journal of Visual Communications and Image Representation (2004-2006), the research break-through award at Microsoft Research (2012), and the most cited Chinese researcher (2017, 2018). He has been invited to serve as general chair, PC chair, or area chair for a dozen of top conferences including WWW/WebConf, SIGIR, NIPS, IJCAI, AAAI, KDD, ACL, ICTIR, as well as associate editor/editorial board member of ACM Transactions on Information Systems, ACM Transactions on Web, Information Retrieval Journal, Neurocommputing, and Foundations and Trends in Information Retrieval. Tie-Yan Liu and his works have been reported by many International media, including National Public Radio, CNET, MIT Technology Review, and PCTech Magazine.
We published a new book on “Distributed Machine Learning” in China. Please have a check here!
We established strategic research collaborations with Asset Management of China (华夏基金) and China Taiping on AI + Investment and with Orient Overseas Container Line Limited (OOCL) and SF-Express on AI + Logistics.
Dataset and Toolkit Release
Microsoft Graph Engine (https://www.graphengine.io/), 2016 – the most powerful graph engine in the world!
Microsoft Distributed Machine Learning Toolkit (http://www.dmtk.io/), 2015 – Attracted millions of page views, hundreds of thousands of downloads, and thousands of stars at GitHub; including record-keeping machine learning algorithms like LightLDA and LightGBM.
LETOR Benchmark Dataset for Learning to Rank (https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval/), 2007 – A must-have experimental platform for research on learning to rank. According to incomplete statistics, more than half of the papers on learning to rank published at major conferences and journals have used this dataset for their evaluations in the past ten years.
We Are Hiring!
We are hiring at all levels (especially senior researchers)! If your major is machine learning (especially deep learning and distributed machine learning), and you have the passion to change the world, please send your resume to firstname.lastname@example.org.