Few-shot Learning
Deep neural networks including pre-trained language models like BERT, Turing-NLG and GPT-3 require thousands of labeled training examples to obtain state-of-the-art performance for downstream tasks and applications. Such large number of labeled examples are difficult…
Knowledge Distillation
Modern machine learning applications have enjoyed a great boost utilizing deep and large neural network models, allowing them to achieve state-of-the-art results on a wide range of tasks such as question-answering, conversational AI, search and…
Microsoft at ACL 2020
Microsoft is proud to be a Diversity and Inclusion Champion sponsor of the 58th Annual Meeting of the Association for Computational Linguistics (ACL) happening on July 5-10, 2020. See the details on our contributions to…