Domain-specific language model pretraining for biomedical natural language processing
COVID-19 highlights a perennial problem facing scientists around the globe: how do we stay up to date with the cutting edge of scientific knowledge? In just a few months since the pandemic emerged, tens of…
Dataflow-Based Dialogue
Code and data for building dataflow-based conversational agents. Accompanying paper: Task-Oriented Dialogue as Dataflow Synthesis (TACL 2020).
KDD 2020 TrueFact Workshop: Making a Credible Web for Tomorrow
The second international TrueFact Workshop: Making a Credible Web for Tomorrow will provide a forum where researchers and practitioners from academia, government and industry can share insights and identify new challenges and opportunities in resolving…
Audio-based Toxic Language Detection
Online gaming has been growing increasingly popular recently. This highly competitive online social platform can sometimes lead to undesired behavior and create an unfriendly community for many players. The detecting of profanity and bullying have…
ZeRO & Fastest BERT: Increasing the scale and speed of deep learning training in DeepSpeed
The latest trend in AI is that larger natural language models provide better accuracy; however, larger models are difficult to train because of cost, time, and ease of code integration. With the goal of advancing…