Algorithmic Crowdsourcing

Established: February 1, 2012

To build a machine learning based intelligent system, we often need to collect training labels and feed them into the system. A useful lesson in machine learning is that “more data beats a clever algorithm”. In the current days, through a commercial crowdsourcing platform, we can easily collect a large amount of labels at a cost of pennies per label.

However, the labels obtained from crowdsourcing may be highly noisy. Training a machine learning model with highly noisy labels can be misleading. This is widely known as “garbage in, garbage out”. There are two main reasons on label noise. One is that crowdsourcing workers may not have expertise on a labeling task, and the other is that crowdsourcing workers may have no incentives to produce high quality labels.

Our goal in this project to develop principled inference algorithms and incentive mechanisms to guarantee high quality labels from crowdsourcing in practice.

Contact person: Denny Zhou



Portrait of John  Platt

John Platt

Principal Scientist


Portrait of Xi  Chen

Xi Chen



Portrait of Nihar  Shah

Nihar Shah


UC Berkeley

Portrait of Qiang  Liu

Qiang Liu

Visiting Scholar


Portrait of Chao  Gao

Chao Gao



Portrait of Tengyu Ma

Tengyu Ma

Visiting Scholar