Named Entity Recognition

Established: January 1, 2013


=> Jan 2013 : Mar 2014 …
In collaboration with Microsoft Office team, we have built a Named Entity Recognition framework out of Wikipedia text. The framework was able to auto-label Wikipedia pages in 3 classes, Persons, Locations, and Organisations. Then, the framework automatically extracts training data for a CRF. The framework was tested and evaluated on multiple datasets to bypass Stanford NER and then, tested on 5 languages, English, French, German, Spanish, and Japanese.