Portrait of Silviu Cucerzan

Silviu Cucerzan

Principal Researcher

About

I am a researcher in the Machine Learning Department, affiliated with the Machine Learning and Optimization and the Natural Language Processing groups. I am currently in Bing Satori, on a two-year rotation.

I joined Microsoft in 2003, after completing my Ph.D. work at the Johns Hopkins University, where I was a member of the Center for Speech and Language Processing. I also received a M.S.E. in Computer Science from Johns Hopkins, as well as an M.Sc. and a B.Sc. in Mathematics and Computer Science from University of Bucharest. Prior to coming to the United States, I held a junior faculty position in the Department of Mathematics and Computer Science at University of Bucharest.

Research Interests

Information Extraction from Large Text Collections
Natural Language Processing
Knowledge Representation
Information Retrieval
Machine Learning

Influential Technologies Developed

Contextual Spelling Correction (How do you spell Levenshtein?)
Query Suggestion and Advertisement Smart Matching
Information-centric Browsing and Search (TechFest 2007/TechFair 2007)
Entity Linking (NEMO)
Fact Extraction from News and Web Text
InstaFact (Hackathon 2016)

Favorite quote
Samuel Goldwyn: “The harder I work, the luckier I get.”

Personal interests
Stephanie

Wildlife photography

Lego

Classical music

Jazz
Hiking
Board games
Writing

Other

pdf
bib
Sergio Jimenez Vargas, Silviu Cucerzan, Fabio González and Alexander Gelbukh
BM25-CTF: Improving TF and IDF factors in BM25 by using collection term frequencies
Language & Knowledge Engineering, Puebla, 2017

pdf
bib
Ning Gao and Silviu Cucerzan
Entity Linking to One Thousand Knowledge Bases
European Conference on Information Retrieval, Aberdeen, 2017

pdf
bib
C.J.C. Burges, T. Hart, Z. Yang, S. Cucerzan, R.W. White, A. Pastusiak, and J. Lewis
A Base Camp for Scaling AI
arXiv, 2016

pdf
bib
Silviu Cucerzan
The MSR System for Entity Linking at TAC 2014
Text Analysis Conference, Gaithersburg, 2014
Best Accuracy and Best B-cubed+ F-measure for the EL task

pdf
bib
Silviu Cucerzan
Named Entities Made Obvious: The Participation in the ERD 2014 Evaluation
Entity Recognition and Disambiguation Challenge, Gold Coast, 2014
Best system in ERD 2014 Long Text Track

pdf
bib
Avirup Sil and Silviu Cucerzan
Temporal Scoping of Relational Facts based on Wikipedia Data
Computational Natural Language Learning (CoNLL), Baltimore, 2014

pdf
bib
Silviu Cucerzan and Avirup Sil
The MSR Systems for Entity Linking and Temporal Slot Filling at TAC 2013
Text Analysis Conference, Gaithersburg, 2013
Entity Linking: Best Accuracy, Best B-cubed+ F-measure; Temporal Scoping: Best Score

pdf
bib
Silviu Cucerzan
The MSR System for Entity Linking at TAC 2012
Text Analysis Conference, Gaithersburg, 2012
Best Accuracy, Best B-cubed+ F-measure

pdf
bib
Chee Wee Leong and Silviu Cucerzan
Supporting Factual Statements with Evidence from the Web
The ACM Conference on Information and Knowledge Management (CIKM), Maui, 2012

pdf
bib
Silviu Cucerzan
TAC Entity Linking by Performing Full-document Entity Extraction and Disambiguation
Text Analysis Conference, Gaithersburg, 2011
Best Accuracy, 2nd Best B-cubed+ F-measure
pdf
bib
Niranjan Balasubramanian and Silviu Cucerzan
Beyond Ranked Lists in Web Search: Aggregating Web Content into Topic Pages
International Journal of Semantic Computing, 4 (4)
World Scientific Publishing Company, 2010

pdf
bib
Niranjan Balasubramanian and Silviu Cucerzan
Topic Pages: An Alternative to the Ten Blue Links
IEEE International Conference on Semantic Computing (IEEE-ICSC), Pittsburgh, 2010

pdf
bib
Silviu Cucerzan
Does Capitalization Matter in Web Search?
International Conference on Knowledge Discovery and Information Retrieval (KDIR), Valencia, 2010

pdf
bib
Silviu Cucerzan
A Case Study of Using Web Search Statistics: Case Restoration
CICLing, Iasi, 2010 Lecture Notes in Computer Science, Volume 6008 Springer, 2010

pdf
bib
Niranjan Balasubramanian and Silviu Cucerzan
Automatic Generation of Topic Pages using Query-based Aspect Models
The ACM Conference on Information and Knowledge Management (CIKM), Hong Kong, 2009

pdf
bib
Mandar Rahurkar and Silviu Cucerzan
Using the Current Browsing Context to Improve Search Relevance
The ACM Conference on Information and Knowledge Management (CIKM), Napa Valley, 2008

pdf
bib
Ryen W. White, Mikhail Bilenko, and Silviu Cucerzan
Leveraging Popular Destinations to Enhance Web Search Interaction
ACM Transactions on The Web, 2 (3) ACM, 2008

pdf
bib
Ziming Zhuang and Silviu Cucerzan
Exploiting Semantic Query Context to Improve Search Ranking
IEEE International Conference on Semantic Computing (IEEE-ICSC), Santa Clara, 2008
The Best Paper Award

pdf
bib
Mandar Rahurkar and Silviu Cucerzan
Predicting when Browsing Context Is Relevant to Search
The 31st ACM SIGIR Conference, Singapore, 2008

pdf
bib
Wisam Dakka and Silviu Cucerzan
Augmenting Wikipedia with Named Entity Tags
The Third International Joint Conference on Natural Language Processing (IJCNLP), Hyderabad, 2008

pdf
bib
Alpa Jain, Silviu Cucerzan, and Saliha Azzam
Acronym-Expansion Recognition and Ranking on the Web
The IEEE Conference on Information Reuse and Integration (IEEE-IRI), Las Vegas, 2007

pdf
bib
Ryen W. White, Mikhail Bilenko, and Silviu Cucerzan
Studying the Use of Popular Destinations to Enhance Web Search Interaction
The 30th ACM SIGIR Conference, Amsterdam, 2007
The Best Paper Award

pdf
bib
Silviu Cucerzan and Ryen W. White
Query Suggestion Based on User Landing Pages
The 30th ACM SIGIR Conference, Amsterdam, 2007
pdf
bib
Ryen W. White, Charles L. A. Clarke, and Silviu Cucerzan
Comparing Query Logs and Pseudo-Relevance Feedback for Web-Search Query Refinement
The 30th ACM SIGIR Conference, Amsterdam, 2007

pdf
bib
Silviu Cucerzan
Large Scale Named Entity Disambiguation Based on Wikipedia Data
The EMNLP-CoNLL Joint Conference, Prague, 2007

pdf
bib
Ziming Zhuang and Silviu Cucerzan
Re-ranking Search Results Using Query Logs
The ACM Conference on Information and Knowledge Management (CIKM), Washington, DC, 2006

pdf
bib
Ziming Zhuang, Silviu Cucerzan, and C. Lee Giles
Network Flow for Collaborative Ranking
ECML/PKDD, Berlin, 2006 Lecture Notes in Computer Science, Volume 4213 Springer, 2006

pdf
Silviu Cucerzan and Eric Brill
Extracting Semantically Related Queries by Exploiting User Session Information
Unpublished Draft (submitted to WWW-2006, November 2005)

pdf
bib
Silviu Cucerzan and Eugene Agichtein
Factoid Question Answering over Unstructured and Structured Content on the Web
The Text Retrieval Conference (TREC), Washington, DC, 2005

pdf
bib
Eugene Agichtein and Silviu Cucerzan
Predicting Accuracy of Extracting Information from Unstructured Text Collections
The ACM Conference on Information and Knowledge Management (CIKM), Bremen, 2005

pdf
bib
Eugene Agichtein, Silviu Cucerzan, and Eric Brill
Analysis of Factoid Questions for Effective Relation Extraction
The 28th ACM SIGIR Conference, Salvador, 2005

pdf
bib
Silviu Cucerzan and Eric Brill
Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users
The Conference on Empirical Methods in Natural Language Processing (EMNLP), Barcelona, 2004

pdf
bib
Silviu Cucerzan and David Yarowsky
Minimally Supervised Induction of Grammatical Gender
The Human Language Technology Conference (HLT/NAACL), Edmonton, 2003

pdf
bib
Radu Florian, Silviu Cucerzan, Charles Schafer, and David Yarowsky
Combining Classifiers for Word Sense Disambiguation
Journal of Natural Language Engineering, 8 (4) Cambridge University Press, 2002

pdf
bib
Silviu Cucerzan and David Yarowsky
Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day
The 6th Conference on Natural Language Learning (CoNLL), Taipei, 2002

pdf
bib
Silviu Cucerzan and David Yarowsky
Language Independent NER using a Unified Model of Internal and Contextual Evidence
The 6th Conference on Natural Language Learning (CoNLL), Taipei, 2002

pdf
bib
Silviu Cucerzan and David Yarowsky
Augmented Mixture Models for Lexical Disambiguation
The Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, 2002

pdf
bib
David Yarowsky, Silviu Cucerzan, Radu Florian, Charles Schafer, and Rich Wicentowski
The Johns Hopkins Senseval2 System Descriptions
Senseval-2, Toulouse, 2001
Best Accuracy for English Lexical Choice (main task)

pdf
bib
Silviu Cucerzan and David Yarowsky
Language Independent Minimally Supervised Induction of Lexical Probabilities
The 38th Annual Meeting of the Association for Computational Linguistics (ACL), Hong Kong, 2000

pdf
bib
Silviu Cucerzan and David Yarowsky
Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence
The 1999 Joint SIGDAT Conference on EMNLP and VLC, College Park, 1999