Portrait of Christian Konig

Christian Konig

Senior Researcher

About

I am currently working as a member of the AutoAdmin and Data Exploration research projects in the Data Management, Exploration and Mining Group at Microsoft Research. I am also a member of the cross-group BLEWS project on blogs and news.

Before joining Microsoft, I completed my Ph.D. in Computer Science at the University of the Saarland, in the Database Research Group of Prof. Gerhard Weikum.

My current research is focused on scalable algorithms for processing and indexing very large data sets in the context of web search and computational advertising. Recent work includes:

Service Intelligence: we study the use of statistical techniques in the context of monitoring, tuning and problem-diagnostics for large-scale ‘Cloud’ database instances.
B-bit Minwise Hashing: we proposed a technique that improves upon the standard minwise hashing method (as well as sign random projections, Hamming-LSH, etc.) for set similarity estimation by storing only storing b bits of each hashed value (e.g., b = 1 or 2); using a novel estimator we obtain order-of-magnitude improvements in the storage space required for a given level of accuracy in practice. Subsequently, we (a) extended the framework to three-way similarities, and (b) integrated (b-bit) minwinse hashing with linear learning algorithms such as linear SVM and logistic regression to solve large-scale and high-dimensional statistical learning tasks. A 3-minute video introduction to the technique is availabe here.
Fast Set intersection: Set intersection is a central operator in IR and data mining; we propose techniques that give novel asymptotic bounds and outperform the state of the art in practice, while being robust in that – for the cases where our approach is not the best – they are close to the best-performing one.
Integrating Vertical Content with Web Search: Current search engines surface a plethora of content other than web pages, such as advertisements, news, images, movies, ‘answers’, etc. Retrieving the appropriate ‘vertical’ content for a given query is an important research challenge. Recently, we studied frameworks for the detection of query intent, which enable the selection of relevant content types, the integration of news results in web search and the dynamic construction of ‘portal’ pages for a given query.
Improving Retrieval Latency: The perceived latency of search is of critical importance to the overall search experience. We have studied algorithms and index structures aimed at minimizing the worst-case latency when retrieving ‘vertical’ content, surfacing advertisements in sponsored search or displaying structured data about entities (such as celebrities, products, locations) related to a search query.
BLEWS (= blogs + news): In the BLEWS system we studied how to surface blog entries commenting on news stories as part of the news browsing experience. The BLEWS system shows which type of blogs are linking back to a specific story, how much ‘attention’ the story is getting and allows the user to quickly navigate to the comments themselves. We also studied the distribution of navigational patterns used to access social media content (i.e., what type of content do users typically read in blogs and how do they get there?).
Text classification: here, our work has focused on the scalable and robust extraction/categorization of entities from very large corpora and reducing the human overhead in text classification settings.

My prior work in the context of the management of databases has focused on a monitoring infrastructure for database servers, the scalable exploration of different database designs and various techniques for result-size estimation.

Projects

Rethinking Eventual Consistency

Established: July 31, 2013

The past five years has seen a resurgence of work on replicated, distributed database systems, to meet the demands of intermittently-connected clients and disaster-tolerant database systems that span data centers. Each product or prototype uses a weakened definition of replica-consistency…

SQLVM: Performance Isolation in Multi-Tenant Relational Database-as-a-Service

Established: February 14, 2013

Multi-tenancy and resource sharing are essential to make a Database-as-a-Service (DaaS). However, resource sharing usually results in the performance of one tenant’s workload to be affected by other co-located tenants. In the SQLVM project, our approach to performance isolation in…

Hyder, a transactional indexed-record manager for shared flash

Established: February 8, 2013

Hyder is a transactional indexed-record manager for shared flash. That is, it supports operations on indexed records and transaction operations that bracket the record operations. It is designed to run on a cluster of servers that have shared access to…

Entity Search and Query Portals

Established: March 20, 2011

The goal of entity search is to return entities (e.g., people, products, locations) relevant to a keyword query. The goal of Query Portals is to go one step further and return not only the names of relevant entities but a…

Publications

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2004

2002

1999

Projects

Other

Professional Activities

  • ACM International Conference on Management of Data (SIGMOD 2004): Proceedings Chair
  • First M3SN Workshop on Modeling, Managing, and Mining of Evolving Social Networks (co-located with ICDE 2009): Co-Chair
  • 1st USETIM (Using Search Engine Technology for Information Management) Workshop: Keynote Speaker
  • 2010 Dagstuhl Workshop on Robust Query Processing: Co-Organizer
  • 37rd International Conference on Very Large Databases (VLDB 2011): Local Arrangements Chair

Programm Comittee Memberships:

  • 10th Conference on Database Systems for Business, Technology, and the Web (BTW 2003): Program Committee member
  • 21st International Conference on Data Engineering (ICDE 2005): Program Committee member
  • 11th International Conference on Database Systems for Advanced Applications (DASFAA 2006): Program Committee member
  • 12th International Conference of the Management of Data (COMAD 2005): Industrial Program Committee member
  • 12th International Conference on Database Systems for Advanced Applications (DASFAA 2007): Program Committee member
  • 14th Conference on Database Systems for Business, Technology, and the Web (BTW 2007): Industrial Program Committee member
  • 18th International Conference on Database and Expert Systems Applications (DEXA 2007): Program Committee member
  • 33rd International Conference on Very Large Databases (VLDB 2007): Program Committee member
  • International Workshop on Ranking in Databases (DBRank 2007) to be held in conjunction with ICDE 2007: Program Committee member
  • 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007): Industrial Program Committee member
  • 13th International Conference on Database Systems for Advanced Applications (DASFAA 2008): Program Committee member
  • 11th International Conference on Extending Database Technology (EDBT 2008): Program Committee member
  • 19th International Conference on Database and Expert Systems Applications (DEXA 2008): Program Committee member
  • Second International Conference on Weblogs and Social Media (ICWSM 2008): Poster/Demo Committee member
  • 15th International Conference of the Management of Data (COMAD 2008): Program Committee member
  • 16th Conference on Database Systems for Business, Technology, and the Web (BTW 2009): Industrial Program Committee member
  • Third International Conference on Weblogs and Social Media (ICWSM 2009): Program Committee member
  • NAACL Human Language Technology (NAACL-HLT) Conference 2009: Program Committee Member
  • 2nd M3SN Workshop on Modeling, Managing, and Mining of Evolving Social Networks (co-located with ICDE 2010): Program Committee member
  • 4th International Workshop on Ranking in Databases (DBRank 2010): Program Committee member
  • 4th International Conference on Weblogs and Social Media (ICWSM 2010): Program Committee member
  • NAACL Human Language Technology (NAACL-HLT) Conference 2010: Program Committee Member
  • Fourth Workshop on Enabling Real-Time Business Intelligence (BIRTE 2010): Program Committee Member
  • 1st Workshop on Data Management Issues in Web Syndication Systems (DaMIWS 2010): Program Committee Member
  • 5th International Conference on Weblogs and Social Media (ICWSM 2011): Program Committee Member
  • 1st International Temporal Web Analytics Workshop (TWAW 2011) at WWW 2011: Program Committee Member
  • 5th IEEE International Conference on Semantic Computing (ICSC 2011): Program Committee Member
  • 17th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD 2011): Program Committee Member
  • European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2011): Program Committee Member
  • Fifth Workshop on Enabling Real-Time Business Intelligence (BIRTE 2011): Program Committee Member
  • SIGIR Workshop on Entity-Oriented Search: Program Committee Member
  • 28th International Conference on Data Engineering (ICDE 2012) – Demo Track: Program Committee Member
  • 21st World Wide Web Conference (WWW 2012): Program Committee Member
  • 5th International ACM Conference on Web Search and Data Mining (WSDM 2012): Program Committee Member
  • 26th International Conference on Artificial Intelligence (AAAI 2012): Program Committee Member
  • 2nd International Temporal Web Analytics Workshop (TempWeb 2012) at WWW 2012: Program Committee Member
  • 39rd International Conference on Very Large Databases (VLDB 2013/PVLDB): Program Committee Member
  • 6th Workshop on Enabling Real-Time Business Intelligence (BIRTE 2012): Program Committee Member
  • Conference on Empirical Methods on Natural Language Processing (EMNLP 2012): Program Committee Member
  • 6th ACM International Conference on Web Search and Data Mining (WSDM 2013): Program Committee Member
  • 2nd International Workshop on Searching and Integrating New Web Data Sources (VLDS 2012): Program Committee Member
  • 16th International Conference on Extending Database Technology (EDBT 2013): Industrial Program Committee Member
  • 1st Joint International Workshop on Entity-oriented and Semantic Search 2012 (JIWES 2012): Program Committee Member
  • 29th International Conference on Data Engineering (ICDE 2013) – Demo Track: Program Committee Member
  • 19th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD 2013): Program Committee Member
  • 2013 IEEE International Conference on Big Data (IEEE Big Data 2013)
  • 40rd International Conference on Very Large Databases (VLDB 2014/PVLDB): Program Committee Member
  • 30th International Conference on Data Engineering (ICDE 2014): Program Committee Member
  • 17th International Conference on Extending Database Technology (EDBT 2014): Demo Program Committee Member
  • 31st International Conference on Machine Learning (ICML 2014): Program Committee Member
  • 23rd World Wide Web Conference (WWW 2014): Program Committee Member
  • 20th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD 2014): Program Committee Member (Research track)
  • 20th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD 2014): Program Committee Member (Industry and Government track)
  • 8th International Conference on Weblogs and Social Media (ICWSM-14): Program Committee Member
  • The Neural Information Processing Systems Conference 2014 (NIPS-2014): Program Committee Member
  • 8th ACM International Conference on Web Search and Data Mining (WSDM 2015): Program Committee Member
  • 24th World Wide Web Conference (WWW 2015): Program Committee Member
  • 21st ACM Conference on Knowledge Discovery and Data Mining (SIGKDD 2015): Program Committee Member (Research track)
  • 21st ACM Conference on Knowledge Discovery and Data Mining (SIGKDD 2015): Program Committee Member (Industrial and Government track)
  • 41rd International Conference on Very Large Databases (VLDB 2015/PVLDB): Demo Track Program Committee Member
  • 9th ACM International Conference on Web Search and Data Mining (WSDM 2016): Program Committee Member
  • The Neural Information Processing Systems Conference 2015 (NIPS-2015): Program Committee Member
  • 32nd International Conference on Data Engineering (ICDE 2016): Demo Track Program Committee Member
  • 25th International World Wide Web Conference (WWW 2016): Program Committee Member
  • 22nd ACM Conference on Knowledge Discovery and Data Mining (SIGKDD 2016): Program Committee Member (Research track)
  • 25th International Joint Conference on Artificial Intelligence (IJCAI-16) : Program Committee Member
  • 42nd International Conference on Very Large Databases (VLDB 2016/PVLDB): Demo Track Program Committee Member
  • 43rd International Conference on Very Large Databases (VLDB 2017/PVLDB): Program Committee Member
  • 10th ACM International Conference on Web Search and Data Mining (WSDM 2017) : Program Committee Member

Journal Reviews: DKE, IJHIS, IS, ISI, TIST, TKDE, TODS, VLDB Journal, WWW Journal

External Reviewer for: STOC, STACS, ICML