I am a Researcher in the Data Management, Exploration and Mining (DMX) group at Microsoft Research. Before joining Microsoft, I completed my Ph.D. in Computer Science at University of Illinois at Urbana-Champaign under the supervision of Prof. Jiawei Han, my M.Phil. at The Chinese University of Hong Kong, advised by Jeffery Xu Yu, and my B.S. at Renmin University of China, advised by Shan Wang and Qing Zhu.

Research Interests

My research goals and interests center around large-scale data management, including interactively querying and exploring “big” data, privacy-preserving data analytics, query processing and optimization, and data mining algorithms. I am particularly interested in randomized and approximation algorithms which have performance guarantees in theory, and are effective and robust in practice as well. More recently, I am interested in:

Approximations in Big Data: “Approximations” have twofold meanings here. First, under resource budgets (e.g., storage cost and computation power), how to enable interactive analytics by trading off accuracy for instant responses. Second, under constraints of data privacy, how to enable data analytics with both privacy and precision guarantees.

  • Approximate query processing: how to process analytical queries on large-scale data (e.g., with billions of rows) with approximate answers in interactive response time (e.g., one hundred milliseconds).
  • Privacy-preserving data analytics: how to process analytical queries and analytics tasks with precision guarantees while protecting data owners’ privacy with formal notations (e.g., differential privacy).

Querying and Searching Large-Scale Data:

  • Inventing new search models and interfaces to help people explore structured/semi-structured data (e.g., text and knowledge graphs), and developing efficient search algorithms and index structures.
  • Query optimization and query processing (e.g., set intersection and progress estimation).

Data Mining: developing data mining algorithms for various applications (e.g., knowledge graphs and pattern mining).

Recent Papers

* alphabetic ordering of authors

[NIPS 2017] Collecting Telemetry Data Privately
* Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin

[VLDB 2017] Flexible Online Task Assignment in Real-Time Spatial Data
Yongxin Tong, Libin Wang, Zimu Zhou, Bolin Ding, Lei Chen, Jieping Ye, and Ke Xu

[SIGMOD 2017] Approximate Query Processing: No Silver Bullet
Surajit Chaudhuri, Bolin Ding, and Srikanth Kandula

[CHI 2017] Trust, but Verify: Optimistic Visualizations of Approximate Queries for Exploring Big Data (Video)
Dominik Moritz, Danyel Fisher, Bolin Ding, and Chi Wang

[VLDB 2016] Online Minimum Matching in Real-Time Spatial Data: Experiments and Analysis
Yongxin Tong, Jieying She, Bolin Ding, Lei Chen, Tianyu Wo, and Ke Xu

[VLDB 2016] Design of Policy-Aware Differentially Private Algorithms
Samuel Haney, Ashwin Machanavajjhala, and Bolin Ding

[SIGMOD 2016] Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee
Bolin Ding, Silu Huang, Surajit Chaudhuri, Kaushik Chakrabarti, and Chi Wang

[SIGMOD 2016] Quickr: Lazily Approximating Complex Ad-Hoc Queries in Big Data Clusters
Srikanth Kandula, Anil Shanbhag, Aleksandar Vitorovic, Matthaios Olma, Robert Grandl, Surajit Chaudhuri, and Bolin Ding

[SIGMOD 2016] Operator and Query Progress Estimation in Microsoft SQL Server Live Query Statistics
Kukjin Lee, Arnd Christian Konig, Vivek Narasayya, Bolin Ding, Surajit Chaudhuri, Brent Ellwein, Alexey Eksarevskiy, Manbeen Kohli, Jacob Wyant, Praneeta Prakash, Rimma Nehme, Jiexing Li, and Jeff Naughton

[ICDE 2016] Online Mobile Micro-Task Allocation in Spatial Crowdsourcing
Yongxin Tong, Jieying She, Bolin Ding, Libin Wang, and Lei Chen

[ICDCS 2016] Enabling Privacy-Preserving Incentives for Mobile Crowd Sensing Systems
Haiming Jin, Lu Su, Bolin Ding, Klara Nahrstedt, and Nikita Borisov

[SIGMOD 2015] S4: Top-k Spreadsheet-Style Search for Query Discovery
Fotis Psallidas, Bolin Ding, Kaushik Chakrabarti, and Surajit Chaudhuri

[VLDB 2015] Finding Patterns in a Knowledge Base using Keywords to Compose Table Answers
Mohan Yang, Bolin Ding, Surajit Chaudhuri, and Kaushik Chakrabarti

[KDD 2014] Scalable Near Real-Time Failure Localization of Data Center Networks
Herodotos Herodotou, Bolin Ding, Shobana Balakrishnan, Geoff Outhred, and Percy Fitter

[SIGMOD 2014] Discovering Queries based on Example Tuples
Yanyan Shen, Kaushik Chakrabarti, Surajit Chaudhuri, Bolin Ding, and Lev Novik

[SIGMOD 2014] Blowfish Privacy: Tuning Privacy-Utility Trade-offs using Policies
Xi He, Ashwin Machanavajjhala, and Bolin Ding

[VLDB 2013] Attraction and Avoidance Detection from Movements
Zhenhui Li, Bolin Ding, Fei Wu, Tobias Kin Hou Lei, Roland Kays, and Margaret C. Crofoot

[KDD 2013] EventCube: Multi-Dimensional Search and Mining of Structured and Text Data
Fangbao Tao, et al.

More in DBLP and Google Scholar





Selected Awards:

  • (2017) FY17 Technical Excellence, Microsoft Privacy
  • (2012) Gold, The 2nd Yahoo!-DAIS Research Excellence Award Competition
  • (2007) Best Student Paper Award, ICDE’07
  • (2007-2008) Richard T. Cheng Fellowship Award, University of Illinois at Urbana-Champaign
  • (2007) 1st place, TopCoder Programming Competition College Tour at University of Illinois
  • (2005) Honorable Mention, 2005 ACM-ICPC Programming Contest World Finals
  • (2004) 3rd place out of 255 teams, Gold Medal, ACM-ICPC Asia Regional Contest, Shanghai Site


I have worked with some amazing interns:


Program Committee Memberships:

  • International Conference on Very Large Data Bases (PVLDB): 2018, 2017
  • ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD): 2017
  • ACM International Conference on Information and Knowledge Management (CIKM): 2017
  • Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD): 2017, 2016, 2015, 2014, 2013
  • International Workshop on Privacy-Preserving Data Publication and Analysis (PrivDB, in conjunction with ICDE): 2013

NSF Panelist: 2016

Reviewer for Journals: ACM Transactions on Database Systems, IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Knowledge Discovery from Data, Theoretical Computer Science, Pattern Recognition, Information Sciences, Knowledge and Information Systems