ConfSeer: Leveraging Support Knowledge Bases for Automated Misconfiguration Detection

  • Navendu Jain

MSR-TR-2015-16 |

We introduce ConfSeer, an automated system that detects potential configuration issues or deviations from identified best practices by leveraging a knowledge base (KB) of technical solutions. The intuition is that these KB articles describe the configuration problems and their fixes so if the system can accurately understand them, it can automatically pinpoint both the errors and their resolution. Unfortunately, finding an accurate match is difficult because (a) the KB articles are written in natural language text, and (b) configuration files typically contain a large number of parameters and their settings so “expertdriven” manual systems are not scalable.

While there are several state-of-the-art techniques proposed for individual tasks such as keyword matching, concept determination and entity resolution, none offer a practical end-toend solution to detecting problems in machine configurations. In this paper, we describe our experiences building ConfSeer using a novel combinations of ideas from natural language processing, information retrieval and online learning. ConfSeer powers the recommendation engine behind Microsoft Azure Operational Insights [45] that proposes fixes for software configuration errors. The system has been deployed in production for about an year where it is used to proactively find misconfigurations on tens of thousands of servers in near real-time. Our evaluation of ConfSeer against a rule-based commercial system, an expert survey and web search engines shows that it achieves 80%-97.5% accuracy and incurs low runtime overheads.