In large software development projects, when a programmer is assigned a bug to fix, she typically spends a lot of time searching (in an ad-hoc manner) for instances from the past where similar bugs have been debugged, analyzed and resolved. Systematic search tools that allow the programmer to express the context of the current bug, and search through diverse data repositories associated with large projects can greatly improve the productivity of debugging. This paper presents the design, implementation and experience from such a search tool called DebugAdvisor.
The context of a bug includes all the information a programmer has about the bug, including natural language text, textual rendering of core dumps, debugger output etc. Our key insight is to allow the programmer to collate this entire context as a query to search for related information. Thus, DebugAdvisor allows the programmer to search using a fat query, which could be kilobytes of structured and unstructured data describing the contextual information for the current bug. Information retrieval in the presence of fat queries and variegated data repositories, all of which contain a mix of structured and unstructured data is a challenging problem. We present novel ideas to solve this problem.
We have deployed DebugAdvisor to over 100 users inside Microsoft. In addition to standard metrics such as precision and recall, we present extensive qualitative and quantitative feedback from our users.