Code Defect AI
Altran developed a machine learning classifier that predicts source code files carrying a higher risk of a bug. Developers are presented with explanation and factors used in making the specific prediction.Learn about Code Defect AI
Bugs are a fact of life in software development. The later a defect is found in the development lifecycle, the higher the cost of fixing a bug. If a defect is found after deployment, customers are impacted and developers spend more time replicating the issue, then issuing a fix. This bug-deployment-analysis-fix process is time consuming and costly.
Certain patterns in the software project’s code base carry a higher risk of introducing a bug. These patterns can be learnt by a classification learning algorithm to predict the prospect of a file having a bug. This allows earlier discovery of a defect, minimizing the cost of fixing bugs.
Custom classification models are created for GitHub projects, based on metadata associated with the historical commits. When Code Defect AI discovers new developer commits, it predicts if any files in the commit are at risk for defects. The rationale behind the prediction is presented using Local Interpretable Model-Agnostic Explanations (LIME) so that developers can trust and learn from the prediction.
Catching bugs before deployment
Software is developed through many cycles of coding, testing, finding bugs, then returning to coding for fixes. Altran and Microsoft deployed machine learning models on Azure to identify bug risks earlier in development with fewer cycles, saving time and money.
Technical details for the Code Defect AI experiment
Supervised learning in machine learning allows algorithms to predict an output based on historical examples of input-output pairs, i.e. labelled data. Supervised learning is termed as a classification problem if the output variable is a discrete variable. Certain patterns in the software project’s code base carry a higher risk of introducing a bug. For example, if a new developer is making changes on a file that historically has higher incidence of bugs, a commit involves files across multiple directories or the code update is spread across multiple regions in the file. These patterns can be learnt by a classification learning algorithm to predict the prospect of a file in a commit having a bug. This allows shift left of defect discovery thus minimizing the cost of fixing defects.
Three custom classification models have been created for three GitHub projects based on metadata associated with the historical commits. Labelled data for training the model has been created using the metadata collected from the GitHub repository. When Code Defect AI discovers new developer commits, it obtains the meta data for the each of the commits and the files in the commit. It then predicts if any of the files in the commit carry a risk of having a bug using the project specific model. Traditional machine learning models are black boxes and the rationale behind the model’s prediction is not available. We present the rationale behind the prediction using Local Interpretable Model-Agnostic Explanations (LIME) so that users develop a greater trust in the prediction.
Sketch2Code converts hand-written drawings to HTML prototypes. Designers share ideas on a whiteboard, then changes are shown instantly in the browser—helping improve collaboration between the designer, developer, and customer.
JFK Files takes 34,000 complex files including photos, handwriting, government documents, and more, then extracts readable information. This knowledge is organized to enable new ways to explore the information.
Spektacom uses a mini sticker sensor on a cricket bat to gather data on the quality, speed, twist, and swing of the bat. This data is used to analyze the quality of the shot to help professionals, amateurs, and coaches improve their game.
PoseTracker uses deep learning to track the position and orientation of objects. This solution will use your phone camera to measure and track the angle, orientation, and distance of an item in real time.