Project 1: Predicting Session Duration for Distributed, Cached Build & Test Systems
Today’s modern build systems distribute and parallelize build tasks across thousands of machines, reusing cached build results whenever possible. Despite the sophisticated nature of modern build tools, the core software architecture of the system under build or test defines the lower bound for how fast the system can execute. To further speed up build and test processes, we make use of caches–we only execute what has been changed. While this provides a significant performance improvement, build and test
execution times become unpredictable. The run-time depends on the architecture and the state of the cache system. However, software engineers like to know how long a build or test run will take, e.g. to decide whether to switch context or to wait for the session to complete.
This internship is aiming to build a prediction model that notifies customer about the expected run-time of a build and updates the predictions live as the build or test session proceeds. Although this project sounds like a machine learning model, it is not. All inputs to such a model are of static nature and can be computed before and during a session making machine learning unnecessary.
Project 2: Machine Learning Prediction Models To Scope Security Analysis Executions
Security tools can be extremely expensive to run. Thus, running these tools in a continuous integration process might be infeasible. At the same time, the chances of introducing new issues by single commits are typically low while an acceptable response rate for such security tools tends to be high, hours rather than seconds. As a result, these tools do not necessarily need to be executed is a blocking manner of the check-in process.
The goal of this internship is to produce a lightweight model to identify code changes that are very unlikely to produce analysis results. The model would be used to speed up CI/CD pipelines by only analyzing a small percentage of changes using expensive analysis tools while the majority of changes would be checked using more light-weight analyses. The goal is to have a highly accurate model to maintain a high degree of recall in terms of identifying relevant tool results.
Project 3: Performance Model to Identify System & Integration Test Run-Time Regressions
In this project, we seek to create a performance model for test case run-times using both historical data and machine metrics. The goal of the performance model is to provide a reliable benchmark for test execution times. Engineers that introduce significant slowdowns to test cases should be notified and investigate the performance regression. Providing as much contextual and run-time information as possible would then help the engineers to debug and hopefully fix the performance regression.
Project 4: Smart Cache Replica Balancing Across Data Centers
Large scale builds systems are executed as a distributed computation cluster over a distributed content cache. Builds are decomposed into atomic tasks, tasks are distributed across multiple computation nodes, and output files of each task are saved into the content cache. A content cache server runs each computation node. Subsequent task can benefit from locality and load an input file quickly from the local node, or it may need to retrieve the file from the content cache server running in a remote node, which is a slower process.
A file can be replicated across multiple nodes. In general, the more replicas a file has, the higher the odds of it being loaded locally (and quickly) by a task that needs it. At the same time the cluster has limited storage capacity, and old files are culled to make space for new ones. This means that more replicas a file has, the smaller the number of distinct files the cluster will be able to retain.
This internship aims to create approaches which use historical data to improve data replication, retention, and task placement algorithms, so to improve file locality for tasks that need it, while balancing efficient usage of disk space in the build cluster, and long retention for build cache content.
Project 5: Studying Introduction Rates of Security Vulnerabilities
Modern CI/CD pipelines argue for constant pre-release inspection of code changes. It is not clear, however, that the rates of introduction for security vulnerabilities warrant this degree of scrutiny and expense. The TSE-Security team has developed a sophisticated static analysis results matching algorithm for SARIF that will provide more accurate tracking of logically unique issues, so that we have better data than seen previously for tracking problems as they appear and are resolved. This will allow us to more precisely study rates of introduction, leading to recommendations for more appropriate/efficient methodologies of applying static analysis.
This project provides ML opportunities as well (as we look to produce models that provide a prediction of whether specific types of code deltas in specific file types are or are not likely to produce an SA result). The main goal is to build automation harness for driving a static analysis against thousands of historical code snapshots. We will target Chakra (C++, security sensitive open source code base) and some prominent C# code base (perhaps VSO’s 9M lines of C#). We are particular interested in the rate of introduction for (a) vulnerabilities introduced to web.config files (b) binary level security issues related to compiler switches (c) introduction of cryptography use into code (d) introduction of classic XSS, SQL injection or other misuse of untrusted data.
Project 6: Study of Flaky Tests life cycle
Today, we are building a flaky test management system that infers flaky tests, helps quarantine them, and create bug reports for developers to debug those tests. We do not have much insights on what changes introduce a flaky test, what are the code changes developers make to fix them. This project aims to conduct a detailed study on the data to understand the life cycle of the flaky test such as what kind of fixes developers make, how long the flaky tests remain non-flaky before changing back to flaky and so on. It will provide interesting insights into understanding this topic in more detail.
The main objectives in this project would be to gain detailed understanding of the life cycle of flaky tests and core reasons for flaky tests and potential fixes, which can be advised to developers back for the future tests.
Project 7: Patterns and practices to make new engineers productive quickly
The objective of this project is to study factors that make new hires more productive and have impact to their new teams faster. We would like to study behaviors and environments of new Microsoft employees who were quickly able to contribute to code bases and compare these factors to environments and behaviors of employees who had significant longer ramp-up time. The goal would be to determine which factors of the on-boarding process including tooling, experience, team norms, mentoring, and team culture positively influence time to first contribution for new hires. This could be used to give best practices and lessons learnt to engineering teams to increase their productivity, especially for newly hired team members.