Enabling secure and compliant engineering with Azure DevOps

Sep 17, 2020   |  

Female developers in focused work.

Microsoft Digital has implemented policy and security controls throughout our pipelines, delivering secure and compliant code across the entire pipeline process. We’re ensuring that our engineers are working in an agile environment that adheres to all our policies, from start to finish.

In a digital-transformation environment, where modern engineering principles require continuous integration and development of services, DevOps delivery models must be efficient and secure. Microsoft Digital has implemented policy and security controls throughout our pipelines, delivering secure and compliant code across the entire pipeline process. The goal is to implement properly applied controls across our entire development landscape, ensuring that our engineers are working in an agile environment that adheres to all our policies, from start to finish.

Examining service pipelines at Microsoft

At Microsoft Digital, we build and operate the IT services on which Microsoft runs. This environment consists of more than 900 individual services that combine to run Microsoft and provide productivity tools and experiences for our employees and customers worldwide. Microsoft Digital develops and supports these services with more than 3,000 engineers working in 3,800 Microsoft Azure Repos.

Our move to the cloud and changing business requirements have increased demand for more rapid delivery of IT solutions. We’ve moved to an agile delivery model, supported by our modernized services. Investments in modern engineering practices, such as continuous integration and continuous delivery (CI/CD), have improved the product quality and created a new engineering culture at Microsoft Digital.

In this development environment, change is continual. Our repos receive 1,000 pull requests, 7,000 commits, 2,800 builds, and 650 releases per day. In this context of rapid, constant change, we want to ensure that our service pipelines always remain within our parameters for security and policy application.

Our mission is to enable our engineers to develop in a secure and compliant environment without having to implement or track security and compliance practices manually. Our engineering environment has grown substantially because of the growth and success of Azure DevOps adoption at Microsoft. However, with 2,800 builds and 650 releases per day, it’s not possible to track and manage policies and configuration manually.

We want an engineering environment that’s unobtrusive to our engineers and enables them to innovate and trust that the system compiles and releases checked-in code in a manner that’s always compliant and secure for the customer. One of our IT’s organization’s most significant overarching challenges is the growth and sprawl of IT service resources: determining where they are, who owns what, and how to ensure the effective release of first-class modern experiences for employees and partners. Our main challenges within the service pipeline include:

  • Distinguishing between production resources and nonproduction resources. With such a large engineering footprint, it’s imperative to be able to identify production releases and ensure that the right controls are applied consistently across the entire engineering environment. At Microsoft Digital, we create services in various ways. While we plan and design many services by using a standard method and process, service creation might be more organic. For example, an internal solution or tool for a temporary team might evolve into a permanent solution that the entire organization uses. Regardless of service origins, determining whether code, builds, releases, and Azure resources are production is critical. While we apply many controls across the entire environment to identify issues early in the life cycle, we need to enforce higher standards on production..
  • Identifying ownership. When a problem arises in code for a service, someone needs to fix it. Establishing centralized tracking and clear ownership of—and accountability for—our engineering artifacts is critical. Without centralized tracking, it would be impossible to identify ownership across the high number of repos, builds, and releases that we manage.
  • Applying security and compliance tools on production code. After we establish production code, the next challenge is being able to run security and compliance tools centrally on the production code, without relying on engineering teams to do this for each individual project. In the past, engineering teams implemented these checks within their pipeline, but this approach relied on manual actions by the engineers. In addition, it’s not easily trackable or enforceable. We need to ensure that our production code always receives the necessary security and compliance checks. This saves time and makes it simpler to ensure that the appropriate measures have been applied to all production builds.

Engaging modern engineering design and tooling

To release secure and compliant modern services, we established a threefold starting point: understanding what is production code and what are production Azure resources; being able to run the compliance tooling consistently across all pipelines; and having clear ownership of our code and pipelines to drive accountability.

Our approach, at its core, enables our engineers to get it right from the beginning, while addressing all challenges in the process. We want to remove manual processes where possible and enable engineering teams to be green by default with the policies that we apply in our engineering system.

Establishing a single service catalog

We use a service catalog to track the hierarchy and organizational hierarchy metadata for all the services that combine to run Microsoft. The catalog creates a single source of truth for service and organizational data. It also streamlines the compliance and onboarding experiences for other services like incident management and Azure DevOps. Using a service catalog gives us the ability to link production services, Azure DevOps code repositories, and Azure subscriptions. It also contains a structured hierarchy of our services as they relate to our business and clearly defines service ownership and dependencies.

Our system uses the metadata that defines a service to initiate a set of compliance tasks. If a service has a user interface, accessibility policies must be applied to that interface. If it has web endpoints, it must undergo vulnerability scanning. If it has buildable code, that code must undergo static analysis. If the service contains Azure resources, policies from the Secure DevOps Toolkit for Azure must be applied.

Identifying code ownership

From the creation process forward, a repo must be mapped to a service. There is a hard mapping between our service catalog’s structure and the structure within Azure DevOps. Every area path in Azure DevOps has a team and associated set of repositories. This structure gives us clear ownership of the code and its engineering artifacts from the beginning and also established a line of accountability for both compliance-scan remediation and incident response.

Establishing governance for production code

Product builds and releases require specific compliance policies, so it’s critical to identify the production environments accurately. We’ve established a simple approach to determine what constitutes production code: we always host production code in the main branch of a repo. To create a build or release pipeline in Azure DevOps, you must do it under a service. We accomplish this by using a folder structure that matches our service tree hierarchy, and this allows us to have clear ownership for our thousands of build and release definitions. This approach is easy to implement, can be applied across the entire environment, and is easily identifiable.

When deploying releases, only the main branch can be released to production. We use tags for Azure resources to identify production environments. We apply policies to the environment to ensure that only main builds can deploy to production, and that naming conventions are applied across environments. This helps us apply consistent stage naming. Enforced policies create an environment that guarantees 100 percent compliance in this area.

Benefits

We’ve created an engineering system in which we have clear ownership of our engineering resources from the outset, can deterministically differentiate between production and nonproduction resources, and can centrally manage the environment. These three fundamentals, established and enforced from the start, produce significant benefits for our engineers and for Microsoft Digital as an engineering organization. These benefits include continuous integration with security controls, which means no build-up of technical debt, and automated and enforced policy management. Other benefits include more secure pipeline definitions, increased policy-release velocity, end-to-end visibility of compliance status, and reduced effort for engineering teams to engage in secure and compliant DevOps practices across the entire organization.

Moving forward

We’ve established a robust framework for pipeline compliance management at Microsoft Digital. With modern engineering tools and design patterns, we’re moving quickly toward a defined and enforced method for pipeline compliance by using an Azure DevOps predeployment gate and artifact filter. Our automated processes and code-based deployments are creating an agile and more managed environment for running our business processes. As we move forward, we’re building a deeper level of built-in automation and gates to further refine the pipeline without requiring human intervention or manual processes. Compliance by default is a continually evolving goal as policies change, and our teams are assessing our environment and applying new toolsets to ensure 100 percent compliance across Microsoft Digital.