Operationalizing the cloud

Azure + DevOps: The awesomely ugly truth about decentralizing operations at Microsoft

May 3, 2018   |  

So far, we’ve unleashed Azure by automating incident and change management, compromised with Azure as we moved and monitored applications, and built a plane while flying it (figuratively…while managing hybrid solutions) on our journey to the cloud.

I think we can conclude that operationalizing the cloud at Microsoft has certainly not been an “all clear and blue skies” experience, but beyond the learnings, pitfalls, and compromises we’ve gone through, the advancement in agile culture and operational processes we’re already seeing has made it all worth it for us.

In my last blog, I shared how our incident and change management systems have been evolving. And quite frankly, so have we as an organization. As our teams embrace the “DevOps” model more and more, we’re re-evaluating the day-to-day operations, services architecture and how these services are delivered.

About four years ago, our applications ran on-premises and my team delivered a service that provided physical and virtual machines. The process for employees to acquire these resources was simple: Go to an order page, fill out information about your system and storage size, and get your machine after the designated service level agreement (SLA)—which was usually a few days for a virtual machine and a few weeks for a physical one, depending on inventory.

Pete Apple sitting in the CSEO building office lobby.
“Transitioning our services from centralized management to decentralized ownership and management has been ugly…and awesome.”

The awesomely ugly truth about this operation is that we worked on replicating this exact model, somehow expecting a more lean and agile operation, and not surprisingly, it didn’t quite work out that way… at first. My team created a series of Azure Subscriptions owned by us that connected back to our corporate network via VPN (eventually ExpressRoute) which then uploaded a standard operating system imaging to build a virtual machine. With the help of Windows Azure Pack, we were able to include additional options on the requisition form where application teams were able to order Azure virtual machines (VMs).

This worked fine…for a while. Then, we started to see problems. Employees wanted to manage their VMs directly in the Azure portal, but they only had remote desktop access to them. My team was the owner of the subscriptions. Any change to a VM like resizing, or adding another disk, had to have a ticket for my team to deliver this service, while the customer (sometimes impatiently) waited for us to process requests and meet our ugly SLA. We also had some application teams wanting to use other types of Azure resources beyond VMs. What were we to do?

In the name of DevOps, we decided to evolve once again and modify this service to provide a shared ownership model. Under this new model, we created business-unit owned Azure Subscriptions where application teams directly manage resources, resize, add disks, do maintenance, or whatever they need to do to their machines to keep their employees working smarter, not harder. All the while, my team maintains governance and assistance as needed since we can access the Azure Subscriptions.

When Azure Resource Manager became available along with its more nuanced Role Based Access Controls, this model worked even better. We started to decentralize Azure Subscriptions by each business unit service line so teams could apply appropriate roles to employees and flip to a true DevOps mode. My central team provided Azure Resource Manager templates that enabled employees to build VMs themselves (no order form required!) and have their machines ready in under 30 minutes. Employees could also start creating PaaS resources to modernize their applications and manage them day to day.

Transitioning our services from centralized management to decentralized ownership and management has been ugly…and awesome. In many ways it’s even more transformative than the journey from on-premises to the cloud. We now enable our teams to create and manage their own Azure resources directly, while maintaining standards and governance guardrails. This is a true reflection of how we’re empowering every person in our organization to do more. Enabling customers to work smarter, not harder, is a wonderful thing.

I love my job.

Learn more about how Microsoft evolved its operations and moved its IT infrastructure management to the cloud.

Tags: , , ,