Operationalizing the cloud

Managing solutions when moving to the cloud is like “building the plane while you’re flying it”

Feb 7, 2018   |  

A very popular cliché used in Silicon Valley, the notion of having to “ship it and fix it and ship it again” sounds all too familiar to my team as we focused our efforts on managing, monitoring, moving, managing, and monitoring solutions in our expedition to the cloud.

Hello again and welcome back to our blog series. In this post, I want to share what it took for us to effectively migrate solutions from on-premises to the cloud while managing and monitoring them for day to day operations.

When I was running the hosting environment on-premises, our physical and VM footprint was spread across multiple geographic datacenters, in two primary security zones— “Corp” and “DMZ”. You might have a similar environment. All systems were managed via centralized Microsoft System Center solutions, namely Operations Manager (SCOM) for monitoring and Config Manager (SCCM) for patching. As we started to look at moving solutions into the cloud, it became clear we were going to have to hybridize our management solution.

VM’s undergoing a “lift and shift” as-is could continue to be connected to our Corp environment via Azure ExpressRoute, so we leveraged this to continue managing them the same way we always had. As more and more hosts moved from on-premises into Azure, we eventually did a lift and shift on the System Center servers themselves, so they were operating out of an Azure datacenter as well. Warning: There’s a tipping point as you get over 50 percent into the cloud based on the size of your environment and how quickly you’re moving VM’s into the cloud, so think about it ahead of time.

We also learned that in many cases, a cloud transition coincides with a DevOps model of deployment and management for the application team, so we changed the technology and site reliability engineering practices in unison. For the “DMZ” and other internet-facing solutions, there were other options. We made sure VM’s in the internet-facing environment were within Windows Server Update Services (WSUS) or now Operations Management Suite (OMS), so they stayed up to date and monitored.

For teams looking to move to a “modern” cloud solution like PaaS or SaaS we encouraged other options rather than trying to duplicate past solutions. If an application is being refactored into a cloud native service without an operating system (and thus a SCOM/SCCM agent), we use modern monitoring solutions like Application Insights and Azure Monitoring.

Thus, we built the plane while flying it. Today, Microsoft Core Services Engineering (CSE, formerly Microsoft IT) still operates a SCOM and SCCM environment in Corp which many teams continue to use. All our physical and VM’s on-premises (and now Corp-connected IaaS) continue to subscribe to those services. We’ve also shared best practices so internal disciplines use modern monitoring solutions for newer applications. One team is already using OMS for management, leveraging log analytics and update management features for improved operations.

In the journey to hybrid operations, we had to learn to be flexible about management solutions because there are more options than just the simple “OS Patch/Monitor” that we lived with for years. This transition also changed the way we handle traditional information technology infrastructure library (ITIL) change and incident management – a new set of challenges as we trekked further into the cloud, which I’ll go into next time.

Watch cloud experts from Microsoft share candid insights and best practices about how we design, manage, and support cloud solutions at Microsoft.

Learn more about monitoring and managing cloud infrastructure with OMS Operations Management Suite (OMS).

Tags: , ,