My excellent cloud adventure: how old habits and bad behaviors can torpedo your cloud move

May 4, 2017   |  

When Microsoft IT took on the challenge of moving our entire IT footprint to the cloud, I wasn’t surprised when a lot of people told me it couldn’t be done. But I was surprised at how the move turned many of our best engineers into the functional equivalent of teenagers.

Changing the rules of the game from owning and running a datacenter, with fixed assets paid for as capital expense, to managing IT in the cloud, with a running meter, effectively disrupts practices that have been the norm for 30-plus years. And suddenly, you find engineers behaving in ways that would probably drive their parents crazy. Metaphorically, they forget to turn out the lights when they leave a room, they leave the water running when they’re done in the bathroom, and they leave old pizza boxes underneath the bed, full of half-eaten pieces.

OK, our engineers are top notch and aren’t exactly a bunch of slackers. And I’m sure their parents couldn’t be more proud of them. I know I am. The truth is just that they’re doing what they’ve always been trained to do in a fixed-datacenter world. If you’re planning to transform your IT, you’ve got be aware that you need to break these long-established practices and behaviors to be truly successful.

For decades, the cardinal rules for IT have always been: Be on time, be on scope and be on budget. Do what you have to do, but nothing can fall over. This led to the common practice of over-provisioning – finding the largest and most powerful hardware available and spending however many months it took to get it all in place. You were on budget, yes. But to all of your business partners you were insanely expensive. And since you had to be on scope and on time, you also padded the hell out of your schedules.

In a cloud world, this is all as obsolete as punch cards and tape drives. For IT leaders, then, the job is to be the driver of change. Let me share two experiences that highlight the challenge.

Recently, I noticed we were paying for five Azure P11 SQL databases, at a cost of five thousand dollars a month each, and I saw they were just dead. Empty. Nothing happening. I went to the team asked, “Why is this out there?” They explained that a big new release was scheduled to deploy in two weeks. Fair enough. But how long did it take to provision it, I asked. “Oh, it was easy,” they said with pride. “Just a few minutes!” (In my head I was asking if this is what an aneurism feels like.)

Another time, we were dealing with a procurement application called MyOrder, which gets extreme use at the end of the fiscal year. In June 2015, it fell over three times in one day, back when it was an on-premise app. When we moved it to the cloud, the team did the math to provision the right VM SKU to make sure it wouldn’t go down again, and they determined that five big VMs called D14s would be enough to handle peak load. But the week after the fiscal year close, I noticed that a collection of even bigger VMs, called G5s, had been deployed – and were still running (at a rate of $10 thousand a month). When I asked them why, their answer was simply: “We were afraid.” (BTW, changing fear-based decision making takes a long time.)

This fear in the culture – the residual product of years of reward and punishment for behaviors that made them successful in an on-premise datacenter world – is perfectly understandable, and it carries through in practices across IT. The team needs to scale up, so they provision new VMs. But then they just walk out of the room and leave them running. Or they set up an environment, do some cool things with it, and leave it like a pizza box under the bed. In a recent sweep of subscriptions in my org, we found 32 terabytes of unattached storage – data that had been connected to VMs at one point but then, instead of being deleted, was left just sitting there, forgotten. Worse, it was sensitive data, which meant it represented a security risk as well. Needless to say, we got that dealt with right away.

One way to think about managing the change is to consider the changing nature of your spend. When IT is on-premise, your costs are largely capital expenditures. You pay for hardware, and it gets depreciated whether you’re using it or not. But in the cloud, it’s all operational expense, like an electrical meter on your house or the water you get billed for. And if you leave things running and they’re not being used, you’re still paying for them.

And so, like a parent teaching a young person how to get ready for a new phase of life, you need to focus on teaching how to change old behaviors in all sorts of ways. Provision when you need to, scale up when the load demands it. But scale down when it’s done, and delete the VMs when you’re finished. And automate all of it. Shut off the lights, turn off the water, and clean up after yourself. Make the family proud. And then you’ll be able to begin focusing on all the new things you can do with all that power, and all the new resources that will open up when you truly master the new world of cloud.  In future posts, we’ll talk about some of the amazing opportunities operating in the cloud gives you.

Related links:

Determination sets 8-year-old on path to save Microsoft millions of dollars

Optimizing resource efficiency in Microsoft Azure

Managing and optimizing resources for cloud computing at Microsoft