Optimizing the cloud

Saving millions by fine-tuning Microsoft Azure usage

Aug 31, 2017   |  

Operating in the cloud brings about a whole new set of challenges. But strangely enough, they look very similar to the ones we used to face in previous years. While we solve them in different ways and with different approaches, in the end, solving grand challenges with technology will always have similar goals–to make people’s lives easier and more productive.

My name is Rick Ochs, and I work in Microsoft Core Services Engineering (formerly Microsoft IT), focusing on cloud adoption and cost optimization. I’m one of many engineers who have been part of the very large transformation of re-inventing IT systems and creating new ways to solve problems. As we’ve migrated and rebuilt the bulk of our systems on cloud platforms, my personal passion about designing and operating systems in the cloud efficiently has grown.

In past roles, I used to tediously build application environments, one by one, weeks at a time. Now, I get to spend my time working with incredibly talented teams across our organization to reinvent how to efficiently manage our infrastructure–not one server at a time, rather by embracing new concepts that impact ecosystem-wide solutions, which can save upwards of millions of dollars.

Saving millions of dollars? What am I talking about?

Our team took these new cloud cost saving concepts and baked them into an optimization reporting tool that categorized cost savings by resource type and organization. It has helped drive a huge cultural shift towards individual team accountability with cloud spending. We proudly call our dashboard ARO, which stands for Azure Resource Optimization. I look forward to talking more about our lessons learned in future blog posts.

We’ve made significant progress since we last shared our learnings in this case study, which was published just a few months ago. Now, we have open source versions of our “snooze” tooling, which allows our people to turn servers off and on via native Azure automation. This has helped to drastically reduce cloud spend internally and externally to Microsoft. Within our organization, we’ve saved incredible amounts of money by breaking out of the datacenter mindset, and we’re passionate about bringing these ideas to other engineering teams around the world.

Earlier this summer, it was announced Microsoft purchased Cloudyn. I couldn’t be more excited–their cost reporting experience is second to none. Cloudyn helps customers budget, track spending and trends, and report cloud behavior in ways that empower people to really own their cloud deployments with great insight. When leaders can bring clear business intelligence into their operations and investments, it creates confidence and unlocks the ability to create a vision. It’s tough to build a strong cloud strategy without world-class reporting. You need the ability to tune where and how you spend your budget, and maximize each dollar spent.

Earlier this morning my daughter asked me, “Daddy, why do you love your job so much?”

I lit up like a Christmas tree when she asked. Not just because she recognized I love what I do (and I hope she learns from the example), but because I had the opportunity to tell her about my work.

“The more we can discover how to get the most out of computers for the lowest number of dollars, the more people can afford to use computers to solve really cool problems,” I told her, or something similar. “Like the DNA Premonition project, where we are trying to track viruses across the globe, so we can prevent people from getting sick. It takes huge computers lots of time to process all that information. As we figure out how to make computers even cheaper for people, they can change the world even faster!”

(Shameless plug, Premonition is a project I’m very honored to say the consulting team I work on helped architect.)

Near the beginning of our cloud journey, the things we learned were simple: Turn off your servers when they are not in use (very similar to your dad telling you to keep the door shut, so you don’t heat the neighborhood!) Don’t pick servers too big for your project – this is akin to sitting at the restaurant and your mom asking you, “Are you sure you think you can eat that entire pizza?” Turning those concepts into second nature takes time, but once they are ingrained into the cultural knowledge of your engineering groups, the cost savings is immense.

Now that we’ve gotten fairly decent about doing the small things, what’s next? PaaS? Serverless? Yes. All of it. Nothing is off the table for what we can use to do more with less! We have so much more to learn from the cloud. Machine learning, trending usage patterns and automatic sizing, predictive analysis, oh my! We are just getting started.

As Microsoft Core Services Engineering continues to pioneer new ways to operate in the cloud, we’ll be here, sharing our story, empowering both our own company, and hopefully yours. I’m looking forward to writing more about our cost optimization journey, so stay tuned!