Microsoft optimizes its cloud consumption to help Microsoft Azure meet surging demand during COVID-19

Oct 7, 2020   |  

COVID-19 has disrupted the way many people around the world work. Instead of going into an office, attending a meeting, or shaking hands with a colleague, employees are online more than ever before. Microsoft is no exception—and its IT and Operations division, Core Services Engineering and Operations (CSEO), is meeting that challenge directly.

The shift to working remotely has led to an astonishing boost to cloud consumption.

In June 2020, Microsoft shared that Microsoft Teams shot up to more than 200 million daily users, 200 million daily meeting participants, and 4.1 billion daily meeting minutes, all of which contributed to a broader increase in cloud traffic.

“That was a huge jump,” says Binu Surendranath, global process owner in Finance Payout at Microsoft. “We could meet that demand by building more datacenters—which we are doing—but we also felt that we could free up more capacity to existing datacenters by optimizing our usage of Azure internally here at Microsoft.”

We had just made a huge lift to make better use of our cloud resources. But we realized there was more to be done. Customers were using their online resources to the maximum, and we wanted to make sure we kept up great service while reducing their usage costs.

~Snigdha Bora, engineering leader for CSEO’s Finance Engineering team

One of the Microsoft groups that rose to this challenge was Microsoft Payment Central. This is a centralized onboarding and account maintenance platform for suppliers and partners working with Microsoft globally. It provides enterprise-scale services to help more than 200,000 payees and manages close to 1 million account profiles, capturing details such as contact, tax, and bank information to enable secure and compliant payments.

Working with CSEO, Payment Central made a substantial effort to reduce its cloud footprint through a pilot project. The benefits were two-fold. By optimizing its own use, the team’s project saved money and resources while also freeing server space for customers seeing their own cloud use soaring.

Moreover, Payment Central undertook the optimization program after completing a similar effort.

“We had just made a huge lift to make better use of our cloud resources,” says Snigdha Bora, engineering leader for CSEO’s Finance Engineering team. “But we realized there was more to be done. Customers were using their online resources to the maximum, and we wanted to make sure we kept up great service while reducing their usage costs.”

[Learn how Microsoft is making data a strategic asset.]

An engineering challenge

Pulling off this latest optimization wasn’t easy.

Microsoft engineers performed a deep inventory of Payment Central’s Microsoft Azure usage, traced areas with the highest cost and use, and developed algorithms designed to streamline their cloud use.

Changes were made across Microsoft Azure app services: Microsoft Cosmos Database, Azure SQL Database, and other cloud-based services. These included choosing host locations carefully, configuring spending alerts with Microsoft Azure cost-analysis reports, right-sizing underused resources, and adopting the latest offerings from Microsoft Azure.

The results of the effort are impressive.

The Call Center Management team, for instance, reduced its Microsoft Azure consumption by a startling 50 percent, with an accompanying reduction in cloud costs coming from shutting down inefficiently used virtual machines (VMs). As a part of this VM retirement, the team retired five production databases.

In addition, more than 1,000 stored procedures were retired as part of the optimization. In return, the team added 23 notebooks to process the data in data bricks and published 10 datasets. Also, because of this simplification in architecture, the team also reduced a DataMall publish to its reporting database.

“It was very satisfying to see that change,” Bora says. “We need to be very conservative in our consumption from a spending view, and we achieved that while also reducing our data footprint, which frees up resources for Azure customers.”

In addition to reducing its cloud footprint, the Payee Management team reduced its monthly Microsoft Azure subscription costs by 39 percent—a substantial margin.

Chart showing expenses going down.
Figure 1. How expenses dropped for a Microsoft team that sought to optimize its data usage.

After cloud use peaked in October 2019, the Payee Management team worked to drive down costs and data usage.

Pitching in across the board

Getting more from Microsoft Azure resources has been a widespread effort. For example, the Foundation Financial Services organization in Microsoft Finance cut thousands of dollars per month in cloud costs. They did it by:

  • Moving away from app services and hosting them as Microsoft Azure functions (serverless architectures). That led to big savings compared with a platform that fetched data from three different systems, every five minutes.
  • Keeping Microsoft Azure Databricks workspaces in the same region as the storage to reduce data-transmission costs.
  • Reducing storage needed by Databricks delta tables by periodically running maintenance jobs that vacuum delta tables.
  • Reducing costs by “snoozing” some Microsoft Azure Analysis services during non-business hours and weekends.

All this optimization we did helps our customers get more access to resources when they really need it. The money we save can be invested in other resources our customers can use.

~Binu Surendranath, global process owner, Finance Payout

More than dollar savings

Beyond the savings in data and dollars, Microsoft saw several other paybacks. Foremost among them was a benefit to customers.

“All this optimization we did helps our customers get more access to resources when they really need it,” Surendranath says. “Also, the money we save can be invested in other resources our customers can use.”

Moreover, Microsoft’s ability to get the most out of its cloud resources can serve as a model for customers who wish to emulate Microsoft’s best practices.

“Customers look to us to see how we run our own cloud infrastructure,” says James Gagnon, principal group DevOps engineering manager in CSEO. “By showing them where we’ve moved the needle significantly on cost, serverless computing, and security, customer confidence in Azure will increase.”

That helps the team influence market share because it shows customers that their cloud costs will be smartly optimized if they choose a Microsoft Azure provider.

“If we can show them how we manage spend as we invest while still giving engineers the autonomy they need, that’s a very compelling story to tell,” Gagnon says.

On a more technical level, by reducing cloud consumption, servers, and computing instances, Microsoft reduces its security risk.

“Let’s say that we have some orphaned resources,” Gagnon says. “We’re not only spending money on them, but they’re probably not as secure as we’d like. By being diligent in cleaning up and consolidating resources, there are fewer objects to secure and therefore reduced risk—and consolidating or deleting their data, that becomes data that can’t be compromised.”

Along with that comes an environmental benefit because Microsoft uses less energy to drive its servers. And its engineers—who now have fewer resources to monitor—can spend less time patching and updating, and more time on forward-looking initiatives that benefit Microsoft’s customers.

Realizing the potential of digital transformation

Internally, Microsoft employees saw one again how its emphasis on digital transformation and teamwork has changed the way the company does business.

“The level of trust we now have between the business side and the engineering side has grown exponentially,” Surendranath says. “I can say that because five years ago I was on the engineering side, and now have shifted to business. With the number of environments highly optimized, I can see changes in how the teams work together.”

The result is far more agility than Microsoft once exhibited.

Changes to major infrastructure that formerly required six months or more are often achieved in a few days, says Surendranath. That pays off when engineers and business managers hear that Microsoft CEO Satya Nadella has just been in Sweden and promised that a new datacenter will open in the coming months.

“When he says that, he’s giving a message to the business community and to the customer,” Surendranath says. “Those kinds of announcements used to send a shock down my nerves. But now we can do it without really boiling the ocean. That means my team focuses more on identifying opportunities, articulating the business value, negotiating with the business, and more.”

Technology never stops evolving. As a leader in the cloud, Microsoft employees understand that they must continually evolve to meet new customer challenges.

“Five years ago, people were OK with getting their deliveries in three days. But online shopping changed everything,” Bora says. “We can’t say to a business unit, ‘We got your request—come back and see us in three months.’ We have to be able to deliver in a matter of days. This agility is possible due to our use of Azure technologies and DevOps model.”

Learn how Microsoft is making data a strategic asset.

Tags: , , , , , , , , , , ,