Microsoft IT maintains a private cloud that includes dedicated storage and thousands of virtual machines. With Windows Server 2016, we use Storage Quality of Service (QoS) to manage storage usage and monitor performance. Policies in Storage QoS prevent one virtual machine from dominating storage I/O within a cluster, standardize storage I/O, and give us insight into storage I/O across the environment.

EXPLORE RELATED CONTENT

At Microsoft IT, we support thousands of virtual machines in our IT environment. In our datacenters, we use Hyper-V in Windows Server for virtualization infrastructure in our private cloud. Recently, we examined Storage Quality of Service (QoS) in Windows Server 2016 to help solve some storage management issues in our virtual environment. By using Storage QoS, we prevent storage I/O from being dominated by a single virtual machine, provide standardized storage I/O, and gain insight into storage I/O across our virtual machine environment.

Our virtualization infrastructure includes dedicated storage appliances. The most common configuration we use is four Cluster Shared Volumes (CSVs) per Hyper-V cluster that provides storage for several virtual machines. Currently, we set storage QoS for each CSV within the storage hardware appliances. Although this provides some level of Storage QoS, it wasn’t as effective as we’d like. Typically, we have anywhere between 50 to 75 virtual machines using a single CSV for storage. It is possible that one or more virtual machines could consume a larger amount of storage I/O than they should—thereby reducing the storage I/O available to the rest of the virtual machines that use the cluster.

Understanding Storage QoS in Windows Server 2016

In Windows Server 2016, Storage QoS can centrally manage and monitor storage performance for Hyper-V servers and the virtual machines that they host. And Storage QoS is built into the Hyper-V role in Windows 2016. It can be used with either a Scale-Out File Server or traditional block storage in the form of CSV. Storage QoS is represented as a cluster resource in Failover Cluster Manager, and is managed directly by the failover cluster. After some virtual machines are using the Scale-Out File Server or CSV, Storage QoS monitors and tracks storage flow.

Using policies to control storage usage

We use policies in Storage QoS to apply rules and limitations to the storage I/O consumed by Hyper-V virtual machines. We can create and apply policies to multiple virtual machines, and virtual machines can be affected by multiple policies. We can also use policies to set standard levels for minimum and maximum storage throughput for individual virtual machines, or specific virtual machine groupings. Storage QoS is designed to:

  • Prevent “bullies” and “victims” in storage use. Virtual machines that consume a large amount of storage I/O can prevent other virtual machines from accessing the amount of storage I/O they need. In this case, storage “bully” virtual machines restrict the availability of storage I/O to “victim” virtual machines.
  • Manage storage I/O with policies. Using Storage QoS policies, we can set performance minimums and maximums on either the virtual machine or virtual disk level. Policies can trigger alerts that notify us when virtual machines are out of compliance with a policy.
  • Monitor storage performance. We can use Storage QoS to monitor performance details of both Hyper-V host machines and the virtual machines they host.

Implementing Storage QoS

Implementing Storage QoS in Windows Server 2016 meant upgrading some of our Hyper-V host servers to Windows Server 2016. And by configuring several policies, we could apply Storage QoS behavior to our virtual machines. By using Storage QoS in Window Server 2016 in this way, we achieved three primary goals:

  • Prevent storage I/O from being dominated by one virtual machine. We control Storage QoS at the virtual hard disk level rather than the virtual machine level to give us the most precise control possible. We are still able to have different storage I/O possibilities for virtual machines with multiple disks. This also prevents any single virtual machine from consuming more storage I/O than it should. Storage I/O is available at a reasonable rate to all virtual machines. There are no longer “victims” and bullies.”
  • Provide standardized storage I/O for virtual machines. To prevent storage I/O bottlenecks, we established standardized storage I/O service levels that we defined and then provided through policies at the virtual hard disk level. By establishing several different service levels with varying storage I/O rates, we can adequately restrict storage I/O consumption while still ensuring that the necessary storage I/O is provided to virtual machines that need it.
  • Gain insight into storage I/O use across our virtual machine environment. By enforcing Storage QoS, we can create simple monitoring of Storage QoS. This gives us insight about how storage I/O is being consumed by each virtual machine.

For more information

Storage Quality of Service

Microsoft IT

microsoft.com/itshowcase

 

© 2019 Microsoft Corporation. All rights reserved. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.


You might also be interested in

Microsoft 365 helps create a secure modern workplace
November 30, 2018

Microsoft 365 helps create a secure modern workplace

Read case study
Automating cloud infrastructure management with Azure Resource Manager
September 19, 2017

Automating cloud infrastructure management with Azure Resource Manager

Read case study
Using shielded virtual machines to help protect high-value assets
April 05, 2017

Using shielded virtual machines to help protect high-value assets

Read case study
Configuration as code: Automating Windows Server 2016 configuration with PowerShell and DSC
September 23, 2016

Configuration as code: Automating Windows Server 2016 configuration with PowerShell and DSC

Read Article