At Microsoft IT, we support thousands of virtual machines in our IT environment. In our datacenters, we use Hyper-V in Windows Server for virtualization infrastructure in our private cloud. Recently, we examined Storage Quality of Service (QoS) in Windows Server 2016 to help solve some storage management issues in our virtual environment. By using Storage QoS, we prevent storage I/O from being dominated by a single virtual machine, provide standardized storage I/O, and gain insight into storage I/O across our virtual machine environment.
Our virtualization infrastructure includes dedicated storage appliances. The most common configuration we use is four Cluster Shared Volumes (CSVs) per Hyper-V cluster that provides storage for several virtual machines. Currently, we set storage QoS for each CSV within the storage hardware appliances. Although this provides some level of Storage QoS, it wasn’t as effective as we’d like. Typically, we have anywhere between 50 to 75 virtual machines using a single CSV for storage. It is possible that one or more virtual machines could consume a larger amount of storage I/O than they should—thereby reducing the storage I/O available to the rest of the virtual machines that use the cluster.
Understanding Storage QoS in Windows Server 2016
In Windows Server 2016, Storage QoS can centrally manage and monitor storage performance for Hyper-V servers and the virtual machines that they host. And Storage QoS is built into the Hyper-V role in Windows 2016. It can be used with either a Scale-Out File Server or traditional block storage in the form of CSV. Storage QoS is represented as a cluster resource in Failover Cluster Manager, and is managed directly by the failover cluster. After some virtual machines are using the Scale-Out File Server or CSV, Storage QoS monitors and tracks storage flow.
Using policies to control storage usage
We use policies in Storage QoS to apply rules and limitations to the storage I/O consumed by Hyper-V virtual machines. We can create and apply policies to multiple virtual machines, and virtual machines can be affected by multiple policies. We can also use policies to set standard levels for minimum and maximum storage throughput for individual virtual machines, or specific virtual machine groupings. Storage QoS is designed to:
- Prevent “bullies” and “victims” in storage use. Virtual machines that consume a large amount of storage I/O can prevent other virtual machines from accessing the amount of storage I/O they need. In this case, storage “bully” virtual machines restrict the availability of storage I/O to “victim” virtual machines.
- Manage storage I/O with policies. Using Storage QoS policies, we can set performance minimums and maximums on either the virtual machine or virtual disk level. Policies can trigger alerts that notify us when virtual machines are out of compliance with a policy.
- Monitor storage performance. We can use Storage QoS to monitor performance details of both Hyper-V host machines and the virtual machines they host.
Implementing Storage QoS
Implementing Storage QoS in Windows Server 2016 meant upgrading some of our Hyper-V host servers to Windows Server 2016. And by configuring several policies, we could apply Storage QoS behavior to our virtual machines. By using Storage QoS in Window Server 2016 in this way, we achieved three primary goals:
- Prevent storage I/O from being dominated by one virtual machine. We control Storage QoS at the virtual hard disk level rather than the virtual machine level to give us the most precise control possible. We are still able to have different storage I/O possibilities for virtual machines with multiple disks. This also prevents any single virtual machine from consuming more storage I/O than it should. Storage I/O is available at a reasonable rate to all virtual machines. There are no longer “victims” and bullies.”
- Provide standardized storage I/O for virtual machines. To prevent storage I/O bottlenecks, we established standardized storage I/O service levels that we defined and then provided through policies at the virtual hard disk level. By establishing several different service levels with varying storage I/O rates, we can adequately restrict storage I/O consumption while still ensuring that the necessary storage I/O is provided to virtual machines that need it.
- Gain insight into storage I/O use across our virtual machine environment. By enforcing Storage QoS, we can create simple monitoring of Storage QoS. This gives us insight about how storage I/O is being consumed by each virtual machine.
For more information
© 2019 Microsoft Corporation. All rights reserved. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.