2 min read

Disk and File Layout for SQL Server

Guest blog post by Paul Galjan, Microsoft Applications specialist, EMC Corporation. To learn more from Paul and other EMC experts, please join our Everything Microsoft Community.

The RAID group is dead – long live the storage pool!   Pools fulfill the real promise of centralized storage – the elimination of storage silos.  Prior to pool technology, when you deployed centralized storage you simply moved the storage silo from the host to within the array.  You gained some efficiencies, but it wasn’t complete.  Pools are now common across the storage industry, and you can even create them within Windows 2012, where they are called “Storage Spaces.”  This post is about how you allocate logical disks (LUNs) from pools so that you can maintain the visibility into performance.  The methods described can be used with any underlying pool technology.

To give some context, here’s how storage arrays were typically laid out 10 years ago.

Layout of storage arrays ten years ago

A single array could host multiple workloads (in this case, Exchange, SQL, and Oracle), but usually it stopped there – spindles (disks) would be dedicated to a workload.  There were all sorts of goodies like virtual LUN migration that allowed you to seamlessly move workloads between the silos (RAID groups) within the array, but those silos were still there.  If you ran out of resources for Exchange, and had some spare resources assigned to SQL Server, then you’d have to go through gyrations to move those resources.  For contrast, this is how pool technology works:

Pool Technology

All the workloads are sharing the same physical resources.  When you run out of resources (either performance or capacity) you just add more. The method is really enabled by automatic tiering and extended cache techniques.  So the popularity of pool technology is understandable. Increasingly I see VNX and VMAX customers happily running with just one or two pools per array.

The question here is this: if you’re not segregating the workload at the physical resource level, is there any need to segregate the workloads at the logical level?  For example, if tempdb and my user databases are in a single pool of disk on the array, should I bother having them on multiple LUNs (Logical Disks) on the host?

If the database is performance sensitive, then the reason is “Yes.” If you don’t, you may have a difficult time troubleshooting problems down the road.  Take an example of a query that’s resulting in an extraordinarily large number of IOs.  If your tempdb is on the same LUN as your user databases, then you really don’t know where those IOs are destined for.  It also reduces your ability to potentially deal with problems.  Pools may be the default storage option, but they’re not perfect, and not all workloads are appropriate for pools.  Segregating workloads into separate LUNs allows me to move them between pools, in and out of RGs without interrupting the database.

So here’s my default starting layout for any performance sensitive SQL Server environment:

  • Disk 1: OS/SQL Binaries
  • Disk 2: System databases (aside from tempdb)
  • Disk 3: tempdb
  • Disk 4: User databases
  • Disk 5: User DB transaction logs

This allows me to get a good view of things just from perfmon.  I can tell generally where the IO is going (user DBs, master, tempdb, logs etc), and if I need to move things around, I can do so pretty easily.