SQL Server Appliances – A Workload-based Appliance Design Philosophy

I run the Appliance Engineering team for SQL Server.  One of the questions I get asked most often about building appliances, is how do we go about designing a new appliance.   We don’t start with a cool piece of hardware and figure out what we might be able to build out of it, rather we start by understanding what an appliance needs to do and work our way to choosing the right hardware.    Our general approach is what I like to call “workload-based appliance design” and I thought I would share some of the thinking we have developed as part of engineering some of the new SQL Server appliances you may have heard about already such as the HP Enterprise Data Warehouse (HPEDW), the HP Business Decision Appliance (HPBDA) and others you may not know about that we are just starting to talk about.   This is not rocket science, but it is good engineering and allows us to work with our key hardware partners to build general purpose appliances at a lower total cost that you might expect given their capabilities.
W is for Workload: Let’s assume that we know we want to build an appliance for a specific workload and identified that workload.  For this discussion let’s choose the “Self-service BI” or SSBI workload targeting a small to medium business or enterprise departments that want to use PowerPivot.   From this starting point our engineering effort kicks off by gaining a deep understanding of the workload specifics.   We run the workload as we understand it on real hardware – we call a design proxy – varying many parameters to understand workload variability.  We talk with customers, consultants, MVPs, our own SQL CAT experts and the developers of the products are thinking about using.   From that collected expert knowledge we build a specific model for the workload – and in the case of SSBI that evolved into an automated workload we could run and measure.   It can be tough to agree on general workload characteristics, but it is critical to gain that level of understanding so specific tools can be built for performance and testing work needed later.
A is for Architecture:  After a workload is understood, a survey is done to understand for the target workload what approaches or system architectures are appropriate.   You can imagine for SSBI we looked at best practices related to PowerPivot, SharePoint, and SQL Server.  We explored running the workload on the metal in different mixes of physical servers and in VMs – splitting up into multiple virtual servers.    We also talked with customers about what capabilities they expected to find in a complete solution and that led us to take an “ecosystem” approach, bringing all the components together into a single server. We ran an extensive battery of tests to see if we could really get the architecture to work well to narrow our approach.
S is for Software:   Once we have an approach we believe is sound we start looking at how to build the solution, the required software components.  There are many software components required for a SSBI workload as we have defined beyond the basic products, determining exactly how to combine those together takes some effort.   Making decisions on what to enable by default, how to configure all the components so they work well together takes a great deal of iteration.  I like to think of this process as learning how to set the 10,000 knobs that exist in the software – at least establishing an initial setting.   Reviewing those decisions with workload experts is a key activity at this stage of the process and often we find that the “best solution” is not necessarily consistent with common “best practice”.   At this point we are starting to add considerable value to the solution – value that is difficult for any single IT organization to create since we are working directly with the world’s leading experts on all the components being utilized,.
H is for Hardware:   The final step, selecting specific hardware, is an iterative process.  For example the SSBI workload is especially memory intensive, so selecting the proper amount of RAM for the system was an important decision.   We bought our engineering prototype hardware with the max amount of available RAM, but through performance tuning and optimization we were able to reduce the total memory to 96GB without impacting overall performance of our workload.   Again the resulting appliance hardware contains the knowledge of many experts – for example we review the configuration of our DIMMs with the engineers who designed the mainboard we are using, we reviewed the RAID configuration with the team that built the RAID controller.   This final stage is marked by rapid iteration of both hardware and software configuration, extensive performance and reliability testing to reach a final configuration – that configuration we capture and deliver with our hardware partners as an appliance.
When you think about the SQL Server appliance products, hopefully this will provide some context for how we create those products – our workload-centric engineering approach is important in making sure we can deliver a compelling product at a low total cost.    And when you need to explain why you think a specific SQL Server appliance might be a good solution for your organization’s workload needs, remember:  SQL Server Appliances have nothing to do with laundry, but we do use Workload, Architecture, Software, Hardware (WASH) as the basis for our engineering design process.
Britt Johnston
Principal Group Manager
SQL Server Appliance Engineering Team
Twitter: www.twitter.com/brittjohnston