Training
Certifications
Books
Special Offers
Community




 
Microsoft® SQL Server 2000™ Performance Tuning Technical Reference
Author Edward Whalen, Marcilina Garcia, Steve Adrien DeLuca, Dean Thompson
Pages 464
Disk N/A
Level Int/Adv
Published 07/03/2001
ISBN 9780735612709
Price $49.99
To see this book's discounted price, select a reseller below.
 

More Information

About the Book
Table of Contents
Sample Chapter
Index
Related Series
Related Books
About the Author

Support: Book & CD

Rate this book
Barnes Noble Amazon Quantum Books

 


Chapter 5: New Features and Performance Enhancements  continued


Federated Servers with Distributed Partitioned Views

In SQL Server 2000, you can spread databases across multiple servers, called federated servers, which make up a federation, for greater processing scalability. If your system provides services for large database systems or large Web sites, you might need the processing power of multiple servers to support the transaction load. Distributing processing over multiple servers is called scaling out, (as opposed to scaling up, which refers to upgrading to a single, more powerful computer). The servers in a federation are called member servers. Figure 5-1 shows a federation of three member servers.

Click to view graphic
Click to view graphic

Figure 5-1.  Federated servers.

You create federated servers by spreading database tables across a group of servers, which you do by creating distributed partitioned views. This involves horizontally partitioning a table into smaller tables, each of which holds a subset of the complete table data. The subset tables are called member tables, and they reside on the member servers. The member tables have the same schema on each of the member servers. The data ranges for member tables are defined by CHECK constraints on the partitioning column. After you create the member tables on the member servers, you create a distributed partitioned view with the same view name on each of the member servers—this view consists of a union of all the member tables. Each view makes it appear that each server has a full copy of the original table. Applications can reference the view rather than a specific member table; they don’t need to know which specific member server holds the requested data. You can make both SELECT and data modifications through the view.

Here’s an example of a distributed partitioned view across three federated servers, Server1, Server2, and Server3. We can split up a table with sales data into three smaller tables based on the three possible values in the partitioning column, Region_ID. The users on the system typically access data in only one of the three regions at a time, so we can split the data for this table and other related tables by region. The application can determine which region the user is referencing and route the user’s query to the appropriate member server. If all the data is on that member server, it is retrieved or modified without the application having to access any of the other member servers.

In our example we create three member tables, one on each of the federated servers, and distribute the data for those three member tables according to Region_ID. Specifically, we create a WesternRegionSales table on Server1, a CentralRegionSales table on Server2, and an EasternRegionSales table on Server3. Rows of sales data with a Region_ID of 1 go into the WesternRegionSales table, with a Region_ID of 2 into the CentralRegionSales table, with a Region_ID of 3 into the EasternRegionSales table. Next, we create a distributed partitioned view that encompasses all the member tables, so that applications can access data without having to specify in which of the three tables the data resides. The statement that creates the view on Server1 looks like this:

CREATE VIEW SALES_BY_REGION
AS
SELECT * FROM WesternRegionSales
UNION ALL
SELECT * FROM Server2.SalesDB.dbo.CentralRegionSales
UNION ALL
SELECT * FROM Server3.SalesDB.dbo.EasternRegionSales
GO

On Server2 the view is created as follows:

CREATE VIEW SALES_BY_REGION
AS
SELECT * FROM CentralRegionSales
UNION ALL
SELECT * FROM Server1.SalesDB.dbo.WesternRegionSales
UNION ALL
SELECT * FROM Server3.SalesDB.dbo.EasternRegionSales
GO

And on Server3 the view is created as follows:

CREATE VIEW SALES_BY_REGION
AS
SELECT * FROM EasternRegionSales
UNION ALL
SELECT * FROM Server1.SalesDB.dbo.WesternRegionSales
UNION ALL
SELECT * FROM Server2.SalesDB.dbo.CentralRegionSales
GO

If an application executes a SQL statement that references the Sales_By_Region view, and the application identifies the user as a region 1 user, it routes the request to Server1, although the user cannot tell where the data is located. If all the data requested by the user is found on Server1, there is no need to access Server2 or Server3—the ideal and most efficient model for federated servers. If, on the other hand, the requested data resides in member tables on more than one of the member servers, the SQL Server instance on Server1 generates a distributed query that pulls the result rows in from the other servers. This requires more overhead and processing power, and causes more contention on the tables.

Each member table requires a CHECK constraint, which you can define when you create the table, or alter later. When a SELECT statement is executed that references the partitioned view and contains the partitioning column in the search condition, SQL Server uses the CHECK constraints to determine which member servers contain the resulting rows, and searches only those servers. The CHECK constraint also allows only appropriate data to be inserted or updated in each member table. Without CHECK constraints, SQL Server still retrieves the data, but it must search each member table on each member server, causing unnecessary processing, and the application must be coded to check any modifications to ensure that they go to the appropriate server. This example of a CHECK constraint on Server1 indicates and ensures that only rows with Region_ID of 1 reside on that member server:

ALTER TABLE WesternRegionSales ADD CONSTRAINT Region_Check 
CHECK (Region_ID = 1)

On Server2 the constraint looks like this:

ALTER TABLE CentralRegionSales ADD CONSTRAINT Region_Check 
CHECK (Region_ID = 2)

And on Server3:

ALTER TABLE EasternRegionSales ADD CONSTRAINT Region_Check 
CHECK (Region_ID = 3)

Partitioned views existed in SQL Server 7.0, but you could not update them, so you could use them only for read-only data. You can update views in SQL Server 2000, which means that when you update, insert, or delete data through the partitioned view, the data is modified on the appropriate member server automatically without the user having to know on which server the data lies. Also, SQL Server 2000 has an improved query optimizer that minimizes the amount of data transferred for distributed queries.


NOTE:
For a detailed description of the rules concerning member tables and partitioning columns, and requirements concerning inserting, updating, and deleting with partitioned views, see "Creating a Partitioned View" in SQL Server Books Online, the online documentation provided with Microsoft SQL Server 2000. We recommend that you read these rules before designing your member tables.

We’ve talked about partitioning one table of a database, but what about the other tables? Ideally you partition as many tables as possible to maximize parallel processing across the federation of servers, and you locate related data on each member server to avoid distributed queries. In our example, related tables would include any more tables with data that could be partitioned by region ID, so that any data requested for a particular region would be found on the same server.

But there will probably be some tables that all users need full access to. These tables will not be part of a partitioned view. Instead, you can place full copies of these tables on each of the member servers, where they are accessed by their table name rather than a view name. You ensure that each member server gets any modifications to the data in these tables by creating INSTEAD OF triggers on each member server that perform a distributed query to update the data on other member servers. For example, say a table called Products stores the products for sale in all regions. This table exists in its entirety on each of the member servers. To update this table and keep data consistent across the member servers, you could create a trigger as follows on Server1, Server2, and Server3:

CREATE TRIGGER trig_products ON Products 
INSTEAD OF UPDATE, INSERT, DELETE AS
BEGIN
DELETE Prod FROM Server1.SalesDB.dbo.Products Prod, DELETED D
WHERE Prod.Product_ID = D.Product_ID
INSERT Server1.SalesDB.dbo.Products SELECT * FROM INSERTED

DELETE Prod FROM Server2.SalesDB.dbo.Products Prod, DELETED D 
WHERE Prod.Product_ID = D.Product_ID
INSERT Server2.SalesDB.dbo.Products SELECT * FROM INSERTED

DELETE Prod FROM Server3.SalesDB.dbo.Products Prod, DELETED D 
WHERE Prod.Product_ID = D.Product_ID
INSERT Server3.SalesDB.dbo.Products SELECT * FROM INSERTED
END

As you can see, this trigger causes distributed processing to occur in order to complete the data modification on all of the member servers. To repeat, to get optimal performance with a federation, you should minimize the amount of distributed processing. The more data transfer between the federated servers, the slower the federation performance. It is best to distribute data between the member tables in the federation such that users only need to read or modify data from the one member server they access directly.

For federated servers to be an option for your system, your databases must have certain characteristics. If your databases are somewhat complex, it can be quite difficult to redesign them to be distributed in a way that fits the federated servers model. If you are designing your databases from the starting point, it might be an easier task because you can plan your table schemas for distribution ahead of time. If implementing federated servers is not an option for you, another option might be to offload certain tasks to another server which has a copy of the entire database, and use replication between servers to keep data synchronized. This is commonly done when separating OLTP tasks from reporting tasks.


Previous   |  Table of Contents   |   Next




Top of Page


Last Updated: Saturday, July 7, 2001