Click Here to Install Silverlight*
IndiaChange|All Microsoft Sites
Microsoft
Communities 
 
Chat Transcript
 
Chat Topic : SQL Server 2005 Partitioning
Chat Expert : Vinod, SriSamp (MVP)
November 09, 2005
 
 

Subhashini  (Moderator):
Hi all, a very good evening to all of you !
 

Subhashini  (Moderator):
We shall begin the chat in the next 5 minutes
 

Subhashini  (Moderator):
Welcome to today's chat on SQL Server 2005 partitioning by Vinod M Kumar & Srinivas Sampath

Subhashini  (Moderator):
Vinod Kumar did his Bachelors of Engineering in Mechanical from Anna University, Chennai. After graduation, he joined SCT Software Solutions (then Exeter Systems) where he worked for about 4 years. He has worked for a team called Technology Labs which evaluates various Microsoft technologies and products and their relevance in this line of business. He is presently with Intel Technologies, Bangalore. He has been working on Microsoft Technologies all along his professional career and is very inclined towards learning every new concept that Microsoft comes out with. In this process, he has gathered tons of knowledge to share with all.

 

Catch all the resources and articles Vinod has written at www.ExtremeExperts.com  < http://www.ExtremeExperts.com >, the site he co-hosts. He has been very passionate about SQL Server programming and loves to experiment different programming tricks in SQL Server.

 

Vinod is an active volunteer in INETA and PASS. He has initiated a SQL Server community in Bangalore. To capture more ev

Subhashini  (Moderator):
To capture more events on SQL Server, join the group at < http://groups.msn.com/SQLBang > which was formed as an initiative to bring PASS community in India.

 

Email ID: vinodk@mvps.org  < mailto:vinodk@mvps.org >

Web Site: www.ExtremeExperts.com  < http://www.ExtremeExperts.com >


Subhashini  (Moderator):
Srinivas Sampath works for SCT Software Solutions(Bangalore). He graduated with a bachelors in Computer Technology(PSG College of Technology) and later pursued MCA (Madras University).He joined SCT (then called Exeter Systems) in 1997 and has been with them for the past 6 years. His area of expertise include database architectures (using SQL Server) and enterprise application architectures using Microsoft Technologies. Srinivas currently is a Product Development Manager and also head of Technology Labs, a group that focuses on latest Microsoft technologies and their applicability into their product lines. His interests include SQL Server 2000 programming, Office 2003 programming and .NET.

ssampath@sct.co.in  < mailto:ssampath@sct.co.in >

< http://www32.brinkster.com/srisamp >


Subhashini  (Moderator):
Just a few chat rules before we begin the chat

Subhashini  (Moderator):
Chat Procedures:

This chat will last for one hour. During this hour, our Experts will respond to as many questions as they can. Please understand that there may be some questions we cannot respond to due to lack of information or because the information is not yet public. We encourage you to submit questions for our Experts

 

Subhashini  (Moderator):
We ask that you stay on topic for the duration of the chat. This helps the Guests and Experts follow the conversation more easily. We invite you to ask off topic questions after this chat is over.


Subhashini  (Moderator):
Please use the check box " ask the experts" to ask your questions to the experts
 

Subhashini  (Moderator):
So lets get the chat rolling ..

Subhashini  (Moderator):
Hi Vinod & Srini

SriSamp [MVP] (Expert):
Welcome all.

Vinod_MVP (Expert):
Hi all ... Good Evening ...

SriSamp [MVP] (Expert):
Welcome to this chat on the new partitioning features of SQL Server 2005.

SriSamp [MVP] (Expert):
The concept of Partitioning of data is not new to SQL Server 2005. Methods were available in SQL Server 2000 (and earlier) also for partitioning of data.

SriSamp [MVP] (Expert):
What is new however, is the mechanism of partitioning and what all you can do with it.

SriSamp [MVP] (Expert):
Before we actually get into the details of partitioning in SQL Server 2005, let us understand some of the basics of partitioning and what methods were available in earlier versions.

SriSamp [MVP] (Expert):
The first question that people have is: What are partitions and why would someone want to use them?

SriSamp [MVP] (Expert):
Many companies today have data warehouses which contain historical data and these warehouses keep growing on a regular basis as more data comes in.

SriSamp [MVP] (Expert):
The basic tenet behind a warehouse is that given an entity (likes sales), out of the millions of rows that are archived in the warehouse, only a portion of it is required at any one time.

SriSamp [MVP] (Expert):
For example, managers may be interested in the current quater and the previous two quaters. The rest of the data is always there when needed, but the access patterns are usually known.

SriSamp [MVP] (Expert):
If all the data for an entity were stored in a single monolithic table, we immediately run into issues of maintenance, availability and performance.

SriSamp [MVP] (Expert):
Any operations that we do with the table will affect others who are using that table for ad-hoc reporting, analysis etc.

SriSamp [MVP] (Expert):
To understand this better, consider for example, a case wherein the sales data for an organization is stored in a single table.

SriSamp [MVP] (Expert):
If the administrator decides to do some indexing operations on the table or wants to load in new data, the entire table is locked out during these operations.

SriSamp [MVP] (Expert):
In addition to size, a table with varying access patterns might be a concern for performance and availability when there are different access patterns.

SriSamp [MVP] (Expert):
For example, consider a case wherein a sales table has the current month's data as read-write and the previous month's data as read-only.

SriSamp [MVP] (Expert):
In this case, as data moves in and out of the table, the access to the table and its ability to respond to user requests might be impacted.

SriSamp [MVP] (Expert):
In some cases, periodic downtime is acceptable, yet it can often be avoided or minimized by better design and implementations.

SriSamp [MVP] (Expert):
Given that the access patterns of data access in warehouse tables are usually known (last quater only, last two years only etc)

SriSamp [MVP] (Expert):
Would it not be great if we could have the data separated on these access points? That's what partitions are all about.

SriSamp [MVP] (Expert):
Partitions are basically useful when you are dealing with large tables, since it helps you partition the data into more manageable chunks based on well known access points.

SriSamp [MVP] (Expert):
Partitioning a large table can provide lots of benefits especially in the area of scalability and manageability.

SriSamp [MVP] (Expert):
Before we understand how to approach this new topic in SQL Server 2005, let's see what options we had in earlier versions of SQL Server.

SriSamp [MVP] (Expert):
In versions of SQL Server prior to 7.0, partitioning had to be part of the design and then built into all the data access code of applications.

SriSamp [MVP] (Expert):
The typical mechansim was to create multiple tables and then based on the partitioning column, the data access component might route the data to the appropriate table.

SriSamp [MVP] (Expert):
A view would be typically be created by UNIONing all of the tables and presenting a unified table. However, data inserts, updates etc had to happen on the appropriate table.

SriSamp [MVP] (Expert):
To understand this concept, consider the following fictitious table which stores details for customers:

SriSamp [MVP] (Expert):
CREATE TABLE MyCustomers

(

                CustomerID          INT NOT NULL PRIMARY KEY,

                CustomerName  VARCHAR(100),

                CustomerState    VARCHAR(2),

                << other columns of interest >>

)

SriSamp [MVP] (Expert):
Let us assume that this table will have several thousands of customers and thus, we have decided to partition this table along the "state" to which the customer belongs.

SriSamp [MVP] (Expert):
Thus, you might have tables like: MyMACustomers, MyCACustomers etc and each table would essentially have the same columns, but only storing data for that particular state.

SriSamp [MVP] (Expert):
We can control the data that is stored into a particular table by implementing CHECK constraints on the "CustomerState" column.

SriSamp [MVP] (Expert):
Once we have this basic structure in place, we can then create a view called MyCustomers that would aggregate all these individual tables like this:

SriSamp [MVP] (Expert):
CREATE VIEW MyCustomers AS

                SELECT .... FROM MyMACustomers

                UNION ALL

                SELECT ... FROM MyCACustomers

                UNION ALL

                ....

SriSamp [MVP] (Expert):
Given this design, as discussed earlier, in versions of SQL Server prior to 7.0, the application code can use the view to select data, however, it had to route DMLs.

SriSamp [MVP] (Expert):
For example, if we need to insert data into the customers table, a typical code block would be something like this:

SriSamp [MVP] (Expert):
IF (CustomerState = 'Boston')

                EXEC MyBostonCustomersInsert <parameters>

....


SriSamp [MVP] (Expert):
This is quite cumbersome to maintain and as new states get added, we need to modify all relevant code appropriately.

SriSamp [MVP] (Expert):
Another issue was with respect to performance. For example, if we needed to search only for customers in the MA state, SQL Server still had to query all of the underlying tables.

SriSamp [MVP] (Expert):
If only a limited subset of the underlying tables was needed, then each user and developer needed to know the design in order to reference only the appropriate tables.

SriSamp [MVP] (Expert):
This issue was to a great extent solved in SQL Server 7.0.

SriSamp [MVP] (Expert):
With the release of 7.0, views were combined with constraints to allow the query optimizer to remove irrelevant tables from the querey plan.

SriSamp [MVP] (Expert):
This basically reduced the overall plan cost when a UNIONed view accessed multiple tables.

SriSamp [MVP] (Expert):
Thus for the example situation above, if we needed only the customers in the MA state, the users would only be directed to the MyMACustomers when they queried the MyCustomers view.

SriSamp [MVP] (Expert):
How did SQL Server 7.0 achieve this?

SriSamp [MVP] (Expert):
As long as the constraints are "trusted" and the queries against the view use a WHERE clause that uses the partition key, SQL Server will access only the required tables.

SriSamp [MVP] (Expert):
So what are trusted constraints? A trusted constraint is one wherein SQL Server can guarantee that all data adheres to the properties defined by the constraint.

SriSamp [MVP] (Expert):
By using this mechansim, developers can significantly reduce the complexity of their design by using views without having to directly access the table of interest.

SriSamp [MVP] (Expert):
With trusted constraints SQL Server improves the performance by removing unnecessary tables from the execution plan.

SriSamp [MVP] (Expert):
Although SQL Server 7.0 solved the issue of performance during querying, DML operations like INSERT, UPDATE and DELETE still had to be made on the underlying tables and not the view.

SriSamp [MVP] (Expert):
This means that the application code still had to use patterns like the ones described above to route the data to the appropriate tables.

SriSamp [MVP] (Expert):
Enter SQL Server 2000. In SQL Server 2000, data modification statements also benefit from the partition capabilities introduced in SQL Server 7.0.

SriSamp [MVP] (Expert):
In SQL Server 2000, you could fire DML statements against the view and SQL Server would determine the appropriate underlying table that would get affected.

SriSamp [MVP] (Expert):
For a good treatment of how to design SQL Server 2000 Data Warehouses and thus partitions, refer to: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/partitionsindw.asp

SriSamp [MVP] (Expert):
It is also useful to note that SQL Server 2000 supported two types of partitioned views: Local Partitioned View (LPV) and Distributed Partition View (DPV)

SriSamp [MVP] (Expert):
An LPV is one in which all of the underlying tables for the partition view lie in the same SQL Server instance.

SriSamp [MVP] (Expert):
A DPV is one in which the uderlying tables live on separate instances. These instances don't have to be on different machines, but they usually are.

SriSamp [MVP] (Expert):
Spreading a DPV across multiple machines helps "scale out" large SQL Server implementations (something called FEDERATION).

SriSamp [MVP] (Expert):
For more information on how to setup and use DPVs and the federation capability, refer to the SQL Server 2000 books online.

SriSamp [MVP] (Expert):
While the improvements in SQL Server 7.0 and SQL Server 2000 significantly enhanced performance when using partitioned views, they did not simply the following:

SriSamp [MVP] (Expert):
Administration, Design, Development.

SriSamp [MVP] (Expert):
People still had to create the underlying tables for the parition view manually and maintain them.

SriSamp [MVP] (Expert):
Administration is also complex, since there are numerous tables to manage and data integrity constraints must be managed for each table.

SriSamp [MVP] (Expert):
From a performance standpoint, the optimizer was required to validate and determine plans for each partition. Thus query optimization cost increases linearly with the number of partitions.

SriSamp [MVP] (Expert):
Enter SQL Server 2005.

SriSamp [MVP] (Expert):
In SQL Server 2005, In SQL Server 2005, table and index pages are contained in one or more partitions.

SriSamp [MVP] (Expert):
A partition is a user-defined unit of data organization.

SriSamp [MVP] (Expert):
By default, a table or index has only one partition that contains all the table or index pages.

SriSamp [MVP] (Expert):
The partition resides in a single filegroup.

SriSamp [MVP] (Expert):
A table or index with a single partition is equivalent to the organizational structure of tables and indexes in earlier versions of SQL Server.

SriSamp [MVP] (Expert):
When a table or index uses multiple partitions, the data is partitioned horizontally so that groups of rows are mapped into individual partitions, based on a specified column.

SriSamp [MVP] (Expert):
The partitions can be put on one or more filegroups in the database.

SriSamp [MVP] (Expert):
The table or index is treated as a single logical entity when queries or updates are performed on the data.

SriSamp [MVP] (Expert):
SQL Server 2005 introduces some new concepts and terminologies when you are dealing with partitions.

SriSamp [MVP] (Expert):
The steps for creating a partitioned table or index include the following:

SriSamp [MVP] (Expert):
a. Create a partition function to specify how a table or index that uses the function can be partitioned.

SriSamp [MVP] (Expert):
b. Create a partition scheme to specify the placement of the partitions of a partition function on filegroups.

SriSamp [MVP] (Expert):
c. Create a table or index using the partition scheme.

SriSamp [MVP] (Expert):
Rather than boring you with theory of what and how to define functions, schemes etc, let us see an example in action that you can paste and execute in SQL Server 2005.

SriSamp [MVP] (Expert):
Note that these commands will work only with the Enterprise / Developer Edition of SQL Server 2005, since partitions is only supported on these editions.

SriSamp [MVP] (Expert):
As we see each example, I will explain the syntax and what the command is doing.

SriSamp [MVP] (Expert):
Note that we will use the same example as above, wherein we want to store details of customers in various states in separate partitions.

SriSamp [MVP] (Expert):
We will start off by creating the "partition function"

SriSamp [MVP] (Expert):
CREATE PARTITION FUNCTION customer_state (CHAR) AS

                RANGE LEFT FOR VALUES ('MA', 'CA', 'AZ')

GO


SriSamp [MVP] (Expert):
A partition function specifies how the table or index is partitioned.

SriSamp [MVP] (Expert):
To create a partition function, you specify the number of partitions, the partitioning column, and the range of partition column values for each partition.

SriSamp [MVP] (Expert):
Note that you can specify only one partitioning column.

SriSamp [MVP] (Expert):
Given the above definition, we will get 4 partitions:

SriSamp [MVP] (Expert):
Partition 1 = Values <= MA, Partition 2 = Values > MA and <= CA, Partition 3 = Values > CA and <= AZ and Partition 4 = Values > AZ

SriSamp [MVP] (Expert):
Note that SQL Server would give you a warning for the above Partition Function defintion that the values need to be sorted. To correct this, provide the values in the sorted order. Thus, we would use AZ, CA and MA in that order.

SriSamp [MVP] (Expert):
Ok, now that we have created the partition function, let us test it with some values using the following query:

SriSamp [MVP] (Expert):
SELECT $PARTITION.customer_state ('BA')

 

SriSamp [MVP] (Expert):
The "$PARTITION" function returns the partition number into which a set of partitioning column values would be mapped for any specified partition function.
 

SriSamp [MVP] (Expert):
Q:
WHY DO YOU NEED LEFT THERE?

A: Good question. For our example it not needed. However, down the line, I will use it to demonstrate how values inbetween them are moved into the various partitions.

SriSamp [MVP] (Expert):
The above statement will return 2, indicating that when we insert 'BA' as the value, the record would get into the second partition.

SriSamp [MVP] (Expert):
You can also query some meta data views to get information about the partition function and its boundary values. Here is an example:

SriSamp [MVP] (Expert):
SELECT

                pf.name, pf.type_desc, pf.fanout, rv.*

FROM

                sys.partition_functions pf

                INNER JOIN sys.partition_range_values rv ON pf.function_id = rv.function_id

WHERE

                pf.name = 'customer_state'


SriSamp [MVP] (Expert):
The above query will return the partition function name, its type, the number of partitions created and range values (that we defined earlier).
 

SriSamp [MVP] (Expert):
Let us now create a partition scheme for the above partition function using the following script:

SriSamp [MVP] (Expert):
CREATE PARTITION SCHEME customer_scheme

                AS PARTITION customer_state

                ALL TO ([PRIMARY])

GO

SriSamp [MVP] (Expert):
A partition scheme maps the partitions produced by a partition function to a set of filegroups that you define.

SriSamp [MVP] (Expert):
Note that we must specify enough number of filegroups to hold the number of partitions.

SriSamp [MVP] (Expert):
If we specify more file groups than the number of partitions, SQL Server will then mark one of the file groups as NEXT USED, meaning that it will contain the next added partition.

SriSamp [MVP] (Expert):
Note that you can also map all the partitions into a single filegroup (which is what we have done above).

SriSamp [MVP] (Expert):
Once we have created the scheme, we can now create a table (or an index) that will use this scheme. Here is our customer table re-defined:

SriSamp [MVP] (Expert):
IF (OBJECT_ID ('Customers') IS NOT NULL)

                DROP TABLE Customers

GO

CREATE TABLE Customers

(

                CustomerID          INT NOT NULL,

                CustomerName  NVARCHAR(50),

                StateOfResidence               CHAR(2),

                TotalOrder             DECIMAL (10,2)

 

                CONSTRAINT CustomerPK PRIMARY KEY (CustomerID, StateOfResidence),

                CONSTRAINT CustomerState CHECK (StateOfResidence IN ('AZ', 'CA', 'MA', 'BA', 'CM', 'MZ'))

)

ON customer_scheme (StateOfResidence)

GO


SriSamp [MVP] (Expert):
Two things to note here:

SriSamp [MVP] (Expert):
a. Note that the primary key constraint also includes the partition key. This is required for SQL Server to enforce uniqueness and partition verification.

SriSamp [MVP] (Expert):
b. We include a CHECK constraint in the table exlusively to limit the values in the column

SriSamp [MVP] (Expert):
We now insert sample rows into this table. Here is a script that does the same:

SriSamp [MVP] (Expert):
INSERT INTO Customers SELECT 1, 'Contoso', 'AZ', 2000

INSERT INTO Customers SELECT 2, 'Levis', 'CA', 2000

INSERT INTO Customers SELECT 3, 'ACME Corp', 'MA', 2000

INSERT INTO Customers SELECT 4, 'Fabrikam', 'MA', 2000

INSERT INTO Customers SELECT 5, 'Books Mart', 'AZ', 2000

 

SriSamp [MVP] (Expert):
Ok, we know that we have 4 partitions. How do we determine how many rows each partition got? Here is a query for the same:

SriSamp [MVP] (Expert):
SELECT

                $PARTITION.customer_state (StateOfResidence) AS [Partition],

                COUNT(*) AS [#Records]

FROM

                Customers

GROUP BY

                $PARTITION.customer_state (StateOfResidence)

ORDER BY

                [Partition]

 

SriSamp [MVP] (Expert):
What this query is doing is displaying the number of records in each partition. Note that we have used the $PARTITION function to get the number and group on that.

SriSamp [MVP] (Expert):
In our above sample script, we inserted data using values that were exactly boundary conditions.
 

SriSamp [MVP] (Expert):
Let us now insert some more rows that fall between the boundaries and see how they are handled. Here is a script for the same:

SriSamp [MVP] (Expert):
INSERT INTO Customers SELECT 6, 'Network Mate', 'BA', 2000

INSERT INTO Customers SELECT 7, 'Soft Pro', 'CM', 2000

INSERT INTO Customers SELECT 8, 'FabMall', 'MZ', 2000


SriSamp [MVP] (Expert):
Now, if you run the above SELECT statement, you see the revised distribution. If you have a pen and paper with you draw out 4 boxes and place the values manually based on our boundaries and you will understand the behavior of the output.

SriSamp [MVP] (Expert):
Here is another query that lists the rows present in a specified partition.

SriSamp [MVP] (Expert):
SELECT * FROM Customers WHERE

                $PARTITION.customer_state (StateOfResidence) = 3


SriSamp [MVP] (Expert):
Till now, what we have defined as the partition is an example of a RANGE PARTITION.

SriSamp [MVP] (Expert):
Range partitions are partitions that are defined by specific and customizable ranges of data.
 

SriSamp [MVP] (Expert):
The range partition boundaries are chosen by the developer and can be changed as data usage patterns change.

SriSamp [MVP] (Expert):
Note that apart from partitioning tables, you can also partition indexes.

SriSamp [MVP] (Expert):
Partitioning a table and its indexes along the same partition function often optimizes performance.
 

SriSamp [MVP] (Expert):
When the indexes and the table use the same partioning function and columns in the same order, the index and the table are said to be "aligned".

SriSamp [MVP] (Expert):
When the tables and indexes also use the same partition scheme, they are considered to be "storage aligned".

SriSamp [MVP] (Expert):
One benefit of storage alignment is that all the data and indexes are present in the same filegroup which can be maintained independently, thus increasing availability.

SriSamp [MVP] (Expert):
Note that although we have defined our boundary values earlier, it is often the case that the boundary values can change over time.

SriSamp [MVP] (Expert):
This requires us to do some maintenance over our partitions and there are several new features and concepts for the same.

SriSamp [MVP] (Expert):
Some of the new features are SPLIT, MERGE and SWITCH which we will cover next:

SriSamp [MVP] (Expert):
Let's answer some questions and if there are more, we can take them too. Otherwise, we will move ahead with the more advanced features.

Vinod_MVP (Expert):
Q:
please provide me links for a new starter into SQL world !

A: There are some handy links available at: http://www.microsoft.com/sql/default.mspx  which you can goto. But from a starter point of view I would strongly recommend using the SQL Server Books Online as the place to start. It comes part of SQL Server Installation.

Vinod_MVP (Expert):
Q:
Is there a difference if we use express edition?

A: The concept of partitioning is same across all the editions. Nothing changes. But you need to understand that Partitioning is supported in the Enterprise and Developer Editions ONLY. This feature is not available in the Express Edition.

Vinod_MVP (Expert):
Q:
WHY DO YOU NEED LEFT THERE?

A: As Srini said, the Left tries to say which side of the partition should the values on the boundary fall. A left says the value in the boundary falls on the left side. As a best practice: Use RANGE RIGHT for ???continuous??? data values ??? more intuitive. Hence boundary becomes starting point for each range. But for character based partitioning like the example Srini shows it is based on your data distribution you might want to use one over the other. Hope this clarifies further.

SriSamp [MVP] (Expert):
Any other questions?

Vinod_MVP (Expert):
Ok. Let us get rolling if there is no further questions.

Vinod_MVP (Expert):
Well, now that Srini has laid some basic foundation to the whole partitioning concept lets dwel a step deeper in what it is to offer in addition to this.

Vinod_MVP (Expert):
Because partitions are used for large tables that need to scale to support better throughput, it is possible that the number of partitions chosen when the partition function was created will need to change over time.

Vinod_MVP (Expert):
Using the ALTER TABLE statement with the new ???split??? option, you can add another partition to the table.

Vinod_MVP (Expert):
When a partition is ???split,??? data may be moved to the new partition; however, for best performance rows should not move.

Vinod_MVP (Expert):
Conversely, if you want to remove a partition you would perform a switch and merge functionalities.

Vinod_MVP (Expert):
In the case of range partitions, a merge request is made by stating which boundary point should be removed.

Vinod_MVP (Expert):
For example, you might choose to have one year of data available and each month-end you will switch in the current month and then switch out the earliest month thereby differentiating between the current month's read/write OLTP data and the ealier month???s read-only analysis type of data.

Vinod_MVP (Expert):
However, before you merge a boundary point you should switch out its associated data otherwise the merge could be an expensive operation.

Vinod_MVP (Expert):
There is a specific flow to this process to make it the most efficient. The concept of switch, merge and split is somewhat complex at first glance nevertheless once we start playing around with them it gets really easy.

Vinod_MVP (Expert):
Let us take simple examples of each of these operations and the nuances of each. Let us start with SPLIT:

Vinod_MVP (Expert):
Syntax:

Vinod_MVP (Expert):
ALTER PARTITION FUNCTION ??? SPLIT RANGE ???

Vinod_MVP (Expert):
This adds a new boundary point.

Vinod_MVP (Expert):
The performance caveat is whether rows need to be moved to the new partition.  If rows are moved, an exclusive table lock applied until the rows are moved.

Vinod_MVP (Expert):
Affects all objects using that Partition Function. One partition is split into two.

Vinod_MVP (Expert):
The new partition is the one containing the new boundary point:

1. To the right of boundary if RANGE RIGHT

2. To the left of boundary if RANGE LEFT

Vinod_MVP (Expert):
Some considerations for SPLIT:

Vinod_MVP (Expert):
1. Instantaneous if partition is empty

Vinod_MVP (Expert):
2. This is IO-intensive if partition is populated

                a. Any data on new side of the boundary is physically deleted from old partition and inserted into the new partition

                b. Fully logged operation

                c. Exclusive table lock held for duration of Split

Vinod_MVP (Expert):
3. Schema change, so plans are invalidated

Vinod_MVP (Expert):
4. Always assign ???Next Used??? filegroup in Partition Scheme before using Split

                ALTER PARTITION SCHEME NEXT USED ???

Vinod_MVP (Expert):
 that code specifies which filegroup holds new partition???s data

Vinod_MVP (Expert):
EG:


ALTER PARTITION FUNCTION TransactionRangePF1()

SPLIT RANGE ('01/01/2005')

Vinod_MVP (Expert):
Moving forward, lets next take a look at MERGE:

Vinod_MVP (Expert):
As the keyword suggests it removes a boundary point from a Partition Function

Vinod_MVP (Expert):
Affects all objects using that Partition Function

Vinod_MVP (Expert):
Syntax:

ALTER PARTITION FUNCTION ??? MERGE RANGE

Vinod_MVP (Expert):
Considerations for MERGE are same as the SPLIT operations.

Vinod_MVP (Expert):
EG:

ALTER PARTITION FUNCTION TransactionRangePF1()

MERGE RANGE ('01/01/2004')

Vinod_MVP (Expert):
The next important operation is the SWITCH operation.

Vinod_MVP (Expert):
1. Instantly swaps the content of one partition or table (source) with another table???s empty partition or an empty table (target)

Vinod_MVP (Expert):
2. Metadata-only operation, no data moves

Vinod_MVP (Expert):
3. Restrictions:

                a. Target table or partition must be empty

                b. Source and target must be in same filegroup

                c. Source must have all indexes required by the target, aligned and in matching filegroups

                d. If Target is a partition, Source must have check constraints (if table) or a partition range that fits within the target range

Vinod_MVP (Expert):
4. All associated indexes are automatically Switched along with the table / partition

Vinod_MVP (Expert):
5. If your table and indexes are aligned, you can instantly switch in/out data

Vinod_MVP (Expert):
Switch can happen between:

1. Partitioned Table's partition into a Table

2. Table into a Partitioned Table's partition

3. Partitioned Table into another Partitioned Table's partition

Vinod_MVP (Expert):
EG:


ALTER TABLE dbo.TransactionHistory

SWITCH Partition 1

TO dbo.TransactionHistoryStaging PARTITION 1

Vinod_MVP (Expert):
So there can be interesting scenarios we can have because of partitioning. But before we go there, do you have any question to bounce us with?

SriSamp [MVP] (Expert):
Q:
hi....sorry for bugging you guys....is there any repository to get all the information regarding sqlserver partioning

A: Resources will be posted in the end. So you can get all the links.

SriSamp [MVP] (Expert):
Q:
can any body explain the exact difference between horizontal and vertical partioning

A: Horizontal partitioning is wherein an object is partitioned along its columns and vertical partitioning is wherein an object is partitioned along its rows. All the examples that we have seen till now are "vertical" in nature.

SriSamp [MVP] (Expert):
Oops sorry. Its the other way round.

SriSamp [MVP] (Expert):
Horizontal is along the rows and vertical is along the columns. All our examples have been "horizontal" in nature.

SriSamp [MVP] (Expert):
 Sorry about that.....

Vinod_MVP (Expert):
Q:
can any body explain the exact difference between horizontal and vertical partioning

A: The problem with implementing a vertical partitioning is only using views. The problem is to implement multiple partition views we need to create that many number of sql views.

SriSamp [MVP] (Expert):
Q:
i have a doubt in Reporting Services, can i use a parameter derived from a SQL Dataset in a MDX query

A: Sumit, the question is out of context. Can you drop in a mail to us and we can then answer?

Vinod_MVP (Expert):
Ok. With no further questions over Partitioning, let me talk over some scenario's.

Vinod_MVP (Expert):
1. Sliding Window Load and Delete

Vinod_MVP (Expert):
This is a typical scenario for Large databases.

Vinod_MVP (Expert):
Partitions might be requested for each hour/day/week/month???add new partition and remove the oldest one.

Vinod_MVP (Expert):
New partition ???

                a. May need to batch load, scrub, and transform before incorporating into the whole table

                b. Or start partition as empty and populate gradually using transactions.

Vinod_MVP (Expert):
Old partition ??? may need backup, archive, restore.

Vinod_MVP (Expert):
Given this scenario what will be the partitioning activity we might need to do

Vinod_MVP (Expert):
Steps for Sliding Window:

Vinod_MVP (Expert):
1. Create staging table in same filegroup as target partition

Vinod_MVP (Expert):
2. Split most recent partition, adding boundary point for following period

Vinod_MVP (Expert):
3. Bulk load and index staging table

Vinod_MVP (Expert):
4. Switch data into next-to-last partition

Vinod_MVP (Expert):
Simple. That was half the job.

Vinod_MVP (Expert):
Deletion step in Sliding window:

Vinod_MVP (Expert):
1. Create unload table in same filegroup as partition to remove

Vinod_MVP (Expert):
2. Switch data out of second partition

Vinod_MVP (Expert):
3. Merge first partition, removing the boundary point for the unload period

Vinod_MVP (Expert):
4. Archive or Truncate the unload table

Vinod_MVP (Expert):
Main design point for sliding window is you always reserve an empty partition at either end.  This means switching operations will be instantaneous for either adding (switching in) or archiving (switching out)

Vinod_MVP (Expert):
The idea of doing this as efficiently as possible ??? keep table available, meta data only e.g. pointer switches.

Vinod_MVP (Expert):
When you build the staging table, it is outside of the partitioned table.  Thus, you have no availability hit while loading the staging table, and then add indexes to the staging table.  Lastly, the pointer switch is instantaneous.

Vinod_MVP (Expert):
Partitioning has got a new dimension to backup and recovery too


Vinod_MVP (Expert):
Lets take that next

Vinod_MVP (Expert):
a. Partitioning leverage Filegroup Backup / Restore enhancements in SQL Server 2005

                - Read-Only Filegroups can now be restored without applying transaction logs

Vinod_MVP (Expert):
b. Reduces data volume for regular backups when historical data is not changing

Vinod_MVP (Expert):
Functionally this means, Mark filegroups for unchanging, historical partitions as READ ONLY. Perform one-time backup of these READ ONLY filegroups.

Vinod_MVP (Expert):
Regularly backup only the Primary filegroup and other filegroups containing active, changing data.

Vinod_MVP (Expert):
To Restore, restore Primary and active filegroups and recover log.  Then restore the read-only filegroups separately.

Vinod_MVP (Expert):
Apart from this it is very important to understand what to be used as a partition key

Vinod_MVP (Expert):
Consider a Orders Table with Order_Date and Order_ID columns with other columns.

Vinod_MVP (Expert):
What are the Design alternatives we have:

Vinod_MVP (Expert):
1. Application enforced uniqueness.  Partition on order_date, align on Order_ID

Vinod_MVP (Expert):
2. Partition on Order_date.  Create a unique monolithic (spanning all partitions) non-aligned index on Order_ID.  Non-aligned unique index enforces unique Order_ID.

Vinod_MVP (Expert):
3. Maintain uniqueness and index alignment by design Order_ID to monotonically increase e.g. identity value.  Partition on Order_ID.

Vinod_MVP (Expert):
Choice of one option over the other depends on the workloads like:

Vinod_MVP (Expert):
1. DataWarehouse probably best to choose option #1. Because that is what makes sense to them.

Vinod_MVP (Expert):
2. OLTP probably best to choose option #3

Vinod_MVP (Expert):
3. Some reporting scenarios (drill down reporting, customer history questions) may benefit from option #2, the non-aligned index.  Pay price each time a new partition is added, the non-aligned index has to be rebuilt.

Vinod_MVP (Expert):
There are also Engine enhancements because of Partitioning.

Vinod_MVP (Expert):
If two tables are partitioned the same way (same scheme), it may be possible to perform a join of each pair of partitions separately

Vinod_MVP (Expert):
This is called a ???collocated join???

Vinod_MVP (Expert):
Works only for some join conditions like Equi-joins on partitioning key (A.a=B.b)

Vinod_MVP (Expert):
I see this as an big time asset. Apart from this many other operations can be performed ???per-partition??? like:

Vinod_MVP (Expert):
1. Grouping, Filtering, Projection

2. Inserts, Updates, Deletes

3. Create Index, Bulk Insert

Vinod_MVP (Expert):
So any more questions???

SriSamp [MVP] (Expert):
Q:
What is the difference between DataWarhouse & Data partition?

A: A data warehouse is a comprehensive source of data (usually collected over several years) and an infrastructure for building Business Intelligence (BI) solutions. Typically data warehouses deal with concepts like "facts" and "dimensions" and will have a schema structure that is more de-normalized to facilitate querying. Data Partitioning on the other hand is the concept of splitting data into small chunks (as opposed to a single monolithic table) based on a "partition key". Data Partitions are the corner stone for effective and efficient data warehouses.

Vinod_MVP (Expert):
When working with partitioning you might need to query system tables to get the partition details.


-- Some of the system tables to concentrate

SELECT * FROM sys.partitions

SELECT * FROM sys.tables

SELECT * FROM sys.indexes

 

SELECT * FROM sys.partition_functions

SELECT * FROM sys.partition_schemes

SELECT * FROM sys.partition_range_values

SELECT * FROM sys.partition_parameters

SriSamp [MVP] (Expert):
Here are some resources that can help you understand the concept of partitions, data warehouses etc better:

Strategies for Partitioning Relational Data Warehouses in SQL Server: http://www.microsoft.com/technet/prodtechnol/sql/2005/spdw.mspx

Using Partitions in a Microsoft SQL Server 2000 Data Warehouse: http://msdn.microsoft.com/library/en-us/dnsql2k/html/partitionsindw.asp?frame=true

SQL Server 2000 Incremental Bulk Load Case Study: http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx

Partitioned Tables and Indexes in SQL Server 2005: http://msdn.microsoft.com/library/en-us/dnsql90/html/sql2k5partition.asp?frame=true

Understanding LEFT and RIGHT in Partition Functions: http://www.sqlskills.com/blogs/kimberly/PermaLink.aspx?guid=cf223632-c93b-4242-b0f2-af493e051266

Vinod_MVP (Expert):
The last two links are very informative ...

Subhashini  (Moderator):
Subhashini  (Moderator): Hi guys , we are close to time up!   Please conclude your queries and feel free to email Vinod and Srini for any additional queries

Vinod_MVP (Expert):
You can also get some resources on SQL Server 2005 at:

My Website: www.ExtremeExperts.com

My Blog: http://blogs.sqlxml.org/vinodkumar/

SriSamp [MVP] (Expert):
Web Site: http://www32.brinkster.com/srisamp

Blog: http://blogs.sqlxml.org/srinivassampath

Email: srisamp@gmail.com

Vinod_MVP (Expert):
Vicjk ...Although partitioning is transparent to the application, partitioning adds administrative overhead and complexity to the implementation of your objects.

Vinod_MVP (Expert):
While partitioning can offer great benefits, you may not want to consider it when your table is small. And I know these are subjective.

Vinod_MVP (Expert):
You need to understand:

1. Your performance needs

2. Additional maintenance needs

3. How is the data access like

4. How many physical disks are available

See that the administrative burden doesnt overshadow the performance gains achieved.

Vinod_MVP (Expert):
There is no rule of thumb that we cannot at any point in time say which number of rows are ideally suited for partitioning candidate.

SriSamp [MVP] (Expert):
The SQL Server Books online has some parameters to look out for before deciding on when you will need partitioning.

Vinod_MVP (Expert):
You are a better judge as an Developer and/or DBA

Vinod_MVP (Expert):
Thanks all. Hope you enjoyed the chat. Do come to the local UG in Bangalore for more SQL Server 2005 sessions. http://groups.msn.com/BDotNet . We have "Introduction to Data Mining in SQL Server 2005" this week.

SriSamp [MVP] (Expert):
Thanks guys. It was great sharing "one more" great feature of SQL Server 2005 and yes, do walk-in into the UG meeting and you will always learn something new...

Subhashini  (Moderator):
Thanks to Vinod and Srini for such an informative session !

Subhashini  (Moderator):
This brings us to the timeup! And please feel free to email your queries to Vinod & Srini

Subhashini  (Moderator):
Thanks to you guys for taking tiem out for this chat

Subhashini  (Moderator):
Have a great evening ! :)

SriSamp [MVP] (Expert):
Thanks and bye all.
 
     

©2009 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement