Click Here to Install Silverlight*
IndiaChange|All Microsoft Sites
Microsoft
Communities 
 
Chat Transcript
 
Chat Topic : CLR and XML Best practices
Chat Expert : Srinivas [MVP]
September 14, 2005
 
 
Subhashini[MSFT] (Moderator):
hello everybody :) a very good evening to all of you !
Subhashini[MSFT] (Moderator):
Welcome to today's chat on CLR and XML best Practices
Subhashini[MSFT] (Moderator):
hosted by our two SQL Server MVPs
Subhashini[MSFT] (Moderator):
To give you a quick intro about our experts
Subhashini[MSFT] (Moderator):
Srinivas Sampath works for SCT Software Solutions(Bangalore). He graduated with a bachelors in Computer Technology(PSG College of Technology) and later pursued MCA (Madras University).He joined SCT (then called Exeter Systems) in 1997 and has been with them for the past 6 years.
Subhashini[MSFT] (Moderator):
His area of expertise include database architectures (using SQL Server) and enterprise application architectures using Microsoft Technologies. Srinivas currently is a Product Development Manager and also head of Technology Labs, a group that focuses on latest Microsoft technologies and their applicability into their product lines.
Subhashini[MSFT] (Moderator):
His interests include SQL Server 2000 programming, Office 2003 programming and .NET. And you can reach him at ssampath@sct.co.in
<http://www32.brinkster.com/srisamp>
Subhashini[MSFT] (Moderator):
To introduce Vinod M Kumar,
Subhashini[MSFT] (Moderator):
Vinod Kumar did his Bachelors of Engineering in Mechanical from Anna University, Chennai. After graduation, he joined SCT Software Solutions (then Exeter Systems) where he worked for about 4 years. He has worked for a team called Technology Labs which evaluates various Microsoft technologies and products and their relevance in this line of business. He is presently with Intel Technologies, Bangalore. He has been working on Microsoft Technologies all along his professional career and is very inclined towards learning every new concept that Microsoft comes out with. In this process, he has gathered tons of knowledge to share with all.
Subhashini[MSFT] (Moderator):
Catch all the resources and articles Vinod has written at www.ExtremeExperts.com <http://www.ExtremeExperts.com>, the site he co-hosts. He has been very passionate about SQL Server programming and loves to experiment different programming tricks in SQL Server.
Subhashini[MSFT] (Moderator):
Vinod is an active volunteer in INETA and PASS. He has initiated a SQL Server community in Bangalore. To capture more events on SQL Server, join the group at <http://groups.msn.com/SQLBang> which was formed as an initiative to bring PASS community in India.
Subhashini[MSFT] (Moderator):
and you can reach Vinod at Email ID: vinodk@mvps.org <mailto:vinodk@mvps.org>
Web Site: www.ExtremeExperts.com <http://www.ExtremeExperts.com>
Subhashini[MSFT] (Moderator):
A few chat rules as usual before we start the chat ..
Subhashini[MSFT] (Moderator):
Please refrain from sending any private messages to the expert during the chat
Subhashini[MSFT] (Moderator):
Chat Procedures:
This chat will last for one hour. During this hour, our Experts will respond to as many questions as they can. Please understand that there may be some questions we cannot respond to due to lack of information or because the information is not yet public. We encourage you to submit questions for our Experts.
Subhashini[MSFT] (Moderator):
We ask that you stay on topic for the duration of the chat. This helps the Guests and Experts follow the conversation more easily. We invite you to ask off topic questions after this chat is over.
Subhashini[MSFT] (Moderator):
Please use the radial button "submit a question " to ask any questions to the expert
Subhashini[MSFT] (Moderator):
So we are all set for a great chat session by our MVPs , we thank you all for joining us today
Subhashini[MSFT] (Moderator):
and let's get the chat rolling ...
Subhashini[MSFT] (Moderator):
Hi Vinod and Srinivas
Vinod_MVP (Expert):
Hi All ...
Srinivas [MVP] (Expert):
Hi all. Good to get in touch with all of you once again.
Srinivas [MVP] (Expert):
So, let's get started.
Srinivas [MVP] (Expert):
As you all know, this chat is titled CLR and XML Best Practices for SQL Server 2005.
Srinivas [MVP] (Expert):
The integration of the CLR and the introduction of the XML as a first class data type in SQL Server opens the doors for many new applications.
Srinivas [MVP] (Expert):
In the past chats, we have seen each of these features in greater detail and there are many online resources to help you get started.
Srinivas [MVP] (Expert):
In today's chat we will explore some of the best practices that you should be aware off, when using these new methods.
Srinivas [MVP] (Expert):
Note that the CLR and XML features are not being introduced here. The assumption here is that you have either worked with them or have read about them.
Srinivas [MVP] (Expert):
First, we will start off with the CLR best practices.
Vinod_MVP (Expert):
Integration of CLR into SQL Server is one of the most demanding and highly anticipated feature in the next generation of SQL Server 2005.
Vinod_MVP (Expert):
Microsoft SQL Server 2005 significantly enhances the database programming model by hosting the Microsoft .NET Framework 2.0 Common Language Runtime (CLR).
Vinod_MVP (Expert):
This enables developers to write procedures, triggers, and functions in any of the CLR languages, particularly Microsoft Visual C# .NET, Microsoft Visual Basic .NET, and Microsoft Visual C++.
Vinod_MVP (Expert):
In this chat we assume there is some understanding about this integration, our prime focus would be to identify some of the scenario's where we can capitalize this new feature.
Vinod_MVP (Expert):
The chat will try to provide a set of high-level guidelines on how to choose between the available programming alternatives.
Vinod_MVP (Expert):
To start off with, Use CLR and XML judiciously to complement T-SQL.
Vinod_MVP (Expert):
The TSQL enhacements are huge and for set based operation it is always better to try use the TSQL way. Because it is more efficient.
Vinod_MVP (Expert):
It is important to understand that CLR is NOT a black box. Moving forward now the application developer has the option to write a code in TSQL and CLR.
Vinod_MVP (Expert):
But making the right choice for the operation can dictate the performance of the application.
Vinod_MVP (Expert):
If I were to compare the raw benefits that CLR brings to the table:
Vinod_MVP (Expert):
a. T-SQL is interpreted and hence slower.
b. Error handling in T-SQL was frankly abysmal (although it has improved in SQL Server 2005)
c. String handling in T-SQL is unoptimized and slow
d. SQL-CLR is compiled so faster
e. SQL-CLR lets you take advantage of the huge base class library
f. Capitalize existing .NET developers experience
Vinod_MVP (Expert):
But we will try to compare the CLR integration in various categories. I have tried to talk over deciding over TSQL or CLR is also an important decision.
Vinod_MVP (Expert):
Here are some simple rules that you can use to decide the same.
Vinod_MVP (Expert):
1. T-SQL is best suited for data access and has a simple programming model to work with.
2. Managed code is all about procedural programming and computation.
3. Because it is compiled the CLR outperforms over computing-intensive business logic functions.
4. If the procedure primarily involves forward-only, read-only row navigation through a result set with some processing of each row, using the CLR is likely more efficient.
Vinod_MVP (Expert):
So any questions till now over the CLR feature?
Srinivas [MVP] (Expert):
Q: Why a clustered primary key must exist in the table containing the indexed xml column ?
A: I will answer this one, even though it is not with respect to the current CLR topic. The clustered index is required to correlate index rows with the rows in the base table.
Vinod_MVP (Expert):
Can you post it as a question to the experts please?
Srinivas [MVP] (Expert):
All, please post questions to the expert and not as regular comments. This will help us seggregate the questions and answer to them.
Vinod_MVP (Expert):
Q: how GC works in environment ?
A: The crus is this, now CLR gets loaded just as another loader that exists for ASP.NET, or Office VSTO application. But the only difference here is the SQL Server takes complete ownership of the GC cycles, when it happens and when an app domain needs to be created etc. Even the methods that needs to be loaded is loaded ONLY in instruction from SQL Server. This process is called as Fusion loading.
Vinod_MVP (Expert):
Q: what is the difference between 1.1 clr and 2.0 clr
A: The differences are the same what the framework can provide. What in specific are you looking for from an integration of SQL Server?
Vinod_MVP (Expert):
Q: Safty measures for SQL in CLR?
A: the security aspect is quite detailed when it comes to using CLR data type. It falls under three different category of Safe, UnSafe and External. It follows the CAS policies and is bound to the database. And I feel this aspect is critical. Moreover, SQL Server CLR code CANNOT access memory areas outside its context boundaries, else the App domain will be restarted. So we have a more granular security boundary.
Srinivas [MVP] (Expert):
Q: Can I directly replace an application that uses SQL2K with SQL 2005 to enjoy these features?
A: Not sure what you mean by replacing the application. If you just take a SQL 2K database and put it onto SQL 2K5, it will run in 2K compatibility mode. After this, you need to change the compatibility to 2K5 and then change your application to start using the new features.
Vinod_MVP (Expert):
Q: Hi vnoid..can we have unique clustered index on table and create xml index ?
A: Yes. This is very much possible. the only restriction was an Primary clustured index on a table. This is to get a locator of the pointer to the data row. And this clustered index can be unique. And I generally prefer this model too.
Vinod_MVP (Expert):
So let us get rolling
Vinod_MVP (Expert):
These are like rule of thumb but it is very vital to make some performance numbers to the whole activity to get a hang of which outperforms the other based on the scenario in hand.
Vinod_MVP (Expert):
I would like to re-iterate that CLR is not replacement for set-based query processing. Here the query processor outperforms all.
Vinod_MVP (Expert):
So to summarize it from an usage perspective:
a. Use CLR for procedural and computation-intensive logic
b. Logic that will benefit from use of data structures not supported in T-SQL (e.g. arrays)
c. Lastly, programs that take advantage of .NET Framework library
d. T-SQL better suited for procedures with mostly data access statements with set based operations with DML and DDL operations.
Vinod_MVP (Expert):
CLR Vs XP's
So this is a prime candidate for usage of CLR. Building XP's in SQL Server have ALWAYS been the domain of C++ geeks. So the benefits here include:
1. Easier in-proc data access: No need to use loop-back connections and binding sessions
2. Better administrative control: 3 permission buckets.
3. Reliable and scalable
I cannot really recollect an area where one needs to still use the older model of XP's given the API's that .NET framework tries to expose.
Vinod_MVP (Expert):
Let us look at some of the common scenarios of use:
Vinod_MVP (Expert):
1. Sending Results to the Client
This involves sending a set of rows to the client without "consuming" them in the server (meaning no navigation of the rows inside the routine). With T-SQL, simply embedding a SELECT statement in the T-SQL proc has the effect of sending rows produced by the SELECT to the client. With managed code, you use the SqlPipe object to send results to the client. T-SQL and in-proc ADO.NET perform equally in this scenario.
Vinod_MVP (Expert):
2. Submitting SQL statements to the server
This involves the round-trip of executing an SQL statement from the procedural code. In this case, T-SQL has a significant advantage (more than twice as fast as in-proc ADO.NET). An important thing to note here is that, the slower performance in CLR occurs because of the additional layers of code that are involved in submitting a T-SQL statement from managed code to native SQL code. After the statement is submitted to the query processor, there is no difference in performance based on the source of the statement (T-SQL or in-proc ADO.NET).
Vinod_MVP (Expert):
3. Forward-only, Read-only row navigation
This involves navigating the rows produced by a SQL statement one at a time in a forward only, read-only fashion. In T-SQL, this is implemented through a forward only, read-only cursor. In CLR, it is implemented through SqlDataReader. Typically, there is some processing involved for each statement. If you ignore the processing associated with each row, then the navigation of rows in CLR is slightly slower than in T-SQL cursors. If you however factor in the processing that is performed for every row, the advantage tilts more towards CLR since the CLR significantly outperforms T-SQL on such processing.
Vinod_MVP (Expert):
4. Row-Navigation with updates
If the problem involves updating rows based on the current position in the cursor, there is no relevant performance comparison because this functionality is not supported through in-proc ADO.NET, and should be done through T-SQL updateable cursors.
Vinod_MVP (Expert):
These are some interesting thoughts that I want you all to start thinking on those lines. Now programming in SQL Server is a two part game and making the right choice is critical.
Vinod_MVP (Expert):
Any questions over the CLR use inside SQL Server ? Go ahead and shoot it ...
Srinivas [MVP] (Expert):
Q: So We can implement all the Business Logic on SQL server side in CLR compatible language?
A: No. Never do that. The CLR features of SQL Server will have to be used only if you have procedural code that is CPU intensive and also requires occasional access to data. Also, other data validation not possible in T-SQL can be done in the CLR. Use T-SQL as much as you can, since set-based queries are always faster. Business logic must always be on the application tier, since that is a layer than can easily scale and be changed without affecting other tiers.
Vinod_MVP (Expert):
Q: So We can implement all the Business Logic on SQL server side in CLR compatible language?
A: Yes. Perfectly crafted Srini. To add to this understand that it is easy for me to scale up when it comes to SQL Server and difficult for me to scale out easily which have always been the asset of the application tier. So use the code that is bound to have computational intensive and some time chatty with the data tier and have the application logic still reside at the middle tier.
Srinivas [MVP] (Expert):
Ok, now that we have seen the CLR best practices, we will move on to the XML best practices.
Srinivas [MVP] (Expert):
By now, most of you would be knowing that one of the biggest improvements in SQL Server 2005 was the introduction of the new XML data type.
Srinivas [MVP] (Expert):
Just to provide a brief history of the same, SQL Server 2000 had very limited support for XML.
Srinivas [MVP] (Expert):
You had to store XML documents as (N)TEXT columns and there was no option to query the XML directly.
Srinivas [MVP] (Expert):
Simple APIs in the form of sp_xmlpreparedocument and OPENXML existed to work on XML data and factor it into a relation model.
Srinivas [MVP] (Expert):
SQLXML was introduced as a middle-tier solution for submitting XML queries to SQL Server and get the results back as XML through a concept called XML views.
Srinivas [MVP] (Expert):
You could also do some post processing on the XML by applying XSL stylesheets. Bulk-load of XML was also supported through the SQLXML OLEDB provider.
Srinivas [MVP] (Expert):
If you are still using SQL Server 2000, and want an XML solution, I would recommed that you look at SQLXML: http://www.microsoft.com/Downloads/details.aspx?displaylang=en&FamilyID=05F1A309-BE6E-49FE-968D-9833FA15CC12
Srinivas [MVP] (Expert):
SQL Server 2000 also allowed you to stream the results of a SELECT clause to the client as XML using the FOR XML directive.
Srinivas [MVP] (Expert):
As the usage of XML increases and more applications adopt XML as the de-facto way of expressing hierarchical data and are looking at ways to store, retrieve and query the XML data, its time SQL Server did something to help people.
Srinivas [MVP] (Expert):
SQL Server 2005 provides full support for XML as a data type. XML documents can now be stored natively (either as documents or fragments) in the database and can be used as variables and parameters.
Srinivas [MVP] (Expert):
You can also type the XML data with an XSD schema (or to a collection of schemas).
Srinivas [MVP] (Expert):
SQL Server 2005 also supports the querying of the data in the XML using XQuery, the new W3C standard for querying with XML data.
Srinivas [MVP] (Expert):
You can also modify the XML data in place (for example adding nodes) by using the new XML DML facilities.
Srinivas [MVP] (Expert):
All of these enhancements for XML actually increase the surface area of the solutions that you can now model using SQL Server 2005.
Srinivas [MVP] (Expert):
Sure.
Srinivas [MVP] (Expert):
Can you pose that as a question to the experts please?
Vinod_MVP (Expert):
Q: Why a clustered primary key must exist in the table containing the indexed xml column ?
A: In simple terms yes very much. But I would like to take a closer look at the requirement to gauge the scenario. Consider if this is just a some sort of data looping with a DML processing ... Then I think the TSQL with the set based approach will hats off outperform. But consider the situation where the looping of data is accompanied with some complex matematical operation with very less data feed to the processing, under such a circumstances the CLR will outperform. This is the trade off between using compiled computational activity vs performing interpreted data intensive task. Hope this demystifies the scenarios of usage.
Vinod_MVP (Expert):
Q: Can we replace cursors with CLR code
A: Oops... That was for a different question. In simple terms yes very much. But I would like to take a closer look at the requirement to gauge the scenario. Consider if this is just a some sort of data looping with a DML processing ... Then I think the TSQL with the set based approach will hats off outperform. But consider the situation where the looping of data is accompanied with some complex matematical operation with very less data feed to the processing, under such a circumstances the CLR will outperform. This is the trade off between using compiled computational activity vs performing interpreted data intensive task. Hope this demystifies the scenarios of usage.
Srinivas [MVP] (Expert):
Let me continue while the question is being asked.
Srinivas [MVP] (Expert):
But, what are use cases where you need to use XML? What are some of the best practices? The focus of this chat is to tell you when and how to use the XML data type.
Srinivas [MVP] (Expert):
Note that this chat does not introduce you to the XML features and how to use it. It assumes some familiarity of having worked with (or read about) the same.
Srinivas [MVP] (Expert):
The first question that people usually have is: Why should I store XML documents in the database? Should you go the relational way or the XML way? Let's address this...
Srinivas [MVP] (Expert):
If the data that you want to store is highly structured with a known and fixed schema, you should always model it in a relation table.
Srinivas [MVP] (Expert):
For example, if you know that the addresses of your customers have Line 1, City, State and Zip and there can be more than 1 address record, you can simply have a table of addresses.
Srinivas [MVP] (Expert):
Each address record can then be linked to the customer via a foreign key.
Srinivas [MVP] (Expert):
If, on the other hand, the structure of your data is not known (or is flexible), then modelling it as a relational table may not be very good.
Srinivas [MVP] (Expert):
One example, that I can think of a structure for storing user preferences. The information for user preferences is very flexible and it is not useful to model it as a relational table.
Srinivas [MVP] (Expert):
In this case, you can store the preferences inside an XML structure and perform queries over it.
Srinivas [MVP] (Expert):
Another thing that you have to ask yourself is, will you just need to store the XML data or perform some operations on it in the database?
Srinivas [MVP] (Expert):
If you need to just store it and get it back each time, you can always use the (N)VARCHAR(MAX) data type. In this case, SQL Server just acts as the data store for the XML.
Srinivas [MVP] (Expert):
Q: What is the internal format, the XML document is stored in SQL server ?
A: It is a binary format.
Srinivas [MVP] (Expert):
This still does not answer the question: Why should I store XML in the database?
Srinivas [MVP] (Expert):
Here are some of the top reasons for doing this (adapted from the article: XML Best Practices for SQL Server 2005 by Shankar Pal et al)
Srinivas [MVP] (Expert):
1. You want to use administrative functionality of the database server for managing your XML data (for example backup)
Srinivas [MVP] (Expert):
2. You want to work with your XML data along with other relational data in a transacted manner.
Srinivas [MVP] (Expert):
3. You need an efficient way of extracting portions of your XML data and querying the same.
Srinivas [MVP] (Expert):
4. You want the server (SQL Server in this case) to validate your XML and assure well-formedness.
Srinivas [MVP] (Expert):
5. You want indexing of XML data for efficient query processing and good scalability and use a world-class query optimizer.
Srinivas [MVP] (Expert):
6. You want SOAP, ADO.NET and OLE DB accesses to the XML data.
Srinivas [MVP] (Expert):
Once you have the decision that XML is going to be stored in the database, the next thing that you must consider is how to optimize the access to the XML.
Srinivas [MVP] (Expert):
SQL Server 2005 allows you to "index" an XML column. XML indexes are created using new DDL statements that can be applied on typed and un-typed XML columns.
Srinivas [MVP] (Expert):
Indexing an XML column creates a B+ Tree for all the XML instances in the column.
Srinivas [MVP] (Expert):
XML Indexes are of two types: (1) Primary Index and (2) Secondary Index (which depends on the primary index).
Srinivas [MVP] (Expert):
The Primary XML index requires a clustered index on the primary key of the table where the XML is stored.
Srinivas [MVP] (Expert):
Once the index is created, a B-Tree of the various tags, elements, node values and node types is created.
Srinivas [MVP] (Expert):
Apart from this, document order, structure and the path from the root of the XML instance to each node is captured for efficient evaluation of path expressions.
Srinivas [MVP] (Expert):
Note that the primary key of the base table is duplicated in the primary index to correlate index rows with the base table rows.
Srinivas [MVP] (Expert):
Queries involving XML data type methods use the primary XML index.
Srinivas [MVP] (Expert):
Q: see meet, what srini is saying we can have primary kjey over the XML datatype column... so i guess, once we have the PK on a column index is automatically created...
A: Let me clear some confusion here. If you create a table having just an XML column, you cannot index it. For example, if you are creating a table called Books which store some information about the book and sample chapters as XML documents, you need to have an primary key for that table if you need to index your XML content. Otherwise, the optimizer will not know what XML document to return from the list of rows that you have for a query that you made. Thus, when you create the PK on the table, the XML indexes in the internal storage structures of SQL Server are co-related to the correct table rows.
Vinod_MVP (Expert):
Q: We can store only XML files in the discussed fashion or any type of files? If 'Only XML', then, will SQL validate the XML file [well formed or not]?
A: It is a general understanding that XML is a special and first class data type. So the discussion revolves around that. Having said, XML datatypes are of two kinds. Typed and UnTyped. Typed is one with a schema attached which enforces a hard binding over the formation of XML and for UnTyped it is not bound to a schema and can even be XML fragments as your question. Under UnTyped, SQL willnot validate the structure of contents. If you want to have raw/XML fragments with no XML related query power then I would recommend you use VARCHAR(MAX) datatype. BTW, you can convert from VARCHAR to XML and vice-versa etc. This is the next bonus ...
Subhashini[MSFT] (Moderator):
attention guys , we have the last 15 mins for the chat to conclude . Please submit your questions to the experts
Srinivas [MVP] (Expert):
Let me continue. We are are running short of time.
Srinivas [MVP] (Expert):
Once the primary XML index has been created, secondary XML indexes can be created to speed-up different classes of queries based on usage patterns.
Srinivas [MVP] (Expert):
Three types of secondary XML indexes can be created: PATH, PROPERTY and VALUE.
Srinivas [MVP] (Expert):
These indexes benefit path-based queries, custom property management scnearios and value-based queries.
Srinivas [MVP] (Expert):
Here are some guidelines on when to create each of these types of secondary indexes.
Srinivas [MVP] (Expert):
1. If the workload uses path based expressions heavily on XML columns, PATH based indexes can speed up the queries.
Srinivas [MVP] (Expert):
The most common case is the exist() method on XML columns in the WHERE clause.
Srinivas [MVP] (Expert):
2. If the workload retrieves multiple values from individual XML expressions using path expressions, clustering paths within each XML instance in the PROPERTY index is useful.
Srinivas [MVP] (Expert):
3. If the workload involves querying for values within XML instances without knowing the element or attribute names that contains those values, VALUE indexes are useful.
Srinivas [MVP] (Expert):
You can also create a full-text index on XML columns. Full text indexes, index the contents of the XML values while ignoring the XML markup.
Vinod_MVP (Expert):
Q: how index works on XML datatype columns....wht is key ?
A: Understand the key can be either a value, XPath or a property. The concepts are very simple and it maps right back to the data with a clustered index. And hence we have the restriction of a Clustered index before we cna define a XML index.
Srinivas [MVP] (Expert):
Attribute values are not full-text indexed (since they are considered part of the markup) and element tags are used as token boundaries.
Srinivas [MVP] (Expert):
Full Text XML indexes can be combined with XML indexes in some cases where we want to filter the XML values of interest based on full-text searches.
Srinivas [MVP] (Expert):
For a good treatment of XML indexes, refer to the following article by Bob Beauchemin: http://msdn.microsoft.com/sql/default.aspx?pull=/library/en-us/dnsql90/html/xmlindexes.asp
Srinivas [MVP] (Expert):
If you notice that queries on an XML column are made principally on a few element values, you may want to think about the notion of "property promotion".
Srinivas [MVP] (Expert):
Property promotion is a concept wherein you pick out the frequently accessed elements into a separate relational table and sync the table through triggers.
Srinivas [MVP] (Expert):
If the property is single-valued, for example the ISBN of a book, you can create a computed column and store the same.
Srinivas [MVP] (Expert):
However, if the property is multi valued (for example, all the first names of your customers), then a separate property table can be created.
Srinivas [MVP] (Expert):
Usage of XML schemas to type XML columns can be very beneficial for query processing and storage optimization.
Srinivas [MVP] (Expert):
Note that XML data not bound to a schema is considered as untyped and the node values are stored as Unicode strings.
Srinivas [MVP] (Expert):
The only check that SQL Server does is for well-formedness.
Srinivas [MVP] (Expert):
Typing of XML is done by associating an XML data type with registered XML schemas.
Srinivas [MVP] (Expert):
An XML column, parameter or variable bound to an XML schema collection is typed according to all the schemas in the collection.
Srinivas [MVP] (Expert):
Within an XML schema collection, the type system identifies each schema using its target namespace.
Srinivas [MVP] (Expert):
Each top-level XML element in an instance must specify a possible empty target namespace it conforms to.
Srinivas [MVP] (Expert):
Data is validated during insert and modification according to the target namespace.
Srinivas [MVP] (Expert):
Also, the binary XML representation encodes typed values based on the associated XML schema information and is fully described, so that reparsing is faster.
Srinivas [MVP] (Expert):
During query compilation, XML schemas are used for type checking and static errors are issued for type mismatches.
Srinivas [MVP] (Expert):
The query compiler also uses XML schemas for query optimizations.
Srinivas [MVP] (Expert):
Here are some other best practices to consider:
Srinivas [MVP] (Expert):
1. When updating an XML instance in-place, take care that you insert elements as the rightmost sibling or rightmost fragment.
Srinivas [MVP] (Expert):
We saw earlier that an XML document is shredded and stored in an internal format that has tags, elements, node types and values etc.
Srinivas [MVP] (Expert):
Thus, when you insert a new node into an XML element, take care that you insert that element as the last sibling.
Srinivas [MVP] (Expert):
For example, consider that you have a <book> node with several <author> child nodes as siblings.
Srinivas [MVP] (Expert):
Now, if you need to add a new author to this set, it would be faster if you add the new <author> element as the last element of the set rather than the first one.
Srinivas [MVP] (Expert):
2. Always use exist() in the WHERE clause to check for the existence of something. exist() uses indexes more efficiently (PATH and VALUE indexes).
Srinivas [MVP] (Expert):
3. Usage of schemas for typing XML allows the optimizer to allow range scans, since the values are stored internally as per the schema.
Srinivas [MVP] (Expert):
4. A new optimization has been recently added, wherein if you have multiple value() predicates on the same SELECT, under certain conditions, they are merged as one scan.
Srinivas [MVP] (Expert):
5. New profiler event (XQuery Typing) that allows you to check whether static type checking is happening in the query.
Srinivas [MVP] (Expert):
6. The debugging techniques are the same for XML data types. By using PerfMon, query plans, we can understand the operations used and then re-write the queries efficiently.
Srinivas [MVP] (Expert):
7. When uploading large amounts of data into a table containing XML columns, it is usually good to disable the XML indexes and rebuild them again.
Srinivas [MVP] (Expert):
8. XML column corruption is caught during DBCC checks and any anomalies (for example non-conformance with schemas etc) is reported back.
Srinivas [MVP] (Expert):
Apart from the "physical" checks that DBCC does, it now also does "logical" checks for XML data.
Srinivas [MVP] (Expert):
These logical checks include checking XML data against the XML index contents, validating the schema against the XML data etc.
Srinivas [MVP] (Expert):
Once an anomaly is reported, you can either opt to restore the database, or decide to rebuild the index.
Vinod_MVP (Expert):
Q: what if i have registered a schema and later on there is a change...
A: Go ahead with the alter schema and change the same. No issues. Check this article: http://www.extremeexperts.com/SQL/Yukon/ChangingXMLSchema.aspx
Srinivas [MVP] (Expert):
That brings me to the end of some of the most common best practices. We can now answer a few questions before the chat ends.
Srinivas [MVP] (Expert):
I saw some questions on memory usage. Loading and parsing a large XML document does have some performance improvements over SQL 2000, but no data has been reported yet on the increase in the performance. As I said before, you need to ask yourself why you want to load such a huge document in the database. The application tier is better suited for it.
Srinivas [MVP] (Expert):
Q: please list best practices for XML usage in YUKON ? (coding Standards,If any)
A: No specific coding standards exist. You can use the list that I provided in this chat, like the usage of indexes, schemas and the proper XML operators to ensure that your XML access performs the way you want.
Vinod_MVP (Expert):
Q: how easy or how diff is it to extract data from the column defined as XML data type, if no indexes are defined...
A: This is same as compared to extracting a data out of a table with no indexes :). If you feel you will be making frequent calls to a node, method or property then try considering the indexing option for it makes the query faster. Or even think on the lines of property promotion to enhance your queries.
Srinivas [MVP] (Expert):
Q: if i have a column of type XML... Is it compulsary to define a schema for that column? if i want to store XML with different XSDs, how should i proceed?
A: Schemas are optional. Thus, if each row in your table has to store a different XML document, one option is you can create a schema collection of all the schemas that you want and then tie the XML to that collection. The other option is to make the column as un-typed and do the schema checking in the application tier.
Vinod_MVP (Expert):
Q: i think, defining schema is not compulsary...
A: As said before, it is NOT mandatory. To define a Typed XML datatype a schema is required. You can have untyped XML's that DONOT require a schema attached to them and they can even be XML fragments generated on the fly.
Srinivas [MVP] (Expert):
Q: Is yukon uses DIFFGRAM internally for XML handling in case of ADO ?
A: Not sure I understand the question. Will have to check.
Srinivas [MVP] (Expert):
Here are some resources that you can use to know more about the XML features.
Srinivas [MVP] (Expert):
1. My Blog about good XML articles: http://blogs.sqlxml.org/srinivassampath/archive/2005/07/05/3858.aspx
Srinivas [MVP] (Expert):
2. Webcast: http://go.microsoft.com/fwlink/?LinkId=43856 (Developing Smart Client Applications Using SQL Server 2005 Native XML Support)
Srinivas [MVP] (Expert):
3. Webcast: http://go.microsoft.com/fwlink/?LinkId=43858 (Managing XML Data on the Database with SQL Server 2005 and Visual Studio 2005)
Srinivas [MVP] (Expert):
4. Webcast: http://go.microsoft.com/fwlink/?LinkId=43877 (Optimizing and Troubleshooting XML Applications at the Server)
Vinod_MVP (Expert):
Q: Is yukon uses DIFFGRAM internally for XML handling in case of ADO ?
A: Can you be more specific here. SQLXML did try to use Diffgrams but these were needed to improve performance as data used to pass through the middle tier. But with XML datatype the definition of XML is a column value rather than a row valued indices. Hope you are able to make the distinction.
Vinod_MVP (Expert):
Some more links of interest include:
Vinod_MVP (Expert):
Using CLR Integration in SQL Server 2005: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sqlclrguidance.asp
Vinod_MVP (Expert):
Making the Most of the CLR : http://www.windowsitpro.com/Article/ArticleID/45673/45673.html
Vinod_MVP (Expert):
Using CLR features in SQL Server 2005 : http://www.sswug.org/see/22708
Vinod_MVP (Expert):
Security for CLR Assemblies in SQL Server 2005 : http://www.windowsitpro.com/SQLServer/Article/ArticleID/45948/SQLServer_45948.html
Vinod_MVP (Expert):
SQL Server 2005-Managed Execution : http://www.extremeexperts.com/SQL/Yukon/ManagedExecution.aspx
Vinod_MVP (Expert):
Can get to the CLR teams Blog at: http://blogs.msdn.com/sqlclr
Vinod_MVP (Expert):
You can catch some resources over my website at: www.ExtremeExperts.com too
Subhashini[MSFT] (Moderator):
great , so that one fully loaded session !
Subhashini[MSFT] (Moderator):
So that was one fully loaded session !
Srinivas [MVP] (Expert):
Q: How can we handle CDATA section of XML, while reading from table ?
A: Not sure what you mean by handle. Can you give an example?
Subhashini[MSFT] (Moderator):
Thanks to all of you for joining us for this chat .
Subhashini[MSFT] (Moderator):
Thanks to Vinod and Srinivas for delivering such an informative session
Subhashini[MSFT] (Moderator):
This brings us to the closure of today's chat
Subhashini[MSFT] (Moderator):
and please email your queries to Vinod and Srinivas
Srinivas [MVP] (Expert):
Our pleasure. In case you still have questions, do post them at the SQLBang group at: http://groups.msn.com/sqlbang.
Subhashini[MSFT] (Moderator):
Srinivas :ssampath@sct.co.in <mailto:ssampath@sct.co.in>
Srinivas [MVP] (Expert):
If you post in the group, more people can benefit from the answer.
Subhashini[MSFT] (Moderator):
Vinod at mailto:vinodk@mvps.org>
Vinod_MVP (Expert):
Thanks all for your time. I hope you had some good information and fun too. Catch you all at the Usergroups.
Srinivas [MVP] (Expert):
Thanks all. Have a nice evening (or day, wherever you are :-))
Vinod_MVP (Expert):
Bye all.
Subhashini[MSFT] (Moderator):
check out the sql user groups in India at http://www.microsoft.com/india/communities/usergroups/default.aspx
Subhashini[MSFT] (Moderator):
Please pool in your feedback about the chat at commind@microsoft.com
Subhashini[MSFT] (Moderator):
Also feel free to email at commind@microsoft.com if you would like to host a webchat ! :)
Subhashini[MSFT] (Moderator):
Thanks all of you for joining us and thanks to Srinivas and Vinod for their time and expertise
Subhashini[MSFT] (Moderator):
have a wonderful evening :)
Srinivas [MVP] (Expert):
Bye all.
 
     

©2009 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement