How to bridge the gap
Brian Lawton and Don Awalt
Article from SQL Server Magazine
While developing SQL Server 7.0, Microsoft was so concerned about backward compatibility that one of its design goals included the capability for all existing application source code to run unmodified. However, as with most product upgrades, in order for the company to implement several new features, it had to sacrifice some backward compatibility. In this article, we explore some of SQL Server's internal implementation changes that resulted in potential compatibility problems and examine several situations that developers might face when upgrading applications.
| Why All the Changes? | |
| System-Level Compatibility | |
| Syntactical Compatibility | |
| Behavioral Compatibility | |
| Putting It to the Test | |
| The Gotchas in Upgrading |
With any new software release, you expect change. SQL Server 7.0 provides change and more. Internally, SQL Server 7.0 is almost a completely new product. Why the drastic change? Put simply, the architecture and code base for SQL Server had reached their limits with the 6.5 release. The only way to achieve the long list of features and design goals for this release and still position SQL Server with enough growth room was to reengineer the internals completely. Doing so created an opportunity for Microsoft to enhance the efficiency of the query optimizer by adding new algorithms and to increase ANSI compatibility by tightening many of the Transact SQL (T-SQL) constructs. These two changes most directly contribute to the compatibility conflicts in moving to SQL Server 7.0.
Let's examine some changes in SQL Server 7.0 and how they affect compatibility. We'll categorize the issues into three groups: system-level, syntactical, and behavioral.
Knowing that backward compatibility is a large concern for customers, Microsoft incorporated the concept of compatibility levels into SQL Server 7.0. SQL Server 7.0 has three compatibility levels: 60 for SQL Server 6.0 databases, 65 for SQL Server 6.5 databases, and 70 for SQL Server 7.0 databases. These levels let legacy databases in the 7.0 environment operate as if they were running under an earlier release of the product.
The system stored procedure sp_dbcmptlevel lets developers report on and change the compatibility level of their databases. For example, to set the Northwind database to level 65 compatibility, execute the following command:
sp_dbcmptlevel [Northwind], 65
In SQL Server 7.0, the Master database must always be at the 70 compatibility level. If you add any user-defined objects to Master, you must test and ensure that they work correctly at the 70 level. However, you can set other user databases within the server to the 60 or 65 compatibility levels. This functionality lets developers migrate their existing databases to SQL Server 7.0 in phases. For example, a development environment might support a new project that uses a version 70 database while simultaneously supporting two older projects that use one database at the 60 compatibility level and another at the 65 compatibility level.
For databases at a 60 or 65 compatibility level, many of SQL Server 7.0's new features are not available. Also, although SQL Server 7.0 currently supports these legacy databases, future releases might not. Therefore, we strongly recommend that you upgrade all applications and databases to support 7.0 as soon as possible.
System Tables and Information Schema Views. Microsoft has always frowned on developers directly accessing SQL Server's internal system tables. With each release of SQL Server, the company provides a disclaimer that tells users to access system tables only via one of the provided APIs: system stored procedures, SQL Distributed Management Objects (SQL-DMO), or T-SQL. The reason for this warning is that the format of the system tables depends on the internal architecture of SQL Server and might change from release to release. However, until SQL Server 7.0, Microsoft made few changes to these tables, and consequently, many developers incorporated system tables into their applications.
Developers who did not heed Microsoft's warning are in for a rude awakening. Many of the system tables' structures have drastically changed. In addition, SQL Server 7.0 storage format is natively Unicode; consequently, system meta data now uses the new Unicode data format. For example, SQL Server 7.0 identifiers are all nvarchar(128). During beta testing, system-table changes gave sites the most problems in upgrading their applications. To give some compatibility, SQL Server 7.0 provides system-level views that mimic the 6.x system tables. Although these views exist in all compatibility modes, they are for backward compatibility only and do not fully support many of the new features.
In SQL Server 7.0, meta data management is still available through system stored procedures and SQL-DMO. In addition to these two access methods, SQL Server now supports the SQL-92 definition for the Information Schema. The Information Schema is a series of read-only views that provide an internal, system table-independent view of the SQL Server meta data. Developers who need to access system information but are concerned about database portability can use these views instead of the system tables.
Many syntax changes that occurred in the 7.0 release (e.g., the addition of the TOP keyword, increased data-type sizes, and increased object-name size limits) involve the addition of new features. Fortunately, those changes have little impact on your migration strategy. However, a few other changes are worth mentioning.
UPDATE and Aliases. One significant difference between SQL Server 6.5 and 7.0 is the use of aliases in an UPDATE statement. When the FROM clause contains an alias and the SET clause references that alias, SQL Server 7.0 does not let the UPDATE clause reference the table name. The UPDATE clause must reference the alias instead. For example, Listing 1 shows a SQL statement that works in SQL Server 6.5, but when you execute it in SQL Server 7.0, the compiler reports an error. Listing 2 shows the syntax that works in both the 6.5 and 7.0 environments.
Foreign Keys. The data types of foreign-key columns must be the same between tables. Before SQL Server 7.0, this requirement applied to numeric data, but SQL Server allowed char and varchar key combinations. Listing 3 is an example of a foreign-key declaration that is valid in SQL Server 6.5, but fails in SQL Server 7.0 (unless it's running in 65 compatibility mode). SQL Server 6.5 provided the implicit type conversion between the two types. In SQL Server 7.0, the types must exactly match. Updated versions of the creation statements appear in Listing 4.
Deferred Name Resolution. Deferred name resolution allows procedure compilation without all table references being present. Deferred name resolution works in much the same way as the object-oriented concept of late binding. At compile time, the compiler attempts to resolve all table names that the procedure references. But if a table does not yet exist, the compiler defers this name resolution until execution time.
For developers who have used temporary tables within their stored procedures or triggers, this subtle new feature is long overdue. Although this feature is useful, it has a side effect that many developers might initially miss—the compiler no longer reliably catches table-name typos. Yes, you now must test your code. This statement might sound funny at first, but if you are not aware of deferred name resolution, you might find yourself wondering why the compiler missed this error.
Outer Joins. In earlier versions of SQL Server, you use the *= and =* operators to specify the left and right outer join conditions in the WHERE clause. In some cases, this syntax results in an ambiguous query that might not return the expected results. For example, consider the two SELECT statements in Listing 5 and their output in Figure 1 on page 32. Logically, both statements are identical, but they result in different data sets. The ambiguity exists under the old syntax because the optimizer does not always distinguish the JOIN condition from the selection criteria. SQL Server 6.5 supports SQL-92-compliant outer joins using the LEFT OUTER JOIN and RIGHT OUTER JOIN extensions in the FROM clause, thus removing this ambiguity. For backward compatibility, SQL Server 7.0 continues to support the old-style syntax. However, Microsoft has stated that future releases of SQL Server will support only the SQL-92 syntax. Therefore, we strongly recommend that you convert any remaining queries that use the old-style syntax to a SQL-92-compliant form.
The behavioral category contains some of the hardest-to-identify compatibility changes. Generally, these subtle changes cause problems only at execution time. Further complicating matters, some of the conflicts produce errors or warning messages, but others might go unnoticed and lead to data corruption. The following items highlight some of the more frequent compatibility challenges developers encounter.
The Query Optimizer. A major change that affects backward compatibility is the addition of several new JOIN algorithms to the query optimizer. Before SQL Server 7.0, only one algorithm, nested iteration, was available. Each time a query executed, the result sets always returned in the same order. In SQL Server 7.0, the optimizer has several options: a hash join (in-memory, grace, or recursive), a merge join, or the nested iteration. Each of these alternatives can produce the correct result; however, each has the potential to return that result in a different order. For more information on how the query optimizer can improve query processing, see Dusan Petkovic and Christian Unterreitmeier, "New Features for Query Processing," July 1999.
If you rely on the implicit ordering of your data, you need to be especially conscious of this change. The change is most visible when you use the GROUP BY clause. Including a GROUP BY clause in your SELECT statement does not imply any ordering of the result set. Before SQL Server 7.0, developers often took for granted that the data would return in the same order as the columns in the GROUP BY clause. Therefore, they omitted the ORDER BY clause. SQL Server 7.0, like the ANSI standard, does not guarantee this ordering. If you need ordered data, you must supply the ORDER BY clause.
Also, because of the new join algorithms and internal changes (e.g., row-level locking, increased page size, and I/O and optimizer improvements), consider removing optimizer hints from your queries unless SQL Server 7.0 still requires the hints. Because of query-engine inefficiencies, earlier releases of SQL Server sometimes require that you use hints to get the best execution plan for large queries. The many enhancements to SQL Server 7.0 change this requirement. For maximum query efficiency, leave query-plan determination to the optimizer and use hints only as a last resort.
Null Comparisons. Dealing with three-state logic, which nulls introduce, has always been difficult for developers. Not only do queries have to consider the equality of an attribute, but when the attribute allows nulls, queries must also deal with the issue of the attribute's existence. When an attribute is null, its value is unknown. Therefore, logical comparison to another value is not possible. To get around this problem, earlier versions of SQL Server implement a T-SQL extension that allows for the comparison of nulls. For example, value=null was always true when value was null. This capability changes with SQL Server 7.0. In adhering to ANSI standards, SQL Server 7.0 uses the SET ANSI_NULLS option to control null comparisons. When set to ON, which is the default value, the statement value = null equates to unknown, not true or false. To test for the presence of nulls, you must now use the IS NULL or IS NOT NULL clause. Consider the SELECT statements in Listing 6: Under SQL Server 6.x, the first SELECT statement returns row "BB"; under SQL Server 7.0, it returns no rows. The second SELECT statement returns row "BB" under both SQL Server 6.5 and SQL Server 7.0.
Aggregation and Nulls. Another area involving nulls (and requiring good defensive programming) is aggregation. When SQL Server 6.5 encounters a null value in an aggregate function (such as AVG, MIN, and MAX), it skips the row and continues without generating any warning or error messages. Under SQL Server 7.0, the same statement completes but generates the message Warning: Null value eliminated from aggregate. For example, if you use the table definition and the first UPDATE statement Listing 7 shows, SQL Server 7.0 generates the warning. The second UPDATE statement correctly accounts for the nullibility of col2 and is preferred in all SQL Server versions.
String Truncation. SQL Server 7.0 no longer allows implicit truncation of strings. For example, consider the code in Listing 8, page 34. When you execute the code, SQL Server 6.5 does not generate any errors. However, the result set from the SELECT statement returns two rows; each contains the value 0123456789. If you execute that same SQL script under SQL Server 7.0, the second INSERT statement generates an error message, String or binary data would be truncated, and the statement ignores the INSERT. The result set returns only one row. The good news is that implicit data loss is eliminated. However, when moving data, developers and designers now must be more conscious of field sizes. Note that even when running under 65 compatibility mode, SQL Server 7.0 does not support 6.x behavior.
Arithmetic Overflow. In SQL Server 6.x, UPDATE and INSERT statements that encounter an arithmetic overflow condition set the value to null, or skip the action if the value belongs to a non-null column. In SQL Server 7.0, however, the action terminates and no modifications occur. For example, executing the SQL code in Listing 9, page 34, in SQL Server 6.x generates the warning Arithmetic overflow occurred, but the row still inserts with the col12 value set to NULL. Under SQL Server 7.0, these same SQL statements still produce an error message, Arithmetic overflow error for data type tinyint, Value = 1600000000. However, no row is inserted. As with string truncation, designers and developers must be conscious of this change and either adjust their data types or modify their calculations accordingly.
Empty Strings. Another new feature in SQL Server 7.0 is its ability to handle empty strings. Pre-7.0 releases interpret an empty string as either a null or a single blank character. In SQL Server 7.0, empty strings are treated as empty strings. This change affects all string-related functions. Some examples appear below; for a complete list of affected functions, consult SQL Server 7.0's Books Online (BOL).
Before 7.0 | With 7.0 |
DATALENGTH(") = 1 | DATALENGTH(") = 0 |
RTRIM(' ') = null | RTRIM(' ') =" |
Timestamps. Originally, the purpose of the timestamp data type was to support SQL Server recovery algorithms. Every time someone modified a page, SQL Server stamped it with the current @@DBTS value and incremented @@DBTS by one. This approach was sufficient to determine the relative sequence of page modifications for recovery, but the timestamp values have no relation to time. In SQL Server 7.0, the only remaining use of the timestamp data type is for concurrency, because the @@DBTS value is incremented only for use in timestamp columns. If a table contains a timestamp column, every time an INSERT, UPDATE, or DELETE statement modifies a row, the timestamp value in the row is set to the current @@DBTS value, and @@DBTS is incremented by one.
Also, you need to be aware of the effect of timestamps on replication. Merge replication does not support timestamp columns. Therefore, organizations that want to exploit SQL Server's merge replication features need to remove the timestamp column from any published table. In a merge replication scenario, changes to this column have no common meaning between servers, because local servers generate their own timestamp values, and the uniqueness of those values is guaranteed only within their specific databases.
Each deployment and database is different. We have touched on just a few of the potential pitfalls you might encounter. Microsoft invested in the development of many resources to ensure a smooth upgrade path, including BOL, so be sure to check them out. BOL provides an excellent reference on compatibility issues and discusses many additional topics such as ODBC and operational (DBA) changes.
As with any project, before deploying your databases to production, be sure to thoroughly test your applications in the SQL Server 7.0 environment. If your developers have good SQL writing habits, you should not encounter many problems. However, many of the changes are behavioral in nature, so they won't show until execution time. In the end, most of the compatibility conflicts are a direct result of stronger ANSI compliance. Therefore, resolving those conflicts will result in a more robust, portable, and stable database. Make the time and do the migration. SQL Server 7.0 is worth upgrading to.
Listing 1 UPDATE Statement, Invalid in SQL Server 7.0
USE pubs go UPDATE Titles SET t.Price = t.Price + 100 -- References alias. FROM Titles t Go
Listing 2 UPDATE Statement, Works in SQL Server 6.5 and 7.0
USE pubs go UPDATE t SET t.Price = t.Price + 100 -- References alias FROM Titles t go
The following code is also valid:
UPDATE Titles SET Price = Price + 100 -- Alias not referenced. FROM Titles t Go
Listing 3 Foreign-Key Declaration, Invalid in SQL Server 7.0
USE tempdb go CREATE table sample (col1 char(10) not null primary key) go CREATE table another (col1 varchar(10) not null foreign key (col1) references sample(col1)) go DROP table another go DROP table sample go
Listing 4 Foreign-Key Declaration, Works in SQL Server 6.5 and 7.0
USE tempdb go CREATE table sample (col1 char(10) not null primary key) go CREATE table another (col1 char(10) not null foreign key (col1) references sample(col1)) go DROP table another go DROP table sample go CREATE table sample (col1 varchar(10) not null primary key) go CREATE table another (col1 varchar(10) not null foreign key (col1) references sample(col1)) go DROP table another go DROP table sample Go
Listing 5 ANSI JOIN Syntax
USE tempdb
go
CREATE table Department
( DepartmentId int not null
, Description varchar(50) not null )
go
CREATE table Employment
( DepartmentId int not null
, EmployeeId int null )
go
INSERT INTO Department values (1, "Sales")
INSERT INTO Department values (2, "Marketing")
INSERT INTO Department values (3, "Information Services")
go
INSERT INTO Employment values (1, 99)
INSERT INTO Employment values (2, 88)
go
PRINT ''
PRINT 'Old Style Join = Incorrect results!!'
PRINT ''
SELECT d.DepartmentId
, d.Description
, e.DepartmentId
FROM Department d
, Employment e
WHERE d.DepartmentId *= e.DepartmentId
AND d.DepartmentId <> 2
OR e.DepartmentId IS NULL
go
PRINT ''
PRINT 'ANSI Join = Correct results!!'
PRINT ''
SELECT d.DepartmentId
, d.Description
, e.DepartmentId
FROM Department d
LEFT OUTER JOIN Employment e ON d.DepartmentId = e.DepartmentId
WHERE d.DepartmentId <> 2
OR e.DepartmentId IS NULL
go
DROP table Department, Employment
Go
Figure 1 ANSI Join vs. *= Results
Old-style join yields incorrect results:
| DepartmentId Description | DepartmentId | |
1 | Sales | 1 |
2 | Marketing | (null) |
3 | Information Services | (null) |
(3 row(s) affected)
ANSI Join Yields Correct results:
| DepartmentId Description | DepartmentId | |
1 | Sales | 1 |
3 | Information Services | (null) |
(2 row(s) affected)
Listing 6 Comparing Nulls
USE tempdb
go
CREATE table sample
( col1 char(10) not null
, col2 tinyint null )
go
INSERT INTO sample values ("AA", 1)
INSERT INTO sample values ("BB", NULL)
go
Valid under 6.x, incorrect results under 7.0:
SELECT * FROM sample WHERE col2 = null go
Valid under both 6.x and 7.0:
SELECT * FROM sample WHERE col2 IS null go DROP table sample Go
Listing 7 Aggregation and NULLs
USE tempdb
go
CREATE table sample
( col1 char(10) not null
, col2 int null )
go
INSERT INTO sample values ("AA", 1)
INSERT INTO sample values ("AA", 1)
INSERT INTO sample values ("AA", 1)
INSERT INTO sample values ("AA", null)
go
This statement generates a warning:
SELECT col1, sum(col2)
FROM sample
GROUP BY col1
go
Good, defensively written SQL:
SELECT col1, sum(IsNull(col2, 0))
FROM sample
GROUP BY col1
go
DROP table sample
Go
Listing 8 String Truncation
USE tempdb
go
CREATE table sample (col1 varchar(10) not null)
go
INSERT INTO sample values ("0123456789")
INSERT INTO sample values ("01234567890123456789")
go
SELECT * FROM sample
go
DROP table sample
Go
Listing 9 Arithmetic Overflow
USE tempdb
go
CREATE table sample
( col1 char(10) not null
, col2 tinyint null )
go
INSERT INTO sample values ("AA", power(200, 4))
go
SELECT * FROM sample
go
DROP table sample
Go
Before you start an upgrade from SQL Server 6.5 to SQL Server 7.0, take some time to prepare. First, check to make sure your system meets the requirements:
| • | You must have 10MB of free space on your C drive for temporary files and logs. |
| • | Tempdb on your SQL Server 6.5 database must be at least 10MB. |
| • | The Master database must have 3MB of free space. To verify that you meet this requirement, run Use Master exec sp_spaceused @updateusage = 'TRUE' |
and verify that the unallocated space is at least 3MB (or 3072KB).
| • | The syscomments text must be intact for all objects to be migrated to SQL Server 7.0 (e.g., views, stored procedures, defaults, rules). |
| • | If you are upgrading from SQL Server 6.5 on one server to a SQL Server 7.0 installation on a different server, both computers must be in the same administrative domain structure. |
| • | The value for @servername must not be null. If you run select @@servername and it returns null, run exec sp_addserver 'myserver','local' (substitute your server name for myserver). |
Before you begin your upgrade:
1. | Back up your SQL Server 6.5 installation. |
2. | If you're using replication, upgrade your distribution server first. |
3. | Ensure that logins exist for every database user. If you restored a database that originated on another server, run sp_change_users_login on the database first to make sure all users map to valid SQL Server logins. |
4. | Make sure you have no undistributed transactions. To verify that no undistributed transactions exist, run exec sp_repltrans against the Publisher database. If activity has stopped, no rows are returned. Then, run exec sp_MSDistribution_counter @publisher = 'publisher_servername' on the distribution database for each supported Publisher. The No rows returned message should have 0 in the undelivered_jobs column. |
Some other considerations:
| • | You must run the upgrade as an administrator on both the 6.5 and the 7.0 machines. |
| • | Stored procedures that modify system tables are not upgraded. |
| • | The sp_rename system stored procedure does not change an object's name in syscomments. Therefore, any object you upgrade retains its old name if you haven't dropped or recreated the object after you run sp_rename. |
Brian Lawton is a consultant with RDA Consultants Limited, an MCSE, MCDBA, and MCSD.
Don Awalt is president of RDA Consultants Limited. He is an MCSE, MCSD, and MCP+I.
The above article is courtesy of SQL Server Magazine.
We at Microsoft Corporation hope that the information in this work is valuable to you. Your use of the information contained in this work, however, is at your sole risk. All information in this work is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Microsoft Corporation. Microsoft Corporation shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages. All prices for products mentioned in this document are subject to change without notice.
International rights = English only.