| |
| Introduction |
| |
| When I met my friend Sarah the
other day in the cafeteria around the street corner, I wasn’t
able to believe myself. She wasn’t quite herself on that
day. I know of Sarah as a very bubbly character who is always
enthusiastic; the last time I met her, she was pretty excited
about her new job in a mid-cap company. We just sat over a cup
of cappuccino and she slowly began to open up about her new
job. She has been inducted into a new project recently and much
to her excitement, she is disappointed that there is no proper
documentation on what has been already done. For instance, there
is no information on the Database structure and no one seems
to really know about it! |
| |
| Just then, an idea to write
a utility to document the database struck me! No I am not competing
against the SQL Enterprise Manager or other tools like Erwin.
I am talking about a utility that can depict the basic database
structure. |
| |
| Let us say that you are assigned
to a project on a Monday and your tasks include some changes
to a couple of Stored Procedures. Worse, you find that a Stored
Procedure interacts with some three databases! After a while
it is obvious that you will find yourself in soup if there is
no proper documentation on the databases. I bet this situation
will turn out to be one of the worse nightmares in your programming
life! And you can even imagine that you always don’t have
the luxury to read a bunch of documents before starting to be
productive. I tell you, these managers are always pushy if greedy
is a strong word when it comes to productivity and turn-around
time!! |
| |
| TIP: Do you know this? |
| |
| Sp_help |
| |
| We can get complete information
about each table in a database using the following
syntax: |
| |
| Execute
sp_help <<tablename>> |
|
| |
| But if we have hundreds of tables
in a database then the above method would surely be a tedious
way to do. So to over come that we could use the following query: |
| |
| Execute
sp_MSForeachTable @command1 = "sp_help '?'" |
|
| |
| As sp_help lists information
about a single table this sp_MSForeachTable displays all information
about all tables within the existing database. Neat isn’t
it? |
| |
| The only problem which I see
with the above system stored procedure is it gives too much
of information which one wouldn’t need all the time. So
in order to retrieve only that information which a developer
would be interested in I have come up with a utility, which
is explained below. |
| |
| System Tables |
| |
| The entire system, from the
tables you create, to configuration options that control SQL
Server's operation, is stored in relational tables within the
system. These tables are installed automatically when you install
SQL Server on your system. It contains the information that
SQL Server needs to manage and operate your system. |
| |
| I have used System tables extensively
in this article. So if you aren’t familiar with system
tables then the below table would give you a better idea about
what's in each of those tables. Note: I have explained only
those tables that are used in this article. |
| |
| Table Name |
Description |
| SYSCOLUMNS |
Contains one row for each column
in every table and view in the database. Each row's
columns give the target column's attributes and
describe its behavior. |
| SYSCOMMENTS |
Contains one or more rows for each
CHECK constraint, DEFAULT constraint, rule, stored
procedure, trigger, and view in the database. The
table's TEXT column contains the SQL or Transact-SQL
statement that defines the constraint, rule, stored
procedure, trigger, or view. Since definitions can
be up to 4MB in size while the TEXT column can hold
only 4,000 bytes (characters), a single object may
be described by more than one row in the SYSCOMMENTS
table. |
| SYSFOREIGNKEYS |
Contains one row for each FOREIGN
KEY constraint used in a table definition. |
| SYSINDEXES |
Contains one row for each index,
table, and BLOB in the database. The columns in
each row describe such things as the location of
the object's first page of data, the row size, and
the number of keys, row count, the name of the table,
and the name of the key columns. |
| SYSINDEXKEYS |
Contains one row with the table
ID, index ID, column ID, and ordinal position of
the column in the index for each column in every
index in a database. |
| SYSOBJECTS |
Contains one row for each object
(constraint, default, index, log, rule, stored procedure,
table, and so on) in the database. |
| SYSTYPES |
Contains one row for each system-supplied
and user-defined data type in the database. |
| SYSUSERS |
Contains one row for each Windows
user, Windows group, MS-SQL Server user, and MS-SQL
Server role in the database. |
|
|
| |
| Introducing DBDoc |
| |
| The utility is really a stored
procedure that has four parts. In other words, it analyzes the
given database and does 4 tasks namely |
| |
| 1 |
List all columns in all
the tables with its datatype, size and allow null flag.
|
| 2 |
List check and default constraints for
all the tables in the database |
| 3 |
List the foreign key details of all tables
in the database |
| 4 |
The primary key, index name and the type
of index for all the tables. |
|
| |
| Well, that’s quite a lot
of information, isn’t it? Let us take these items one
by one and look into with a magnifying glass |
| |
| Listing all columns
in all the tables |
| |
| Here is the SQL statement that
will do the trick. |
| |
Select
so.name "Table Name",
sc.name "Field Name",
st.name "Data Type",
sc.length "Size",
'Is Null' =
Case
When
sc.isnullable = 1 then 'Null'
Else
'Not Null'
End
From
syscolumns sc,
sysobjects so,
systypes st
Where
so.type = 'U' and
sc.id = so.id and
sc.xtype = st.xtype and
so.status > 0 and
st.xusertype <> 256 -- this
is for sysname which is equivalent to nvarchar |
|
| |
| This query is based on information
from the SYSOBJECTS, SYSCOLUMNS, and SYSTYPES system tables
that contains one row for each column in the current database
that has a user-defined data type |
| |
| And here is the partial result
when the above statement was run on the PUBs database. |
| |
| |
Table Name |
Field Name |
Data Type |
Size |
IsNullable |
| 1 |
titleauthor |
au_id |
varchar |
11 |
Not Null |
| 2 |
titleauthor |
au_id |
id |
11 |
Not Null |
| 3 |
titleauthor |
au_id |
tid |
11 |
Not Null |
| 4 |
titleauthor |
title_id |
varchar |
6 |
Not Null |
| 5 |
titleauthor |
title_id |
id |
6 |
Not Null |
| 6 |
titleauthor |
title_id |
tid |
6 |
Not Null |
| 7 |
titleauthor |
au_ord |
tinyint |
1 |
Null |
| 8 |
titleauthor |
royaltyper |
int |
4 |
Null |
|
|
| |
| Now, examine the shaded area
closely. The table titleauthor has two user-defined datatypes
“id” and “tid” and hence the result
is a Cartesian product of the two rows. Here is the slightly
modified query to fix the problem. Try this on PUBs and NorthWind
also. |
| |
Select
so.name "Table Name",
sc.name "Field Name",
type_name (sc.xusertype) "Data
Type",
sc.length "Size",
"Is Nullable" =
Case
When
sc.isnullable = 1 then 'Null'
Else
'Not Null'
End
From
syscolumns sc,
sysobjects so
Where
so.type = 'U' and
sc.id = so.id and
so.status > 0 |
|
| |
| And the partial result when
run on the PUBs database is |
| |
| |
Table Name |
Field Name |
Data Type |
Size |
IsNullable |
| 1 |
Titleauthor |
au_id |
Id |
11 |
Not Null |
| 2 |
Titleauthor |
title_id |
Tid |
6 |
Not Null |
| 3 |
Titleauthor |
au_ord |
Tinyint |
1 |
Null |
| 4 |
Titleauthor |
royaltyper |
Int |
4 |
Null |
|
|
| |
| There you go, good work bud! |
| |
| Point to note in this query
is “xusertype” field of SYSCOLUMNS would contain
the ID of the datatype. I am using the function type_name to
convert that ID into actual datatype value. For better understanding
check the below table: |
| |
| xusertype |
type_name(xusertype) |
| 34 |
image |
| 35 |
text |
| 36 |
uniqueidentifier |
| 48 |
tinyint |
| 52 |
smallint |
| 56 |
int |
| 59 |
real |
| 61 |
datetime |
| 98 |
sql_variant |
| 99 |
ntext |
| 104 |
bit |
| 106 |
decimal |
| 108 |
numeric |
| 127 |
bigint |
| 165 |
varbinary |
| 167 |
varchar |
| 173 |
binary |
| 175 |
char |
| 231 |
nvarchar |
| 239 |
nchar |
| 256 |
sysname |
|
|
| |
| List check and default
constraints on the tables |
| |
| This one is slightly complex
when compared with the previous one. |
| |
Select
object_name(parent_obj) "Table
Name",
col_name(parent_obj,info) "Field
name",
sc.TEXT "Check/Default Constraint
Text"
From
sysobjects so,
SysComments sc
Where
parent_obj in (select id from
sysobjects where type='U') and
xtype in ('D','C') and
sc.id = so.ID |
|
| |
| This query is based on information
from the SYSOBJECTS and SYSCOMMENTS system tables that contain
one row for each Check and Default constraint in the current
database. |
| |
| TIP: You
might also want to check the following queries, |
| |
|
SELECT * FROM INFORMATION_SCHEMA.CHECK_CONSTRAINTS
|
|
SELECT * FROM INFORMATION_SCHEMA.COLUMN_DOMAIN_USAGE |
| |
SELECT * FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE |
|
| |
| And the partial result of this
query is |
| |
| |
Table Name |
Field Name |
Check/Default
Constraint Text |
| 1 |
titles |
type |
('UNDECIDED') |
| 2 |
titles |
pubdate |
(getdate()) |
| 3 |
Jobs |
Job_desc |
('New Position - title not formalized
yet') |
| 4 |
Jobs |
min_lvl |
([min_lvl] >= 10) |
| 5 |
Jobs |
max_lvl |
([max_lvl] <= 250) |
|
|
| |
| Listing foreign key
information |
| |
| Actually I got the base logic
of this query from the web. I took the query and just modified
it slightly to suit our need. Why re-invent the wheel ah? |
| |
Select
so.[name] "Table Name",
sc.[name] "Field Name",
Fkey_Constraints = so2.[name]
+ '(' + sc2.[name] +')'
From
sysforeignkeys sf
Inner join
(select uid, id, [name] from sysobjects
where xtype = 'U') so on sf.fkeyid = so.id
Inner join
(select uid, id, [name] from sysobjects
where xtype = 'U') so2 on sf.rkeyid = so2.id
Inner join
(select id, colid,[name] from
syscolumns) sc on sf.fkeyid = sc.id and sf.fkey = sc.colid
Inner join
(select id, colid,[name] from
syscolumns)sc2 on sf.rkeyid = sc2.id and sf.rkey = sc2.colid
Inner join
(select id, [name] from sysobjects)
so3 on sf.constid = so3.id
Inner join
sysusers su on so.uid = su.uid
Inner join sysusers su2 ON so2.uid = su2.uid |
|
| |
| And the result of this query
is: |
| |
| |
Table Name |
Field Name |
Fkey_Constraints |
| 1 |
titles |
pub_id |
publishers(pub_id) |
| 2 |
titleauthor |
au_id |
authors(au_id) |
| 3 |
titleauthor |
title_id |
titles(title_id) |
| 4 |
sales |
stor_id |
stores(stor_id) |
| 5 |
sales |
title_id |
titles(title_id) |
| 6 |
roysched |
title_id |
titles(title_id) |
| 7 |
discounts |
stor_id |
stores(stor_id) |
| 8 |
pub_info |
pub_id |
publishers(pub_id) |
| 9 |
employee |
job_id |
jobs(job_id) |
| 10 |
employee |
pub_id |
publishers(pub_id) |
|
|
| |
| TIP: You
might also want to check the following queries, |
| |
|
SELECT * FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE
|
|
SELECT * FROM INFORMATION_SCHEMA.CONSTRAINT_TABLE_USAGE |
| |
SELECT * FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS |
|
| |
| Listing primary keys
and index information |
| |
| I hate to admit it; this one
took most of the time. |
| |
Select
so.name "Table name",
sc.name "Field name",
si.name "Index name",
"Primary Key / Unique"
=
Case
When (convert(bit,(si.status
& 0x800)/0x800) = 1 and convert(bit,(si.status &
2)/2) = 1) then
'Primary
Key, Unique Key'
When convert(bit,(si.status
& 0x800)/0x800) = 1 then
'Primary
Key'
When convert(bit,(si.status
& 2)/2) = 1 then
'Unique
Key'
Else '--'
End,
"Index Type" =
Case
When convert(bit,(si.status
& 16)/16) =1 then
'(Clustered)'
Else
'(Non-Clustered)'
End
From
sysobjects so,
sysindexes si,
syscolumns sc,
sysindexkeys sik
Where
so.type in ('U') and
si.id = so.id and
so.id = sc.id and
so.id = sik.id and
si.indid = sik.indid and
sc.colid = sik.colid and
sik.keyno <= si.keycnt and
(si.status&32) = 0 |
|
| |
| This would give out the result
|
| |
| |
Table Name |
Field Name |
Index name |
Primary / Unique
Key |
Index Type |
| 1 |
titleauthor |
au_id |
UPKCL_taind |
Primary Key, Unique Key |
(Clustered) |
| 2 |
titleauthor |
title_id |
UPKCL_taind |
Primary Key, Unique Key |
(Clustered) |
| 3 |
titleauthor |
au_id |
auidind |
-- |
(Non-Clustered) |
| 4 |
titleauthor |
title_id |
titleidind |
-- |
(Non-Clustered) |
| 5 |
Stores |
stor_id |
UPK_storeid |
Primary Key, Unique Key |
(Clustered) |
| 6 |
Sales |
stor_id |
UPKCL_sales |
Primary Key, Unique Key |
(Clustered) |
| 7 |
Sales |
ord_num |
UPKCL_sales |
Primary Key, Unique Key |
(Clustered) |
| 8 |
Sales |
title_id |
UPKCL_sales |
Primary Key, Unique Key |
(Clustered) |
| 9 |
Sales |
title_id |
titleidind |
-- |
(Non-Clustered) |
|
|
| |
| How to run the queries |
| |
| It is not a bad idea to put
all the three statements in a Stored Procedure and run at a
single go. Thus DBDoc was born J another good idea is to store
the DBDoc as part of the model database so that all the new
databases created thereafter inherit the Stored Procedure DBDoc. |
| |
| Conclusion |
| |
| Ideally documentation needs
to happen before the development is started and should continue
throughout the life of the project. Well is there anything that
is “ideal” in this world?! Nobody is perfect. In
a majority of the projects, people tend to take documentation
lightly and don’t give the due respect. Better be late
than never. Overlooking documentation can certainly bring down
catastrophe in any project if ignored or not taken seriously. |
| |
| |