Chapter 7 - Protecting Data

Windows NT Server provides several tools for managing disk resources to enhance performance and protect data. The tools include the Disk Administrator and Backup programs which are in the Administrative Tools (Common) folder, as well as the uninterruptible power supply (UPS) option in Control Panel. Only Disk Administrator and UPS are discussed in this chapter. For more information about Backup, see Chapter 6, "Backing Up and Restoring Network Files."

The tools discussed in this chapter provide data protection if one of the following occurs:

Disk failures using fault tolerance, redundancy, and the chkdsk program.

Power outages using the UPS option to configure uninterruptible power supplies

Corrupted or missing system or boot files using the Last Known Good Configuration option.

This chapter also includes information about using Disk Administrator to set up and organize your hard disks to function more efficiently. The fault-tolerant options in Disk Administrator enable you to take advantage of Redundant Array of Inexpensive Disks (RAID) data management, mirror sets, stripe sets, and stripe sets with parity.

Disk Administrator Overview

Disk Administrator is a graphical tool for managing disks. Disk Administrator encompasses and extends the functionality of character-based disk management tools, such as MS-DOS Fdisk and the Microsoft LAN Manager Fault Tolerance character applications.

The following list provides an overview of what you can do with Disk Administrator:

Create and delete partitions on a hard disk.

Create and delete logical drives within an extended partition.

Format and label volumes.

Read status information about disks, such as the partition sizes and the amount of free space that is available for creating additional partitions.

Read status information about Windows NT volumes, such as the drive-letter assignment, volume label, file system type, size, and available space.

Make and change drive-letter assignments for hard disk volumes and CD-ROM devices.

Create and delete volume sets.

Extend volumes and volume sets.

Create and delete stripe sets with or without parity.

Regenerate a missing or failed member of a stripe set with parity.

Establish or break disk-mirror sets.

Save and restore disk configuration.

Note You cannot use Disk Administrator to further partition the system or the boot partition because it contains files required to operate Windows NT. Disk Administrator can be used to partition free space on an existing disk or to partition new disks only. For more information, see "Partitioning Disks" later in this chapter.

The Windows NT Server version of Disk Administrator includes the common disk organizational tools (volume sets and stripe sets) and then adds the data-protection (fault tolerance) tools (mirror sets and stripe sets with parity). For more information about fault tolerance, mirror sets, and stripe sets with parity, see "Fault Tolerance" later in this chapter.

<table border="1" cellpadding="5" cellspacing="0" width="90%"><tbody></tbody></table>

Disk and File Terms 

A partition is a portion of a physical disk that functions as though it were a physically separate unit. A partition is usually referred to as either a primary or an extended partition.

A primary partition is a portion of a physical disk that can be marked for use by an operating system. A disk can have up to four primary partitions (or up to three, if there is an extended partition) per physical disk. A primary partition cannot be subpartitioned

An extended partition is created from free space on a hard disk and can be subpartitioned into logical drives. Only one of the four partitions allowed per physical disk can be an extended partition, and no primary partition needs to be present to create an extended partition.

Free space is an unused and unformatted portion of a hard disk that can be partitioned or subpartitioned. Free space within an extended partition is available for the creation of logical drives. Free space within extended partitions on several disks can also be used to create volume sets or other kinds of volumes for fault tolerance purposes. Free space that is not within an extended partition is available for the creation of a partition with a maximum of four partitions allowed.

A volume is a partition or collection of partitions that have been formatted for use by a file system. A Windows NT volume can be assigned a drive letter and used to organize directories and files.

A volume set is a combination of partitions that appear as one logical drive.

A stripe set is saving data across identical partitions on different drives. A stripe set is not fault tolerant.

Fault tolerance ensures data integrity when hardware failures occur. In Windows NT, fault tolerance is provided by the FTDISK.SYS driver.

A mirror set is a fully redundant or shadow copy of data.

The system partition contains the hardware-specific files (Ntldr, Osloader.exe, Boot.ini, Ntdetect.com) needed to load Windows NT.

The boot partition contains Windows NT operating system files which are located in the %Systemroot% and %Systemroot%\System32 directory.

Top of pageTop of page

Managing Disks

If you use only the Windows NT operating system, you can create one partition that occupies your entire disk or as many as four partitions. If you want to use other operating systems on your hard disk (such as UNIX or MS-DOS) with file systems that are not recognized by Windows NT, you must create separate partitions for each non-Microsoft operating system. Notice, though, that MS-DOS and Windows NT can share the same partition when using the file allocation table (FAT) file system.

On an x86-based computer, the operating system starts from the active system partition on the first internal hard disk (that is, Disk 0). Computers using reduced instruction set computing (RISC) processors can have several system partitions that are configurable by the manufacturer's configuration program. Such partitions must be formatted for the file allocation table (FAT) file system. For detailed information about setting up more than one system partition on a RISC-based computer, see your hardware documentation.

Note Disk Administrator cannot be used to partition a disk that contains Windows NT system files. Disk Administrator can be used to partition free space on an existing disk or to partition new disks only.

Partitioning Disks

Disk management under Windows NT is very flexible. You can create up to four partitions in the free space on a physical hard disk, create multiple logical drives in the free space of an extended partition, and delete partitions. You can also add hard disks to your system configuration, recover disk configuration information, and assign specific drive letters to each primary partition or logical drive.

Note Windows NT cannot recognize free space that was created on a FAT partition using the UNDELETE SENTRY feature in MS-DOS version 6.2. With the SENTRY method, MS-DOS reserves part of the hard disk to store deleted files and then compensates during MS-DOS queries about free space. Because Windows NT does not understand SENTRY, it reports the space on the FAT partition as used.

Each partition can have a different file system, such as (FAT) or Windows NT file system (NTFS). If you want multiple file systems and your existing hard disk has only one partition, you must create more than one partition on the hard disk before installing Windows NT.

The Windows NT Setup program can be used to partition the disk while installing Windows NT. However, if you are installing Windows NT on an existing disk and you are going to partition the disk, you should first back up the data.

Support for creating high performance file system (HPFS) partitions is not available in Windows NT version 4.0. HPFS partitions must be removed before running setup.

Partitioning the internal hard disk on a new computer is done (using the Setup program) during initial setup when you load the Windows NT operating system software. Making changes to that disk or partitioning a new hard disk is done using Disk Administrator.

After you partition a disk, you can either commit the changes immediately or wait until you quit Disk Administrator to save them.

Note Before you install Windows NT on a computer with other operating systems, you should use the fdisk program (or another comparable program) that is included with MS-DOS to determine the number of existing partitions on the disk. If the entire disk has only a single primary partition, you cannot use Disk Administrator to divide that primary partition after Windows NT is installed.

For more information about the fdisk program, see the Command Reference in Help.

Setting Up a New Hard Disk

You can create any of the following in the free space on a hard disk:

A single primary partition

Additional partitions up to the maximum of four

An extended partition with a number of logical drives that is limited only by the size of the partition

Other types of Windows NT volumes, such as volume sets and stripe sets

The following illustration shows examples of different disk-partitioning schemes on x86-based and RISC-based computers and where certain files might be located.

 

x86 or RISC-based computers 

 

x86-based computers 

 

RISC-based computers 

Creating Primary Partitions

When creating primary partitions, the system assigns space to a partition starting from the beginning of the space available. Therefore, in the beginning, there are no gaps between partitions. Gaps occur only when you delete a partition later on. For example, if you delete the second of three partitions and create a new, smaller second partition, that will leave a gap of free space between the second and third partitions.

For information about how to create a primary partition, see "Creating Primary Partitions" in Disk Administrator Help.

Creating an Extended Partition

One of the four partitions that you can create under Windows NT, if disk space allows, is an extended partition. You can use the free space in the extended partition to create multiple logical drives or use all or part of it when creating volume sets or other kinds of volumes for fault-tolerance purposes.

For information about how to create extended partitions, see "Creating an Extended Partition," and "Creating Logical Drives in an Extended Partition" in Disk Administrator Help.

Formatting and Labeling Partitions

Before you can store files and directories on the partitions that you have created, you must first commit the changes to disk and then format each partition individually to use with the file system you want to work with. You can also assign descriptive volume labels at this time.

To format and label volumes, you can use the Format and Set Volume Label commands from the Disk Administrator Tools menu, or you can use the Format and Label commands at the command prompt. For information about the Format and Label commands, see the Command Reference in Help.

For information about how to format and label volumes using Disk Administrator, see "Formatting and Labeling Partitions" in Disk Administrator Help.

Marking Partitions as Active

The names commonly used for the partitions containing the startup and operating system files are the system and boot partitions, respectively.

The system partition for Windows NT is the volume that contains the hardware-specific files needed to load Windows NT. On x86-based computers, it must be a primary partition that has been marked as active for boot purposes and must be located on the disk that the computer accesses when starting the system. There can be only one active system partition at a time. If you want to use another operating system, you must first mark its system partition as active before restarting the computer.

For information about how to mark a partition as active, see "Marking Partitions as Active" in Disk Administrator Help.

Partitions on a RISC-based computer are not marked active. Instead, they are configured by a hardware configuration program supplied by the manufacturer. On RISC-based computers, the system partition must be formatted for the FAT file system. On either type of computer, the system partition can never be part of a stripe set or volume set.

The boot partition for Windows NT is the volume, formatted for either the NTFS or FAT file system, that contains the Windows NT operating system and its support files. The boot partition can be (but does not have to be) the same as the system partition. The boot partition also cannot be part of a stripe set or volume set.

Securing System Partitions

Since the system partition on a RISC-based computer must be formatted for the FAT file system, there is no way to secure information in individual directories and files on that partition, unless the manufacturer included hardware protection. For information, see the vendor's documentation. Therefore, the only way to secure the system partition is to allow access only to members of the Administrators group. It is best to have the system and boot files on separate partitions when securing the partition.

For information about how to secure system partitions, see "Securing System Partitions" in Disk Administrator Help.

Assigning Drive Letters

You can create more than 24 volumes with Windows NT, but you cannot assign more than 24 drive letters for accessing these volumes. Drive letters A and B are reserved for floppy disk drives. However, if you do not have a B floppy disk drive, you can use the letter B for a network drive.

Windows NT enables the static assignment of drive letters. This means that a drive letter can be permanently assigned to a specific hard disk and partition/volume. When a new hard disk is added to an existing computer system, it does not affect statically assigned drive letters.

In addition to supporting the static assignment of drive letters on volumes and partitions, Disk Administrator also supports assigning a permanent drive letter to CD-ROM drives.

However, this static assignment of drive letters occurs only after Disk Administrator has been used on the computer. Until then, drive letters are assigned by Windows NT in a manner similar to that used by MS-DOS, which adheres to the following rule: first, the primary partitions on each hard disk get letters assigned starting with the letter C. Windows NT then continues assigning the next available drive letter to each of the logical drives in alphabetical order on each hard disk and then to the other primary partitions on each hard disk. The active system partition is typically the C drive.

The following is an example with three hard disks.

 

The following is an example with only one hard disk.

 

Note You should be careful when making drive-letter assignments because many MS-DOS and Windows programs make references to a specific drive letter. For example, the Path environment variable shows specific drive letters with program names.

For information about how to assign drive letters, see "Assigning Drive Letters" and "Assigning CD-ROM Drive Letters" in Disk Administrator Help.

Deleting Partitions, Volumes, or Logical Drives

Before deleting partitions, volumes, or logical drives under Windows NT, you need to ensure that the information on them has been backed up onto another storage medium and verified or is no longer needed.

Windows NT places certain restrictions on your ability to delete. It will not let you delete the volume with the system files (the system partition). Nor can you delete individual partitions that are part of a set without deleting the entire set. However, on a RISC-based computer, you can delete the system partition with the files needed to load Windows NT, so be very careful. Windows NT also requires that all the logical drives or other volumes in an extended partition be deleted before you can delete the extended partition.

Once you delete partitions, volumes, or logical drives, you must first commit the changes before anything else can be done to the partitions, volumes, or logical drives.

For information about how to delete partitions, volumes, or logical drives, see "Deleting Partitions, Volumes, or Logical Drives" in Disk Administrator Help.

Committing Changes

After you have made significant changes to your disk partitions, Disk Administrator displays a message to remind you about the irreversibility of certain changes, such as deleting a partition, and to ask whether you want to save those changes. Disk Administrator performs updates to the disks only after you agree to saving those changes and quit the program or if you commit the changes before quitting Disk Administrator.

Sometimes, after you have committed your changes, another message advises you that changes have been made that require you to restart the computer. (This happens under some circumstances, such as if you extended a volume set, locked a volume, or searched for or restored disk configuration information.) Disk Administrator initiates a complete system shutdown, closes all open applications, and restarts the computer.

Adding Hard Disks

The maximum number of hard disks that you can add to a computer depends on your hardware configuration, such as how many SCSI adapters you have attached. After adding additional hard disks to your computer, restart your computer, and then start Disk Administrator. Before the Disk Administrator window opens, a message advises you that Disk Administrator has noticed a change and will update the system configuration. However, drive letters are not changed by the system when you add new hard disks if they have already been statically assigned.

Saving and Restoring Disk Configuration Information

Disk Administrator provides options for saving and restoring the following currently defined disk configuration information: assigned drive letters, volume sets, stripe sets, stripe sets with parity, and mirror sets. You should be sure to save the disk configuration information before upgrading the operating system to a newer version to ensure that you do not lose your current configuration information.

You can also search for disk configuration information among different installed versions of Windows NT and select a specific version to replace another. However, you should be careful to update this version's information every time you make a change to your disk's configuration. Make your changes first, quit Disk Administrator, restart your computer and Disk Administrator, and then save the configuration information and quit Disk Administrator.

For information about how to save, restore, and search for disk configuration information, see "Saving Disk Configuration Information," "Restoring Disk Configuration Information," and "Searching for Disk Configuration Information" in Disk Administrator Help.

Changing the File System on a Partition

If you want to change the file system on an existing partition, you should back up the information on the partition.

If Windows NT is not installed on the partition, you can use the Format command from the Disk Administrator Tools menu to reformat that partition to another file system. Or, you can also use the format program at the command prompt. However, reformatting the partition will also destroy all existing data.

If you want to change the file system on an existing FAT partition to the NTFS format, you can use the convert program at the command prompt. Using the Convert program does not overwrite data on the disk. The Convert program cannot be used to convert an NTFS partition to FAT.

For more information about Format and Convert, see the Command Reference in Help.

If Windows NT is installed on the partition, you cannot delete it from within Disk Administrator nor reformat the partition using Format. Instead, you must use the Setup program. For more information about using the Setup program, see "Formatting and labeling partitions" and Windows NT Server Start Here.

For information about how to reformat partitions, see "Reformatting Existing Partitions" in Disk Administrator Help.

Creating and Deleting Volume Sets

Volume sets are a mechanism for more effectively using the total available free space on several disks. Volume sets are created, as shown in the following illustration, by combining various-sized areas of free space from 1 to 32 disks into one large logical volume set that is recognized as a single partition. 

Note Operating systems, such as MS-DOS, that do not have volume-set functionality cannot recognize any volume sets that are created by Windows NT. Therefore, if you create a volume set on a dual-boot computer, those partitions become unusable by MS-DOS.

The areas of free space used to create volume sets can be different sizes, as shown in the following illustration. Volume sets are organized in such a way that the space on one disk gets filled up and then, starting at the beginning of the next disk, all that space gets filled up. The process continues in the same way on each subsequent disk up to a maximum of 32 disks.

 

Deleting smaller partitions and combining them into one volume set frees drive letters for other uses, enables the creation of a large volume for file system use, and can improve system performance by better balancing data input and output (I/O) across the drives. However, volume sets do not have fault tolerance.

Before making any changes to volume sets, you should first back up all the information on the volume set and only then delete the volume set because all the information contained in the set will be deleted, too.

Existing NTFS volumes and volume sets can be extended by adding free space. Disk Administrator forces the system to restart after you quit and save your changes and then formats the new area without affecting any existing files on the original volume or volume set.

For information about how to create, delete, and extend volume sets, see "Creating a Volume Set," "Deleting a Volume Set," and "Extending Volumes and Volume Sets" in Disk Administrator Help.

Creating and Deleting Stripe Sets

Stripe sets are created similarly to volume sets, but with more restrictions. Each member partition of the stripe set must be on a different disk up to a limit of 32 disks. Also, Disk Administrator will make all the partitions the same size. 

Note Operating systems, such as MS-DOS, that do not have stripe-set functionality cannot recognize any stripe sets that are created by Windows NT. Therefore, if you create a stripe set on a dual-boot computer, those partitions become unusable by MS-DOS.

Stripe sets are created by combining areas of free space from 2 to 32 disks into one large logical volume. The partitions in stripe sets are all approximately the same size so that the data can be written in stripes across each partition. This enables I/O commands to be issued concurrently and increases throughput.

The following illustration shows a set of six hard disks and how the stripes are distributed across them.

 

When you no longer want a stripe set or you have a problem with a faulty disk drive, you should first back up all the information on the stripe set and only then delete the stripe set because all the information will be deleted too. Stripe sets without parity do not provide fault tolerance.

For information about how to create and delete stripe sets see "Creating a Stripe Set" and "Deleting a Stripe Set" in Disk Administrator Help

Top of pageTop of page

Fault Tolerance

Fault tolerance is the ability of a system to continue functioning when part of the system fails. Normally, the expression fault tolerance is used to describe disk subsystems, but it can also apply to other parts of the system or the entire system. Fully fault-tolerant systems use redundant disk controllers and power supplies as well as fault-tolerant disk subsystems. You can also use uninterruptible power supplies (UPSs) to safeguard against local power failure. For more information about UPS, see "Managing Uninterruptible Power Supplies" later in this chapter.

Although the data is always available and current in a fault-tolerant system, you still need to make tape backups to protect the information on your disk subsystem against destructive events such as fire, earthquakes, tornadoes, floods, and user errors. Disk fault tolerance is not an alternative to a backup strategy with offsite storage. For more information about backing up to tape, see Chapter 6 "Backing Up and Restoring Network Files."

Fault tolerance is designed to combat problems with disk failures, power outages, or corrupted operating systems which can include boot files, the operating system itself, or system files.

Fault-tolerant disk systems are standardized and categorized in six levels known as Redundant Arrays of Inexpensive Disks (RAID) level 0 through level 5. Each level offers various mixes of performance, reliability, and cost. Disk Administrator includes RAID levels 0, 1, and 5. Only levels 1 and 5 provide fault-tolerance.

RAID strategies can be implemented using hardware or software solutions. In a hardware solution, the controller interface handles the creation and regeneration of redundant information. In Windows NT Server, this activity can be performed in the software. A hardware implementation of a RAID strategy can offer performance advantages over the software implementation included in Windows NT Server.

Understanding RAID

Disk arrays consist of multiple disk drives coordinated by a controller. Individual data files are typically written to more than one disk in a manner that, depending on the RAID level used, can improve performance and/or reliability.

However, there is no fault tolerance until the fault is repaired. Few RAID implementations can withstand two simultaneous failures. When the failed disk is replaced, the data can be regenerated using the redundant information. Data regeneration occurs without bringing in backup tapes or performing manual update operations to cover transactions that took place since the last backup. When data regeneration is complete, all data is current and again protected against disk failure. The ability to provide cost-effective high data availability is the key advantage of disk arrays.

Level 0: Stripe Sets

Stripe sets are created by combining areas of free space on from three to 32 disks into one large logical volume. Data is divided into blocks and spread in a fixed order among all the disks in the array.

Level 0 stripe sets do not provide any fault tolerance.

 

RAID Level 0 

Stripe sets in Windows NT write data to multiple partitions, as is done with volume sets. However, striping writes files across all disks so that data is added to all disks in the set at the same rate.

Stripe sets offer the best performance of all the Windows NT Server disk management strategies, including volume sets. However, like volume sets, it does not provide fault tolerance. If any partition in the set fails, all data is lost.

Level 1: Mirror Sets

Mirror sets provide an identical twin for a selected disk; all data written to the primary disk is also written to the shadow or mirror disk. This results in disk space utilization of only 50 percent. If one disk fails, the system uses data from the other disk. For more information about dealing with boot failures, see "Fixing a System or Boot Failure" later in this chapter.

Mirror sets protect a partition on a disk from media and, possibly, controller failure by maintaining a fully redundant copy on another disk. When a mirrored partition fails, you must break the mirror set to expose the remaining partition as a separate volume with its own drive letter. That volume then becomes the main partition, and you can create a new mirror-set relationship with unused free space of the same size or greater on another disk.

Mirror sets are created by duplicating a partition using free space on another disk. If the second partition is larger, the remaining space becomes free space. The same drive letter is used for both partitions. Any existing partition, even the system and boot partitions, can be mirrored onto another partition of the same size, or greater, on another disk using either the same or a different controller. When creating mirror sets, it is best to use disks that are the same size, model, and manufacturer.

 

RAID Level 1 

Mirror sets have better overall read and write performance than level 5, stripe sets with parity. Another advantage of mirror sets over stripe sets with parity is that there is no loss in performance when a member of a mirror set fails. Mirror sets are more expensive in terms of dollars per megabyte because its disk space utilization is less. But its entry cost is lower because it requires only two disks, whereas stripe sets with parity require three or more disks.

The following illustration shows examples of mirror sets using the same and different controllers.

 

Mirror sets reduce the chance of an unrecoverable error by providing a duplicate set of data, which doubles the number of disks required and the input/output (I/O) operations when writing to the disk. However, some performance gains are achieved for reading data because of I/O load balancing of requests between the two partitions.

When you want to use the space in a mirror set for other purposes, you must first break the mirror set and then delete the partition. Breaking the mirror set does not delete the information, but it is still safer to do a backup first. You will then be ready to delete one of the partitions that made up the mirror set to regain free space.

In the case of an unrecoverable error on a partition within a mirror set, you need to break the mirror-set relationship to expose the remaining partition as an individual partition or logical drive. You can then reassign some free space on another disk to create a new mirror set. For more information about breaking mirror sets, see the Windows NT Server Resource Kit version 4.0.

For information about how to establish, break, or delete a mirror set, see "Establishing a Mirror Set" and "Breaking a Mirror Set," in Disk Administrator Help.

Level 5: Stripe Sets with Parity

Level 5 is commonly known as striping with parity. The data is striped in large blocks across all the disks in the array. Level 5 differs because it writes the parity across all the disks. The data redundancy is provided by the parity information. The data and parity information are arranged on the disk array so that the two are always on different disks.

 

RAID Level 5 Configuration 

Stripe sets with parity have better read performance than mirror sets. However, when a member is missing, such as when a disk has failed, the read performance is degraded by the need to recover the data with the parity information.

Nevertheless, this strategy is recommended over mirror sets for applications that require redundancy and are primarily read-oriented. Write performance is reduced by the parity calculation. Also, a write operation requires three times as much memory as a read operation during normal operation. Moreover, when a partition fails, reading requires at least three times the memory as would normally be used, both caused by parity calculation.

<table border="1" cellpadding="5" cellspacing="0" width="90%"><tbody></tbody></table>

Understanding Windows NT Parity Usage 

The data redundancy method used in Windows NT Server for striping with parity is a function of the Boolean operation called exclusive OR, also referred to as XOR. The important concept to remember about parity is that regeneration uses the parity information with the data on the good disks to re-create the data on the failed disk. The Windows NT Server stripe-sets-with-parity form of fault tolerance maintains an XOR of the total data. This enables the reconstruction of missing data (on a failed disk or sector) from the remaining disks in the stripe set with parity.

Note Using stripe sets with parity require more system memory than using mirror sets. The recommended minimum RAM is 16 MB or greater.

Stripe sets with parity include one parity strip per row. Therefore, you must use at least three, rather than two, disks to allow for the parity information. Parity strips, as shown in the following illustration, are distributed across all the partitions to balance the I/O load. The protection provided here is as complete as with disk mirroring. This technique provides data redundancy at a cost of only one additional disk for the set. Recovery from the failure of a disk in a parity stripe set is more time consuming, though, than for mirror sets.

The following illustration shows a set of six hard disks and how the parity stripes are distributed across the partitions.

 

When you want to recover the space in a stripe set with parity for other purposes, be sure to do a backup first if you want to reuse that information and then delete the stripe set.

For information about how to create and delete stripe sets with parity, see "Creating a Stripe Set with Parity" and "Deleting a Stripe Set with Parity" in Disk Administrator Help.

Windows NT RAID Strategy Summary

In Windows NT Server, stripe sets provide the best performance but provide no fault tolerance (that is, data redundancy).

When compared to stripe sets with parity, a mirror-set implementation has a lower entry cost, requires less system memory, provides the best overall performance, and does not show performance degradation during a failure. However, its cost-per-megabyte is higher than that for stripe sets with parity.

A stripe-set-with-parity implementation has better read performance and a lower cost-per-megabyte, but it requires more system memory and loses its performance advantage while a member is missing.

Stripe sets with parity are a good solution for data redundancy in a computing environment in which most activity consists of reading data. For example, if your network has a server on which you maintain all copies of the programs used by the people at that site, this might be a good case for using a stripe set with parity. This would enable you to protect the programs against the loss of a single disk in the stripe set. In addition, the read performance would improve due to concurrency of the reads across the disks making up the stripe set with parity.

In an environment in which frequent updates to the information occur, it can be better to use mirror sets. However, you can use a stripe set with parity if you want redundancy and if the storage overhead cost of a mirror is prohibitive.

Notice that operating systems, such as MS-DOS, that do not have fault-tolerance functionality cannot recognize the partitions that Windows NT Server creates for fault tolerance. Therefore, if you mirror your MS-DOS system partition on a dual-boot computer, MS-DOS cannot use or start either partition. Also, as a precaution, you should be sure to create a recovery disk so that you can start your computer from the mirrored partition if the system partition is lost. For information on creating and using such a recovery disk, see "Creating a Recovery Disk on x86-based Systems" later in this chapter.

Top of pageTop of page

Managing Uninterruptible Power Supplies

An uninterruptible power supply (UPS) provides power when the local power fails. It is usually rated to provide a specific amount of power for a specific period of time. This power comes from batteries that are kept charged while main power is available. The main power is converted from an AC voltage to a DC voltage used to charge the battery. When needed, the DC power is converted to an AC voltage compatible with the computer power supply. Usually, all that is needed from a UPS is time to shut down the system in an orderly fashion by quitting processes and closing sessions.

To minimize downtime from power failures and provide some advance warning before total power loss, Windows NT provides the UPS option in Control Panel for managing an uninterruptible power supply.

Before purchasing a UPS device to use with Windows NT, confirm with the UPS manufacturer that both the device and its serial cable are compatible with Windows NT.

Understanding UPS Types

Uninterruptible power supplies fall into two categories: online and standby.

You connect an online UPS between the main power and the computer to constantly supply your computer system with power. Connecting it to the main power keeps its battery charged. This method provides power conditioning, which means that it removes spikes, surges, sags, and noise.

 

Configuration of an Online UPS 

A standby UPS is configured to provide either the main power or its own power source and to switch from one to the other as necessary. When the main power is available, the UPS device connects the main power directly to the computer and monitors the main power voltage level. The UPS power supply is kept in standby mode (that is, ready to provide power but using very little), and the battery is kept charged. When the main power fails or the voltage falls below an acceptable level, the UPS device switches the power fed to the computer from the main power to its own power. This should occur so quickly that the computer power supply can provide uninterrupted service. A standby UPS can also provide power conditioning during regular service if it is built into the main power path, but it is not a function of the conversion process of the UPS power supply.

 

Configuration of a Standby UPS 

Hybrid versions of these two types can also exist. Check the reliability and failure-handling mechanism of the UPS device before buying or installing it.

Understanding How a UPS Interacts with Operating Systems

Many UPS devices can interface with operating systems, enabling the operating system to notify users automatically of the pending shutdown process or provide notification that the power has been restored and a shutdown is no longer necessary.

During a power failure, the UPS service immediately pauses the Server service to prevent any new connections and sends a message to notify users of the power failure. The UPS service then waits a specified interval of time before notifying users to quit their sessions. If power is restored during the interval, another message is sent to inform users that power has been restored and normal operations have resumed.

Setting Up the Uninterruptible Power Supply

You can use the UPS option in Control Panel to set the following options:

The serial port where the UPS device is connected.

Whether the UPS device sends a signal if the regular power supply fails.

Whether the UPS device sends a warning when battery power is low.

Whether the UPS service sends a signal telling the UPS device to shut off.

A command file to execute at shutdown time.

The expected life and recharge time for the battery.

The timing for warning messages.

The actual options for configuring the UPS service depend on the specific UPS hardware installed on your system. Incorrect settings can cause undesirable operation of your UPS hardware. For details about possible settings, see the documentation for your UPS device.

For information about how to set up a UPS, see "Configuring the Uninterruptible Power Supply (UPS)" in Help.

After configuring UPS options, be sure to test that your computer is protected from power failure. For information about how to test the configuration of your UPS, see "Testing your UPS Configuration" in Help.

Understanding How the Windows NT UPS Service Works

The UPS option in Control Panel enables the UPS service to communicate with a UPS device through a serial port with the following signals.

SignalPinAsserted by

Power Failure

CTS (Clear To Send )

UPS hardware

Low Battery

DCD (Data Carrier Detect)

UPS hardware

UPS Shutdown

DTR (Data Terminal Ready)

Windows NT UPS service

The assertion of each of these pins can be either positive or negative, depending on the UPS device's implementation. Use the UPS dialog box to specify the polarity used by the UPS hardware. For configuration details, see the vendor's manual.

To support contact-closure type UPS devices, the UPS service always does the following:

TXD (Transmit Data) pin 6 is set permanently low.

RTS (Request To Send) pin is set permanently high.

When the UPS service is started, it verifies the settings in the UPS dialog box by assuming that the system is not starting during a power failure and by ensuring that the signal polarity on the CTS and DCD pins is opposite to that specified as the failure condition in the UPS dialog box. For example, if the UPS dialog box specifies that the UPS device supports a Power Failure Signal (CTS pin) with a positive signal, the UPS service checks to make sure that this pin is not already asserted positive (which would not happen unless you had started the system during a power failure).

This has some important implications. With an online UPS, the UPS device can shut itself off immediately if the configuration is incorrect. With a standby UPS, an incorrect configuration typically shuts the UPS device off as soon as a power failure is detected, effectively circumventing the purpose of the UPS. This is why it is important to configure and test your UPS device to ensure that it operates correctly.

When the UPS service starts, it waits until the CTS pin is asserted by the UPS. If you have indicated in the UPS dialog box that your UPS device does not support a Low Battery signal, the UPS service uses the parameters specified in the UPS Characteristics box of the UPS dialog box to estimate the charge level of your battery in terms of minutes. Each time you start the UPS service, the charge level of your battery is reset to 0 minutes. As time elapses, the Battery Recharge parameter is used to estimate the battery life to a maximum time specified by the Expected Battery Life parameter. The UPS service requires at least two minutes to perform a graceful shutdown. Therefore, if your battery does not have more than two minutes of life remaining, a shutdown is performed immediately. Since it is important that the parameters in the UPS Characteristics box be set accurately, it is best to use worst-case estimates.

When a power failure occurs, the UPS service uses the parameters in the UPS Service box of the UPS dialog box to decide how to respond. For noisy power (that is, power that fluctuates regularly), the first parameter should be set for a few seconds. The first parameter is the time between power failure and the initial warning message. Setting the first parameter minimizes messages being broadcast.

The UPS device continually sends messages at an interval specified in the second parameter of the UPS Service box. The second parameter specifies the delay between warning messages. The second parameter should be set very low if you want to ensure that users are aware of the power failure or set high if it is not important to let users know about the power failure. When the UPS battery is low, the service initiates a shutdown and then turns off the UPS device (if this feature is supported).

Using a UPS Device with Windows NT

You should use the following Windows NT services in combination with the UPS device that you select for your computer:

UPS

Alerter

Messenger

Event Log

The following are some basic points to be aware of to ensure that your UPS is installed correctly and to protect your computer from the hazards of a power failure.

A UPS device supplies power to your computer and peripherals (for example, the monitor and printer) when the main power supply is interrupted or fails completely. Some UPS devices can supply power for only a few minutes, while others can supply power for many hours. In any case, you should configure UPS properly to work with Windows NT so that the UPS can track power fluctuations and take appropriate actions. For example, if a power failure is prolonged, the UPS might not be able to supply power for the entire duration of the failure. In this case, Windows NT warns users of the power failure. When the UPS reaches a critical state, the operating system shuts down, and the UPS device is turned off. Therefore, you should ensure that your UPS device guarantees at least two minutes to enable the operating system to perform a graceful shutdown.

To select a UPS device that works with Windows NT, see the Windows NT Server Hardware Compatibility List. This usually means ordering the correct serial cable from the UPS manufacturer. This cable is designed to follow the UPS interfacing specification for Windows NT. If you are upgrading your computer to Windows NT and already have a UPS, check with the manufacturer to ensure that the existing cable works with Windows NT. Unexpected results can occur if the wrong cable is used.

UPS manufacturers can have their own software that can be purchased separately to take advantage of the unique features of their UPS device. In this case, you should not use the UPS service that comes with Windows NT. Follow the instructions that come with the UPS manufacturer's software.

The UPS service is configured using the UPS option in Control Panel. You should base the configuration on the features supported by the UPS device that you are using. The three features that Windows NT supports are:

Main-power failure detection.

Low battery detection.

UPS shutdown capability.

To determine the correct settings, read the user's manual for your UPS device carefully, or contact the manufacturer. Based on the features supported by the UPS, you might have to enter additional parameters in the UPS Characteristics box of the UPS dialog box.

The UPS service can be controlled in several ways. One way is to configure the settings in the UPS dialog box and click OK. A message appears and asks if you want to start the UPS service. Another way to start the service is by using the Services option in Control Panel.

The Alerter and Messenger services are started automatically when Windows NT starts. The Alerter service sends alerts to selected users, and the Messenger service sends messages to your local Windows NT computer and to other users on the network. All detected power fluctuations and power failures are recorded in the event log, along with UPS service start failures and server shutdown initiations.

To ensure that the computer is protected from power failures, test it by simulating a power failure (that is, by disconnecting the main power supply to the UPS device). Your computer and peripherals connected to the UPS device should remain operational, and messages should be displayed and events logged. Wait until the UPS battery reaches a low level to verify that a graceful shutdown occurs. Restore the main power to the UPS device, and check the event log to ensure that all actions were logged and there were no errors.

Running a Command File upon UPS Shutdown

The Windows NT UPS service supports the running of a command file the administrator defines. You should specify a command file only if your system requires special actions prior to a system shutdown. For example, you can have a custom application running that is connected to another computer. You can use the command file to end the session and log off the connected computer automatically prior to system shutdown.

You cannot specify a command file that causes a dialog box to appear because dialog boxes can require user input and would therefore impede a graceful system shutdown.

The command file must reside in your \Systemroot\System32 directory and have one of the following extensions: .exe, .com, .bat, or .cmd. After you have created the file and placed it in the proper directory, use the UPS dialog box to activate its use upon UPS shutdown. Select the Execute Command File option, type the file name, and click OK.

Note The command file must finish running in 30 seconds. A run time that is greater than 30 seconds threatens the capability of Windows NT to complete a graceful system shutdown. You should test the operation of the command file under a worst-case scenario.

Top of pageTop of page

System Diagnosis, Recovery, and Repair

These system diagnosis, recovery, and repair methods are described in this section:

Use Windows NT Diagnostics to diagnose configuration problems.

Use the Recovery box (Startup/Shutdown tab) in the System option of Control Panel to specify how Windows NT Server records and responds to severe errors.

Restore the previous working configuration using the Last Known Good Configuration.

Restore corrupt or missing system files, as well as the boot sector, and configuration information using the Repair process in Windows NT Setup. Depending on what you need to repair, you might need to use the Emergency Repair Disk. You can also use the Emergency Repair Disk program (Rdisk.exe) to update your system information or create a new Emergency Repair Disk.

Create and use a recovery disk to start the computer after failure of the primary partition.

A list of other error detection tools available in Windows NT Server appears at the end of this section.

Using Windows NT Diagnostics for System Diagnosis

You can use Windows NT Diagnostics (Winmsd.exe), the diagnostic tool for Windows NT Server, to view and print configuration information for a local or remote computer. Windows NT Diagnostics is located in the Administrative Tools folder. With Windows NT Diagnostics, you can view the following:

Operating system information, such as the version number and system boot options, plus process, system, and user environment variables

Hardware details such as BIOS information, video resolution, CPU type, and CPU steppings

Physical memory, paging file information, and DMA usage

The current state of each driver and service on the computer

Drives and devices installed on the computer, plus related interrupt (IRQ) and port information

Network information, including transports, configuration settings, and statistics

Printer settings, fonts settings, and system processes that are running

Using System Recovery in Control Panel

When a severe error (called a STOP error, or fatal system error or blue screen) occurs, by default the system does the following:

Writes an event to the system log.

Alerts administrators.

Dumps system memory to a file you can use for debugging.

Automatically restarts the server.

Because Windows NT Server automatically restarts rather than waiting for administrator intervention, fatal system errors cause less server down time than they would otherwise.

The dump of system memory to a log file can be valuable for debugging the cause of the STOP error. If you contact your technical support representatives about the error, they might ask for the log file. Notice that Windows NT writes the log file to the same file name (Memory.dmp, by default) each time a STOP error occurs. To preserve log files, you should copy them to a new file name after the computer restarts.

If you want to change how Windows NT reacts to a STOP error, use the System option (Startup/Shutdown tab) in Control Panel. For information about how to configure the System option, see "Configuring System Recovery Options" in Help.

Using the Last Known Good Configuration

If you encounter difficulty starting Windows NT Server after you installed a new driver or changed a driver configuration, you can choose to start Windows NT Server using the Last Known Good Configuration.

To use the Last Known Good Configuration

1.

Start your computer and press the SPACEBAR immediately when the words "OS Loader V4.00" appear.

A Hardware Profile/Configuration Recovery menu appears that lets you select one of the following:

A hardware profile to be used when starting the computer

Switch to the Last Known Good configuration

Restart computer

2.

Select L to use the Last Known Good Configuration to start Windows NT Server as it was before you made the changes that prevented it from starting.

All configuration changes that were made since your system was last successfully started are lost.

Using the Repair Process

If your system files or boot partition are corrupt and you are unable to start the computer using the Last Known Good method, you can use the Repair process in Windows NT Setup to restore your system.

To repair a Windows NT Server installation, Windows NT Setup needs either the configuration information that is saved in the \Systemroot\Repair directory or on the Emergency Repair Disk created when you installed Windows NT (or created later using the rdisk program).

For information about how to use or create an Emergency Repair Disk, see "Repair Disk Utility" in Help.

If your system becomes corrupt and you cannot repair it using the Emergency Repair Disk or using the information in the \Repair directory, you must reinstall Windows NT Server from the original installation source. For more information about restoring your system, see the Windows NT Server Resource Kit version 4.0.

To restore Windows NT Server on an x86-based computer using the repair process in Windows NT Setup

1.

If you installed Windows NT Server using the original Setup floppy disks or CD-ROM or using Winnt.exe, start Setup just as you did originally. That is, insert the Setup Boot Disk in drive A and start the computer.

2.

In the text-based Setup screen that asks whether you want to install Windows NT Server or repair files, type r to indicate that you want to repair your Windows NT Server files.

3.

Windows NT Setup asks you for the Emergency Repair Disk. If you do not have one, Setup presents a list of the Windows NT installations that it found on your computer and lets you select the one you want to repair.

4.

Follow the instructions on the screen, inserting the Emergency Repair Disk (if you have one) in drive A and providing any other Windows NT Setup disks as requested.

5.

When the final message appears, remove the Emergency Repair Disk from drive A, and then press CTRL+ALT+DEL to restart your computer.

To restore Windows NT Server on a RISC-based computer with an Emergency Repair Disk

1.

Start the Windows NT Setup program as instructed in your manufacturer-supplied documentation. (How you start Windows NT Setup depends on the type of RISC-based computer you have.)

2.

In the text-based Setup screen that asks whether you want to install Windows NT Server or repair files, type r to indicate that you want to repair your Windows NT Server files.

3.

Follow the instructions on the screen, inserting the Emergency Repair Disk (if you have one) in drive A if Setup asks for it.

4.

When the final message appears, remove the Emergency Repair Disk, and then press ENTER to restart your computer.

The repair process in Windows NT Setup enables you to select what you want to repair.

Note The Emergency Repair Disk program (rdisk) does not backup user accounts or file security unless you specify the /s parameter with the rdisk command at the command prompt.

Caution Be sure to update the system repair information in the \Repair directory on your hard disk and to create and maintain an up-to-date Emergency Repair Disk. This way, your system repair information will account for new configuration information such as drive letter assignments, stripe sets, volume sets, mirrors, and so on. Otherwise, drives can be inaccessible in the event of a system failure.

You can use the rdisk program to update your system repair information and to a create a new Emergency Repair Disk. For more information about rdisk see Windows NT Server Start Here.

Recovering From Disk and Sector Failures

The Windows NT Server fault-tolerance tools enable you to recover quickly and easily from problem situations. Mirror sets enable you to have instant access to another disk with a redundant copy of the information on a failed disk. Using stripe sets with parity enables you to regenerate data by using the parity strip if a disk fails. Bad-sector mapping capabilities enable the system to fix sector failures without user intervention.

This section briefly discusses how to recover data from the following types of error situations:

Failed disk in a stripe set with parity

Sector failures

Failed disk in a mirror set

Fixing Mirror Sets and Stripe Sets with Parity

When a member of a mirror set or a stripe set with parity fails, it becomes an orphan. The fault-tolerance driver (Ftdisk.sys) then determines that it can no longer use it and directs all new reads and writes to the remaining members of the fault-tolerance volume.

When a member of a mirror set is orphaned, you must first break the mirror-set relationship to expose the remaining partition as a separate volume. The remaining, working member of the mirror set receives the drive letter that was previously assigned to the complete mirror set. The orphaned partition receives the next available drive letter or whatever letter you want to assign.

You can then create a new mirror-set relationship from unused free space on another disk. When you restart the computer, the data from the good partition is copied to the new member of the mirror set.

For information about how to break a mirror set, see "Breaking a Mirror Set" in Help.

When a member of a stripe set with parity is orphaned, you can regenerate the data for the orphaned member from the remaining members. In Disk Administrator, select a new area of free space that is the same size as or greater than, the other members of the stripe set with parity, and then regenerate the data. If you are required to restart the computer, the fault-tolerance driver reads the information from the strips on the other member disks and then re-creates the data of the missing member and writes it to the new member.

Regenerating a stripe set requires the volume be locked by the operating system. All network connections to the volume are lost when a volume is regenerated.

For information about how to regenerate a recoverable stripe set with parity, see "Regenerating a Recoverable Stripe Set with Parity" in Help.

Fixing Sector Failures

The file system verifies all sectors when it formats a volume. All faulty sectors are spared from service. Windows NT Server fault-tolerance services add sector-recovery capabilities to the system.

When there is a sector I/O failure in a fault-tolerant system with redundant copies of the data, the fault-tolerance driver attempts to spare the bad sector from use. This includes performing a device control asking the disk device driver to spare the sector from use. Small Computer System Interface (SCSI) devices can do this, but AT devices, such as Integrated Device Electronics (IDE) and Enhanced Small Device Interface (ESDI), cannot.

When the sector cannot be spared, the correct information obtained from the redundant copy is returned to the file system with a status message stating that there is a faulty sector in the I/O. The file system then attempts to locate the failure and spare the bad sectors by removing them from the sector map of the file system. An error is logged in Event Viewer about the potential for data loss if the partition containing the redundant copy also fails. For more information about Event Viewer, see Chapter 9, "Monitoring Events."

Disk Administrator also enables you to check for errors on your disk. Click Check For Errors on the Tools menu. For information about how to check your disk for errors, see "Checking for Errors" in Help.

Top of pageTop of page

Fixing a System or Boot Failure

If the system or boot partition of a disk fails, the system will not start. The process used to recover from a startup failure depends on the disk configuration and the computer system's microprocessor type. If the system or boot partition on the disk is not part of a mirror set, the system-backup copies should be restored to the replacement disk.

For a system or boot partition configured as part of a mirror set, which is the only fault-tolerant method that can be applied to the boot and system partitions, the recovery procedure depends on whether the computer is x86-based or RISC-based. Both use a recovery disk for startup protection, and both use Advanced RISC Computing (ARC) names to describe the path to the boot partition. You should create and test the recovery disk in advance of any failure, or you cannot start your computer from the mirror if the primary partition is lost.

<table border="1" cellpadding="5" cellspacing="0" width="90%"><tbody></tbody></table>

Understanding ARC Names 

To set up the boot information for recovery in the Windows NT environment, you must understand ARC names and how they are constructed. ARC names are a generic method of identifying devices within the ARC environment. For disk devices, ARC names are constructed as follows:

<component>(x)disk(y)rdisk(z)partition(a) 

where <component> identifies the hardware adapter for the device. The two valid values for this field are scsi and multi, where scsi indicates a SCSI disk and multi indicates a disk interface other than SCSI. (Multi can also be used for SCSI disk interfaces if the BIOS is enabled on the disk controller.) For Windows NT, this could be a disk supported by the AtDisk driver or one supported by AbiosDsk.

x is the ordinal number of the adapter. For example, if there are two SCSI adapters in the system, the first to load and initialize is assigned the ordinal 0 and the next number assigned is 1. This continues for all adapter drivers that initialize.

y is, for scsi, the SCSI bus number for multiple-bus SCSI adapters. For multi, this is always 0.

z is, for scsi, always 0. For multi, this is the ordinal for the disk on the adapter, which determines the order the disk appears in Disk Administrator.

a is the partition ordinal for the partition used on the disk. All partitions receive a number, beginning with 1, except type 5 (MS-DOS Extended) and type 0 (unused) partitions.

For example, if the Windows NT tree is located on the fourth partition on a SCSI disk with the target ID of 3 on the second SCSI controller in the system, the ARC name is:

scsi(1)disk(3)rdisk(0)partition(4) 

Starting Windows NT Server on an x86-based Computer

Windows NT Server starts in the following sequence on an x86-based computer:

1.

When Windows NT Server is installed, it alters the system's boot sector to look for and run a program called Ntldr.

2.

Ntldr reads Boot.ini and builds a menu of the operating systems that you can start. (The Boot.ini file is described following this list.)

3.

Ntldr runs Ntdetect.com, which builds a list of the system's hardware components.

4.

You can select an operating system from the menu, or let the time-out count down to 0 to start the default operating system.

If you don't see the menu and the default operating system automatically starts, the time-out value has been set to 0 in Boot.ini.

5.

The low-level components of Windows NT Server load, and then Windows NT Server initializes the drivers and starts the services based on information stored in the registry.

6.

The high-level components of Windows NT Server load, and then the Welcome screen is displayed so you can log on.

<table border="1" cellpadding="5" cellspacing="0" width="90%"><tbody></tbody></table>

Editing Boot.ini 

If Windows NT Server does not start, make sure that the statements in Boot.ini (found in the root directory of your system partition) refer to the correct path for the \Systemroot directory.

Boot.ini is a system text file that has two sections: the first specifies the default operating system to start and a time-out value specifying how long to wait before starting automatically, and the second specifies the operating systems that you can start. For example, if your system is configured to run either Windows NT Server or MS-DOS, Boot.ini typically looks like this:

[boot loader]
timeout=30
default=SCSI(0)disk(0)rdisk(0)partition(1)\winnt40

[operating systems]
SCSI(0)disk(0)rdisk(0)partition(1)\winnt40="Windows NT Server"
c:\="MS-DOS"

You can edit the text displayed in quotes to customize the operating system choices, but you must first change the read-only, hidden, and system attributes of Boot.ini.

To view and change the attributes of the Boot.ini files, see "Viewing all file and filename extensions" and "Changing file or folder properties" in Help. Attributes can also be changed using the attrib command at the command prompt. For more information about attrib, see the Command Reference in Help.

Creating a Recovery Disk on x86-based Systems

For x86-based systems, a Boot.ini file is located on the system partition. This file contains a menu selection and the ARC-name location for the Windows NT boot partition.

In this example, the system contains one Adaptec SCSI adapter and two Future Domain SCSI adapters. The adapter controller is the first on the SCSI chain. The boot partition is mirrored on the Future Domain adapter. The ARC names are as follows:

scsi(0)disk()rdisk()partition() would be for the Adaptec SCSI devices.
scsi(1)disk()rdisk()partition() would be for the first Future Domain adapter.
scsi(2)disk()rdisk()partition() would be for the second Future Domain adapter.

Create the recovery disk by formatting a floppy disk using the Windows NT operating system and then copying the following files to the disk. The files are located in the root directory of the system partition:

Ntldr

Ntdetect.com

Ntbootdd.sys (required only if the boot partition is on a SCSI disk and BIOS is not enabled on the controller; this file is the SCSI miniport driver used to find the mirror disk)

Boot.ini (with an alternate path pointing to the mirrored copy of the system partition, specified using ARC naming conventions)

Bootsect.dos (only on multiple-boot computers)

Creating a Recovery Disk on RISC-based Systems

RISC-based systems have information equivalent to the Boot.ini files in nonvolatile RAM. The process for creating a recovery disk for RISC systems using the Microsoft firmware is to copy the following files to the Windows NT-formatted blank floppy disk:

Osloader.exe

Hal.dll

The vendor firmware provides for boot maintenance operations through menu selections. To set up a boot selection for the boot disk, set the OSLOADER value to scsi(0)disk(0)fdisk(0)\Osloader.exe and the SYSTEMPARTITION value to scsi(0)disk(0)fdisk(0). The value for fdisk(x) can be changed to 1 to use the second floppy disk in the system. You also need to set appropriate values for the following selections:

OSLOADPARTITION, which is the ARC name that specifies the secondary mirrored partition

OSLOADFILENAME, which is the path to the \Systemroot directory for Windows NT (for example, \Winnt40 or \Windows) on the secondary mirrored partition

After the Windows NT boot loader program, Osloader.exe, has finished loading Windows NT and the configured drivers, the fault-tolerant services are present and can perform the remaining corrective actions for starting the system. Notice, however, that RISC-based systems can access only SCSI devices connected to the built-in SCSI adapter during the firmware boot process. Therefore, to protect the boot or system partition on a RISC-based system, both drives used must be connected to the internal SCSI adapter.

Testing Your Newly Created Recovery Disk

To test your recovery disk, use it to boot the system from the shadow partition. Test the disk both with the primary disk powered on and then with the primary disk powered off. In both cases, if you can log on, the recovery disk works.

Maintaining Your Recovery Disk

You should update the recovery disk every time partitions are changed. For example, if you are using partition 2 to start and you delete partition 1, you must change the ARC name to start using partition 1. Likewise, if you are starting on partition 2 and you delete partition 1 and repartition it into partitions 1 and 2, you must change the ARC name to start using partition 3. For more information about ARC names, see "Understanding ARC Names" earlier in this chapter.

Maintaining the Boot Configuration

Once Windows NT Server starts successfully, back up the configuration directory (\Systemroot\System32\Config), and maintain current backups as you change the configuration and accounts. The registry is made up of the files in the configuration directory. For more information about backing up the registry, see Chapter 6 "Backing Up and Restoring Network Files."

If you have to use the Repair option in Windows NT Setup to restore the registry files, you can restore the configuration from your backup.

For x86-based systems, do not delete Boot.ini, Ntldr, Bootsect.dos, Ntdetect.com, or Ntbootdd.sys (if Windows NT Server is installed on a SCSI disk) in the root directory of the system partition. For RISC-based systems, do not delete Hal.dll or Osloader.exe in \OS\NT. If these hidden system files are deleted, Windows NT Server will not start. Use the Emergency Repair Disk to recover these files.

If you made changes to a system that previously started Windows NT Server successfully and it now does not start, you can return to your previous configuration by selecting Last Known Good Configuration at system boot. If Windows NT Server still will not start, use the Repair process in Windows NT Setup to restart the system. Once the system is restarted, you can restore data using backup tapes.

Other Error Detection Tools

Some other ways in which Windows NT Server reports errors and preserves your system configuration and data include the following:

If your system uses NTFS, Windows NT Server logs all file transactions, replaces bad clusters automatically, and stores copies of key information for all files on the NTFS volume.

Services and applications record events, including errors, in the event logs that you can view in Event Viewer. For information about Event Viewer, see Chapter 9, "Monitoring Events."

The Alerter and Messenger services work together to provide warnings on printer, security, and user session problems. These services also provide server shutdown warnings if the system uses the UPS service. For information, see "Managing Uninterruptible Power Supplies" earlier in this chapter.

Performance Monitor can be configured to create an alert log for monitoring performance and generating network alerts. For information about Performance Monitor, see Chapter 8, "Monitoring Performance."

Chkdsk examines disk space and use for the NTFS and FAT file systems. If there are errors on the disk, chkdsk alerts you and corrects the errors if the /f switch is used. If files are open when chkdsk is attempting to correct disk errors, chkdsk lets you specify to have it automatically check the disk and correct errors the next time the computer restarts. For more information, see the Command Reference in Help.

The chkdsk program can also be run using the Check for Errors command from the Disk Administrator Tools menu.

Caution Using chkdsk to repair file system errors or bad sectors on very large volumes can render the volume inaccessible for a several days.

Dr. Watson for Windows NT can be used to detect, diagnose, and log application errors for use by technical support personnel. To run Dr. Watson, type start drwtsn32 at the command prompt. Press F1 for Help on setting up and using Dr. Watson.

Repairing Config.nt and Autoexec.nt

If Windows NT Server displays an error message concerning these files, or if you have problems running MS-DOS–based applications, check whether Config.nt or Autoexec.nt is incorrect or missing. The files are located in the \Systemroot\System32 subdirectory.

If the files are incorrect or missing, you can copy new versions from the Emergency Repair Disk to the \System32 subdirectory, as described earlier in this section.

Top of pageTop of page

Recovering a Windows NT Server

The most common failures requiring system recovery are hardware failure (irreparable physical disk damage) and accidental deletion or modification of data. These failures can happen on your system partition, boot partition, or data partition.

If a failure occurs on a critical server, you will want to recover and get running as soon as possible. Information in the following sections should help you formulate a recovery plan.

In general, the steps outlined next also apply to recovering a Windows NT Workstation.

Making Recovery Easier

When you first set up a server, you can reduce the time needed for system recovery by putting the Windows NT Server system and boot partitions and the data partition(s) on separate drives. This will greatly simplify recovery if a disk is damaged.

If you use Disk Administrator to create stripe or mirror sets, you should save the disk configuration data each time you change the configuration. You can save the configuration to the server's Emergency Repair Disk or to a separate disk.

Note Always run the Emergency Repair Disk (rdisk) program just before and after you make any changes to the disk configuration. Doing so enables you to return to a stable configuration that was in place before changes were made. For information about how to use the rdisk program see "Repair Disk Utility" in Help.

Another helpful record to have during disk recovery is a written list of disk partitions and their sizes. Attach this information to the front of each disk drive.

The Emergency Repair Disk contains configuration information needed to recover a server if the system partition is lost. If system recovery of a server requires reinstallation of system software, you can use the Emergency Repair Disk to start the system and then select the Repair option in the Setup program to automatically restore this information.

How to Recover a Server

If a server has a disk failure on the disk containing the server's system partition, use the following steps to recover the server from tape.

1.

If there was a disk failure, replace the disk.

2.

If the failed disk contained the system partition, reinstall Windows NT Server on the new disk. You can recover the server's system partition and part of its registry information using the server's Emergency Repair Disk to start the system and then using the Setup program's Repair option.

However, if a backup tape is more current than the Emergency Repair Disk, restore the registry from the backup tape.

3.

Restart the server.

4.

From a tape drive attached to the server (not over the network), restore the system partition from the last normal backup.

The system partition contains hardware-specific files needed to load Windows NT. Therefore, drivers that were installed after the first Windows NT installation will not be restored unless you restore the system partition from tape. The files on the system partition are not stored or updated on the Emergency Repair Disk.

5.

Restore any applicable incremental or differential backup sets. Be sure to select the Restore Local Registry option to recover the rest of the registry information.

6.

Restart the server.

It is now ready for normal use.

There can be a significant delay if the server performs time-critical functions and you use step 2 (reinstalling Windows NT Server on the new disk). To minimize the time necessary to recover the server, you can create a recovery drive. This is an external SCSI drive, as small as 100 MB. It can be a dedicated disk drive, which sits on the server's SCSI chain but is powered off to prevent accidental modification. It can also be a pooled portable drive, which you can then cable to any server that fails.

To prepare the recovery drive for future use, install Windows NT Server on the drive, and configure a local paging file and tape driver on it. Then create a recovery disk containing the files needed to load and initialize Windows NT Server. Be sure that the Boot.ini file points to the SCSI address of the recovery drive. For more information about the files needed to load and initialize Windows NT Server and creating a recovery disk, the Windows NT Server Resource Kit version 4.0.

When a system failure occurs, cable the recovery drive to the server (or power it on if it is already attached to the server). Restart the server using the recovery disk you created for the recovery drive.

If the recovery drive has the minimal software and user accounts to run your server, you can operate the server with the recovery drive until the next scheduled maintenance period and then make a full restore from tape.

Depending on the size of the recovery drive, you can either make the recovery drive your new system drive and restore from tape. Or, you can replace the failed drive and restore to the new drive in the background while the recovery drive keeps the server running.


Top of pageTop of page