Chapter 9 - The Art of Performance Monitoring
Detecting the source of a performance problem isn't always a straightforward task. Sometimes it requires that you try different tools, running each in several ways, examining computer performance, and repeating the tests in a rigorous, scientific manner.
Problems can appear intermittently or be camouflaged by some greater or lesser matter. The following graph is an example of what you might see.
This is a Performance Monitor graph of processor and disk use over a 61-second interval. The white line represents disk activity; the black line represents processor activity. If you viewed just the first half of the interval, you would conclude that you have a disk bottleneck; the second half might lead you to believe you have a processor bottleneck. When the data is logged over time, you find that the processor is actually the problem—but you'd never know it from a one-minute glance.
This part of the Windows NT 4.0 Workstation Resource Guide is designed to help you tune and optimize Windows NT 4.0 Workstation. The remainder of the chapter includes some history and important information about Windows NT Workstation that affect how you monitor it.
The Resource Guide and Other Resources
The following materials might be of interest as well:
Optimizing Windows NT 4.0 Workstation
An original design goal of Windows NT was to eliminate the many parameters that characterized earlier systems. Adaptive algorithms were incorporated in Windows NT so that correct values are determined by the system as it runs. The 32-bit address space removed many limitations on memory and the need for users to manually adjust parameters to partition memory.
Windows NT has fundamentally changed how computers will be managed in the future. The task of optimizing Windows NT is not the art of manually adjusting many conflicting parameters. Optimizing Windows NT is a process of determining what hardware resource is experiencing the greatest demand and then adjusting the operation to relieve that demand.
Windows NT did not achieve the goal of automatic tuning in every case. A few parameters remain, mainly because it is not possible to know precisely how every computer is used. Default values for all parameters are set for a broad range of normal system use, and they rarely need to be altered. But special circumstances sometimes call for changes. In this book we will be sure to mention the few tuning parameters that remain in Windows NT and indicate when it is appropriate to change them from their default values.
A bottleneck is a condition in which the limitations in one component prevent the whole system from operating faster. The device with the lowest maximum throughput is the most likely to become a bottleneck if it is in demand. Making any other device faster can never yield more throughput; it can only result in lower utilization of the faster device.
Even if all other components are infinitely fast, a bottleneck holds the system at a stall until it is cleared.
Although a foolproof bottleneck alarm and a direct bottleneck counter aren't available, you can combine several different indicators to look for bottlenecks. The primary indicator is an extended high rate of use on one hardware resource and resulting low rates of use on related components. It is accompanied by sustained queues for one or more services, and slow response time.
Bottlenecks, Utilization, and Queues
The best bottleneck alarm is system response time, as perceived by the user. Users' perceptions are affected by their expectations and the kind of work they do. An accurate bottleneck alarm would be designed to reflect these same expectations and requirements. You needn't demand the same throughput on a system supporting word processing as you do on one madly calculating routes to Jupiter. Even if your processors, disks, and memory are running at near capacity, if they are not developing the queues that degrade their response time, you don't have a problem (although you might want to plan more capacity for the future).
Although 100% utilization of a resource is a clear warning, it is neither a necessary nor sufficient condition for a bottleneck. You can have bottlenecks on devices with utilization well below 100% and you can, at least in theory, have a device perking along at nearly 100% utilization with no signs that it is a bottleneck. That is, the device is not preventing any other resource from getting its work done, nothing is waiting for it, and even if it were infinitely fast, things wouldn't happen any sooner.
A bottleneck is determined by the number of requests for service, the arrival pattern of the requests, and the amount of time requested. If these factors are perfectly synchronized, no queues develop. But if they are random or unpredictable, queues develop at much lower utilization rates.
For example, suppose a process had ten threads, each of which used exactly 0.999 seconds of processor time once every ten seconds. If each request arrived exactly one second after the previous one in perfect sequence, the processor would be 99.9% busy, but there would be no queue, no interference between the threads and, technically, no bottleneck.
Admittedly, this is a highly idealized situation, but it's easy to see how any disruption in the pattern would quickly create a large queue. According to queuing theory, if the arrival pattern of requests and the duration of requested services are random or unpredictable, a device that is 66% utilized will produce a queue of two items. Even worse, if, instead of being random, requests for service are either very short or very long, queues can form at even lower utilization. That is, fewer requests for service produce even longer queues.
These chapters introduce several tools to help you monitor hardware and software performance. Many tasks require switching between or combining tools. But no matter which tools you choose, some basic concepts are common to all of them. This section describes those commonalties and describes how to monitor
These topics are an introduction to the larger topics of using Performance Monitor, Task Manager, and the other Windows NT Resource Kit 4.0 CD tools to optimize Windows NT.
Windows NT sees the active components running on the system as objects with characteristic properties. Some, such as processes and threads, are familiar; others, such as mutexes and semaphores, are less well known. For more information on Windows NT 4.0 objects, see "Microkernel Objects" in Chapter 5, "Windows NT 4.0 Workstation Architecture."
System object counts are important because each object takes up space in the operating system's nonpaged memory. Some just perform quick housekeeping and bookkeeping functions at background priority and rarely become a bottleneck. However, too many threads and processes can degrade performance on all functions, resulting in a bottleneck in processor or memory use.
Several performance monitoring tools let you keep track of the number of objects in your system:
Processes, which include both user applications and Windows NT services, can become bottlenecks. While investigating processor, disk, or memory use, chart use by process, and then start and stop the processes to see how your system responds.
Performance Monitor and Task Manager both show counts of running processes, including user programs and Windows NT services:
Many of the tools on the Windows NT Resource Kit 4.0 CD also monitor processes in detail, including Process Viewer (PViewer.exe) and Process Monitor (PMon.exe). For more information, see Chapter 11, "Performance Monitoring Tools," and Rktools.hlp.
Note The Services Control Panel also displays Windows NT services and lets you start and stop them. The Services Control Panel shows all Windows NT services, regardless of the process in which they run. However, it lists services by service name whereas Performance Monitor and Task Manager display the names of executable files.
For a list of the default services and a description of each, see Windows NT Help in the Services Control Panel. Click Start, click Help, and type Default Services.
In Task Manager, select the Processes tab. It displays a table of active processes. From the View menu, click Select Columns to add additional measures of the processor time, memory use, process priority, handle and thread counts, and the process ID.
In Performance Monitor, select the Process object from the Add To dialog box. All active applications and services appear in the Instances box.
The following table lists processes commonly running on Windows NT 4.0 Servers and Workstations without a network connection. It shows them as they appear in Performance Monitor and in Task Manager.
Note Process Explode (Pview.exe), Process Viewer (Pviewer.exe), and Process Monitor (Pmon.exe) all display important counts of system processes. Although the information from these tools is instantaneous and cannot be logged or collected, the tools require almost no setup, so they are very valuable for a quick look.
No matter what tool you choose, the processes that appear depend upon whether the computer is a server or workstation, and upon the services installed on the computer, including network services. User applications, including the executables for Performance Monitor and Task Manager, appear only when they are running.
Also, a process instance might not be visible for every active service. Performance Monitor and Task Manager display an instance for each executable process running on the system. Many services share a process to conserve system resources, so these appear together as one instance.
For example, many Windows NT 32-bit services, including Alerter, Clipbook Server, and Event Viewer, share the Services.exe process with the Windows NT Services Control Manager, a general process that starts all system services. Net Logon shares the Lsass.exe process with other security services.
It's difficult to monitor these services separately, although you can experiment in associating a service with threads in the process. The SC utility, in the Computer Configuration subdirectory on the Resource Kit CD, displays useful service configuration information, including the name of the process in which the service runs. For more information on SC, see Rktools.hlp.
Optimizing 16-bit Windows Applications
In Windows NT 4.0 Workstation and Server, by default, all active 16-bit Windows applications run as separate threads in a single multithreaded process called NT Virtual DOS Machine (NTVDM). The NTVDM process simulates a 16-bit Windows environment complete with all of the DLLs called by 16-bit Windows applications.
This configuration poses two challenges for running 16-bit applications:
As a result, Windows NT 4.0 includes an option to run a 16-bit application in its own separate NTVDM process with its own address space.
You can monitor 16-bit Windows applications by identifying them by their Thread ID while they are running, or by running each application in a separate address space.
In addition to the 16-bit applications, each NTVDM process includes a heartbeat thread that interrupts every 55 milliseconds to simulate a processor timer-tic, and the Wowexec.exe thread, which helps to create 16-bit tasks and to handle the delivery of the 16-bit interrupt. You will see the heartbeat and Wowexec threads when monitoring 16-bit applications.
Win16 Application Performance
The NTVDM process is multitasking: A thread in the process (in this case, a 16-bit Windows application) can run at the same time as threads of other processes if the computer has more than one processor. It is also preemptible: Threads can be interrupted and resumed to allow virtual multitasking on a single-processor computer.
However, only one 16-bit Windows application thread in an NTVDM can run at one time and, if an application thread is preempted, the NTVDM always resumes with the same thread. This limits the performance of multiple 16-bit applications running in the same NTVDM process, although this limitation becomes an issue only when the processor is very busy.
Monitoring Win16 Applications
Almost all performance monitoring tools can monitor 16-bit applications on Windows NT 4.0 Server and Workstation. However, because they run in the same process, the trick to monitoring more than one 16-bit application is to distinguish among the threads of the NTVDM process.
To monitor one 16-bit application, simply select the NTVDM process in Performance Monitor, Task Manager, Process Explode, Process Viewer, Process Monitor, or another tool. If you have multiple 16-bit processes running in NTVDM, you can distinguish them by their thread IDs in all tools except Process Monitor. You might have to start and stop the 16-bit process to determine which thread ID is associated with which 16-bit process.
This figure is a Performance Monitor report on an a single NTVDM process (Process ID 105) with three threads. One of the threads is the heartbeat thread (Thread #0, Thread ID 118), one is the Wowexec thread (Thread #1, Thread ID 140), one is a 16-bit application, Write.exe (Thread #2, Thread ID 46).
Performance Monitor identifies threads by the process name and a thread number. The thread numbers are ordinal numbers (beginning with 0) that represent the order in which the threads started. The thread number of a running thread changes when a thread with a lower number stops; all threads with higher number move up in order to close the gap. For example, if thread 1 stopped, thread 2 becomes thread 1. Therefore, thread numbers are not reliable indicators of thread identity.
Performance Monitor can monitor the Process ID and Thread ID of a thread. The Process ID is the ID of the process in which the thread runs. Thread ID is the ID of the thread. Unlike thread number, it is assigned when the thread starts and remains with it until the thread stops.
The Process and Thread IDs are just ordinal numbers that are associated with the process or thread only for a single run. On subsequent runs, they just as likely to be assigned a different ID. However, you can use the ID to track them during execution.
This figure shows Process Explode monitoring a 16-bit Windows application running in a single process (Ntvdm.exe). The three threads displayed in the Thread ID box (midway down the first column) represent the heartbeat thread, the Wowexec thread, and the thread of the 16-bit Windows application.
To see information about the thread in Process Explode, click on the Thread ID of the thread in the Thread ID box.
Task Manager makes it easy to identify 16-bit applications, because it displays the names of the executable files indented below the NTVDM process name. To monitor 16-bit processes in Task Manager, click the Processes tab, and from the Options menu click Show 16-bit Tasks.
In this example, you can see the Wowexec and Write threads. The heartbeat thread is not an executable and does not appear in Task Manager. However, the Thread Count column on the far right shows that all three threads are running in the NTVDM process.
Running Win16 Applications in a Separate Process
Windows NT 4.0 lets you opt to run a 16-bit Windows application in separate, unshared NTVDM process with its own memory space. This eliminates competition between NTVDM threads in a single process, making the 16-bit application thread fully multitasking and preemptible. It also simplifies monitoring.
To run a 16-bit application in its own address space, you can do any of the following:
In Task Manager and Performance Monitor, two instances of the NTVDM process appear in the Process object Instances box. You can use their process IDs to distinguish between them.
This example shows Task Manager monitoring two copies of 16-bit Write, each in its own NTVDM process.
When a 16-bit process runs in its own memory space, Performance Monitor shows two instances of the NTVDM process. You need to use process IDs to distinguish between them. (You might have to stop and start the processes to make the distinction.)
Monitoring MS-DOS Applications
In Windows NT 4.0, each MS-DOS application runs in its own NTVDM process, eliminating some of the problems encountered in Win16 applications. Unfortunately, all of the NTVDM processes are called Ntvdm.exe by default, but you can change that.
To create a new process name for an NTVDM
Tip You don't have to restart the computer for the registry change to take effect. Thus, you can change the registry between starting different DOS applications and have each start in a uniquely named process. It is also prudent to set it back to Ntvdm.exe when you are finished.
Unfortunately, this doesn't work with 16-bit Windows applications, so you need to distinguish those by thread or by process ID.
The Cost of Performance Monitoring
Performance monitoring tools are quite sophisticated, but they are plagued by the problem common to all investigative tools: Using them changes their results. Performance tools are just applications and, as such, they occupy the processor, use memory and disk space, and tax the graphics subsystem of the Windows NT Executive. Make sure to measure the effects of these tools, and subtract them from your data.
Note Performance Monitor for Windows NT 4.0 has lower overhead than previous versions, due almost entirely to changes in the Windows NT 4.0 architecture. Most of Performance Monitor overhead is consumed by its graphic displays, which are now more efficient, not by data collection.
Response Probe, a monitoring tool included on the CD, has no apparent overhead. It monitors its own toll on the system and subtracts it before displaying its results.