Special Offers

Inside Microsoft® Windows® 2000, Third Edition
Author David A. Solomon and Mark E. Russinovich
Pages 944
Disk 1 Companion CD(s)
Level Intermediate
Published 08/16/2000
ISBN 9780735610217
ISBN-10 0-7356-1021-5
Price(USD) $49.99
To see this book's discounted price, select a reseller below.

More Information

About the Book
Table of Contents
Sample Chapter
Related Series
Related Books
About the Author

Support: Book & CD

Rate this book
Barnes Noble Amazon Quantum Books


Chapter 6: Processes, Threads, and Jobs (continued)

Thread Scheduling

This section describes the Windows 2000 scheduling policies and algorithms. The first subsection provides a condensed description of how scheduling works on Windows 2000 and a definition of key terms. Then Windows 2000 priority levels are described from both the Win32 API and the Windows 2000 kernel points of view. After a review of the relevant Win32 functions and Windows 2000 utilities and tools that relate to scheduling, the detailed data structures and algorithms that comprise the Windows 2000 scheduling system are presented.

Overview of Windows 2000 Scheduling

Windows 2000 implements a priority-driven, preemptive scheduling system—the highest-priority runnable (ready) thread always runs, with the caveat that the thread chosen to run might be limited by the processors on which the thread is allowed to run, a phenomenon called processor affinity. By default, threads can run on any available processor, but you can alter processor affinity by using one of the Win32 scheduling functions.

Viewing Ready Threads

You can view the list of ready threads with the kernel debugger !ready command. This command displays the thread or list of threads that are ready to run at each priority level. In the following example, two threads are ready to run at priority 10 and six at priority 8. Because this output was generated using LiveKd on a uniprocessor system, the current thread will always be the kernel debugger (/Kd or I386kd).

kd> !ready 1
Ready Threads at priority 10
 THREAD 810de030  Cid 490.4a8 Teb: 7ffd9000  Win32Thread: e297e008 READY
 THREAD 81110030  Cid 490.48c Teb: 7ffde000  Win32Thread: e29425a8 READY
Ready Threads at priority 8
 THREAD 811fe790  Cid 23c.274 Teb: 7ffdb000  Win32Thread: e258cda8 READY
 THREAD 810bec70  Cid 23c.50c Teb: 7ffd9000  Win32Thread: e2ccf748 READY
 THREAD 8003a950  Cid 23c.550 Teb: 7ffda000  Win32Thread: e29a7ae8 READY
 THREAD 85ac2db0  Cid 23c.5e4 Teb: 7ffd8000  Win32Thread: e297a9e8 READY
 THREAD 827318d0  Cid 514.560 Teb: 7ffd9000  Win32Thread: 00000000 READY
 THREAD 8117adb0  Cid 2d4.338 Teb: 7ffaf000  Win32Thread: 00000000 READY

When a thread is selected to run, it runs for an amount of time called a quantum. A quantum is the length of time a thread is allowed to run before Windows 2000 interrupts the thread to find out whether another thread at the same priority level or higher is waiting to run or whether the thread's priority needs to be reduced. Quantum values can vary from thread to thread (and differ between Windows 2000 Professional and Windows 2000 Server). (Quantums are described in more detail elsewhere in this chapter.) A thread might not get to complete its quantum, however. Because Windows 2000 implements a preemptive scheduler, if another thread with a higher priority becomes ready to run, the currently running thread might be preempted before finishing its time slice. In fact, a thread can be selected to run next and be preempted before even beginning its quantum!

The Windows 2000 scheduling code is implemented in the kernel. There's no single "scheduler" module or routine, however—the code is spread throughout the kernel in which scheduling-related events occur. The routines that perform these duties are collectively called the kernel's dispatcher. Thread dispatching occurs at DPC/dispatch level and is triggered by any of the following events:

  • A thread becomes ready to execute—for example, a thread has been newly created or has just been released from the wait state.
  • A thread leaves the running state because its time quantum ends, it terminates, or it enters a wait state.
  • A thread's priority changes, either because of a system service call or because Windows 2000 itself changes the priority value.
  • The processor affinity of a running thread changes.

At each of these junctions, Windows 2000 must determine which thread should run next. When Windows 2000 selects a new thread to run, it performs a context switch to it. A context switch is the procedure of saving the volatile machine state associated with a running thread, loading another thread's volatile state, and starting the new thread's execution.

As already noted, Windows 2000 schedules at the thread granularity. This approach makes sense when you consider that processes don't run but only provide resources and a context in which their threads run. Because scheduling decisions are made strictly on a thread basis, no consideration is given to what process the thread belongs to. For example, if process A has 10 runnable threads and process B has 2 runnable threads, and all 12 threads are at the same priority, each thread would receive one-twelfth of the CPU time—Windows 2000 wouldn't give 50 percent of the CPU to process A and 50 percent to process B.

To understand the thread-scheduling algorithms, you must first understand the priority levels that Windows 2000 uses.

Thread-Scheduling State Changes

You can watch thread-scheduling state changes with the Performance tool in Windows 2000. This utility can be useful when you're debugging a multithreaded application if you're unsure about the state of the threads running in the process. To watch thread-scheduling stage changes using the Performance tool, follow these steps:

  1. Run the Microsoft Notepad utility (Notepad.exe).
  2. Start the Performance tool by selecting Programs from the Start menu and then selecting Administrative Tools/Performance.
  3. Select chart view if you're in some other view.
  4. Right-click on the graph, and choose Properties.
  5. Click the Graph tab, and change the chart vertical scale maximum to 7. (As you'll see from the explanation text for the performance counter, thread states are numbers from 0 through 7.) Click OK.
  6. Click the Add button on the toolbar to bring up the Add Counters dialog box.
  7. Select the Thread performance object, and then select the Thread State counter. Click the Explain button to see the definition of the values:
  8. Click to view graphic
    Click to view graphic

  9. In the Instances box, scroll down until you see the Notepad process (notepad/0); select it, and click the Add button.
  10. Scroll back up in the Instances box to the Mmc process (the Microsoft Management Console process running the System Monitor ActiveX control), select all the threads (mmc/0, mmc/1, and so on), and add them to the chart by clicking the Add button. Before you click Add, you should see something like this:
  11. Click to view graphic
    Click to view graphic

  12. Now close the Add Counters dialog box by clicking Close.
  13. You should see the state of the Notepad thread (the very top line in the following figure) as a 5, which, as shown in the explanation text you saw under step 5, represents the waiting state (because the thread is waiting for GUI input):
  14. Click to view graphic
    Click to view graphic

  15. Notice that one thread in the Mmc process (running the Performance tool snap-in) is in the running state (number 2). This is the thread that's querying the thread states, so it's always displayed in the running state.
  16. You'll never see Notepad in the running state (unless you're on a multiprocessor system) because Mmc is always in the running state when it gathers the state of the threads you're monitoring.

Priority Levels

As illustrated in Figure 6-11, internally, Windows 2000 uses 32 priority levels, ranging from 0 through 31. These values divide up as follows:

  • Sixteen real-time levels (16-31)
  • Fifteen variable levels (1-15)
  • One system level (0), reserved for the zero page thread
  • Click to view graphic
    Click to view graphic

    Figure 6-11 Thread priority levels

Thread priority levels are assigned from two different perspectives: those of the Win32 API and those of the Windows 2000 kernel. The Win32 API first organizes processes by the priority class to which they are assigned at creation (Real-time, High, Above Normal, Normal, Below Normal, and Idle) and then by the relative priority of the individual threads within those processes (Time-critical, Highest, Above-normal, Normal, Below-normal, Lowest, and Idle).

In the Win32 API, each thread has a priority based on a combination of its process priority class and its relative thread priority. The mapping from Win32 priority to internal Windows 2000 numeric priority is shown in Figure 6-12.

Click to view graphic
Click to view graphic

Figure 6-12 Kernel priorities in Win32 vs. Windows 2000

The priorities shown in Figure 6-12 are thread base priorities. Threads start out inheriting the process base priority, which can be changed with Task Manager (as described in the section "Relevant Tools") or with the Win32 SetPriorityClass function.

Normally, the process base priority (and hence the starting-thread base priority) will default to the value at the middle of each process priority range (24, 13, 10, 8, 6, or 4). However, some Windows 2000 system processes (such as the Session Manager, service controller, and local security authentication server) have a base process priority slightly higher than the default for the Normal class (8). This higher default value ensures that the threads in these processes will all start at a higher priority than the default value of 8. A system process uses internal Windows 2000 functions to set its process base priority to a numeric value other than its default starting Win32 base priority.

Whereas a process has only a single priority value (base priority), each thread has two priority values: current and base. The current priority for threads in the dynamic range (1 through 15) might be, and often is, higher than the base priority. Windows 2000 never adjusts the priority of threads in the real-time range (16 through 31), so they always have the same base and current priority.

Win32 Scheduling APIs

The Win32 API functions that relate to thread scheduling are listed in Table 6-15. (For more information, see the Win32 API reference documentation.)

Table 6-15 Scheduling-Related APIs and Their Functions

API Function
Suspend/ResumeThread Suspends or resumes a paused thread from execution.
Get/SetPriorityClass Returns or sets a process's priority class (base priority).
Get/SetThreadPriority Returns or sets a thread's priority (relative to its process base priority).
Get/SetProcessAffinityMask Returns or sets a process's affinity mask.
SetThreadAffinityMask Sets a thread's affinity mask (must be a subset of the process's affinity mask) for a particular set of processors, restricting it to running on those processors.
Get/SetThreadPriorityBoost Returns or sets the ability for Windows 2000 to boost the priority of a thread temporarily (applies only to threads in the dynamic range).
SetThreadIdealProcessor Establishes a preferred processor for a particular thread but doesn't restrict the thread to that processor.
Get/SetProcessPriorityBoost Returns or sets the default priority boost control state of the current process. (This function is used to set the thread priority boost control state when a thread is created.)
SwitchToThread Yields execution for one or more quantums.
Sleep Puts the current thread into a wait state for a specified time interval (figured in milliseconds [msec]). A zero value relinquishes the rest of the thread's quantum.
SleepEx Causes the current thread to go into a wait state until either an I/O completion callback is completed, an APC is queued to the thread, or the specified time interval ends.

Relevant Tools

You can view (and change) the base-process priority class with Task Manager, Pview, or Pviewer. You can view the numeric base-process priority value with the Performance tool or Pstat. You can view thread priorities with the Performance tool, Pview, Pviewer, and Pstat. There is no general utility to change relative thread priority levels, however. Table 6-16 lists the tools related to thread scheduling.

Table 6-16 Tools Related to Thread Scheduling

Click to view graphic
Click to view graphic

The only way to specify a starting priority class for a process is with the start command in the Windows 2000 command prompt.

Examining and Specifying Process and Thread Priorities

Try the following experiment:

  1. From the command prompt, type start/realtime notepad. Notepad should open.
  2. Run the Process Viewer utility in the Support Tools (Pviewer.exe), and select Notepad.exe from the list of processes, as shown here. Notice that the dynamic priority of the thread in Notepad is 24. This matches the real-time value shown in Figure 6-12.
  3. Click to view graphic
    Click to view graphic

  4. Task Manager can show you similar information. Press Ctrl+Shift+Esc to start Task Manager, and go to the Processes tab. Right-click on the Notepad.exe process, and select the Set Priority option. You can see that Notepad's process priority class is Realtime, as shown here:
  5. Click to view graphic
    Click to view graphic

Real-Time Priorities

You can raise or lower thread priorities within the dynamic range in any application; however, you must have the increase scheduling priority privilege to enter the real-time range. (If you attempt to move a process into the Real-time priority class and don't have the privilege, the operation doesn't fail—the High class is used.)

Be aware that many important Windows 2000 kernel-mode system threads run in the real-time priority range, so if your process spends excessive time running in this range, it might be blocking critical system functions in the memory manager, cache manager, local and network file systems, and even other device drivers. It won't block hardware interrupts because they have a higher priority than any thread, but it might block system threads from running.

There is one behavioral difference for threads in the real-time range (mentioned in the section "Preemption"): their thread quantum is reset if they are preempted.

Although Windows 2000 has a set of priorities called real-time, they are not real-time in the common definition of the term in that Windows 2000 doesn't provide true real-time operating system facilities, such as guaranteed interrupt latency or a way for threads to obtain a guaranteed execution time. For more information, see the sidebar "Windows 2000 and Real-Time Processing" in Chapter 3 as well as the MSDN Library article "Real-Time Systems and Microsoft Windows NT."

Interrupt Levels vs. Priority Levels

As illustrated in Figure 6-13, all threads run at IRQL 0 or 1. (For a description of how Windows 2000 uses interrupt levels, see Chapter 3.) User-mode threads run at IRQL 0; only kernel-mode APCs execute at IRQL 1, since they interrupt the execution of a thread. (For more information on APCs, see in Chapter 3.) Also, threads running in kernel mode can raise IRQL. Because of this, no user-mode thread, regardless of its priority, blocks hardware interrupts (although high-priority real-time threads can block the execution of important system threads).

Click to view graphic
Click to view graphic

Figure 6-13 Interrupt priorities vs. thread priorities

Thread-scheduling decisions are made at DPC/dispatch level. Thus, while the kernel is deciding which thread should run next, no thread can be running and possibly changing scheduling-related information (such as priorities). On a multiprocessor system, access to the thread-scheduling data structures is synchronized by acquiring the Dispatcher spinlock (KiDispatcherLock).

Thread States

Before you can comprehend the thread-scheduling algorithms and data structures, you need to understand the various execution states that a thread can be in. Figure 6-14 illustrates the state transitions for a Windows 2000 thread. More details on what happens at each transition are included later in this section.

Click to view graphic
Click to view graphic

Figure 6-14 Thread states

The thread states are as follows:

  • Ready When looking for a thread to execute, the dispatcher considers only the pool of threads in the ready state. These threads are simply waiting to execute.
  • Standby A thread in the standby state has been selected to run next on a particular processor. When the correct conditions exist, the dispatcher performs a context switch to this thread. Only one thread can be in the standby state for each processor on the system.
  • Running Once the dispatcher performs a context switch to a thread, the thread enters the running state and executes. The thread's execution continues until the kernel preempts it to run a higher priority thread, its quantum ends, it terminates, or it voluntarily enters the wait state.
  • Waiting A thread can enter the wait state in several ways: a thread can voluntarily wait on an object to synchronize its execution, the operating system (the I/O system, for example) can wait on the thread's behalf, or an environment subsystem can direct the thread to suspend itself. When the thread's wait ends, depending on the priority, the thread either begins running immediately or is moved back to the ready state.
  • Transition A thread enters the transition state if it is ready for execution but its kernel stack is paged out of memory. For example, the thread's kernel stack might be paged out of memory. Once its kernel stack is brought back into memory, the thread enters the ready state.
  • Terminated When a thread finishes executing, it enters the terminated state. Once terminated, a thread object might or might not be deleted. (The object manager sets policy regarding when to delete the object.) If the executive has a pointer to the thread object, it can reinitialize the thread object and use it again.
  • Initialized Used internally while a thread is being created.


As mentioned earlier in the chapter, a quantum is the amount of time a thread gets to run before Windows 2000 checks whether another thread at the same priority should get to run. If a thread completes its quantum and there are no other threads at its priority, Windows 2000 reschedules the thread to run for another quantum.

Each thread has a quantum value that represents how long the thread can run until its quantum expires. This value isn't a time length but rather an integer value, which we'll call quantum units.

Quantum Accounting

By default, threads start with a quantum value of 6 on Windows 2000 Professional and 36 on Windows 2000 Server. (We'll explain how you can change these values later.) The rationale for the longer default value on Windows 2000 Server is to minimize context switching. By having a longer quantum, server applications that wake up as the result of a client request have a better chance of completing the request and going back into a wait state before their quantum ends.

Each time the clock interrupts, the clock-interrupt routine deducts a fixed value (3) from the thread quantum. If there is no remaining thread quantum, the quantum end processing is triggered and another thread might be selected to run. On Windows 2000 Professional, because 3 is deducted each time the clock interrupt fires, by default a thread runs for 2 clock intervals; on Windows 2000 Server, by default a thread runs for 12 clock intervals.

Even if the system were at DPC/dispatch level or above (for example, if a DPC or an interrupt service routine was executing) when the clock interrupt occurred, the current thread would still have its quantum decremented, even if it hadn't been running for a full clock interval. If this was not done and device interrupts or DPCs occurred right before the clock interval timer interrupts, threads might not ever get their quantum reduced.

The length of the clock interval varies according to the hardware platform. The frequency of the clock interrupts is up to the HAL, not the kernel. For example, the clock interval for most x86 uniprocessors is 10 milliseconds, and for most x86 multiprocessors, 15 milliseconds.

Determining the Clock Interval Frequency

To determine the approximate clock interval, run the Performance tool on an idle system (that is, no processes that are performing I/O operations should be running—check the per-process I/O counters to verify that the system is idle). Monitor the value Interrupts/sec (in the Processor object). Divide the average value into 1 to calculate the clock interval.

For example, most uniprocessor x86 systems show an average interrupts/second of 100, so you would calculate the clock interval as 1/100 = 10 milliseconds. On a multiprocessor x86 system, the average is 67 interrupts per second, which is a clock interval of 1/67 = .015, or 15 milliseconds.

The reason quantum is expressed in terms of a multiple of 3 quantum units per clock tick rather than as single units is to allow for partial quantum decay on wait completion. When a thread that has a base priority less than 14 executes a wait function (such as WaitForSingleObject or WaitForMultipleObjects), its quantum is reduced by 1 quantum unit. (Threads running at priority 14 or higher have their quantums reset after a wait.)

This partial decay addresses the case in which a thread enters a wait state before the clock interval timer fires. If this adjustment were not made, it would be possible for threads never to have their quantums reduced. For example, if a thread ran, entered a wait state, ran again, and entered another wait state but was never the currently running thread when the clock interval timer fired, it would never have its quantum charged for the time it was running.

Controlling the Quantum

The registry value HKLM\SYSTEM\CurrentControlSet\Control\PriorityControl\Win32PrioritySeparation allows you to specify the relative length of thread quantums (short or long) and whether or not threads in the foreground process should have their quantums boosted (and if so, the amount of the boost). This value consists of 6 bits divided into the three 2-bit fields shown in Figure 6-15.

Click to view graphic
Click to view graphic

Figure 6-15 Fields of the Win32PrioritySeparation registry value

  • Short vs. Long 1 specifies long, and 2 specifies short. 0 or 3 indicates that the default will be used (short for Windows 2000 Professional, long for Windows 2000 Server).
  • Variable vs. Fixed 1 means to vary the quantum for the foreground process, and 2 means that quantum values don't change for foreground processes. 0 or 3 means that the default will be used (variable for Windows 2000 Professional, fixed for Windows 2000 Server).
  • Foreground Quantum Boost This field, which must have a value of 0, 1, or 2 (3 is invalid and treated as 2) is an index into a three-entry quantum table used to obtain the quantum for the threads in the foreground process. The quantum for threads in background processes is taken from the first entry in this quantum table. The field's value is stored in the kernel variable PsPrioritySeparation.

The foreground process is the process that owns the thread that owns the window that's in focus. When the foreground window changes to one owned by a thread in a process higher than the Idle priority class, the Win32 subsystem changes the quantum values for all the threads in that process by using the lower-order 2 bits of the Win32PrioritySeparation registry value as an index into a three-element array named PspForegroundQuantum. This array contains values determined by the other two bit fields in the Win32PrioritySeparation registry value. Table 6-17 shows the possible settings for PspForegroundQuantum.

Table 6-17 Quantum Values

  Short Long
Variable 6 12 18 12 24 36
Fixed 18 18 18 36 36 36

The reason that Windows 2000 boosts the quantum of foreground threads and not the priority is best illustrated with the following example, which shows the potential problems resulting from an approach based on foreground priority boosting. Suppose you start a long-running spreadsheet recalculation and then switch to a CPU-intensive application (such as a graphics-intensive game). The spreadsheet process running in the background will get little CPU time if the game process, which is in the foreground, has its priority boosted. Increasing the quantum of the game process doesn't prevent the spreadsheet calculation from running but instead favors the game process. If you do want to run an interactive application at a higher priority than all other interactive processes, you can always change the priority class to Above Normal or High using Task Manager (or start the application from the command prompt with the command start /abovenormal or start /high ).

When you're setting the quantum length by modifying the Win32PrioritySeparation registry value directly, you can select any combination. When you're using the Performance Options dialog box, you can choose from only two combinations. To see this dialog box (shown in Figure 6-16), open the System utility in Control Panel (or right-click on My Computer and select Properties), click the Advanced tab, and then click the Performance Options button.

Click to view graphic
Click to view graphic

Figure 6-16 Adjusting the quantum settings

The Applications option under Optimize Performance For designates the use of short, variable quantums—the default for Windows 2000 Professional. The Background Services option designates the use of long, fixed quantums—the default for Windows 2000 Server. If you install Terminal Services on Windows 2000 Advanced Server or Windows 2000 Datacenter Server and configure the server as an application server, this setting is changed to optimize for applications.

Scheduling Data Structures

To make thread-scheduling decisions, the kernel maintains a set of data structures known collectively as the dispatcher database, which is illustrated in Figure 6-17. The dispatcher database keeps track of which threads are waiting to execute and which processes are executing which threads. The most important structure in the dispatcher database is the dispatcher ready queue (located at KiDispatcherReadyListHead). This queue is really a series of queues, one queue for each scheduling priority. The queues contain threads that are in the ready state, waiting to be scheduled for execution.

Click to view graphic
Click to view graphic

Figure 6-17 Dispatcher database

To speed up the selection of which thread to run or preempt, Windows 2000 maintains a 32-bit bitmask called the ready summary (KiReadySummary). Each bit set indicates one or more threads in the ready queue for that priority level. (Bit 0 represents priority 0, and so on.) Windows 2000 maintains another bitmask, the idle summary (KiIdleSummary), in which each set bit represents an idle processor.

As noted earlier, thread dispatching takes place at DPC/dispatch level. In addition to preventing other threads from running, being at DPC/dispatch level synchronizes access to the dispatcher database. On a multiprocessor system, however, changes to the dispatcher database require the additional step of acquiring the kernel dispatcher spinlock (KiDispatcherLock). Table 6-18 shows the kernel-mode kernel variables that are related to thread scheduling.

Table 6-18 Thread-Scheduling Kernel Variables

Variable Type Description
KiDispatcherLock Spinlock Dispatcher spinlock
KeNumberProcessors Byte Number of processors active in system
KeActiveProcessors Bitmask (32 bits) Bitmask of active processors in system
KiIdleSummary Bitmask (32 bits) Bitmask of idle processors
KiReadySummary Bitmask (32 bits) Bitmask of priority levels that have one or more ready threads
KiDispatcherReadyListHead Array of 32 list entries List heads for the 32 ready queues

Scheduling Scenarios

Windows 2000 bases the question of "Who gets the CPU?" on thread priority; but how does this approach work in practice? The following sections illustrate just how priority-driven preemptive multitasking works on the thread level. Note that there are differences in the way Windows 2000 handles scheduling decisions on a multiprocessor system vs. on a uniprocessor system. These differences are explained in the section "Thread Scheduling on Symmetric Multiprocessing Systems" later in this chapter.

Voluntary Switch

First a thread might voluntarily relinquish use of the processor by entering a wait state on some object (such as an event, a mutex, a semaphore, an I/O completion port, a process, a thread, a window message, and so on) by calling one of the many Win32 wait functions (such as WaitForSingleObject or WaitForMultipleObjects). Waiting on objects is described in more detail in Chapter 3.

Voluntary switching is roughly equivalent to a thread ordering an item that isn't ready to go at a fast-food counter. Rather than hold up the queue of the other diners, the thread will step aside and let the next thread place its order (execute its routine) while the first thread's hamburger is being prepared. When the hamburger is ready, the first thread goes to the end of the ready queue of the priority level. However, as you'll see later in the chapter, most wait operations result in a temporary priority boost so that the thread can pick up its hamburger right away and start eating.

Figure 6-18 illustrates a thread entering a wait state and Windows 2000 selecting a new thread to run.

Click to view graphic
Click to view graphic

Figure 6-18 Voluntary switching

In Figure 6-18, the top block (thread) is voluntarily relinquishing the processor so that the next thread in the ready queue can run (as represented by the halo it has when in the Running column). Although it might appear from this figure that the relinquishing thread's priority is being reduced, it's not—it's just being moved to the wait queue of the objects the thread is waiting on. What about any remaining quantum for the thread? The quantum value isn't reset when a thread enters a wait state—in fact, as explained earlier, when the wait is satisfied, the thread's quantum value is decremented by 1 quantum unit, equivalent to one-third of a clock interval (except for threads running at priority 14 or higher—they have their quantum reset after a wait).


In this scheduling scenario, a lower-priority thread is preempted when a higher-priority thread becomes ready to run. This situation might occur for a couple of reasons.

  • A higher-priority thread's wait completes. (The event that the other thread was waiting on has occurred.)
  • A thread priority is increased or decreased.

In either of these cases, Windows 2000 must determine whether the currently running thread should still continue to run or whether it should be preempted to allow a higher-priority thread to run.

Threads running in user mode can preempt threads running in kernel mode—the mode in which the thread is running doesn't matter. The thread priority is the determining factor.

When a thread is preempted, it is put at the head of the ready queue for the priority it was running at. Threads running in the real-time priority range have their quantum reset to a full time slice while threads running in the dynamic priority range finish their quantum when they get to run again. Figure 6-19 illustrates this situation.

Click to view graphic
Click to view graphic

Figure 6-19 Preemptive thread scheduling

In Figure 6-19, a thread with priority 18 emerges from a wait state and repossesses the CPU, causing the thread that had been running (at priority 16) to be bumped to the head of the ready queue. Notice that the bumped thread isn't going to the end of the queue but to the beginning; when the preempting thread has finished running, the bumped thread can complete its quantum. In this example, the threads are in the real-time range; as explained in the note in the section "Priority Boosts," no dynamic priority boosts are allowed for threads in the real-time range.

If voluntary switching is roughly equivalent to a thread letting another thread place its lunch order while the first thread waits for its meal, preemption is roughly equivalent to a thread being bumped from its place in line because the president of the United States has just walked in and ordered a hamburger. The preempted thread doesn't get bumped to the back of the line but is simply moved aside while the president gets his lunch. As soon as the president leaves, the first thread can resume ordering its meal.

Quantum End

When the running thread exhausts its CPU quantum, Windows 2000 must determine whether the thread's priority should be decremented and then whether another thread should be scheduled on the processor.

If the thread priority is reduced, Windows 2000 looks for a more appropriate thread to schedule. (For example, a more appropriate thread would be a thread in a ready queue with a higher priority than the new priority for the currently running thread.) If the thread priority isn't reduced and there are other threads in the ready queue at the same priority level, Windows 2000 selects the next thread in the ready queue at that same priority level and moves the previously running thread to the tail of that queue (giving it a new quantum value and changing its state from running to ready). This case is illustrated in Figure 6-20. If no other thread of the same priority is ready to run, the thread gets to run for another quantum.

Click to view graphic
Click to view graphic

Figure 6-20 Quantum end thread scheduling


When a thread finishes running (either because it returned from its main routine, called ExitThread, or was killed with TerminateThread), it moves from the running state to the terminated state. If there are no handles open on the thread object, the thread is removed from the process thread list and the associated data structures are deallocated and released.

Context Switching

A thread's context and the procedure for context switching vary depending on the processor's architecture. A typical context switch requires saving and reloading the following data:

  • Program counter
  • Processor status register
  • Other register contents
  • User and kernel stack pointers
  • A pointer to the address space in which the thread runs (the process's page table directory)

The kernel saves this information from the old thread by pushing it onto the current (old thread's) kernel-mode stack, updating the stack pointer, and saving the stack pointer in the old thread's KTHREAD block. The kernel stack pointer is then set to the new thread's kernel stack, and the new thread's context is loaded. If the new thread is in a different process, it loads the address of its page table directory into a special processor register so that its address space is available. (See the description of address translation in Chapter 7.) If a kernel APC that needs to be delivered is pending, an interrupt at IRQL 1 is requested. Otherwise, control passes to the new thread's restored program counter and the thread resumes execution.

Idle Thread

When no runnable thread exists on a CPU, Windows 2000 dispatches the per-CPU idle thread. Each CPU is allotted one idle thread because on a multiprocessor system one CPU can be executing a thread while other CPUs might have no threads to execute. Windows 2000 reports the priority of the idle thread as 0. In reality, however, such threads don't have a priority level because they run only when there are no threads to run. (Remember, only one thread per Windows 2000 system is actually running at priority 0—the zero page thread.) In fact, the idle loop runs at DPC/dispatch level, polling for work to do: deferred procedure calls (DPCs) to deliver or threads to dispatch to. Although some details of the flow vary between architectures, the basic flow of control of the idle thread is as follows:

  1. Enables and disables interrupts (allowing any pending interrupts to be delivered).
  2. Checks whether any DPCs (described in Chapter 3) are pending on the processor. If DPCs are pending, clears the pending software interrupt and delivers them.
  3. Checks whether a thread has been selected to run next on the processor, and if so, dispatches that thread.
  4. Calls the HAL processor idle routine (in case any power management functions need to be performed).

Various Windows 2000 process viewer utilities report the idle process using different names. Task Manager calls it "System Idle Process," Process Viewer reports it as "Idle," Pstat calls it "Idle Process," Process Explode and Tlist call it "System Process," and Qslice calls it "SystemProcess."

Previous  |  Table of Contents   |  Next

Top of Page

Last Updated: Friday, July 6, 2001