Device-Driver Performance Considerations for Multimedia Platforms

Updated: October 5, 2006

This paper provides information about writing device drivers for the Microsoft Windows family of operating systems. It provides guidelines for driver developers to minimize the negative impact of driver behavior on multimedia applications.

*
On This Page
IntroductionIntroduction
How Driver Behavior Affects Multimedia ApplicationsHow Driver Behavior Affects Multimedia Applications
Measuring ISR and DPC Execution TimesMeasuring ISR and DPC Execution Times
Executing at PASSIVE_LEVELExecuting at PASSIVE_LEVEL

Introduction

This paper provides information about writing kernel-mode device drivers for client versions of the Microsoft Windows family of operating systems.

Device-driver behavior significantly impacts the performance of multimedia applications. Obviously, it is important that driver developers minimize any potentially negative impact. This paper therefore provides both information on how driver behavior affects the Microsoft Windows environment and tips on how to minimize these effects. The information and tips include:

The maximum amount of processor time an Interrupt Service Routine (ISR) or Deferred Procedure Call (DPC) should consume.

How to measure ISR and DPC execution times.

How drivers can execute at an interrupt request level (IRQL) of PASSIVE_LEVEL.

Top of pageTop of page

How Driver Behavior Affects Multimedia Applications

ISRs and DPCs consume the processing time of the actively running thread. If this thread is a multimedia thread, the multimedia application could miss a deadline and present either incorrect content to the user or correct content at the wrong time. For example, a DVD playback application must load, decode, and display a video frame every 33 milliseconds. The application cannot meet this deadline if too much processing time is taken from the decoding thread to run a long DPC. Consequently, the user perceives a glitch.

To deliver high-quality audio and video, multimedia applications must consider the amount of processing time that each driver on the system consumes at an IRQL higher than PASSIVE_LEVEL. However, driver behavior is defined as unpredictable if the IRQL remains higher than PASSIVE_LEVEL for an unbounded amount of time. Because driver behavior is currently unpredictable in the Microsoft Windows environment, multimedia applications often provide a less-than-optimal experience. Contrastingly, dedicated set-top boxes and game systems maintain predictable environments, because they use well-behaved, proprietary drivers. These systems can thus support high-quality multimedia applications-even though the platforms are comprised of the same hardware that is used to run the Microsoft Windows operating system.

Note:
The guidelines described in this paper also apply to server systems, even though they do not run multimedia applications. For server systems, it is important to increase the predictability of the Microsoft Windows environment so that other applications containing deadlines, such as the user interface, execute correctly.

Improving Users' Multimedia Experiences

Microsoft is incorporating functionality into Windows Vista that improves users’ multimedia experiences-by enabling them to determine why glitches happen. A user can identify the devices that consume large amounts of processing time and investigate ways to reconfigure, disable, or replace the devices or their drivers—bringing about a glitch-free multimedia experience.

Guidelines on ISR and DPC Behavior

Because the definition of the phrase too much processing time is subjective, the guidelines in this section are derived from the requirements of multimedia applications and other time-sensitive applications. The guidelines thus apply to a moderately powerful system, which is defined as a system that contains no more than one 900-MHz processor, 512 MB RAM, and a 133-MHz system bus.

To provide a predictable and bounded environment for multimedia applications, all device drivers must adhere to the following guidelines:

Drivers should never collectively take more than 400 microseconds in any 2-millisecond period on a system with a 1-GHz processor. This guideline isn’t relevant to specific drivers since no single device can control or influence the behavior of other devices, but it is worth mentioning to ensure clarity on the system-level guideline and to keep the behavior of a single driver in perspective.

An ISR should never take more than 25 microseconds. This means that the ISR should be able to clear all hardware signals that are currently asserting its IRQ, queue up any required DPCs, and return within 25 microseconds. Any work that takes more than this amount of time should be performed at the DPC level.

A single DPC should never take more than 100 microseconds. A lengthier DPC leaves little time for other processing to occur before some application’s deadline arises. Any required work that takes more than 100 microseconds should be performed by a thread.

A driver or thread should not perform any action that blocks the execution of other drivers for more than 25 microseconds because doing so has the same effect as running an excessively long ISR. Actions that block drivers include holding a spin lock, executing CLI or STI instructions, raising the IRQL, and masking interrupts. (Regardless of how little time they take, CLI and STI instructions should be used only in extremely rare cases.)

If possible, call the KeDelayExecutionThread function instead of KeStallExecutionProcessor. If the latter must be called, the delay should not exceed 100 microseconds in either a single call or accumulated back-to-back calls.

Adhering to the preceding guidelines is essential to the success of Microsoft Windows as a home multimedia platform. A system containing a loaded driver that does not meet these guidelines is unsuitable for multimedia scenarios.

Top of pageTop of page

Measuring ISR and DPC Execution Times

Microsoft added infrastructure to Microsoft Windows XP Service Pack 2 (SP2) that enables the reporting of ISR and DPC execution times. You can use this infrastructure at driver-development time to measure and optimize the driver’s behavior. The following section briefly describes how to use the event-tracing tools that are built on the infrastructure.

Note:
You can find complete documentation on using the infrastructure and its tools in the "Event Tracing" section of the Microsoft Platform SDK.

Event-Tracing Tools

You can use the Tracelog.exe and Tracerpt.exe tools to leverage the event-tracing infrastructure. Tracelog.exe turns trace collection on and off, while Tracerpt.exe converts a trace log file into a comma-separated file in text format and generates a summary file in text format.

To gather driver-relevant data:

1.

Run tracelog -start -f kernel.etl -b 64 -UsePerfCounter -eflag 8 0x307 0x4084 0 0 0 0 0 0 to start logging the following execution times: process, thread, image load, thread context swap, DPC, Timer DPC, and ISR.

2.

Run tracelog -stop to stop logging.

3.

Run tracerpt kernel.etl to generate a summary of event counts in Summary.txt and a full text trace in Dumpfile.csv.

4.

Run tracerpt kernel.etl -report to generate a text report that summarizes DPC and ISR execution times.

The first row in Dumpfile.csv begins with the text "Event Name" and contains column titles for the report.

In Dumpfile.csv, the DPC, Timer DPC, and ISR events have the following format:

PerfInfo, (DPC|ISR|Timer DPC), unused, end time, kernel time, user time, start time, routine address, unused, unused

To get an approximation of the execution time for a DPC, Timer DPC, or ISR that a routine address points to, subtract the start time from the end time.

In Dumpfile.csv, the context swap events have the following format:

Thread, ContxtSwap, unused, end time, kernel time, user time, new thread id, old thread id, new thread priority, old thread priority, , , , 1, old thread state, old thread ideal processor number, unused, unused

Note:
For more information about each event, see "MOF Classes" in the "Event Tracing" section of the Microsoft Platform SDK.

Top of pageTop of page

Executing at PASSIVE_LEVEL

When a driver must perform work that will cause it to exceed the time limit, the driver should perform the work at PASSIVE_LEVEL. Two options exist for performing work at PASSIVE_LEVEL: the driver can create dedicated threads to execute long work items or it can use kernel-mode worker threads that the system provides.

Although creating a dedicated thread is a more flexible approach, leveraging system-worker threads is simpler and more efficient from the system's point of view.

For more information on how drivers can create and use dedicated-driver threads and system-worker threads, see the "Thread Objects" section in the Microsoft DDK.


Top of pageTop of page