This paper describes how the Operating Systems and Middleware group at Hasso Plattner Institut (HPI) extends the Windows® Operating System Internals Curriculum Resource Kit (CRK)i with hands-on labs using the Windows Research Kernel (WRK)ii for teaching undergraduate/graduate operating systems classes. The paper also demonstrates the applicability of the WRK as a proof-of-concept environment for current research projects of the Operating Systems and Middleware group.
About the School
The Hasso Plattner Institut (HPI) for Software Systems Engineering is a privately funded and independent research institute associated with the University of Potsdam in Germany. Founded in 1998, HPI has rich experience conducting research projects in conjunction with industry partners, nationally and internationally.
The Operating Systems and Middleware group at HPI offers various lectures, courses, and seminars as part of the undergraduate and graduate Software Engineering curriculum at HPI. Teaching activities focus on the architecture of operating systems, on component-based middleware, and on predictable distributed and embedded computing. The curriculum discusses operating system issues with standard platforms (such as Windows 2000 and Windows XP, Mac OS X [for BSD UNIX], and Solaris) as well as with embedded systems (such as Windows Embedded and Embedded Linux).
About the Operating Systems Curriculum
At HPI, undergraduate students must attend the introductory Operating Systems course and optionally can attend the successor course Operating Systems II. Both courses are based on the CRK, which is extended by hands-on labs that examine the WRK.
Commercial off-the-shelf operating systems are complex software systems of remarkable size. Gaining a thorough understanding of an operating system's design rationale, algorithms, heuristics, and implementation strategies from reading source code is difficult at best. More helpful is to learn about these details through real-world case studies.
HPI's approach to teaching an operating system curriculum is threefold: The curriculum starts with a two-semester undergraduate lecture series on the various aspects of an operating system. Classroom experiments, labs, and homework assignments accompany this series. Also, the first semester of the lecture series is mandatory for undergraduate students.
A seminar on operating system services and administration is the second offering to students. In this seminar, students are expected to set up, experiment, and evaluate systems. Interoperability of heterogeneous settings is the main focus.
The third step in the operating system curriculum is a source code analysis project. Students are asked to explain the exact implementation of certain operating system functionality based on the actual source code.
Within the lecture, one section discusses Windows scheduling principles, priorities (Windows API versus NTOS kernel), scheduling data structures, scheduling scenarios, priority boosts, and priority adjustments. For example, students look at the central thread state (see Figure 1).
Figure 1. The central thread diagram.
In contrast to many other operating systems, Windows is thoroughly instrumented. Tools such as the Performance Monitor can be used to investigate a multitude of performance counters, thus gathering data about the internal state of the Windows operating system. In the lecture, after describing the scheduling algorithm used by Windows and the expected system behavior, HPI incorporates labs that dig into an actual live system. For example, it uses the Performance Monitor to display scheduling state changes in real time (10 millisecond resolution) for a given application.
CPU utilization, fairness, throughput, and minimal response time are typical optimization criteria for CPU schedulers. Most scheduling algorithms try to achieve these goals by dynamically assigning priorities to processes. There are two general classes of scheduler implementation: (1) schedulers based on priority elevation, and (2) decreasing priority schedulers with aging. The VMS and Windows schedulers fall into the first class, whereas Mach, UNIX, and Linux fall into the second class. Because schedulers are workload-dependent and based on heuristics, many variations may exist within both classes (see Figure 2).
Figure 2. Priority boosting and decay.
Priority boosting is discussed in the context of the Windows scheduler implementation, and system behavior is demonstrated in a lab using performance counter data (see Figure 3). Select labs are demonstrated during the lecture; however, students are also expected to experiment on their own. In some cases, this requires an elevation of privileges for the students' user accounts.
Figure 3. Observing the scheduler using performance counters.
To prepare students to accomplish lab goals, faculty first introduce how to build the kernel, how the WRK is organized, how to set up virtual machines, and how to use kernel debuggers, as well as Windows programming conventions.
The WRK requires a working Windows Server® 2003 operating system. To save students the time of installing and setting up virtual machines, HPI prepared Microsoft® Virtual PC 2007 and VMware images for download on its Web site.
Windows source code is complex. To help students find their way through all the documentation, HPI faculty prepared a browsable HTML version using the open source tool Doxygen. Because the Windows coding conventions differ from what Doxygen can process, HPI wrote some scripts that transform the Windows source code into a Doxygen-compatible format.
The advantage of the documentation Web site is that students can easily click through all the source files by following hyperlinks on function names or data structure definitions. The HTML representation provides syntax highlighting, which improves code readability.
Figure 4. doxygen interactive documentation.
HPI also created a Web site, Windows Research Kernel @HPI,iii that aggregates its labs, tutorials, “how-tos,” and other information garnered while working with the WRK. For example, HPI provides a Microsoft Visual Studio® 2005 development system–based solution on this site that makes it possible to build the Windows kernel directly within Visual Studio.
Figure 5. The HPI Windows Research Kernel internal Web site.
Tutorial on Kernel Debuggers
Most of the students have used debuggers for programs they have written on their own. Kernel debuggers, however, usually require more setup and are more difficult to handle. Typically, debugging an operating system kernel requires two machines: a debugger host and a target. The debugger host is necessary because when debugging the target machine, the kernel is usually halted. This debugging model is something new for most students. Historically, the debugger host and target were connected via a null modem cable; nowadays, an IEEE 1394 (FireWire) connection is possible. Fortunately, most virtual machine vendors allow for virtual COM ports, which simplifies the debugging process. In the preliminary lab, students learn how to set up kernel debuggers and use them.
Kernel Programming Project: Implementation of a System Service Call
After the preliminary lab, students train their skills toward implementing the system service call. Implementing the service call requires several steps, as shown in the following figure.
In a first exercise, students must implement their own system service call wrapper library (usually called myntdll.dll). The purpose of this exercise is to repeat the concept and implementation of dynamic link libraries. Also, it is a good idea to let students debate the sense of such system service call wrappers. The basic myntdll.dll library should provide only one service: WrkSetThreadPriority. The semantics of this service are such that the priority of a thread identified by a handle should be set explicitly to a value between 1 and 31. The implementation is already given inside the kernel: NtSetInformationThread. Students only have to set up the stack appropriately and issue an interrupt instruction with the appropriate service call number specified.
The second step toward the implementation of a system service call is the modification of an existing one (NtSetInformationProcess is a promising candidate). The purpose of this exercise is to learn how to transfer data between user-mode and kernel-mode and to check transferred data for validity. NtSetInformationProcess accepts a parameter that defines what information should be set inside the process. By extending this enumeration, one can easily extend the functionality of the service call. The implemented functionality is artificial. However, the task is to (mis-)use the NtSetInformationProcess service call to pass two integer values down to the kernel, add them, and print the result on the debugger console. Besides modifying NtSetInformationProcess, myntdll.dll also has to be modified appropriately.
Third, students implement their own system service call. All that is left to do is to extend the service call table in order to inform the service dispatcher of the new service and combine steps one and two to accomplish the goal. The task is to provide a system service that allows for adding two integers. This rather simple service expects two integers as input and returns the sum as one integer. Once again, myntdll.dll must be modified appropriately. Where the system service implementation is located is arbitrary, depending on the students' preferences. However, the compiler and linker must be able to build the new kernel.
As already stated, this first implementation of a system service call is artificial rather than useful. Thus, the exercise was further refined, to recapitulate process and thread management and to combine it with implementing a system service call. In Windows, the unit of scheduling is a thread, not a process. So, the faculty came up with the idea of removing a process from the list of active processes that Windows maintains. If a process is not in the list of processes in the system, applications like Task Manager will not be able to list that process, which means the process is invisible to the system. However, the application represented by the process will continue executing, as its threads are still visible for the scheduler. The task for students is to implement a system service call that removes and adds, respectively, a specified process from the system. Calling the system service will thus toggle the visibility of a process in the system.
Two approaches are generally used to solve this problem: First, students can remove the process from the list of active processes. Second, the EPROCESS data structure can be extended with a flag indicating whether a process should occur in the Task Manager. While the first approach requires a detailed understanding of locking schemes inside the kernel, the second approach is rather simple to implement. Besides adding the flag, students only have to modify the PsEnumProcesses kernel function that is used by the Task Manager. The function must be modified in such a way that it evaluates the flag inside the EPROCESS structure and does not return any information if the flag is set.
WRK for Research
The WRK is valuable not only for teaching purposes, but also for experimentally evaluating research projects. This section presents two current research projects at the Operating Systems and Middleware group: the Windows Monitoring Kerneliv (WMK) and the Kernel-mode Scheduling Serverv. The WMK is an efficient event-logging infrastructure—even more efficient than Event Tracing for Windows—that is more appropriate for analyzing operating system behavior than other existing instrumentation methods. The Windows Soft Real-Time Scheduler enables the WRK to manually assign a portion of the CPU for a certain set of threads.
The Windows Monitoring Kernel
Monitoring the kernel while it is active greatly facilitates understanding the details of the WRK. The Windows kernel itself is fully equipped with instrumentation facilities: Windows Management Instrumentation, Event Tracing for Windows, and Windows performance counters. However, after reviewing each of these, HPI faculty drew the conclusion that, for the purpose of analyzing specific core details of the kernel, all of these great tools are inappropriate. The following requirements had to be met:
- Applicable at a very early stage in the boot process
- Applicable at any interrupt request level (IRQL)
- Low in overhead
- Extensible with respect to the number of supported events
- Equipped with comprehensive visualization and analysis tools
- The search for the right performance monitor finally resulted in the Windows Monitoring Kernel, a version of the WRK that is extended by an efficient event-logging infrastructure, as well as a set of powerful analysis tools. The characteristics of the WMK are:
- Averages 32 bytes per event
- Handles up to 60,000 events per second
- Averages 6 percent performance loss compared to the original WRK
- Supports 12 or more types of events, spread throughout different code locations
- Supports custom driver/library–generated events
The WMK has been used several times for evaluating certain aspects of the kernel. For instance, HPI measured the amount of time a thread can continuously be run before it is preempted by the operating system. Interestingly, it learned that most threads never get the chance to completely consume their quantum. Also, HPI was able to monitor synchronization dependencies between threads of applications, indicating possible bottlenecks due to lock contention.
Kernel-Mode Scheduling Server
The idea of the scheduling server was developed in context of the SONiC system, where a regular workstation, used interactively by users, was to be integrated into a cluster of workstations within a distributed computing facility. The scheduling server was supposed to guarantee that the background processes use only a specified number of CPU cycles.
A scheduling server implementation utilizes the different priorities of operating system scheduling entities. Most operating systems use a priority-based, round-robin scheduling approach. Each unit of scheduling (for example, a thread on a Windows-based system) has an assigned priority value. The scheduler always chooses the (runnable) activity with the highest priority to run next.
This scheduling approach can be utilized by the scheduling server to manage a set of assigned tasks independently of the kernel scheduler: A scheduling server dispatcher thread runs on the system's highest priority level and boosts the controlled tasks to the second-highest priority level. By controlling the run time of the tasks at the highest priority level, and by suspending the task for a certain time, the scheduling server gives a specific share of CPU cycles to each controlled thread.
In a kernel-mode implementation, the dispatcher thread can be removed by integrating the dispatching algorithm into the operating system scheduler code. Each Windows thread has an assigned quantum of time that defines the time the specific thread can reside in the running state. When a clock interrupt occurs, the quantum value is decremented. When the quantum value reaches zero, a software interrupt is issued, indicating the end of the quantum. This interrupt results in the execution of a Deferred Procedure Call (DPC), which schedules another thread for execution. HPI modified this implementation by adding another counter, the partitioning clock that counts, for each thread under scheduling server control, the amount of time it may run. When this partitioning clock reaches zero, the same DPC is scheduled as if a quantum has been exhausted. The overhead of this implementation arises only from two additional clock decrement instructions in the clock interrupt handler.
About the Course Authors
Andreas Polze is the Operating Systems and Middleware Professor at the Hasso-Plattner-Institut for Software Systems Engineering at University of Potsdam, Germany. He received a doctoral degree from Freie University Berlin, Germany, in 1994 and a habilitation degree from Humboldt University Berlin in 2001, both in Computer Science. His habilitation thesis investigates Predictable Computing in Multicomputer Systems.
His current research interests include Interconnecting Middleware and Embedded Systems, Mobility and Adaptive System Configuration, and End-to-End Service Availability for standard middleware platforms. Andreas Polze has authored more than 50 papers in scientific journals and conference proceedings and has contributed to five books.
Alexander Schmidt studied computer science at the Chemnitz University of Technology in Germany, where he graduated and received his diploma. In 2006, Alexander Schmidt joined the Operating Systems and Middleware group at HPI as a Ph.D. student focusing on the area of monitoring in the operating system context. He is currently engaged in the Windows Research Kernel project hosted at HPI and contributes to the Windows Monitoring Kernel, an efficient event-logging infrastructure for monitoring arbitrary applications based on Windows systems.
Michael Schöbel studied Software Systems Engineering at HPI, where he received a M.Sc. degree. In 2005, he joined the HPI research school on service-oriented system engineering. His current research focuses on monitoring, dynamic program analysis, and operating system support for service-based systems. New concepts and abstractions are implemented in the context of the Windows Research Kernel.
Windows Operating Systems Internals Curriculum Resource Kit (CRK)
The CRK is a collection of instructional material that follows the ACM/IEEE-CS Operating System Body of Knowledge to illustrate operating systems concepts using Windows XP as a case study. The CRK is based on the book Microsoft Windows Internals and adapted for university use by the book's authors, Mark Russinovich and David Solomon, and Professor Andreas Polze of the University of Potsdam, Germany.
The experiments, quizzes, and assignments integrated with the course materials were tested over the five years in Professor Polze's operating systems architecture class, as described in a paper presented at the SIGCSE 2006 Technical Symposium on Computer Science Education. Universities can download the CRK core units from the Faculty Connection Resource Center and use them to supplement lectures on operating systems.
Windows Research Kernel (WRK)
The WRK contains the bulk of the source code for the core Windows (NTOS) kernel (compatible with Windows Server 2003 Service Pack 1 [SP1] and Windows XP for x86-based and x64-based computers). It includes the core sources for object management, processes, threads, virtual memory, and more. Other modules are provided as binary objects that can be linked to produce a fully functional NTOS executable file and booted using Windows Server 2003 SP1 or Windows XP Professional x64 Edition. The current WRK release (version 1.2) is a kit with a build environment, Microsoft Virtual PC 2007, and documentation.
iii Windows Research Kernel @ HPI: www.dcl.hpi.uni-potsdam.de/research/WRK
iv Schmidt, A. and Schöbel, M. "Analyzing System Behavior: How the Operating System Can Help," in 1st Workshop on Applied Program Analysis Bremen, Germany, September 2007.
v Schöbel, M. and Polze, A. "Kernel-Mode Scheduling Server for CPU Partitioning: A Case Study Using the Windows Research Kernel," in Proceedings of the 23rd ACM Symposium on Applied Computing, Fortaleza, Brazil. March 2008.
This case study is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.
Document published June 2009