Click Here to Install Silverlight*
United StatesChange|All Microsoft Sites
Windows Media Player 9 Series
|Windows Media Worldwide

Optimizing Windows Media Services

Abstract
Microsoft Windows Media Services 9 Series is the distribution component of the Windows Media 9 Series platform. This article describes performance and scalability tests that were conducted on a streaming media system in a controlled lab environment. Tuning recommendations are included to help you to achieve optimal performance from your Windows Media server.

 

Alexandre Ferreira
Microsoft Corporation
March 2005
 

Applies to:
   Microsoft® Windows Media® Services 9 Series
 

Contents


Introduction

This article provides a technical overview of the performance and scalability of Microsoft Windows Media Services 9 Series (including the Windows Media Services update released with Windows Server 2003 Service Pack 1.) It describes common Windows Media Services performance issues, limitations, and performance monitoring techniques. It also presents the results from a set of performance tests conducted in a controlled lab environment.

It is recommended that you use the information presented in this document as a guideline. The performance results are based on specific hardware configurations that represent simplified versions of real-world scenarios. The actual capacity of your streaming media system depends on several factors, including network topology, user utilization patterns, hardware configuration, and software configuration. Based on the guidelines and performance information in this article, you should be able to design, fine-tune, and maximize the capacity of your servers to achieve the best results for your individual situation.

Back to the top of this pageBack to Top


Basic Guidelines

To guarantee the best experience for your users, consider the following basic guidelines:
  • Limit the total number of users to 50 percent of the maximum user capacity achieved in your load tests.
  • Ensure that overall network utilization is less than 50 percent of the maximum network interface capacity or less than 50 percent of the maximum throughput capacity of the most common bit rate.
  • Ensure that your server has enough available memory to perform at the desired performance levels.
  • Whenever possible, use a dedicated computer for streaming. Avoid running additional CPU-intensive services, such as Internet Information Services (IIS) or SQL Server, on the server you use when streaming content.

Back to the top of this pageBack to Top


Windows Media Services 4.1 vs. Windows Media Services 9 Series

Windows Media Services 9 Series provides significant performance improvements when compared to the previous version, Windows Media Services  4.1. The following list contains some of the features of Windows Media Services 9 Series that contribute to these performance gains:
  • New object model and extensible plug-in architecture.
  • Improved I/O and threading model.
  • Improved user experience provided by Fast Streaming. Fast Streaming is comprised of four components: Fast Start, Fast Cache, Fast Reconnect, and Fast Recovery.
  • Fewer disk-seek operations due to the retrieval of larger blocks of data.
  • Increased number of simultaneous users in common streaming scenarios.
  • In-memory caching of frequently accessed content using the file-buffering capabilities of the Windows Server 2003 operating system.
  • Higher packet recovery rates over User Datagram Protocol (UDP) transport.
  • Support for Real Time Streaming Protocol (RTSP).
  • An improved load simulation tool (Windows Media Load Simulator).

The following charts summarize the performance gains of Windows Media Services 9 Series in Windows Server 2003 compared to Windows Media Services  4.1 in Windows 2000 Server. The first pair of columns presents a comparison between the maximum number of broadcast users in both platforms. The second pair of columns presents a comparison between the maximum number of on-demand users sourcing from a hardware RAID 0 set of three Ultra SCSI 3 15000 rpm disks. The third pair of columns presents a comparison between the maximum number of on-demand users sourcing from a single Ultra SCSI 3 15000 rpm disk.

Modem/Dial-up

Chart showing version comparison for modem/dial-up connections

DSL/Broadband

Chart showing version comparison for DSL/broadband connections

Intranet

Chart showing version comparison for intranet connections

Back to the top of this pageBack to Top


Bottlenecks

Bottlenecks occur when one part of a process or activity is slower or impedes the other parts, thus hindering the overall progress or performance. The most common bottlenecks in a Windows Media Services system are processor (CPU) capacity, data source throughput, outgoing network bandwidth, and system memory. CPU utilization and system memory bottlenecks are usually easier to identify than data source throughput and network bandwidth limitations.

It is crucial that you identify, remove, or minimize the effects of all bottlenecks in your system to maximize the Windows Media server capacity and improve the end-user experience.

Processor (CPU) Capacity

System administrators tend to believe that their systems are healthy as long as the CPU utilization does not reach 100 percent. Unfortunately, this assumption is not always true. For example, there are several cases in which the server is not able to accept extra loads, even though the CPU utilization level is very low.

There are other types of bottlenecks that are not caused by high CPU utilization. These bottlenecks may significantly degrade the end-user experience without affecting the server CPU utilization level. In general, system memory bottlenecks may cause the CPU utilization to approach 100 percent due to memory-paging operations. Network bandwidth and data source throughput bottlenecks, on the other hand, usually do not affect the average CPU utilization level.

You can monitor CPU utilization by using several different programs and utilities, including:
  • The Windows Media Services snap-in for Microsoft Management Console (MMC). This snap-in provides system CPU utilization information on the Monitor tab in the details pane.
  • Windows Task Manager.
  • Performance Monitor. The \Processor(*)\% Processor Time counter provides more details about CPU utilization, such as charts and history information.

It is very important to determine the correct hardware requirements, capacity, and CPU power for your streaming media system. The average CPU utilization depends primarily on the operations that users perform, such as connecting to a stream, streaming content, changing between playlist entries, fast-forwarding, seeking, or submitting log entries.

As a general rule, average CPU utilization should not exceed 25 percent, and the number of concurrent users should remain below 50 percent of the maximum server capacity at the specific bit rate. While this guideline might seem conservative, it is derived from the fact that the CPU usage for internal server operations varies significantly. For example, consider a server broadcasting to a few thousand steady clients. The average CPU utilization, depending on the bit rate and server used, easily stays below 20 percent. If a few hundred clients attempt to connect to the server or switch to different playlist entries at the same time, the server CPU utilization may spike to high levels for a few seconds. In general, streaming to multiple clients requires less server CPU usage than processing individual client requests. Consequently, keeping the average CPU load below 25 percent does not cause a significant reduction in the maximum number of streaming users. The remaining 75 percent of the server's CPU capacity is available to process more CPU-intensive user requests, minimizing response times and maximizing the user experience. Fast Streaming and seek operations are two examples of CPU-intensive actions that can benefit from additional idle CPU cycles.

If you find that the average CPU utilization of your server becomes temporarily higher than the recommended value, then:
  • Avoid real-time server-side playlist manipulation.
  • Reduce the Connection rate (per second) limit.
  • Reduce the Fast Start bandwidth per player connection limit.

On a permanent basis, consider increasing your hardware capacity to decrease the average CPU utilization level and provide a higher quality of service to your users.

Number of Processors

Windows Media Services 9 Series relies mostly on I/O operations. Therefore, increasing the number of processors does not necessarily increase the server's processing power. Other factors, such as internal bus layout, bus speed, network interface bus location, interruption-handling distribution among different processors, and data source throughput capacity, have a significant effect on the overall performance.

When Windows Media servers use 1 gigabit per second (Gbps) network adapters, scalability tests show that dual-physical-processor systems provide optimal results in the majority of cases. When servers use 100 megabits per second (Mbps) network adapters, tests show that single-processor servers can handle the load in the majority of cases. It is recommended that you use computers with four or more processors when streaming to wireless networks and using CPU-intensive plug-ins.

The following charts show the maximum number of 22 kilobits per second (Kbps) and 300 Kbps RTSPU streams that Windows Media Services can serve when running on computers with one, two, four, and eight processors enabled. Roughly, as the number of processors increases linearly, the number of connected users grows exponentially. This behavior is caused mainly by I/O resource limitations. Due to I/O constraints, the additional processing power is not fully utilized. See Appendix A: Lab Setup Description for specific hardware details about the eight-processor server used in this test (reference hardware S3).

Chart showing maximum number of 22 Kbps streams

Chart showing maximum number of 300 Kbps streams

Follow these guidelines to achieve optimal server performance with regard to processor capacity:
  • Limit the average CPU utilization level to be less than or equal to 25 percent of the total processor capacity.
  • Avoid running CPU-intensive operations while streaming content to multiple clients.
  • Quit as many programs as possible, including the Windows Media Services snap-in for MMC, if your CPU usage is above the normal utilization level.

Data Source Throughput

Windows Media Services 9 Series supports streaming digital media from various sources through built-in and non-Microsoft data source plug-ins. The default installation of Windows Media Services includes a set of plug-ins that provide access to content over a network (streamed from other Windows Media servers or encoders), HTTP download, and file data sources (stored on local or remote file systems).

To guarantee a good user experience, regardless of the data source, ensure that the connection between the server and the data source can sustain the required data delivery rate. Data sources can be as simple as streams stored on the local hard disk. More complex data sources include streams received from distribution servers or streams stored in Network Attached Storage (NAS) devices and Storage Area Networks (SAN) infrastructures. Depending on the way in which the data source is connected to the server (architecture, drivers, protocols, and so on), the maximum throughput and effect on the server CPU utilization varies significantly. Specific performance results and comparisons between the various solutions are outside the scope of this article. Check with your storage solution vendor to determine the maximum sustainable throughput and the resulting CPU utilization impact.

The publishing point type is a key factor in determining your data source throughput requirements. Broadcast publishing points are less data source-intensive than on-demand publishing points. At any given moment, the users connected to a broadcast publishing point all receive copies of the same piece of data. In general, a broadcast publishing point retrieves one instance of data from the source, and then splits and sends it to multiple users. In contrast, on-demand publishing points require distinct data reads for every client connection, resulting in increased data source load.

There are cases in which several on-demand users stream content from a few very popular files. To improve performance in this scenario, the built-in WMS File Data Source plug-in takes advantage of the file-buffering capabilities of Windows Server 2003. This way, if several on-demand users access the same stream, Windows Media Services retrieves data from the server's cache instead of from the original data source. This behavior helps to reduce the data source throughput requirements significantly. By default, Windows Media Services takes advantage of the file buffering capabilities when sourcing from the local NTFS or FAT file system. File buffering support for other types of storage systems depends on the driver implementation. Check with your storage solution vendor to determine if their solution supports the Windows Server 2003 file buffering feature.

The following diagrams show a comparison between the PhysicalDisk(_Total)\Disk Read Bytes/sec and Windows Media Services\Current File Read Rate (Kbps) counters (both scaled to same unit), for 22 Kbps and 300 Kbps streams that use the RTSPT protocol. These diagrams illustrate an ideal scenario in which all users receive an on-demand stream from a single piece of content. In this scenario, the total amount of information that Windows Media Services reads from the built-in WMS File Data Source plug-in grows linearly while the amount of information actually retrieved from the disk remains roughly constant and very low.

Diagram showing comparison of 22 Kbps streams that use RTSPT

Diagram showing comparison of 300 Kbps streams that use RTSPT

You can view several performance counters to diagnose the status of the different data source layers. The primary counter used to evaluate data source bottlenecks is the \Windows Media Services\Current Late Read Rate counter in Performance Monitor. This counter, available at both server and publishing point level, indicates the current number of read operations that were completed with delays. You can use the \Windows Media Services\Current Late Read Rate counter, available at the publishing point level, to determine which specific publishing point is experiencing read delays.

If the values for these counters are higher than zero for an extended period of time, it is an indication that one or more publishing points are experiencing data source throughput problems. In such cases, you can use the \Windows Media Services\Current File Read Rate (Kbps) and \Windows Media Services\Current Incoming Bandwidth (Kbps) counters to identify the incoming data rate. Depending on the server's data source, you can use specific counters such as Local File System\Physical Disks, Network Data Source\Network Interface and Remote File Systems\NTB connections to identify the current level at which those interfaces are operating. If you find that one or more interfaces used by the data source connection are responsible for the system bottleneck, you may consider upgrading them, adding more resources, or distributing the load among different servers.

Data sources can be split into three groups: locally cached (in memory), remotely cached, and non-cached.

Locally cached (in memory) environments occur when the majority of users stream a very small set of content. Adding more clients to locally cached environments usually results in increased memory usage and CPU utilization. Depending on the data source used, such an increase usually does not cause late reads. Instead, the server CPU utilization reaches a high level, and the \Windows Media Services\Current Late Send Rate counter increases above zero. The result is a typical case of CPU bottleneck, a topic covered in more detail in the following sections. A common example of the locally cached scenario is a situation in which the majority of users stream a single piece of content stored in the local file system.

Remotely cached or non-cached environments occur when multiple users access multiple pieces of content. In this scenario, regardless of whether content is cached remotely or not cached at all, the late reads occur when the connection between the server and the data source reaches a throughput limit or the data source capacity is exhausted. NAS and SAN infrastructures are good examples of remotely cached environments in which the maximum throughput is determined by the network or proprietary interface capacity. An example of a non-cached environment occurs when users stream multiple pieces of content stored in the local file system. In this case, the maximum throughput typically is determined by the disk or disk interface capacity.

In some cases, the incoming data source connection and outgoing streaming traffic share the same network interface. This configuration is not recommended because the resulting throughput may overwhelm the interface more quickly.

All on-demand performance tests presented in this document were conducted using single streams stored in the local NTFS files system. This locally cached approach was chosen to demonstrate the maximum capacity of Windows Media Services independent of any data source hardware limitations. The capacity of Windows Media Services in on-demand cache-miss scenarios depends on the data source's hardware configuration and throughput capacity. As client distribution increases among multiple streams, there are more dependencies on data source capacity due to the reduction of cache hits. Appendix B: Test Profiles contains a list of performance test results. Using the reference hardware S1 and S2, Windows Media Services was able to deliver 22,000 22 Kbps on-demand streams at a throughput rate of approximately 470 megabits per second (Mbps). Similar tests indicate that Windows Media Services can support 990 1 Mbps on-demand streams at approximately 970 Mbps.

The following guidelines may help you minimize the effects of bottlenecks in data source constrained systems:
  • Avoid storing the content on the hard disk on which the operating system is installed.
  • Disable page-file operations on the hard disk on which the content is stored.
  • Whenever possible, store content on the local hard disks or in storage systems that support Windows Server 2003 file buffering.
  • In a multiple server environment, replicate content across the local file systems of all servers, if the amount of data and frequency of updates is not prohibitive. If replicating data is not a cost-effective solution, partition the data and traffic across multiple servers.
  • When storing content remotely, avoid using the same connection for incoming and outgoing traffic. Use dedicated interfaces for each task.
  • Do not retrieve content from HTTP sources. Use HTTP for dynamic playlist processing and retrieval only.

Enhanced trick mode

The Windows Media Services update for release with Windows Media Server 2003 Service Pack 1 (SP1) includes the Advanced FF/RW Media Parser plug-in, which helps mitigate data source bottlenecks that can occur when clients send a fast-forward or fast-rewind ("trick mode") request. The plug-in can be especially useful in high-bit-rate streaming scenarios, such as on-demand video delivery in an IPTV system. To use the plug-in, the content provider hosts files encoded at multiple FF/RW rates. These files are then accessed by clients when sending a trick mode request. For more information, see Using enhanced trick mode.

Consider the case where a user plays at normal speed for 60 seconds, plays in fast-forward for 60 seconds with a speed factor of 5, and then resumes normal playback for another 60 seconds. The following chart shows the I/O Read Bytes/sec of the WMServer.exe process for a single client performing this scenario.

Chart showing enhanced trick mode

Notice that the numbers of bytes read is substantially lower with the Advanced FF/RW plug-in. When a number of clients are using a fast play mode simultaneously, the plug-in can substantially reduce back-end data source traffic.

Bandwidth

The two main aspects of bandwidth bottlenecks are the total capacity of the server network interfaces and the overall network capacity/topology between one or multiple servers and users. This article does not discuss the second aspect because most of the variables are outside the scope and control of the Windows Media Services configuration.

The two most common Ethernet network adapters have connection speeds of 100 Mbps and 1 Gbps. Depending on the hardware used, Windows Media Services can achieve up to 95 percent of the maximum capacity of network adapters.

Using a typical server configuration, Windows Media Services can reach the limit of a single 100 Mbps network adapter by using only a fraction of the CPU resources. Because of this characteristic, this article focuses only on performance results achieved using 1 Gbps network adapters. This article does not discuss server configurations with one or more 100 Mbps network adapters.

To maximize the overall throughput of Windows Media Services, it is recommended that you use 1 Gbps network adapters. The 1 Gbps network adapters yield impressive results that can reach single-server levels of up to 960 Mbps. When two or more 1 Gbps network adapters are used with reference hardware S1 and S2, the CPU usage becomes the limiting factor. Nevertheless, in performance tests, Windows Media Services achieved throughput levels higher than 1.3 Gbps when used with two 1 Gbps network adapters.

The \Windows Media Services\Current Late Send Rate counter is the most relevant performance counter when evaluating network bottlenecks. Windows Media Services reports a late send every time it sends out a packet at least one-half second later than the scheduled send time. Late sends could be the result of a lack of available bandwidth to fulfill all client requests, a lack of processor cycles to finish all scheduled operations in time, or late reads from the original data source. The second and third reasons are not directly related to network bottlenecks. To identify whether a network bottleneck is the cause of a late send, consider the following factors:
  • Late send spikes that last for five seconds or less usually indicate that the server was temporarily overloaded or was unable to retrieve data from the data source in time. The client-side buffer has enough data to sustain the playback until the server stream flow returns to normal. Common causes of temporary system overload are an influx of new client connections, a large number of simultaneous client requests (such as seek, fast-forward, and so on), and certain server-side playlist transitions during a broadcast. If the system overload is due to client activities, consider reducing the connection rate limit and reserving additional CPU capacity for processing client requests.
  • Sustained late sends that last longer than 10 seconds combined with low CPU utilization and no late reads usually indicate that there is an outbound network bottleneck. Check the \Windows Media Services\Current Player Allocated Bandwidth (Kbps) and \Windows Media Services\Current Player Send Rate (Kbps) performance counters to determine whether a network bottleneck is causing the late sends. A discrepancy between the allocated bandwidth and the actual player send rate indicates that clients are not receiving data quickly enough. Note that network bottlenecks may be caused by local interface constraints and/or by other remote bandwidth limitations that occur outside your network. Under such circumstances, you have the following options: increase your network interface and/or external network capacity, limit the maximum number of players, or add additional streaming servers. If you do not change the server configurations, the network bottleneck may affect the user playback quality.
  • Sustained late sends that last longer than 10 seconds combined with low CPU utilization and sustained late reads usually indicate that the server is not receiving data quickly enough from one or more data sources. This is a typical example of a data source bottleneck. Only users that are connected to publishing points affected by late reads will experience playback issues. Users that are connected to other publishing points may not experience any playback issues and may receive data at the normal rate.
  • Sustained late sends combined with very high CPU utilization, with or without late reads, usually indicate that one or more resources in the server have been exhausted. In this scenario, many users will experience playback problems. This is a typical example of a CPU or memory bottleneck. In this case, you should either decrease the maximum number of users or add additional servers to distribute the load.

The following guidelines may help you optimize server performance and avoid network-related issues:
  • Perform network load tests using Windows Media Load Simulator to determine the maximum capacity of your server and establish appropriate limits for your system.
  • Limit the aggregate player bandwidth to 50 percent of the maximum network interface capacity or 50 percent of the maximum aggregate bandwidth of the most common bit rate stream, whichever is lower. See Limits for more details. See Appendix B: Test Profiles for specific bit-rate aggregate bandwidth limits.
  • Use a 1 Gbps network adapter to maximize your server capacity, except in multicast scenarios.
  • Use dedicated network adapters for incoming and outgoing traffic if content is stored remotely.
  • Ensure that the network adapters and network infrastructure support full-duplex transfer.
  • If your network infrastructure permits, enable multicast streaming for broadcast publishing points.

System Memory

There are several utilities available for tracking system memory, including Windows Task Manager and the Performance Monitor \Process(WMServer)\Private Bytes counter. As a general guideline, the system memory should always exceed the amount of memory being used by the Windows Media Services process (WMServer.exe). Ideally, for broadcast scenarios, the Windows Media server should have at least 30 percent more memory than is required to perform at the target utilization level. For on-demand scenarios, it is recommended that the server have at least 50 percent more free memory than is required, since the operating system can use the additional memory for file buffering operations.

The average memory use per user depends on the publishing point type and the encoding settings used in the content, such as the bit rate, packet size, and the number of audio and video streams. See Appendix B: Test Profiles for more information about the average memory requirements per user. Depending on the usage pattern, target number of users, client distribution among different streams, and publishing point types, you should be able to estimate how much memory will be required for your streaming media system.

If your server has a large amount of physical memory, the additional memory can minimize the delays caused by memory-paging operations and increase your overall server performance. Given the nature of streaming media, Windows Media Services has a small window of time to send data to all connected users. In memory-constrained systems, memory-paging operations may cause unexpected delays while the server is processing, sending, or reading data from the sources. The resulting late sends usually affect the overall client experience. If the server has a large amount of available system memory, the operating system can maximize the use of file buffering and minimize the effects of data source throughput bottlenecks.

There are several factors that determine the server memory requirements. For example, streaming from broadcast publishing points requires less memory than streaming from on-demand publishing points. The connection protocols also affect the total amount of allocated memory. In general, Transmission Control Protocol (TCP)-based protocols require less system memory than protocols based on the User Datagram Protocol (UDP) because Windows Media Services stores extra information for UDP connections. Depending on the bit rate, Windows Media Services stores up to 10 seconds of sent data in the server memory to fulfill packet resend requests when UDP packets are lost during transmission. For more information about the about the protocols supported by Windows Media Services and the performance differences, see Protocols.

Depending on the bit rate, on-demand user connections require 3 to 10 times more memory than broadcast connections. In addition, on-demand connections that use UDP usually require 2 to 3 times more memory than TCP connections. In practice, a server with 4 gigabytes (GB) of RAM can support up to a maximum of 22,800 22 Kbps broadcast audio streams and 22,000 22 Kbps on-demand audio streams.

Like other 32-bit applications, Windows Media Services has a memory limit that restricts the maximum server capacity. If the Windows Media Services process (WMServer.exe) memory utilization reaches 2 GB, see Server Namespace Settings for instructions on how to decrease the average memory utilization per user.

The following guidelines may help you optimize server performance:
  • Calculate the optimal amount of memory required for your system based on the target performance level and client distribution pattern.
  • Provide at least 30 percent more system memory than the target Windows Media Services memory utilization level for broadcast scenarios.
  • Exceed the target Windows Media Services memory utilization level for on-demand scenarios by at least 50 percent to maximize file buffering capabilities and minimize the effects of memory-paging operations.
  • Ensure that the Windows Media Services memory utilization level does not exceed the total amount of physical memory in the system.
  • Keep the Windows Media Services memory utilization level below 2 GB. If memory utilization reaches 2 GB, quit and restart Windows Media Services, or distribute the load across additional servers.

Back to the top of this pageBack to Top


Performance Evaluation

This section explains how Windows Media Services 9 Series performed during a series of performance and stress tests. Included in this section are a list of test variables, an explanation of Windows Media Services limitations, and guidelines for maximizing your Windows Media server capacity. This section also includes side-by-side diagrams that show test results for 22 Kbps and 300 Kbps streams to illustrate Windows Media Services scalability in different test situations.

Protocols

Windows Media Services supports three different unicast streaming protocols: Real Time Streaming Protocol (RTSP), Hypertext Transfer Protocol (HTTP), and Microsoft Media Server protocol (MMS). Both RTSP and MMS can operate over a TCP or UDP transport. This document uses the following name conventions to refer to these transport methods: RTSPU (RTSP over UDP), RTSPT (RTSP over TCP), MMSU (MMS over UDP), and MMST (MMS over TCP). HTTP operates only over a TCP transport.

Windows Media Services also supports multicast streams. As the number of multicast users increases, there is no significant impact on server performance because clients connect to the stream and not to the server. Theoretically, a Windows Media server can handle an unlimited number of multicast users. Because multicast streams do not greatly impact server performance, multicast performance will not be discussed in this document.

In terms of performance, the protocols can be divided into two major categories: TCP-based protocols (RTSPT, HTTP, and MMST) and UDP-based protocols (RTSPU and MMSU). Performance results show that Windows Media Services can handle, on average, a higher number of streams using TCP-based protocols than streams using UDP-based protocols. The following diagrams illustrate the maximum number of 22 Kbps and 300 Kbps connections that Windows Media Services can support, depending on the protocol used.

Diagram showing maximum number of 22 Kbps streams by protocol

Diagram showing maximum number of 300 Kbps streams by protocol

When a client connects to Windows Media Services, the URL prefix determines what protocol Windows Media Services uses to stream content. If the URL has the default mms:// prefix, the player and server negotiate the best possible protocol automatically. Each protocol has advantages and limitations, and some protocols are more suitable than others in certain situations. For example, firewalls or proxy servers may block certain protocols from transmitting data. In cases like this, the player and server automatically attempt to re-establish the connection by using a protocol that can pass through the firewall or proxy. Therefore, it is highly recommended that you do not disable any of the protocols for performance reasons.

In a typical streaming media system deployment, you will support concurrent streams across all protocols. Therefore, you can determine the server capacity by factoring the client distribution over the average capacity of each specific protocol.

The following protocol guidelines may help you optimize server performance:
  • Always use the default mms:// prefix for connection URLs. This enables the player and server to negotiate the best possible protocol.
  • Whenever possible, enable the HTTP protocol. This protocol is very useful when firewalls or proxy servers block the RTSP and MMS protocols.
  • Specify a TCP-based protocol, such as RTSPT or HTTP, to minimize data loss when connecting from server to server, such as in a distribution or cache/proxy scenario.

Bit Rates

The bit rate of the digital media content, usually measured in Kbps, affects Windows Media server performance and the maximum number of clients that can connect to a stream. You can determine the total amount of data flowing through the server by multiplying the bit rate of the content by the number of users connected to each stream. This aggregate bit rate directly affects the network throughput, CPU utilization level, memory "footprint," and data throughput. As the bit rate of the content increases, the maximum number of users decreases. The correlation between the two is not always linear. Different bottlenecks restrict the maximum server capacity at different content bit rate levels. For example, servers usually encounter system memory and CPU utilization bottlenecks while streaming content encoded at low bit rates, such as between 8 Kbps and 22 Kbps. On the other hand, the maximum number of users streaming high-bit-rate content, such as content above 500 Kbps, is usually limited by network throughput bottlenecks.

Appendix B: Test Profiles presents a comprehensive set of performance test results that illustrate how a Windows Media server scales when streaming content with bit rates ranging from 22 Kbps to 1 Mbps. The following diagrams present a subset of that data for content streams with bit rates of 22 Kbps and 300 Kbps. These diagrams show scalability comparisons using RTSPU and RTSPT protocols for the following counters: Processor(_Total)\% Processor Time, Process(WMServer)\Working Set, Windows, and Windows Media Services\Current Player Send Rate (Kbps).

Diagram showing scalability comparisons using RTSPT and RTSPU protocols with 22 Kbps streams

Diagram showing scalability comparisons using RTSPT and RTSPU protocols with 300 Kbps streams

Diagram showing scalability comparisons using RTSPT and RTSPU protocols with 22 Kbps streams

Diagram showing scalability comparisons using RTSPT and RTSPU protocols with 300 Kbps streams

Diagram showing scalability comparisons using RTSPT and RTSPU protocols with 22 Kbps streams

Diagram showing scalability comparisons using RTSPT and RTSPU protocols with 300 Kbps streams

Multiple-Bit-Rate Content

Content encoded at multiple bit rates results in a single file or stream that includes several combined audio, video, or script streams. Each individual stream usually has a different bit rate. When a user connects to a multiple-bit-rate (MBR) stream, the player plays back the most suitable set of streams depending on the content bit rate and the available bandwidth.

Consider an MBR stream scenario with three video streams (Stream A – 285 Kbps, Stream B – 135 Kbps, and Stream C - 20 Kbps) and one audio stream (Stream D – 15 Kbps). While connecting to the stream, users have the following range of options: A+D (300 Kbps), B+D (150 Kbps), C+D (35 Kbps) and D (15 Kbps). A user with a 56 Kbps modem, for example, will most likely view the C+D (audio + video) stream, while a user with a 28 Kbps modem would be limited to the D (audio only) stream.

Automatic speed detection and stream selection are the key advantages of using MBR streams. This feature provides a better user experience and does not require that you set up multiple publishing points for different stream bit rates. Conversely, the server performance may be affected when MBR streams are used due to increased memory utilization per user and increased data source load. This behavior occurs because the server needs to retrieve, allocate memory, and process information for all the streams combined (A+B+C+D=500Kbps), regardless of the stream that is selected. Consequently, your system is more likely to encounter a memory or data source throughput bottleneck when streaming MBR content. When several users are connected to the same stream, the Windows Server 2003 file buffering feature can help to alleviate this limitation and can enable the server to handle a larger number of streams simultaneously.

An easy way to determine the overall server capacity when using MBR streaming is to account for the memory and data source bottlenecks for the sum of all streams combined.

Variable-Bit-Rate Content

Windows Media Services offers limited support for variable-bit-rate (VBR) streams. Windows Media Services uses the Fast Cache feature to deliver VBR streams at a constant bit rate that is high enough to avoid client-side re-buffering. Because VBR streaming support was introduced in the Windows Media 9 Series platform, VBR streaming can only be used if Windows Media Player 9 Series, Windows Media Services 9 Series, and the Windows Media 9 Series codecs (Windows Media Audio 9 Series and Windows Media Video 9 Series) are used in combination. As with MBR streaming, you can determine the overall server capacity when using VBR streaming by accounting for the limitations imposed by the actual constant bit rate Fast Cache uses to deliver data.

Content Encoding Settings

Encoding settings such as packet size, buffer size, and key frame interval are usually transparent to the server and to end users because these settings are configured during content creation. Even though Windows Media Services cannot control these settings, they still may affect the overall server capacity. Specific encoding recommendations are outside the scope of this document, but consider the following basic guidelines before you determine which encoding settings to use for content that will be streamed:
  • Whenever possible, encode the content with packet sizes smaller than 1,452 bytes. This way, when the packets are combined with header information, they can fit within a single Maximum Transmission Unit (MTU) frame (1,500 bytes). Note that this may not be required for RTSP connections since the server automatically resizes the packets.
  • Use a buffer size between two and four seconds to minimize open latency and seek response time. Note that reducing the buffer size might adversely affect the encoding quality.
  • When setting up the encoder for live content streaming, use key frame intervals that are less than eight seconds to minimize open latency during broadcasts.

The following are a few basic guidelines that may help you achieve optimal performance with regard to bit rates:
  • Use bit rates and codecs that best fit your user needs. For example, even if your users have high-speed connections, do not encode your content at 2 Mbps if a 500 Kbps bit rate satisfies your quality needs.
  • Use MBR encoding to maximize the user experience and minimize the setup complexity when clients access the same stream over a wide range of bit rates.

Live vs. On-Demand Streaming

Earlier we discussed the performance differences between broadcast (live) and on-demand streaming. Broadcast streams use fewer resources than on-demand streams because broadcast publishing points share memory and data source resources among multiple users, while on-demand publishing points must obtain memory and data source resources for each connected user.

The following charts illustrate the differences between broadcast publishing points and on-demand publishing points in terms of the Processor(_Total)\% Processor Time, Process(WMServer)\Working Set, and Windows Media Services\Current File Read Rate (Kbps) counters for 22 Kbps and 300 Kbps streams that use the RTSPU and RTSPT protocols.

Chart comparing on-demand and broadcast publishing points streaming 22 Kbps content

Table comparing on-demand and broadcast publishing points streaming 300 Kbps content

Table comparing on-demand and broadcast publishing points streaming 22 Kbps content

Table comparing on-demand and broadcast publishing points streaming 300 Kbps content

Chart comparing on-demand and broadcast publishing points streaming 22 Kbps content

Chart comparing on-demand and broadcast publishing points streaming 300 Kbps content

As shown in the previous diagrams, RTSPT scales better than RTSPU, and broadcast publishing points require fewer resources than on-demand publishing points. See Appendix B: Test Profiles for a comprehensive set of performance results, including details about maximum server capacity and recommended utilization levels. The matrix in Appendix B covers all protocols, common bit rates, and publishing point types. It also presents on-demand performance results achieved using modified namespace settings.

Fast Streaming

Fast Streaming provides an instant-on, always-on streaming experience by effectively eliminating buffering time and reducing the likelihood of an interruption in playback due to network conditions. Fast Streaming is made of the following four components:
  • Fast Cache provides a way to stream content to clients faster than the data rate specified by the stream.
  • Fast Start provides a way for clients to fill the initial buffer at speeds higher than the bit rate of the content requested.
  • Fast Recovery provides a way to recover lost or damaged data packets without the client having to request that the data be resent by the Windows Media server.
  • Fast Reconnect enables the client to reconnect to the server and restart streaming automatically after a temporary network outage.

During Fast Start, the server sends data to the client buffer at speeds higher than the requested content bit rate. This enables users to start receiving and rendering content quickly. The publishing point starts to stream the content at the defined bit rate only after the initial buffer requirement is satisfied. Consequently, the network usage is slightly higher when establishing a Fast Start connection than when establishing a regular connection. This difference is only noticeable when many users connect to the stream at the same time because the connection duration is very small compared to the total stream duration. To maximize the Fast Start experience for your users, follow the guidelines presented in this document, especially those related to network and CPU preservation.

To calculate the effects of Fast Start for a specific user scenario, use the following equation:

Equation for calculating the effects of Fast Start

Where:

Description of equation

For example, consider an Internet radio stream sent over a 56 Kbps modem connection. Assuming that a 56 Kbps modem yields roughly 45 Kbps worth of throughput and assuming that the audio content was encoded at 22 Kbps, the buffering time will be reduced from 5 seconds (using conventional streaming) to 2.4 seconds (using Fast Start). The buffering time for the same stream over a 700 Kbps DSL connection would be much smaller, around 15 milliseconds. In both cases, the user experience is still better than it would be if Fast Start was not used.

The following diagrams show the behavior of the Processor(_Total)\% Processor Time and Windows Media Services\Current Player Send Rate (Kbps) counters while users are connected to a 300 Kbps broadcast stream. Initially, there were no clients connected to the stream. After 8 seconds, 180 clients simultaneously connected and remained connected until another set of 180 clients connected about 20 seconds later.

As shown in the first diagram, regardless of whether Fast Start is used, a brief increase in the CPU utilization level occurs while the clients are connecting. The second diagram shows that the network utilization reached levels higher than normal for a small period of time before stabilizing at the expected throughput level. The spike in the network utilization usually lasts for a few seconds if several clients connect at the same time. The effect of a single connection is usually much smaller and lasts for less than one second.

Graph showing Fast Start comparison

Graph showing Fast Start comparison

Fast Cache provides a way to deliver content to clients faster than the data rate specified by the stream format. For example, when Fast Cache is enabled, the server can transmit a 100 Kbps stream at 500 Kbps. Windows Media Player still renders the stream at the specified data rate, but the Player can buffer a much larger portion of the content before rendering it. Fast Cache only works with on-demand streams. Using Fast Cache affects performance and reduces the maximum number of simultaneous users while the data source and network bandwidth utilization increases. From the performance point of view, a 100 Kbps stream that is transmitted at 500 Kbps with Fast Cache requires almost as many system and network resources as a 500 Kbps stream that does not use Fast Cache.

Although Fast Cache may require a higher resource usage per user, this usage is diminished over time since a Fast Cache client connection lasts for a fraction of the time of non-Fast Cache connections. For example, a two-minute, 100-Kbps clip streaming at 500 Kbps can be transferred in about 30 seconds, while the same clip streaming in real time would require two minutes of network usage. Over a certain period, the total amount of data streamed from your server is the same regardless of whether Fast Cache is enabled. Nevertheless, it is highly recommended that you enable Fast Cache, since it provides a better user experience by making the connection more tolerant to network bandwidth fluctuations.

The following diagrams illustrate the difference between 100 clients streaming a two-minute, 100-Kbps stream without Fast Cache versus 100 clients streaming the same clip with Fast Cache at five times the original speed. As shown in the first diagram, the difference in CPU utilization (\Processor(_Total)\% Processor Time) with a small number of users is insignificant. The second diagram shows that when Fast Start is enabled, the aggregate throughput (\Windows Media Services\Current Player Send Rate (Kbps)) is roughly five times higher and lasts only 20 percent of the time of the stream without Fast Cache. The third diagram shows that clients (\Windows Media Services\Current Streaming Players) using Fast Cache finish streaming the content and disconnect from the server more quickly.

Graph showing Fast Cache comparison with % Processor Time

Graph showing Fast Cache comparison with Current Player Send Rate

Graph showing Fast Cache comparison with Current Streaming Players

The following guidelines may help you optimize performance when using Fast Streaming:
  • Do not change the default Fast Streaming settings. They are configured to provide the best possible client experience.
  • Increase the default Fast Start bandwidth per player (Kbps) limit when streaming high-bit-rate content (700 Kbps and above) in a LAN environment. Use five times the target content bit rate as a baseline value.

Wireless

You can use the Windows Media Services Fast Recovery feature to improve end user experience in wireless streaming scenarios. To use Fast Recovery, you must enable forward error correction (FEC) on a publishing point. FEC is a common method of preserving the integrity of data transmitted over unreliable or slow network connections. When FEC is enabled, the server sends additional data that can be used to rebuild any packets that might be lost before they reach the client. These redundant packets enable the client to reconstruct the original transmission even if a significant number of packets are missing. The server creates these extra packets based on parameters submitted by clients during the connection process. Wireless support only works when the client connects to the server using the RTSPU protocol.

Wireless streaming with FEC can affect CPU performance, memory utilization, and network utilization. The amount of overhead imposed depends on the parameters submitted by the clients. Tests have shown, using the default server namespace configuration settings, that using a high recovery rate FEC setting of 50 percent (for example, WMFecSpan=4, WMFecPktsPerSpan=2 and WMThinning=0) has the following effect on system resources.
  • The maximum number of clients is reduced by more than 40 percent.
  • The network utilization increases by at least 50 percent.
  • The average memory utilization per client increases between 30 and 70 percent.

Increased resource utilization only happens if the client specifies FEC parameters during the connection. Simply enabling wireless support at the publishing point level does not cause any significant performance degradation. See Appendix B: Test Profiles for more details about the server performance for typical FEC bit rates (32 Kbps, 64 Kbps, and 128 Kbps).

The following guidelines may help you optimize server performance when streaming over wireless networks:
  • Enable wireless support only when the client connections require FEC.
  • Determine the lowest-possible FEC settings to guarantee the necessary recovery rate, and configure the settings accordingly. Using settings that are higher than required may place unnecessary demands on the system.
  • Use high-performance servers with four or more processors to support the increased CPU and memory utilization requirements for wireless streaming.
  • If all the clients connecting to your server are going to use FEC, disable UDP resends by reducing the MaxResendBufferSizeInMSecs namespace setting to zero. This change reduces the amount of memory allocated per user. For more information, see Server Namespace Settings.

Playlists

Playlists provide a way of organizing multiple pieces of digital media content into a single list. Windows Media Services supports both client-side playlists and server-side playlists. Client-side playlists are usually created by a player or by Web scripts and are saved as Windows Media metafiles with an .asx file name extension. Server-side playlists are usually created by content producers, server administrators, or Web page scripts and are saved as Windows Media metafiles with a .wsx file name extension.

Both types of playlists can affect server performance and resource utilization. The most relevant impact on the server occurs when a client transitions between playlist elements. Windows Media Services and the client must negotiate new stream settings every time the playlist switches from one element to the next. The cost of such operations can be quantified as a subset of a full client connection. While the server streams a playlist entry itself, there are no noticeable differences in server performance.

When streaming playlists, there is a strong correlation between publishing point type and the server's capacity. When playlists are streamed from an on-demand publishing point, client operations are distributed over time. Because it is unlikely that many clients will connect to the publishing point at the same time, resource-intensive operations like playlist transitions also happen at different points in time. When playlists are streamed from a broadcast publishing point, however, transitions typically occur simultaneously. As a result, the server CPU utilization may increase significantly during a broadcast because of the additional overhead required to process multiple client requests related to the playlist transition.

The following diagrams illustrate the effects in the \Processor(_Total)\% Processor Time, System\Context Switches/sec and \Windows Media Services\Current Player Send Rate (Kbps) performance counters during a playlist transition. In this example, the stream was delivered to 1,000 broadcast clients by using the RTSPT protocol. The diagrams show two transitions between three 22 Kbps, 60-second playlist elements. Other protocols demonstrate similar behavior.

Graph showing playlist transition effects on performance monitors

Graph showing playlist transition effects on performance monitors

Graph showing playlist transition effects on performance monitors

The following guidelines may help you optimize the server when using client-side or server-side playlists:
  • Whenever possible, use small server-side playlists (50 elements or fewer) for on-demand publishing points to decrease the amount of memory allocated per user.
  • If you plan to use a playlist that contains multiple elements during a broadcast, limit the total number of users to 20 percent or less of the maximum number of users for a certain bit rate.
  • Store static playlist (.wsx) files in your local file system to minimize client connection delays that would occur if the server had to retrieve these files from HTTP sources.

Limits

You can use limits to specify the performance boundaries for your Windows Media server. By adjusting the limit values, you can ensure that your transmission does not exceed the capabilities of your server, network, or audience. It is highly recommended that you evaluate the capacity of your system and set appropriate server limits before deploying it to a production environment. You can specify limits at both server and publishing point levels, depending on your specific needs.

The following are some of the limits you can configure in Windows Media Services:
  • Limit player connections. It is highly recommended that you change the value of this limit from Unlimited (the default value) to a value that suits your system. Based on your hardware profile, streaming scenarios, and system requirements, establish a maximum number of player connections.
  • Limit outgoing distribution connections. Determine how many distribution servers your system requires and set this limit appropriately. Do not leave it as Unlimited.
  • Limit aggregate player bandwidth (Kbps) and Limit aggregate outgoing distribution bandwidth (Kbps). Always set these limits to 100 percent of your network interface throughput limit. You should not use these limits to keep the network utilization level below 50 percent of the maximum capacity. Setting them to lower values may inhibit network utilization and Fast Streaming functionality. You should keep the network utilization level below 50 percent of the network capacity by limiting the maximum number of player connections instead.
  • Limit bandwidth per stream per player (Kbps) and Limit bandwidth per outgoing distribution stream (Kbps). You can use the default values for these limits. You can also set these limits to be equal to the value of the Fast Start bandwidth per player (Kbps) limit or the highest stream bit rate, whichever is higher.
  • Limit connection rate (per second). Set this limit to a value that is less than or equal to 50 clients per second. This limit helps to ensure that existing connections are not adversely affected when a large number of new clients connect to the server. It also helps ensure that users get the best possible experience while connecting to the stream. Performance tests using the reference hardware have shown that setting the limit to 50 clients per second provides optimal results in most situations.
  • Limit incoming bandwidth (Kbps). Determine the amount of incoming bandwidth your system requires and set this limit appropriately. Do not leave it as Unlimited.
  • Limit player timeout inactivity (seconds) and Limit connection acknowledgement (seconds). You can use the default values for these limits. For more information about these limits, see Windows Media Services Help.
  • Limit Fast Start bandwidth per player (Kbps). This value is set at the publishing point level only. You can use the default value for this limit. You should only increase the value of this limit when streaming high-bit-rate content over local area networks (LANs).
  • Limit Fast Cache content delivery rate. This value is set at the publishing point level only. You can use the default value for this limit. You should reduce this value in an on-demand scenario if the server is overburdened by a large number of concurrent users.

Back to the top of this pageBack to Top


Advanced Tuning

TCP/IP Registry Keys

Windows Server 2003 uses the FastSendDatagramThreshold registry key to determine whether a datagram should go through the fast I/O path or should be buffered during a send operation. Fast I/O means that the server bypasses the I/O subsystem and copies data directly to the network interface buffer.

The default value of the FastSendDatagramThreshold key is 1024. If the number of packets in a stream exceeds this value, additional operations are necessary. As a result, CPU utilization and context switches increase, while the maximum number of simultaneous clients that the server can handle decreases. Performance tests showed that changing the default threshold setting to a higher value, such as 1500 bytes, improves server performance.

In general, only high-bit-rate streams are affected by changing this key. Packet sizes larger than 1024 bytes usually appear in content that has bit rates higher than 100 Kbps. A side effect of changing this key value is an increase in the number of non-paged pool bytes allocated for the server. This change does not cause any significant problems.

See Appendix E: Registry Keys for more information on changing the FastSendDatagramThreshold settings in the registry.

 Note   Incorrectly editing the registry may severely damage your system. Before making changes to the registry, you should back up any valued data on the computer.

Server Namespace Settings

Windows Media Services is configured, by default, to achieve the best performance during typical utilization scenarios. Under certain conditions, however, you may be required to change some namespace settings to work around certain memory limitations. The primary reason for changing namespace settings is to prevent the server from running out of memory address space, which might happen when you use high-performance hardware that has more than 2 GB of system memory. Changes to the namespace settings can also minimize the effects of memory bottlenecks in on-demand constrained systems. In general, each 32-bit application is limited to 4 GB of memory—2 GB dedicated to user-mode memory and another 2 GB dedicated for kernel-mode operations. When the Windows Media Services process (WMServer.exe) reaches the 2 GB memory limit, it may no longer be able to allocate memory for additional client connections. Although the server can allocate only 2 GB of user-mode memory, you still should consider the recommendations presented in System Memory. It is highly recommended that you add additional memory to your server. The kernel may use this additional memory for various system operations such as file buffering.

Because Windows Media Services has been configured for the best possible performance, the following changes may negatively affect the server's performance:
  • Increased number of read operations per second. Changing this value night reduce the overall performance of your server, because the server might require a larger number of data source reads per second and the data source read sizes might decrease.
  • Reduced UDP resend buffer. If you reduce this value, the amount of data that the server keeps to acknowledge UDP resend requests is reduced as well. Therefore, clients that connect to your server on high latency networks by using UDP may be adversely affected.

It is recommended that you do not make changes to the server namespace before determining the effects on the overall server performance and the end-user experience. Note that the following settings are correlated, meaning that the actual internal buffer sizes and the amount of memory per client may not be exactly equal to the values specified. The amount of memory used per client depends on a combination of several internal parameters.

You can use the following configuration settings to reduce the amount of memory allocated per user in the event that the server reaches a memory limit. See Appendix B: Test Profiles for information about how to make such changes.
  • OptimalBufferSizeInMSecsOnDemand. Defines the maximum buffer size, in milliseconds, allocated per connection for an on-demand publishing point.
    Minimal setting = 0x3E8 (1000 ms)
    Default/Maximum setting = 0x2710 (10000 ms)
    
                  <node
    name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="HEX_VALUE_HERE"
    />
  • MaxBufferSizeInBytes. Defines the maximum buffer size, in bytes, allocated per connection for any publishing point.
    Minimal setting = 0x200 (512 bytes)
    Default/Maximum setting = 0x40000 (256 Kbytes)
    
                  <node
    name="MaxBufferSizeInBytes" opcode="create" type="int32" value="_HEX_VALUE_HERE"
    />
  • MaxResendBufferSizeInMSecs. Defines the maximum buffer size, in milliseconds, allocated per connection for UDP resend operations.
    Minimal setting = 0x0 (0 ms)
    Default/Maximum setting = 0x2710 (10000 ms)
    
                  <node
    name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x2710"/>

Use the following table as a guideline for server configurations when your target audience uses a low-speed connection (typically dial-up connection speeds of 10 Kbps to 40 Kbps).

Namespace valueRange
OptimalBufferSizeInMSecsOnDemand
0x7D0 - 0xBB8
MaxBufferSizeInBytes
0x2000 - 0x4000
MaxResendBufferSizeInMSecs
0x7D0 - 0xBB8

Use the following table as a guideline for server configurations when your target audience uses a high-speed connection (typically broadband connection speeds of 100 Kbps to 400 Kbps).

Namespace valueRange
OptimalBufferSizeInMSecsOnDemand
0xFA0 - 0x1F40
MaxBufferSizeInBytes
0x8000 - 0x10000
MaxResendBufferSizeInMSecs
0x1388 - 0x1B58

If your target audience connects to your server with a connection speed of 500 Kbps or higher, it is not likely that you will need to make any namespace changes. In this scenario, other system resources will be exhausted before the server experiences memory limitation problems.

Back to the top of this pageBack to Top


Appendix A: Lab Setup Description

Windows Media Services 9 Series performance tests were conducted in a controlled laboratory environment. The following diagram depicts the hardware configuration used during the tests.

Diagram showing the lab setup for Windows Media Services tests

Server Hardware Profile 1 (S1)

S1 - Dell PowerEdge 2650

Dual 2.4 gigahertz (GHz) Intel Xeon (HT) Processors

400 megahertz (MHz) Front Side Bus

512 KB L2 Advanced Transfer Cache

4 GB 200 MHz DDR SDRAM

PCI-X (1 X 64 bit/133 MHz) support

PowerEdge Expandable RAID controller, Version 3, Dual-Channel (PERC 3/DC) with three 18.2GB Ultra3 SCSI 15,000 rpm hard disks

Intel PRO/1000 XF Server Adaptor (supports PCI-X bus at 64 bit/133 MHz)

Windows Server 2003, Enterprise Edition

Windows Media Services 9 Series

Server Hardware Profile 2 (S2)

S2- HP/Compaq ProLiant ML530 G2

Dual 2.4 GHz Intel Xeon (HT) Processors

400 MHz Front Side Bus

512 KB L2 Advanced Transfer Cache

4 GB 200 MHz DDR SDRAM

PCI-X (1 X 64 bit/133 MHz) support

18.2 GB Ultra3 SCSI 15,000 rpm hard disk

Intel PRO/1000 XF Server Adaptor (supports PCI-X bus at 64 bit/133 MHz)

Windows Server 2003, Enterprise Edition

Windows Media Services 9 Series

Server Hardware Profile 3 (S3)

S3 - Compaq ProLiant 8500

Eight 550 MHz Intel PIII Xeon Processors

100 MHz Front Side Bus

1 MB L2 Cache

8 GB 100 MHz SDRAM

PCI-X (1 X 64 bit/133 MHz) support

18.2 GB Ultra3 SCSI 10,000 rpm hard disk

Intel PRO/1000 F Server Adaptor

Windows Server 2003, Enterprise Edition

Windows Media Services 9 Series

Clients: 16 client computers

C1 … C16 - Dell Optiplex 240

1.5 GHz Intel Pentium 4 Processors

512 MB SDRAM

40 GB ATA100 IDE hard disk

3Com 3C920 Integrated Fast Ethernet 10/100 or Intel PRO/1000 F Server Adaptor

Windows Server 2003, Enterprise Edition

Windows Media Load Simulator

Network Switch:

N - Extreme Networks Summit 4

16 100 Mbps full-duplex ports

6 1 Gbps full-duplex fiber ports

Back to the top of this pageBack to Top


Appendix B: Test Profiles

22 Kbps

Bar graph showing maximum number of 22 Kbps streams for each scenario

Table showing 22 Kbps streams for broadcast scenario

Table showing 22 Kbps streams for on-demand scenario

Table showing 22 Kbps streams for on-demand with namespace changes scenario

Buffer namespace changes:


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x7d0" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x2000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x7d0" />
</node> <!-- PacketPump -->

56 Kbps

Bar graph showing maximum number of 56 Kbps streams for each scenario

Table showing 56 Kbps streams for broadcast scenario

Table showing 56 Kbps streams for on-demand scenario

Table showing 56 Kbps streams for on-demand with namespace changes scenario

Buffer namespace changes:


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x1388" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x2000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x7d0" />
</node> <!-- PacketPump -->

100 Kbps

Bar graph showing maximum number of 100 Kbps streams for each scenario

Table showing 100 Kbps streams for broadcast scenario

Table showing 100 Kbps streams for on-demand scenario

Table showing 100 Kbps streams for on-demand with namespace changes scenario

Buffer namespace changes:


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x1388" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x8000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0xfa0" />
</node> <!-- PacketPump -->

300 Kbps

Bar graph showing maximum number of 300 Kbps streams for each scenario

Table showing 300 Kbps streams for broadcast scenario

Table showing 300 Kbps streams for on-demand scenario

Table showing 300 Kbps streams for on-demand with namespace changes scenario

Buffer namespace changes:


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x2710" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x10000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x1f40" />
</node> <!-- PacketPump -->

500 Kbps

Bar graph showing maximum number of 500 Kbps streams for each scenario

Table showing 500 Kbps streams for broadcast scenario

Table showing 500 Kbps streams for on-demand scenario

1 Mbps

Bar graph showing maximum number of 1 Mbps streams for each scenario

Table showing 1 Mbps streams for broadcast scenario

Table showing 1 Mbps streams for on-demand scenario

Wireless 36 Kbps, 64 Kbps, 128 Kbps - Broadcast Wireless Scenario

Bar graph showing maximum number of broadcast wireless users with and without FEC enabled

Table showing wireless broadcast comparison with and without FEC enabled

Wireless 36 Kbps, 64 Kbps, 128 Kbps - On-Demand Wireless Scenario

Bar graph showing maximum number of on-demand wireless users with and without FEC enabled

Table showing wireless on-demand comparison with and without FEC enabled

Buffer namespace settings without FEC enabled:

32Kbps


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x7d0" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x2000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x7d0" />
</node> <!-- PacketPump -->
64Kbps


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x1388" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x2000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x7d0" />
</node> <!-- PacketPump -->
128Kbps


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x1388" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x8000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0xfa0" />
</node> <!-- PacketPump -->
Buffer namespace settings with FEC enabled:

32Kbps


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x0" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x2000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x7d0" />
</node> <!-- PacketPump -->
64Kbps


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x0" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x2000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x7d0" />
</node> <!-- PacketPump -->
128Kbps


          <node name="PacketPump" opcode="create" >
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x0" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x8000" />
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0xfa0" />
</node> <!-- PacketPump -->

Back to the top of this pageBack to Top


Appendix C: Test Content Specifications

The following table presents encoding settings used for content streamed during Windows Media Services performance tests. These encoding settings are presented for reference reasons and should not be used as baseline values to achieve maximum capacity.

Table showing content encoding settings

Back to the top of this pageBack to Top


Appendix D: Changing Namespace Buffer Settings

 Note   Before you edit the namespace, verify that you have a backup copy of the configuration file that you can restore if a problem occurs. If you edit the namespace incorrectly, you may be required to reinstall any product that uses the Windows Media Services namespace settings. Microsoft cannot guarantee that problems resulting from incorrectly editing the namespace can be solved.
  1. Stop Windows Media Services (run the net stop wmserver command).
  2. Change to the directory where the namespace file is located (%SystemRoot%\System32\Windows Media\Server).
  3. Open the ServerNamespace.xml file in a text editor, such as Notepad.
  4. Locate the Other node in the namespace.
  5. Add the PacketPump sub-node under the Other node after any existing sub-nodes:
    <node name="PacketPump" opcode="create" > ... </node> <!-PacketPump -->
  6. Add the following values to the PacketPump sub-node to modify the default values. If you do not make any changes, the default value is used. See Advanced Tuning for appropriate values recommendations.
    
                <node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x1f40" />

    <node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x10000" />

    <node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x2710" />
  7. Restart Windows Media Services (run the net start wmserver command).

The following is an example of the code that you can add to the ServerNamespace.xml file:


        <node name="Other" opcode="create" >
<node name="Client Upgrade" opcode="create" >
...
</node> <!-- Client Upgrade -->
<node name="PacketPump" opcode="create" >
<node name="OptimalBufferSizeInMSecsOnDemand" opcode="create" type="int32" value="0x1f40" />
<node name="MaxBufferSizeInBytes" opcode="create" type="int32" value="0x10000" />
<node name="MaxResendBufferSizeInMSecs" opcode="create" type="int32" value="0x2710" />
</node> <!-- PacketPump -->
</node> <!-- Other -->

Back to the top of this pageBack to Top


Appendix E: Registry Keys

 Note   Incorrectly editing the registry may severely damage your system. Before making changes to the registry, you should back up any valued data on the computer.
FastSendDatagramThreshold

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\AFD\Parameters\

FastSendDatagramThreshold. This setting controls how datagrams are sent to the client computers. Datagrams smaller than the default go through the fast I/O path or are buffered during the send process. Larger ones are stored until the datagram is actually sent. Fast I/O means that the server copies data and bypasses the I/O subsystem instead of mapping memory and going through the I/O subsystem.

Set this key to a value that is larger than the packet size of the highest bit rate stream the server will deliver.

TypeDefaultRecommended value
DWORD
1024
1500


Back to the top of this pageBack to Top


Appendix F: Performance Counters

The following performance counters were used to collect performance and scalability information during Windows Media Services tests. The sample rate was set to 1 second.

Processor
  • \Processor(_Total)\% Processor Time
  • \Processor(_Total)\Interrupts/sec

Windows Media Services
  • \Windows Media Services\Current Streaming Players
  • \Windows Media Services\Current Connected Players
  • \Windows Media Services\Current Late Send Rate
  • \Windows Media Services\Current Late Read Rate
  • \Windows Media Services\Current Connection Queue Length
  • \Windows Media Services\Current Connection Rate
  • \Windows Media Services\Current File Read Rate (Kbps)
  • \Windows Media Services\Current Player Allocated Bandwidth (Kbps)
  • \Windows Media Services\Current Player Send Rate (Kbps)
  • \Windows Media Services\Current UDP Resend Requests Rate
  • \Windows Media Services\Current UDP Resends Sent Rate

Windows Media Services process
  • \Process(WMServer)\% Privileged Time
  • \Process(WMServer)\% Processor Time
  • \Process(WMServer)\% User Time
  • \Process(WMServer)\Handle Count
  • \Process(WMServer)\Page Faults/sec
  • \Process(WMServer)\Page File Bytes
  • \Process(WMServer)\Private Bytes
  • \Process(WMServer)\Working Set
  • \Process(WMServer)\Pool Nonpaged Bytes
  • \Process(WMServer)\Pool Paged Bytes
  • \Process(WMServer)\Virtual Bytes

System counters
  • \System\Context Switches/sec

System memory
  • \Memory\Page Faults/sec
  • \Memory\Cache Bytes
  • \Memory\Committed Bytes
  • \Memory\Available Bytes

Physical disk
  • \PhysicalDisk(_Total)\% Disk Time
  • \PhysicalDisk(_Total)\Current Disk Queue Length
  • \PhysicalDisk(_Total)\Disk Bytes/sec
  • \PhysicalDisk(_Total)\Disk Read Bytes/sec

Network interface
  • \Network Interface(*)\Output Queue Length
  • \Network Interface(*)\Bytes Sent/sec
  • \Network Interface(*)\Packets Sent/sec
  • \Network Interface(*)\Bytes Received/sec
  • \Network Interface(*)\Packets Received/sec
  • \Network Interface(*)\Packets/sec

Remote file systems – NTB connections (NetBIOS over TCP/IP)
  • \NBT Connection(Total)\Bytes Received/sec
  • \NBT Connection(Total)\Bytes Sent/sec
  • \NBT Connection(Total)\Bytes Total/sec

Back to the top of this pageBack to Top


For More Information


Back to the top of this pageBack to Top



© 2014 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy & Cookies
Microsoft