Click Here to Install Silverlight*
United StatesChange|All Microsoft Sites
Windows Media Player 9 Series*
Search Microsoft.com for:
|Windows Media Worldwide

Advanced Encoding Techniques with Windows Media 9 Series

Feedback
E-mail us with your comments and feedback about this article.
 
Abstract
This document provides information about advanced techniques that you can apply when using Microsoft Windows Media Encoder 9 Series to create the best quality encoded content possible. The target audience for this document are audio and video professionals, and anyone else interested in creating high-quality encoded content. It is assumed that readers have some experience with audio and video compression concepts.

 

Jennifer Winters
Microsoft Corporation
December 2004
 

Applies to:
   Microsoft® Windows Media® Encoder 9 Series
 

Contents


Introduction

Working with digital media is an art, not a science, so be prepared to practice, test, and tweak to achieve the highest quality. This document provides tips that you can follow to ensure that you start with the best-quality content possible before you begin encoding. It also provides information about techniques that you can apply in the encoding session to ensure that you end up with high-quality encoded content.

Back to the top of this pageBack to Top


Capturing Quality Content

This section outlines topics to keep in mind as you prepare to capture your audio and video content. The following points are explained in detail throughout the rest of the section.
  • Capturing to an AVI File. For the best quality, avoid combining the capturing and encoding processes. Instead, capture to an AVI file first, and then encode.
  • Comparing Audio and Video Sources. Keep in mind that some audio and video sources are better than others. For the best quality, capture SDI video and digital audio.
  • Setting Proper Audio and Video Levels. Set your video and audio levels properly before you start capturing.
  • Optimizing Your Computer. Check that your computer is optimized.
  • Capturing to Optimal Pixel Formats. Capture to a YUY2 pixel format to avoid color conversions during encoding.
  • Capturing Optimal Resolutions. Capture video at either a resolution of 320×480 or 640×480.

Capturing to an AVI File

To ensure the highest-quality results, it is recommended that you capture to an AVI file before encoding. Doing so has the following advantages:
  • Removes any issues related to the processor falling behind the capture process, and enables the encoder to optimize all calculations.
  • Enables the use of editing programs to perform steps such as trimming the start and end times of the file, or doing color correction.
  • Simplifies batch encoding when you source from an AVI file.

You can use a number of programs to capture to an AVI file, including Adobe Premiere, Avid/DS HD, and VirtualDub.

Despite the higher-quality results, there are a few disadvantages related to capturing to an AVI file:
  • Requires additional steps to capture and encode.
  • Involves higher system storage requirements.

Comparing Audio and Video Sources

It is important to start with the best-quality source. This section lists possible sources, in the order from best to worst:
  • Serial digital interface (SDI) video. Used for digital video cameras and camcorders. Because the content stays in a digital format throughout the capturing and encoding processes, this results in the least amount of data translations, and results in the best-quality video.
  • Component video. Used when sourcing from DVDs. With this source, the video signals are separated, for example, into the RGB or Y/R-y/B-y format. Results in good-quality video.
  • S-Video. Used for S-VHS, DVD, or Hi-8 camcorders. The video signal is divided into luminance and chrominance. Results in good-quality video.
  • DV video. Used with DV devices, such as MiniDV digital camcorders connected through an IEEE 1394 video port. Results in good-quality video.
  • Composite video. Used for analog cameras, camcorders, cable TV, and VCRs. Composite video should only be used as a source as a last resort. With composite video, luminance and chrominance components are mixed, which makes it difficult to get good-quality video.
  • Audio. If possible, capture digital audio. If you must capture audio from an analog source, balanced audio connections are better than RCA.

Setting Proper Audio and Video Levels

To set audio and video levels properly:
  • Adjust your video monitor using SMPTE color bars, and then adjust your computer monitor to match, using a high-resolution bitmap of the SMPTE bars.
  • Adjust your video capture card levels (hue, saturation, and brightness), so that the picture matches the video monitor.
  • Check and normalize all audio levels in your system. Use a professional-grade audio card, such as the Echo Layla24 or the M-Audio Delta Series.
  • If possible, use a digital waveform monitor.

Optimizing Your Computer

Before you start capturing, optimize your computer using the following steps:
  • Defragment your hard disk.
  • Turn off network and file sharing.
  • Close all other programs, especially if a program accesses the hard disk.
  • Monitor system resources, making sure that the computer is sufficiently powerful to keep pace with the data feed.
  • During the capture, watch for frame dropping. It should be possible to capture an entire movie with no dropped frames.
  • Watch for direct memory access (DMA) buffer conflicts between the capture card and the SCSI card, which can result in frame dropping. This is less likely to occur now than in the past. If conflicts occur, one solution is to use a dual PCI bus motherboard configuration, in which the capture card and the SCSI card are on different buses.

Capturing to Optimal Pixel Formats

It is recommended that you capture to a YUY2 (4:2:2) pixel format, which enables you to avoid pixel format conversions during encoding. The Windows Media Video 9 Series codec is primarily a 4:2:0 pixel format, except that if you choose to maintain the interlacing in your content (a new feature with Windows Media Encoder 9 Series), then a 4:1:1 pixel format is used. Because the YUY2 format is a superset of both 4:2:0 and 4:1:1 pixel formats, the content can be converted to either format without any data loss.

An important note is that if you capture to a 4:2:0 AVI file (for example I420, YV12, or IYUV), you will not be able to maintain the interlacing in your source video.

Older capture devices may create AVI files that do not fully conform to published specifications, resulting in upside-down video with the YUY2 pixel format. To prevent this, you can either set the driver on your capture device to use a different pixel format, or you can "flip" the image if your driver provides such a feature. Finally, there is also an option to flip the video in the encoder.

The RGB pixel formats are not recommended, because they result in extra color-space conversions, bigger files, and more data to move across the bus.

Capturing Optimal Resolutions

If you capture 320×240 to an AVI file, the capture card throws away one of the fields, which effectively deinterlaces the video. If your target audience plays the video at 320×240, this usually produces acceptable results. However, to ensure the highest quality, you should capture both fields, so that you can use Microsoft Windows Media Encoder to deinterlace the video or apply the inverse telecine feature. Deinterlacing and inverse telecine require both fields of a frame to be present in order to function properly. For this reason, it is recommended that you capture either at 320×480 or 640×480. After deinterlacing or the inverse telecine filter is applied in the encoder, output video encoded at 320x240 will have higher quality.

Back to the top of this pageBack to Top


Encoding Techniques

Once you have captured your source, you are ready to set up an encoding session. This section provides information about the following techniques that you can apply in the encoding session to ensure you end up with high-quality encoded content:
  • Optimizing Video
  • Preserving Nonsquare Pixel Output
  • Selecting an Encoding Mode
  • Compression Settings

Optimizing Video

You can use the encoder to deinterlace source video that is interlaced, apply an inverse telecine filter to content that is telecined, or maintain the interlacing in your source video. The option you choose depends on the source of your video, as follows:
  • Film-originated content. Standard motion picture film is shot at 24 frames per second (fps). Before it can be broadcast on television, it must be converted to videotape and put through a telecine process where frames are added that convert it to the 29.97 fps required by the NTSC standard. It is recommended that you apply the inverse telecine filter to remove the extra frames added during the telecine process. This removes redundant data and improves the quality of the encoded video.
  • Video-originated content. It is recommended that you either deinterlace or maintain the interlacing in the source video, depending on the playback device you are targeting. Deinterlace for playback on progressive-scan devices such as computers, but maintain interlacing for playback on interlaced devices such as televisions. If you deinterlace the video, Windows Media Player 9 Series or later can detect whether the computer's graphic card supports hardware deinterlacing during playback. If it does, the codec will pass all of the information to the card for deinterlacing. If the graphic card does not support hardware deinterlacing, then deinterlacing is handled by the codec.
  • Mixed film and video content. If the source video is 640×480 and 30 fps, deinterlacing is recommended. The encoded video will still have the telecined frames, but interlacing artifacts are removed. Or, you can maintain the interlacing.

Preserving Nonsquare Pixel Output

If your video source has nonsquare pixels, you can now preserve its pixel aspect ratio by using the encoder. The pixel aspect ratio is the width (x) of the pixel with respect to its height (y). A square pixel has a ratio of 1:1, but a nonsquare (rectangular) pixel does not have the same height and width. This concept is similar to the frame aspect ratio, which is the total width of an image with respect to its height. However, these aspect ratios are not necessarily tied together. For example, a widescreen image with a frame aspect ratio of 16:9 can be made of square or nonsquare pixels.

If you encode a video source with nonsquare pixels as though the pixels are square, the output will be distorted.

If you set the size of the output video to be the same as the source video, and the source video has nonsquare pixels, then the pixel aspect ratio of your source video is automatically preserved in the output video. Windows Media Player 9 Series or later automatically interprets and scales the video appropriately during playback.

Selecting an Encoding Mode

Using Windows Media Encoder, you can encode audio and video content at either a constant bit rate (CBR) or a variable bit rate (VBR). The mode to use depends both on your source and on the scenario you are targeting:
  • One-pass CBR. Use when capturing live content, when broadcasting, or when you are targeting older players or devices.
  • Two-pass CBR. Use when capturing from files, when encoding to a file, and when you want to set up an on-demand streaming scenario.
  • Quality-based VBR (one-pass). Use when you want to ensure a constant quality level, for example, when you are archiving content. Maintains consistent quality throughout; spikes bit rate arbitrarily to maintain quality.
  • Bit rate-based VBR (two-pass). Use when you want to achieve the best-quality level while staying within a predictable average bandwidth. Use this mode when you are planning to create files that can be downloaded before being played, or when you want to control the size of the output file.
  • Peak bit rate-based VBR (two-pass). Use when you want to create content that will be played back on a device that has a constrained reading speed, such as a CD or DVD player. Similar to bit rate-based VBR encoding, except that you also specify the peak bit rate.

With one-pass encoding, the content passes through the encoder once, and compression is applied as the content is encountered. With two-pass encoding, the content is analyzed during the first pass, and then encoded in the second pass based on data gathered in the first pass.

Two-pass encoding can result in better-quality content, because the encoder can allocate the bits more effectively within the window specified by the buffer. However, two-pass encoding takes longer, because the encoder goes through all of the content twice. Two-pass encoding is not available in all situations, such as when you are broadcasting a live event.

Compression Settings

This section describes important compression settings, including tradeoffs to consider:
  • Key frame. A key frame is a point in encoded video where the data for the entire frame is transmitted, rather than just the changes. Key frames are generally inserted when there is a scene change; they are also inserted at regular intervals to improve seeking. The key frame setting is the minimum time between points where the encoder will insert key frames (they may be inserted more often automatically, if necessary). Decreasing the distance between key frames can improve the quality of the video, but also significantly impacts the overall file size. If you plan to edit the encoded video, reduce the distance between key frames to improve edit quality. If you want to minimize the file size, increase the key frame value, for example, to 20 seconds. If you use a longer key frame distance, additional key frames are inserted when necessary; for example, when a scene changes. Keep in mind, though, that a long key frame distance will affect both the ability to seek within the video and the amount of time a user may need to wait for video in a multicast scenario.
  • Buffer size. The bit rate and quality of content fluctuate within the confines of the buffer size. A larger buffer size enables more bits within the buffer range to be allocated to complex scenes. For example, if you set the buffer size to 10 seconds, the codec may choose to allocate x number of bytes to the first 8 seconds, and the rest during the last 2 seconds. This allows for the more complex parts of the video to have more bits allocated within the buffer. Typically, increasing the buffer will improve overall quality. However, it also increases the delay between the time when the user requests the content, and playing starts. For download-and-play scenarios, increasing the buffer size has little impact on this delay. For lower bit rates, it is recommended to increase the buffer size. For higher bit rates, increasing the buffer size has a smaller impact on quality.
  • Video smoothness. Video smoothness determines the tradeoff between sharp images and smooth motion. Video appears smooth when objects move easily from one position to another on the screen, and the edges of objects are not jagged. Video appears clear when images and motion are well-defined and clearly delineated. The bit rate setting determines the number of bits that can be allocated over a period of video. Based on this setting, the codec can choose to include more frames, which results in the images appearing smoother because there are more frames. However, each frame uses fewer bits. With a higher video smoothness value, the codec may include fewer frames. This increases the number of bits allocated per frame, resulting in sharper images; however, because there were fewer frames, the image may not appear as smooth. The video smoothness setting only comes into effect when there are not enough bits to encode at the specified frame rate, and a tradeoff must be made. At higher bit rates, this value can be increased. If you are dropping frames during encoding, consider decreasing video smoothness.

Back to the top of this pageBack to Top


For More Information

This section lists the following additional resources:
Back to the top of this pageBack to Top



© 2008 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement
Microsoft