|This article provides an overview of VC-1 and details its status as an established open standard. Comparative information is given to demonstrate how well VC-1 compares to two other codecs in use today, H.264 and MPEG-2.
Microsoft® Windows Media® Video 9
Microsoft VC-1 Encoder
IntroductionVC-1 is a video codec specification that has been standardized by the Society of Motion Picture and Television Engineers (SMPTE) and implemented by Microsoft as Microsoft Windows Media Video (WMV) 9. Formal standardization of VC-1 represents the culmination of years of technical scrutiny by over 75 companies. SMPTE 421M details the complete bit stream syntax and is accompanied by two companion documents (SMPTE RP227 and SMPTE RP228) that describe VC-1 transport and conformance. These documents provide comprehensive guidance to ensure content delivery and interoperability. Standardizing the decoder bit stream facilitates independent implementation of interoperable VC-1 encoders and decoders. This, in turn, drives innovation around software and hardware encoding solutions.
Back to Top
SMPTE Standardization BackgroundSMPTE is the preeminent society of film and video experts, with members in 85 countries worldwide. The standards that SMPTE produces are widely used by professionals in the fields of video, motion pictures, and digital cinema.
The SMPTE standard for VC-1, SMPTE 421M, was originally based on Windows Media Video 9 codec. The Windows Media Video 9 codec is functionally equivalent to VC-1; it is the Microsoft implementation of the VC-1 standard. VC-1 includes the Simple, Main, and Advanced Profiles that are described in the Overview of VC-1 section of this article.
The standardization process was undertaken by the SMPTE Video Compression Technology Committee, also known as C24. This committee is responsible for technologies that encode, process, switch, and decode video signals to, in, and from the compressed domain.
Microsoft chose to standardize the Windows Media Video 9 codec for a number of reasons, including accessibility and interoperability. Standardizing enables independent implementations and ensures those implementations will be interoperable. Standardizing the bitstream syntax and decoding process gives hardware manufacturers the resources and stability required to invest in creating decoders on chips in a variety of hardware devices. Isolating the video codec standard from the other parts of a complete video system enables the use of the codec on many types of hardware and in many different systems.
In addition to the other reasons for standardization, having an SMPTE standard that can be referenced by other format and system specifications makes the inclusion of VC-1 easier for independent companies and discourages the adoption of different versions of the technology by different organizations. Standardization also helps gain adoption of the technology by organizations that are committed to the use of open industry standards.
There are three documents produced for SMPTE to describe VC-1. The first document, SMPTE 421M, is the VC-1 specification itself. This is the main document providing comprehensive details of the VC-1 bitstream syntax and decoder semantics. The second document, SMPTE RP228, is the VC-1 conformance specification. This document describes the test procedures and criteria for determining conformance to the SMPTE 421M specification and includes reference source code and bitstreams. The third document, SMPTE RP227, is the VC-1 transport specification. The transport document provides details about carrying VC-1 elementary streams in MPEG-2 Program and Transport Streams.
Back to Top
Overview of VC-1The VC-1 codec is designed to achieve state-of-the-art compressed video quality at bit rates that may range from very low to very high. The codec can easily handle 1920 pixel × 1080 pixel presentation at 6 to 30 megabits per second (Mbps) for high-definition video. VC-1 is capable of higher resolutions such as 2048 pixels × 1536 pixels for digital cinema, and of a maximum bit rate of 135 Mbps. An example of very low bit rate video would be 160 pixel × 120 pixel presentation at 10 kilobits per second (Kbps) for modem applications.
The basic functionality of VC-1 involves a block-based motion compensation and spatial transform scheme similar to that used in other video compression standards since MPEG-1 and H.261. However, VC-1 includes a number of innovations and optimizations that make it distinct from the basic compression scheme, resulting in excellent quality and efficiency. VC-1 Advanced Profile is also transport and container independent. This provides even greater flexibility for device manufacturers and content services.
InnovationsVC-1 includes a number of innovations that enable it to produce high quality content. This section provides brief descriptions of some of these features.
Adaptive Block Size TransformTraditionally, 8 × 8 transforms have been used for image and video coding. However, there is evidence to suggest that 4 × 4 transforms can reduce ringing artifacts at edges and discontinuities. VC-1 is capable of coding an 8 × 8 block using either an 8 × 8 transform, two 8 × 4 transforms, two 4 × 8 transforms, or four 4 × 4 transforms. This feature enables coding that takes advantage of the different transform sizes as needed for optimal image quality.
16-Bit TransformsIn order to minimize the computational complexity of the decoder, VC-1 uses 16-bit transforms. This also has the advantage of easy implementation on the large amount of digital signal processing (DSP) hardware built with 16-bit processors. Among the constraints put on VC-1 transforms is the requirement that the 16-bit values used produce results that can fit in 16 bits. The constraints on transforms ensure that decoding is as efficient as possible on a wide range of devices.
Motion CompensationMotion compensation is the process of generating a prediction of a video frame by displacing the reference frame. Typically, the prediction is formed for a block (an 8 × 8 pixel tile) or a macroblock (a 16 × 16 pixel tile) of data. The displacement of data due to motion is defined by a motion vector, which captures the shift along both the x- and y-axes.
The efficiency of the codec is affected by the size of the predicted block, the granularity of sub-pixel data that can be captured, and the type of filter used for generating sub-pixel predictors. VC-1 uses 16 × 16 blocks for prediction, with the ability to generate mixed frames of 16 × 16 and 8 × 8 blocks. The finest granularity of sub-pixel information supported by VC-1 is 1/4 pixel. Two sets of filters are used by VC-1 for motion compensation. The first is an approximate bicubic filter with four taps. The second is a bilinear filter with two taps.
VC-1 combines the motion vector settings defined by the block size, sub-pixel granularity, and filter type into modes. The result is four motion compensation modes that suit a range of different situations. This classification of settings into modes also helps compact decoder implementations.
Loop FilteringVC-1 uses an in-loop deblocking filter that attempts to remove block-boundary discontinuities introduced by quantization errors in interpolated frames. These discontinuities can cause visible artifacts in the decompressed video frames and can impact the quality of the frame as a predictor for future interpolated frames.
The loop filter takes into account the adaptive block size transforms. The filter is also optimized to reduce the number of operations required.
Interlace CodingInterlaced video content is widely used in television broadcasting. When encoding interlaced content, the VC-1 codec can take advantage of the characteristics of interlaced frames to improve compression. This is achieved by using data from both fields to predict motion compensation in interpolated frames.
Advanced B Frame CodingA bi-directional or B frame is a frame that is interpolated from data both in previous and subsequent frames. B frames are distinct from I frames (also called key frames), which are encoded without reference to other frames. B frames are also distinct from P frames, which are interpolated from previous frames only. VC-1 includes several optimizations that make B frames more efficient.
Fading CompensationDue to the nature of compression that uses motion compensation, encoding of video frames that contain fades to or from black is very inefficient. With a uniform fade, every macroblock needs adjustments to luminance. VC-1 includes fading compensation, which detects fades and uses alternate methods to adjust luminance. This feature improves compression efficiency for sequences with fading and other global illumination changes.
Differential QuantizationDifferential quantization, or dquant, is an encoding method in which multiple quantization steps are used within a single frame. Rather than quantize the entire frame with a single quantization level, macroblocks are identified within the frame that might benefit from lower quantization levels and greater number of preserved AC coefficients. Such macroblocks are then encoded at lower quantization levels than the one used for the remaining macroblocks in the frame. The simplest and typically most efficient form of differential quantization involves only two quantizer levels (bi-level dquant), but VC-1 supports multiple levels, too.
Profiles and LevelsVC-1 contains a number of profile and level combinations that support the encoding of many types of video. The profile determines the codec features that are available, and thereby determines the required decoder complexity (mathematical intensity). The following table lists VC-1 profiles and levels.
|Profile||Level||Max Bit Rate||Representative Resolutions by Frame Rate|
|Simple||Low||96 Kbps||176 × 144 @ 15 Hz (QCIF)|
|Medium||384 Kbps||240 × 176 @ 30 Hz|
352 × 288 @ 15 Hz (CIF)
|Main||Low||2 Mbps||320 × 240 @ 24 Hz (QVGA)|
|Medium||10 Mbps||720 × 480 @ 30 Hz (480p)|
720 × 576 @ 25 Hz (576p)
|High||20 Mbps||1920 × 1080 @ 30 Hz (1080p)|
|Advanced||L0||2 Mbps||352 × 288 @ 30 Hz (CIF)|
|L1||10 Mbps||720 × 480 @ 30 Hz (NTSC-SD)|
720 × 576 @ 25 Hz (PAL-SD)
|L2||20 Mbps||720 × 480 @ 60 Hz (480p)|
1280 × 720 @ 30 Hz (720p)
|L3||45 Mbps||1920 × 1080 @ 24 Hz (1080p)|
1920 × 1080 @ 30 Hz (1080i)
1280 × 720 @ 60 Hz (720p)
|L4||135 Mbps||1920 × 1080 @ 60 Hz (1080p)|
2048 × 1536 @ 24 Hz
Back to Top
VC-1 Compared to Other CodecsVC-1 is very competitive when compared to other codecs in use today. This section compares the performance of VC-1 with MPEG-2 and H.264.
Quality ComparisonVC-1 and H.264 represent a logical technological evolution in video compression compared to MPEG-2. Both of these codecs are generally able to achieve superior quality over MPEG-2 at comparable bit rates.
Measuring the quality of a video codec is not easy, because the reconstructed image is not meant to be identical to the original. Ideally, only information that is perceptually irrelevant will be lost in the compression/decompression process, but what counts as "irrelevant" depends on the viewer's subjective response. It is important to note that an empirical codec comparison is always based on a practical implementation of a codec specification, which therefore means a compared codec is only as good as its implementation. It is very difficult to compare video compression standards in general terms when there can be significant differences between quality and performance of various implementations within the same codec class. Codec comparison results can therefore vary greatly depending on selected encoder implementations, decoder (post-processing) implementations, video sources, encoding methods and user scenarios.
One useful objective metric is the peak signal-to-noise ratio (PSNR) plotted against bit rate. PSNR is the ratio between the maximum value of a signal (255 for 8-bit video) and the quantization noise. A higher PSNR indicates a less noisy signal. For any codec, PSNR is expected to increase at higher bit rates, because higher bit rates translate to less aggressive compression. Thus, a graph that plots PSNR against bit rate shows the performance of the codec over a range of compression settings.
The key arbiter of codec quality is the subjective appearance of the decoded video. For example, the DVD Forum conducted tests in the winter of 2002 to select codecs for the next-generation optical disc standard. Viewers from Hollywood film studios and major consumer electronics companies rated video clips on a scale of 1 to 5 for resolution, noise, and overall impression. Multiple codec implementations were tested, including MPEG-2, VC-1 (provided by Microsoft), H.264, and MPEG-4 Advanced Simple Profile. The baselines against which the codecs were compared were D5 masters and D-VHS (24 Mbps). During the tests, viewers were not told which codec was used to encode each of the clips.
On all three measures (resolution, noise, and overall impression), the quality of the Microsoft VC-1 encoder was judged closest to the original D5 master. By comparison, the H.264 encoder that was tested rated as comparable only to MPEG-2 on two of the three measures (resolution and overall impression), and was rated somewhat worse than VC-1 on noise.
VC-1 codecs have performed well in other independent subjective quality tests:
DV Magazine found VC-1 to be superior to both MPEG-2 and MPEG-4.
TANDBERG Television found VC-1 produces significantly better quality than MPEG-2 and comparable quality to H.264. These results were presented at the 2003 International Broadcasting Convention (IBC).
C'T Magazine, Germany's premiere audio-video magazine, compared various codecs, including VC-1, H.264, and MPEG-2, and selected VC-1 as producing the best subjective and objective quality for high-definition (HD) video.
The European Broadcasting Union (EBU) found VC-1 had the most consistent quality in tests that compared Microsoft VC-1, RealMedia V9, the Envivio MPEG-4 encoder, and the Apple MPEG-4 encoder.
Complexity ComparisonIt is not enough to deliver high-quality video. A video codec must also be efficient to decode, particularly when the codec is implemented in hardware. Lower complexity means less silicon, lower cost, and fewer problems with power consumption and heat.
Because they are more sophisticated, VC-1 and H.264 are both more complex to decode than MPEG-2. Yet VC-1 is more than twice as efficient to decode as H.264. A study by 3GPP, a collaboration group that is setting 3G mobile phone standards, found that VC-1 Main Profile requires 25 percent fewer cycles than H.264 Baseline. It should be noted that H.264 Main Profile requires even more cycles than Baseline, because it includes highly complex arithmetic coding, also known as CABAC.
In fact, software decoding of VC-1 at 1080p (1920 × 1080 progressive) resolution is possible on today's off-the-shelf computer hardware. In the hardware domain, companies can do more with a single DSP because VC-1 is easier to implement.
Back to Top
VC-1 AdoptionVC-1 has already been adopted by the digital video industry and a number of standards bodies and industry organizations in addition to SMPTE.
Next-Generation Optical Media. All of the leading next-generation optical media formats have adopted VC-1 as a mandatory codec. The DVD Forum has mandated VC-1, H.264, and MPEG-2 for the HD DVD format. The Blu-ray Disc Association has mandated the same three codecs for their blue-laser Blu-ray Disc format. And the recent FVD standard from Taiwan has adopted VC-1 as the only mandated video codec.
Chips. Numerous DSP and chip manufacturers have begun to support VC-1.
Professional Video Equipment. VC-1 is being used for professional video broadcast and delivery today. Leading industry companies already have products on the market that support VC-1, ranging from encoders and decoders to professional video test equipment.
Home Networks. VC-1 is an optional format in the Digital Living Network Alliance (DLNA) standards. DLNA is developing a set of interoperability guidelines for home networks. These guidelines will enable computers, portable devices, and home consumer electronic devices such as stereos and set-top boxes to share digital media seamlessly over a home network.
Mobile Devices. VC-1 is one of the formats included in the Digital Video Broadcasting - Handheld (DVB-H) specification, and is a key component of Modeo's new DVB-H solution. VC-1 is also part of new broadband, Wi-Fi, and cellular delivery solutions such as MobiTV.
Transport IndependenceThe VC-1 codec is not tied to any particular transport mechanism. From the beginning, the codec was designed to take into account the existing MPEG-2 Systems layer. As a result, it can be used easily with existing broadcast infrastructures.
Closed captions, active format descriptions (AFD), and other information can be carried in user data.
The organization of the video stream into I, B, and P frames enables conventional tuning and trick modes.
In addition to participating in standards development, Microsoft is also working with companies to support broadcast solutions. For example, at the 2004 IBC convention, a prototype system was demonstrated that delivers VC-1 over satellite using DVB-S2. Pre-encoded video files were multiplexed, encapsulated in transport streams, and streamed to a DVB-S2 modulator. The satellite uplink was located in the United Kingdom, and the signal was received at the IBC convention center in Amsterdam, where it was demodulated, decoded, and played back on the convention floor.
VC-1 video encoded for file-based playback commonly uses the Microsoft Advanced Systems Format (ASF) file container as part of the Windows Media ecosystem, but VC-1 has also been standardized for storage in industry-recognized MP4 and MXF (Material eXchange Format) file containers.
ToolsMicrosoft has a number of tools available to help companies adopt VC-1 technology:
Microsoft VC-1 Encoder is a commercial software development kit (SDK) for third-party solution companies to incorporate into their commercial products. It allows for full API control over a large range of encoding parameters and has been designed for easy integration and upgrades. Licensees quickly benefit from the quality and performance of the SDK to provide new features and improved results to their customers. For additional information about licensing, see the Microsoft VC-1 Encoder Web site
Windows Media Format 11 SDK is a freely downloadable software development kit (SDK) that includes VC-1 compliant Windows Media Video 9 (WMV9) codecs for Windows XP and Windows Vista. It supports a range of profiles and bit rates, from real-time encoding (suitable for live videoconferencing) to best quality 2-pass VBR encoding. Examples of encoder applications that use WMV9 codecs include Windows Media Encoder 9 Series and Microsoft Expression Encoder. For additional information about licensing, see the Windows Media Licensing page
Windows Media licensees have access to a collection of utilities that convert between ASF and MPEG-2 transport streams. These utilities allow studios to convert existing ASF-encapsulated VC-1 content (WMV9) to IPTV-ready VC-1 content in MPEG-2 Transport Streams. For additional information about licensing, see the Windows Media Licensing page
The SMPTE test materials are the official tools for VC-1 conformance testing. These materials include the source code for a reference decoder and a sample encoder as well as bitstreams for testing. The SMPTE test materials are acquired through the SMPTE VC-1 Test Materials Access program found at the SMPTE Web site
Back to Top
ConclusionVC-1 is a cutting-edge codec that offers very high image quality with excellent compression efficiency. VC-1 is capable of delivering high-definition video at bit rates as low as 6 to 8 Mbps.
The emphasis during development on reducing the computational power required by the VC-1 decoder provides advantages for a broad range of media consumers. Personal computer users can decode full 1080i/p resolution video with off-the-shelf hardware, making HD video delivery a reality for the home computer. Perhaps more important than the benefits of VC-1 to the personal computer market is its value in the consumer electronics space. Hardware supporting VC-1 includes next generation DVD players, set-top boxes, portable media devices, wireless phones, and more. Major industry players are selecting VC-1 for its scalability and quality.
VC-1 is leading the next wave of digital video. It is a high-quality codec that benefits from the resources of Microsoft while being an open standard. Adopters can choose to develop custom implementations and solutions or to use the existing support provided by Windows Media technologies. VC-1 offers something for every digital video solution.
Back to Top
Back to Top