Click Here to Install Silverlight*
United StatesChange|All Microsoft Sites
Windows Media Player 9 Series
|Windows Media Worldwide

Surrounding Windows Media 9 Series: Windows Media Audio 9 Series Codecs for Surround Sound Audio

Hector La Torre
Mike Sokol
Fits & Starts Productions

March 2004

Applies to:
Microsoft® Windows Media® Audio 9 codec
Microsoft Windows Media Audio 9 Voice codec
Microsoft Windows Media Audio 9 Professional code
Microsoft Windows Media Audio 9 Lossless codec

Contents:

Introduction

Over the next several months, visitors to this website will find our series of articles about multichannel audio, also known as surround sound audio. Our goal is to deliver articles that will educate and instruct our readers about the technology and equipment used for multichannel audio and to demonstrate how anyone can get involved with it.

Microsoft® Windows Media® Audio 9 Series consists of a family of audio codecs with applications from low-data rate voice encoding (Windows Media Audio 9 Voice codec) to CD-quality music encoding (Windows Media Audio 9 codec) to high-resolution stereo and multichannel delivery (Windows Media Audio 9 Professional codec) to lossless encoding for mastering and archiving (Windows Media Audio 9 Lossless codec).

About Hector and Mike
Hector La Torre is the founder and managing partner of Fits & Starts Productions, LLC. The company has conducted nearly 300 surround sound seminars and is the leading provider of audio seminars in the United States. In another life, Hector was a Boston- and New York-based musician and record producer, former executive director of EQ and Gig magazines, and technical editor of Pro Sound News magazine. Hector La Torre also is an audio consultant and technical writer for the music and recording industry. For more information, go to the Fits & Starts Productions website.

Mike Sokol is a live and recording audio engineer with more than three decades of experience in the music field. He has authored more than 1,200 articles for numerous pro-audio magazines over the last 20 years and has published a book on acoustic instrument sound reinforcement. He also owned a computer integration and repair business for 15 years before joining Fits & Starts Productions LLC as a lead instructor, with more than 250 seminars on surround production techniques taught over the last five years. His new book on surround audio is currently being written for publication late 2004. For more information, go to the Fits & Starts Productions website.

Our series will focus primarily on the capabilities of the Windows Media Audio 9 Professional and Windows Media Audio 9 Lossless compression technology and tools—after all, this is the Microsoft Web site—but we will also be providing hands-on information about the fundamentals of multichannel audio, including equipment choices and operation guidelines. No matter how useful a tool the Windows Media Audio 9 codec might be, its effectiveness would be minimized if the basic recording and playback system were improperly set up and calibrated.

Because of the extensive creative potential of the Windows Media Audio 9 Professional codec, these articles are targeted to a broad spectrum of disciplines and interests: from the active content creators—engineers, producers, and musicians—who use surround technology as a new musical palette, to the music lovers who want to experience a new way of listening to audio by enveloping themselves "inside" the music. More specifically, we will try to serve:
  • Audio professionals working in commercial and project studios who will look to the Windows Media Audio 9 codec as a means to streamline workflow and help to make audio files more secure by using digital rights management (DRM).
  • Working musicians who can use their existing production tools, such as Steinberg Nuendo 2.0, Digidesign Pro Tools and others, as well as freely available tools such as Microsoft Windows Media Encoder 9 Series to efficiently transfer compressed audio files of works in progress to other band members.
  • Hard-core music fans who can use the Windows Media Audio 9 codec and these articles to set up a listening system that enables them to appreciate true multichannel audio playback.

Those individuals who are already involved with multichannel audio will find additional helpful techniques on how to properly outfit their studios and calibrate the components for quality surround sound playback. Those new to the game will find detailed information on the necessary components in a surround system, and how the computer is at the heart of that system. In brief, we want to take you on a multichannel ride from start to finish—from capturing music to encoding to mastering to final playback.

As we progress in this series, you can expect to understand Windows Media Audio 9 Professional system requirements, sound cards, playback systems, additional multichannel distribution methods (for example,: Digital Theater Systems , Dolby Digital AC3, SRS Circle Surround, DVD-A, SACD, MP4) and the differences among them, and much more.

Back to the top of this pageBack to the top


Defining Multichannel High-Resolution Audio

Let's begin by discussing exactly what multichannel audio is, and how you can create, shape, and play it back all from your computer. To do so, we need to get a few definitions—and a bit of history—out of the way so we are all speaking the same language.

"Multichannel audio" is technically defined as anything with more than two channels. So stereo is not multichannel, and mono certainly isn't multichannel. However, a three-channel system, such as an LCR (left, center, right) is multichannel as is four-channel LCRS (left, center, right, surround). After you get into four channels of audio, you have truly entered surround territory. The next level, 5.1 surround, has three speakers in front of the listener in an LCR setting and two speakers somewhat behind the listener. The .1 channel (called "point one," not "dot one") in 5.1 is a holdover from the cinema days when it was used as a low-frequency effects (LFE) channel for thunder, guns, and other low-end rumbling special effects. The .1 channel is called such because it is about one-tenth the bandwidth of the other channels. Its frequency range is about 5 hertz (Hz) to 120 Hz. This 5.1 format is what is generally referred to as "surround," and the focus of these articles will be multichannel surround of 5.1 channels and above. (The Windows Media Audio 9 Professional codec can actually handle up to 8 discreet channels of audio, where the eighth channel can be a full bandwidth channel, not just a reduced bandwidth LFE-type channel).

Back to the top of this pageBack to the top


The Multichannel Movement

The drive behind the original multichannel movement came from the film industry, where surround audio is more than 50 years old, having started with a mouse named Mickey in a movie named Fantasia. Ten years ago, the technology became inexpensive enough that home consumers could own a multichannel sound system, the now ubiquitous home-theater system, and surround music was on its way. The ride is only just beginning.

Back to the top of this pageBack to the top


All Things in Good Time

Like many great ideas before their time, surround sound just wasn't easily available to consumers in their homes as soon as it could be produced by the professional industry. In fact, the music industry had a brief dalliance with surround music in the 1970s called Quad, which failed primarily due to a lack of a workable consumer delivery system. It wasn't until Dolby Logic, Dolby Digital, and Digital Theater Systems (DTS) audio became widespread in the cinema industry in the late 1980s that the technology trickled down to the home theater surround systems we currently enjoy.

What made surround sound for the home possible was a new technology called a multichannel codec. This codec lets you encode 5 or more channels of music into a single audio file. This file can then be transported, stored, and played much like any stereo information, but can be decoded back to its original multichannel streams. The term codec is an abbreviation of compressor/decompressor.

Back to the top of this pageBack to the top


Day of the Codec

The basic idea behind a multichannel audio codec is simple. Let's say you have six music files designated for various speakers in the listening room, as shown in Figure 1.

Speaker configuration for surround sound
Figure 1. Speaker configuration for surround sound

The speakers are labeled L for Left Front, C for Center Front, R for Right Front, Ls for Left surround, Rs for Right surround, and LFE for the subwoofer. An encoding application, such as Windows Media Encoder 9 Series, lets you assign these six files to the appropriate speakers through a dialog box on your screen. After you make the appropriate assignments, you provide a name for the output file, as shown in Figure 2, click OK, and a few minutes later a single file is created from the original six.

Using Windows Media Encoder to assign audio files to speakers
Figure 2. Using Windows Media Encoder to assign audio files to speakers

You can then store this file on your hard disk or copy it to some removable media such as a CD-R or USB RAM card. When the time comes, you can either e-mail the file to another computer or place it on a Web server so it can be streamed in real time. On the playback end, the decoder turns the six files back into their original streams and routes them to the appropriate speakers in the listening room. Et voila, you have surround audio you can move around just like using stereo.

Back to the top of this pageBack to the top


Compress This

A secondary function of the Windows Media Audio codec is to reduce the bandwidth needed to carry six or more channels of Pulse Code Modulation (PCM) audio, which streams out at 1.5 megabits per second (Mbps) per channel, down to a manageable size. After all, if a disc could hold only one-third the amount of play time as its stereo capacity, it would be of limited use. In fact, codecs use a lossy compression method (see the topic "Defining Compression") to reduce the file size (and bandwidth) by a factor of 10:1 or more. The surround encoder with the Windows Media Audio family of codecs can compress audio in excess of 20:1 and still maintain musical fidelity. The Windows Media Audio 9 Professional encoders can achieve data rates as low as 128 kilobits per second (Kbps) (see Figure 3) for 5.1 surround sound audio, which makes the size and bandwidth of a surround sound file on par with a traditional stereo MP3 audio file.

Data rates for the Windows Media Audio 9 Professional codec
Figure 3. Data rates for the Windows Media Audio 9 Professional codec
Back to the top of this pageBack to the top


High Resolution

The second aspect of audio compression is the audio resolution, that is, the frequency at which audio is sampled as it is converted from analog to digital format, and the number of bits it uses at each sample time. With the Windows Media Audio 9 codec, audio is sampled at 44.1 or 48 kilohertz (kHz) using 16 bits, similar to the current CD standard. The Windows Media Audio 9 codec is able to offer CD quality at data rates from 64 to 192 Kbps, at 16 bits per sample. The Windows Media Audio 9 Professional codec, however, is able to sample audio at a higher sample rate (96 kHz) using 24 bits per sample. This additional sampling information enables the codec to delivery better-than-CD experiences, at far lower data rates than current CDs.

Back to the top of this pageBack to the top


Defining Compression

Now that you are familiar with codecs, a few definitions about compression types are in order. First, remember this is computer data compression, definitely not the dynamic-range compression we are all used to in the recording studio and on stage. The two fundamental types of data compression are lossless and lossy.

Lossless compression is what occurs every time you use a file compression program such as PKWARE PKZIP to reduce the size of a database file. There are several technologies involved, but the simplest way to think about it is to imagine taking all the extra spaces out of a document and putting the words into one long line. Of course, some documents have a lot of spaces, and hence wasted space, while other documents have a lot of blank places that can be compressed. The opposite action takes place, that is, the spaces are reinserted into the document, when we decompress a file.

We generally talk about this compression in ratios so that, for instance, a 2:1 compressed file becomes one-half its original size. We can extend this same idea of lossless compression to music, but the problem is that most audio files have very few blank spaces to compress. So trying to use PKZIP for a music track is futile. However, a few codecs exist that accomplish this feat through a series of folding techniques in which unused bit areas are shared with more complex parts of the song. The Windows Media Audio 9 Lossless codec is quite effective at reducing the size of the files in a lossless manner. You generally can only accomplish compression rates around 2:1 or so, but upon decompression an exact bit-for-bit duplicate of the original audio file is produced. No data is lost, which makes it ideal for archiving content masters. 

Lossy compression is a very different thing. Instead of looking for empty spaces to store extra bits, it works by using a perceptual coding technique. That is, a lossy compression algorithm understands how human beings hear things and what parts of the data are important to musical fidelity. For instance, loud musical notes mask our ability to hear softer notes in the same passage. So a cymbal crash gets our attention, and we don't hear the acoustic guitar harmonics that happen to be playing at the same time. This allows the codec to eliminate the acoustic guitar for a few milliseconds and save a few thousand bytes of storage with every cymbal crash.

Because impact noise clouds our ear for a few milliseconds, the codec can eliminate some complex musical data after every transient. By doing these reductions, as well as looking for related harmonic content and such, a lossy codec can attain data compression rates of 20:1 or more. This means you can compress a 20-megabyte (MB) file to a size of 1 MB or less, reducing the download time and bandwidth by a similar factor. Note that higher compression rates result in smaller files requiring less bandwidth.

On the decoding or playback side of the compression, the codec then estimates what must have happened originally during compression and expands the audio file back to its original size. However, the output file is never exactly the same as the original, as there is some loss in musical fidelity. This loss can vary from being imperceptible to most listeners at low compression rates (high bandwidth) to quite obvious at very high compression rates (low bandwidth). How much fidelity loss occurs is a function of the quality of the codec, the complexity of the music, and how much compression is needed to get the file to a usable size and small enough bandwidth.

The Windows Media Audio 9 Professional codec can reduce 5.1 surround sound data rates to as low as 128 Kbps. This is lower than the original bandwidth of nearly 3,600 Kbps for six tracks of CD-quality 16-bit/44.1 kHz audio. That is a compression ratio of around 28:1 with acceptable mid-fidelity music. At higher data rates (less compression) of 192 Kbps, the Windows Media Audio 9 Professional codec compares favorably in listening tests to other high-resolution multichannel codecs at twice the data rate. In addition, when using 5.1 surround sound audio compressed at 384 Kbps with the Windows Media Audio 9 Professional codec, most listeners cannot discern any differences between the compressed music and the original PCM files. We will discuss that in more detail in future articles.

Back to the top of this pageBack to the top


Stay Tuned

The next article in this series will get you up and listening to surround sound that is encoded with the Windows Media Audio 9 codec rather quickly so you'll have something with which to test your encoded audio. We also will show you how to get Microsoft Windows Media Player 9 Series to recognize your files as multichannel. Then, in future articles, we will jump into a whole series of how-to steps on using Windows Media Player 9 Series to play streams from your server, its down mixing and metadata functions, and much more.

Back to the top of this pageBack to the top



© 2009 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement
Microsoft