Chapter 3: Programming DirectShow Applications
3 Programming DirectShow ApplicationsAlthough sample DirectShow filter graphs can be constructed and tested in GraphEdit, application programmers want to use standard programming languageseither C or C++to construct DirectShow applications. Although Visual Basic is an easy-to-learn and fully functional programming environment, the Visual Basic support for DirectShow programming interfaces is minimal. If you're a Visual Basic programmer, don't despair: nearly everything that follows is useful information, even if it can't be directly applied to your programming needs. For the purposes of this text, we'll be using the Microsoft Visual C++ integrated development environment, which provides a robust platform for the design and testing of DirectShow applications.
The design of a DirectShow application is straightforward and generally has three logical parts: initialization, where the application environment is established, followed by the construction of the DirectShow filter graph; execution, when the filter graph enters the running state and processes a stream of data; and cleanup, when data structures are deallocated and system resources released. This isn't significantly different from the model used by any other Windows application, and as a result, DirectShow applications can be combined with existing Windows applications very easily.
Before we can dive in and create a simple "media player" using DirectShow, a player that can be used to play any media types for which there are corresponding DirectShow filters (at the very least, these will include AVI, WAV, and Windows Media files), we need to cover some ground for programmers unfamiliar with the Microsoft Component Object Model (COM) programming interfaces. The DirectShow programming interfaces to filters and filter graphs present themselves as COM objects, so most DirectShow applications are COM-intensive. Although this might sound daunting, you don't need to know very much about COM to write fully functional DirectShow programs.
For example, a word processor and an e-mail client both need access to a spelling checkerso why write two spelling checkers when a reusable code object could be invoked by both programs when needed? In the ideal case, nothing would need to be known about the reusable code object other than its name. Once the programmer included this ideal object, the program would be able to query it to determine its properties. Like a normal object, this object would have data and methods, both of which would be accessible (if public) to the invoking program. In short, the ideal reusable code object would act just like code written by the programmer.
By the mid-1990s, Microsoft had introduced COM, its version of the reusable code object. Although the first generations of COM were somewhat rough in their design, nearly a decade of refinement has produced a level of functionality that begins to approach the ideal of the reusable code object. A COM object, like a C++ object, has properties that can be inspected, methods to be invoked, and interfaces that illustrate characteristics inherited from base classes. The creator of a COM object can choose to hide or reveal any of these qualities, producing an object that is both easy to manage and easy to use.
CoInitializeEx(NULL, COINIT_APARTMENTTHREADED) //Initializes COM
After COM has been initialized, calls to the COM libraries can be made in any desired fashion. Before the application terminates its execution, COM must be shut down again. (Failure to shut down COM could result in execution errors when another program attempts to use COM services .) To release COM services, add this line of code to the application's cleanup code:
CoUninitialize(); // Releases COM
No calls to the COM services can be made after COM has been uninitialized.
The COM routine CoCreateInstance is used to create COM objects. In the case of the Filter Graph Manager, the code to create it might look like this:
IGraphBuilder *graphBuilder = NULL; // Pointer to created object
The CoCreateInstance call takes five arguments, beginning with a class IDin this case CLSID_FilterGraphwhich requests that a COM object representative of a filter graph (really, a Filter Graph Manager) be created. The NULL parameter indicates that this is not an aggregate object, which will be the case in any DirectShow application. The value CLSCTX_INPROC_SERVER indicates the that the COM object is being loaded from an in-process (local to your application) DLL. This value is always present in this parameter.
The next argument is an interface ID, which informs COM of the unique interface being requested by the caller. In this case, the value is IID_IGraphBuilder, which means that you will be retrieving the object's IGraphBuilder interface, which has methods for building filter graphs. Later, you'll need to use another interface on the same object, IMediaControl, which provides methods to start, stop, and pause the graph. The pointer address is returned in the last parameter. This pointer is cast as void**, a generic pointer to a pointer, because the function could return a pointer to any number of objects.
Nearly every COM invocation returns a status code of some sort or another; this code should always be examined for error values, using the macros SUCCEEDED and FAILED to test for success or failure. A COM call that generates an error indicates either a logical error in the program or some failure of the operating system to fulfill a request, perhaps because resources already in use or as yet uninitialized have been requested by the program.
When you're through with a COM interface, you need to invoke its Release method so that the object will know how to delete itself at the appropriate time. For the preceding code fragment, this method might look like this:
pGraphBuilder->Release(); // Release the object
If you fail to release COM interfaces, objects will not get deleted and you'll clutter up your memory, suffer a performance hit, and possibly confuse the operating system into thinking that resources are being used after you've finished with them. So make sure you clean up after your COM invocations.
To acquire this interface, you need to send a query (request) to the object using any of its interfaces that you have already obtained. In this case, we already have its IGraphBuilder interface, so we'll use that interface. If we assume that the code fragment in the previous section has already executed successfully, that call might look like this:
IMediaControl *pMediaControl = NULL; // Store pointer to interface
The QueryInterface method takes two parameters. The first parameter is the interface ID (a GUID) for the requested interface. In this case, the interface ID references the IMediaControl interface. The second parameter is a pointer to a storage location for the returned interface. Once again, an error code will be returned if the query fails.
All DirectShow filters are COM objectsincluding those you create for yourselfso when we get into the subject of writing your own filters, we'll cover the internal construction of COM objects in much greater detail.
Beyond Visual Studio .NET, the DirectX 9.0 Software Development Kit (SDK) is an essential element in the creation of DirectShow applications. The DirectX 9.0 SDK contains all the source files, headers, and libraries (along with a lot of helpful documentation) that will need to be linked with your own source code to create a functional DirectShow application.
If you already have Visual Studio .NET installed on the computer you'll be using for DirectShow application development, you might need to check whether the correct DirectX 9.0 SDK directories are in the include paths for the Visual C++ compiler and linker. (You'll know pretty quickly if your environment hasn't been set up correctly because your applications will generate errors during the compile or linking phases of program generation.)
To inspect the settings for your projects, open the Property Pages dialog box for your Visual C++ project. In the C/C++ folder, examine the value of the field labeled Additional Include Directories. The file path for the DirectX 9.0 SDK include files should be the first value in that field.
After you've ensured that the DirectX 9.0 SDK include files are available to the compiler, click on the folder labeled Linker and examine the value of the field Additional Dependencies. Here you should find a file path that points to the DirectX 9.0 SDK object libraries. If you don't, add the file path to the list of other file paths (if any) in the field.
At this point, Visual Studio .NET is ready for your programming projects. To test it, open the project DSRender (on the CD-ROM) and try to build it. If it compiles and executes without errors, everything has been set up correctly. If you have problems, ensure that the DirectX 9.0 SDK has been installed correctly.
Now, with all these important essentials out of the way, let's take a look at DSRender, our first peek at a DirectShow application program. Like many of the other projects presented in this book, it's designed for console-mode operation. This means that many of the Windows API calls that deal with the particulars of the graphical user interfacewindows, menus, dialog boxes, and the likehave been left out of the project, leaving only the meat of the DirectShow application. The code samples provided with this book are designed to become the kernels of your own DirectShow applications, so the code is "clean" and uncluttered by the requirements of a fully loaded Windows application.
If all of this has proceeded successfully (and it should), the application next instantiates a Filter Graph Manager object with a call to CoCreateInstance, and obtains in that same call the IGraphBuilder interface on that object, which provides methods that allow you to build a filter graph. Once the IGraphBuilder interface has been obtained (if this fails, this might indicate problems with DirectX or the operating system), two QueryInterface method calls are made to retrieve additional interfaces that are exposed by the Filter Graph Manager. The first of these calls returns an IMediaControl interface, which has methods for changing the execution state of the filter graph, as explained previously. The second of these calls requests an IMediaEvent object. The IMediaEvent interface provides a way for the filter graph to signal its own state changes to the DirectShow application. In this case, IMediaEvent will be used to track the progress of media playback, and it will pause execution of the application until playback is done. (For operating system geeks: this is possible because the DirectShow filters execute in a different thread from the DirectShow application.)
Now some magic happens. With just a single line of code, the entire filter graph is built. When the IGraphBuilder method RenderFile is invoked (with the name of the media file), the Filter Graph Manager object examines the media file's type and determines the appropriate set of filterssource, transform, and rendererthat need to be added to the filter graph. These filters are added to the filter graph and then connected together. If RenderFile returns without errors, DirectShow found a path from source to renderer. If the call to RenderFile fails, DirectShow lacked the filters to play the media fileor perhaps the file was corrupted.
With the filter graph built, a one-line call to the IMediaControl interface invoking its Run method begins execution of the filter graph. Although the filter graph begins executing, the Run method returns immediately because the data streaming code is running in a separate thread that has been started by the source filter. Media file playback commences. If the media file is a movie, a playback window will open on the display; if it's a sound file, there won't be any visible sign of playback, but sounds should start coming from the computer's speakers. Figure 3-1 shows an AVI file being played.
Figure 3-1 DSRender playing the AVI file Sunset.avi
This application, as written, needs to pause during playback of the media file. If it didn't, the application would terminate just after the filter graph had started playback, and that wouldn't be very useful. This is where the IMediaEvent interface comes into play. Invoking its WaitForCompletion method with a value of INFINITE causes the application to wait until the Filter Graph Manager learns that the media file has completed its playback. In a real-world application, you wouldn't use a value of INFINITE in the call to WaitForCompletion; if something happened to stall or halt the playback of the media file, the application would waitforever. This is fine for a first DirectShow example, but other programming examples in this book will show you how to exploit the IMediaEvent interface more effectively.
After playback is complete, a call to the Stop method of the IMediaControl object halts the execution of the filter graph. This stop call is necessary because a filter graph doesn't stop by itself when a media file has been fully rendered.
// Pass it a file name in wszPath, and it will save the filter graph
This function is straightforward, although it uses a few components we haven't yet encountered. Beginning with a call to the Windows function StgCreateDocfile, an output file is opened, creating an IStorage object (in other words, an object that exposes the IStorage interface) that represents the file. (Note that this is an example of a COM object that is not created directly through CoCreateInstance but rather through a helper function.) Next an IStream stream object is created; this stream is used to provide a data path to the output file. The magic in this function happens when the Filter Graph Manager's IPersistStream interface is obtained by a call to the QueryInterface method of IGraphBuilder. The IPersistStream interface contains methods that create persistent stream objects, which can be written to a storage medium such as a file and retrieved later. When the Save method of IPersistStream is invokedwith a parameter that points to the IStream objectthe filter graph data structure is written to the stream.
If all of this goes as planned, a call to the Commit method of the IStorage interface writes the data to disk. At this point, a "snapshot" of the filter graph has been written out. This program uses the hard-coded string C:\MyGraph.GRF as the file name, but this name can be modified by you to any system-legal file path and name. After you run DSRender you'll find the file MyGraph.GRF on your hard disk. Double-click it and GraphEdit will launch; you'll see the filter graph created by DSRender. This filter graph will vary, depending on the media type of the file being rendered. Figure 3-2 shows the MyGraph.GRF filter graph.
Figure 3-2 GraphEdit showing the filter graph MyGraph.GRF created by DSRender
DSRender is a very slapdash example of a DirectShow applicationno frills, no extra UI details, just media playback. Yet a very broad set of media can be played with this simple application because the DirectShow IGraphBuilder object handles the hard work of selecting and connecting the appropriate filters together to create a functional filter graph. Now we need to move on and learn how to do the heavy lifting for ourselves, building a filter graph in C++ code line by line. Well, mostly..
// DSBuild implements a very simple program to render audio files
The opening lines of the function are essentially the same as those from DSRender. A Filter Graph Manager object is instantiated through a COM call, and subsequent QueryInterface calls return pointers to its IMediaControl and IMediaEvent interfaces. That's everything needed to begin building the filter graph. At this point, we use a new method of IGraphBuilder, AddSourceFilter, which takes a file name as a parameter and returns a pointer to an IBaseFilter interface on the filter that was chosen and instantiated. The IBaseFilter interface is exposed by all DirectShow filters.
Next the audio renderer filter is created using CoCreateInstance, with a class ID value of CLSID_DSoundRender, which returns the IBaseFilter interface for that object. Once that filter has been created successfully, it is added to the filter graph with the ingeniously named IGraphBuilder method AddFilter. The AddFilter method takes two parameters. The first parameter is a pointer to the IBaseFilter interface on the filter to be added, while the second parameter is an application-defined string used to identify the filter. (You can use this string to name the filter whatever you like. This feature is particularly worthwhile when examining a filter graph in GraphEdit.)
Now we have two filters in the filter graph: a source filter pointing to the file and an audio output filter. They need to be connected together, probably through a path of transform filters. The transform filters required to connect source to renderer will vary by media type of the source file. Rather than examining the source file ourselves to determine what intermediate filters are needed (which would be a long and involved process), we'll use the DirectShow Intelligent Connect feature to do the work for us.
To begin, we'll need to obtain IPin interfaceswhich, as the name suggests, are exposed by the pins on a filterfor both the output of the source filter and the input of the renderer. We use the local function GetPin (explained in detail in the next section) to obtain these interfaces on the pins we want to connect. Once we have both of these, we can invoke the IGraphBuilder method Connect. (Connect takes as parameters two pins; if successful, the method connects the two pins through some set of intermediate filters.) If the call to Connect fails, DirectShow wasn't able to build a path between source and renderer, possibly because the media type of the source file isn't supported by DirectShow or because the file didn't contain any audio.
As in DSRender, the application uses the IMediaControl interface's Run method to begin execution of the filter graph, and the IMediaEvent method WaitForCompletion pauses execution of the application until the media file has been completely rendered. At this point, the Stop method is called and the filter graph halts its execution. The filter graph is written to a file with a call to SaveGraphFile, the allocated interfaces are released, and the application terminates.
Even when created by hand, a filter graph isn't a difficult object to build or maintain. However, this application would have been significantly more difficult to write without Intelligent Connect, which allowed us to ignore the specifics of the media in the source file.
// This code allows us to find a pin (input or output) on a filter.
The IBaseFilter interface has a member function, EnumPins, which returns an IEnumPins interface. This interface enables you to iterate through a list of all the pins on a filter. Each element in the IEnumPins list contains an IPin object. As the code walks through this list of pins, each pin is queried through an invocation of its IPin::QueryDirection method. If the direction matches the requirements, that IPin interface pointer becomes the function's return valuewith one caveat: some filters have multiple input and output pins, and these pins can have different media types, so you can't know that a returned IPin will be useful in every situation. You could call GetPin on a digital video filter, expecting to get an output pin for digital video, only to find that it won't connect to a video renderer because the output pin is for the audio track that accompanies the video. This function doesn't discriminate.