Writing A Complete DSP Plug-in for Windows Media Player 9 in C#
Ianier Munoz
Applies to:Chronotron.com
This article shows how to build a complete DSP (Digital Signal Processing) plug-in for Windows Media Player 9 in C#. The accompanying sample code comprises of a Phase Shifter DSP plug-in and a User Interface plug-in for controlling the DSP plug-in parameters in real-time. This article assumes that you're familiar with C# and with COM Interop. Downloads: Contents: IntroductionWindows Media Player 9 is more extensible than its predecessor. In addition to "skins", third-party developers can write code that runs hosted by the player and add some extra functionality to it, such as cool animations synchronized with the music or audio effects. The code hosted by the player is usually referred to as "plug-ins". In this article we will focus on the first two of them. We will develop a DSP plug-in that implements the Phase Shifter audio effect (also known by musicians as "Phaser") and a User Interface plug-in that implements a window containing sliders to allow controlling the effect parameters in real-time. See Figure 1. ![]() Figure 1 If you are not interested in the technical details but rather want just to use the plug-in, you can download and install the binaries from the wmplugins.com web site. ![]() Figure 2 ![]() The DSP Plug-inDSP plug-ins are nothing more than DirectX Media Objects (DMO). A DMO is a COM object that implements a relatively simple interface named IMediaObject. While you may think that this interface has many methods, implementing it is easy compared to other alternatives such as DirectShow filters. The IMediaObject interface translated into C# is shown below. You'll find the complete definition including all support structures in MediaObj.cs.
public interface IMediaObject
{
void GetStreamCount(out uint pcInputStreams, out uint pcOutputStreams);
void GetInputStreamInfo(uint dwInputStreamIndex, out DMO_INPUT_STREAM_INFO_FLAGS
pdwFlags);
void GetOutputStreamInfo(uint dwOutputStreamIndex, out
DMO_OUTPUT_STREAM_INFO_FLAGS pdwFlags);
void GetInputType(uint dwInputStreamIndex, uint dwTypeIndex, [Out] DMO_MEDIA_TYPE
pmt);
void GetOutputType(uint dwOutputStreamIndex, uint dwTypeIndex, [Out]
DMO_MEDIA_TYPE pmt);
void SetInputType(uint dwInputStreamIndex, [In] DMO_MEDIA_TYPE pmt,
DMO_SET_TYPE_FLAGS dwFlags);
void SetOutputType(uint dwOutputStreamIndex, [In] DMO_MEDIA_TYPE pmt,
DMO_SET_TYPE_FLAGS dwFlags);
void GetInputCurrentType(uint dwInputStreamIndex, [Out] DMO_MEDIA_TYPE pmt);
void GetOutputCurrentType(uint dwOutputStreamIndex, [Out] DMO_MEDIA_TYPE pmt);
void GetInputSizeInfo(uint dwInputStreamIndex, out uint pcbSize, out uint
pcbMaxLookahead, out uint pcbAlignment);
void GetOutputSizeInfo(uint dwOutputStreamIndex, out uint pcbSize, out uint
pcbAlignment);
void GetInputMaxLatency(uint dwInputStreamIndex, out long prtMaxLatency);
void SetInputMaxLatency(uint dwInputStreamIndex, long rtMaxLatency);
void Flush();
void Discontinuity(uint dwInputStreamIndex);
void AllocateStreamingResources();
void FreeStreamingResources();
void GetInputStatus(uint dwInputStreamIndex, out DMO_INPUT_STATUS_FLAGS dwFlags);
void ProcessInput(uint dwInputStreamIndex, IMediaBuffer pBuffer,
DMO_INPUT_DATA_BUFFER_FLAGS dwFlags, long rtTimestamp, long rtTimelength);
void ProcessOutput(DMO_PROCESS_OUTPUT_FLAGS dwFlags, uint cOutputBufferCount,
[MarshalAs(UnmanagedType.LPArray)]DMO_OUTPUT_DATA_BUFFER[] pOutputBuffers, out uint
pdwStatus);
void Lock(int bLock);
}
The methods in IMediaObject can be split into two main groups: negotiation of the stream format and data streaming itself. In the next section I'll dig into the details of each one of these tasks. ![]() Media Type NegotiationMost of the IMediaObject methods are actually about describing what the plug-in can do. This includes the number streams that it can handle and their exact format. Note that a DMO is not limited to audio streams; a DMO could also handle video data or any other media type. In this article I use the terms format and media type indistinctly. A media type is described by the DMO_MEDIA_TYPE structure, which is declared in MediaObj.cs. The AudioDmoBase class in DmoBase.cs implements most of the plumbing we need to develop an audio effect plug-in. Let's have a look at the methods involved in format negotiation: GetStreamCount tells the caller how many inputs and outputs the plug-in has. Note that the number of inputs and outputs is not related to the format of the data. An audio plug-in typically has only one input and one output (the audio stream) no matter how many audio channels the stream contains (mono, stereo, etc.).
public void GetStreamCount(out uint pcInputStreams, out uint pcOutputStreams)
{
pcInputStreams = 1;
pcOutputStreams = 1;
}
The purpose of GetInputType and GetOutputType is to enumerate the formats the plug-in can accept at input or output respectively. Because a typical audio effect does not change the format of the stream, if the input format has already been set then that format is offered by GetOutputType. Similarly, if the output format has already been set, then that format is enumerated by GetInputType. Our plug-in does not offer any format by itself, so it calls Marshal.ThrowExceptionForHR with the DMO_E_NO_MORE_ITEMS error code when no input or output format has been set. The CLR will return the exception's error code to the caller in the form of an HRESULT.
public void GetInputType(uint dwInputStreamIndex, uint dwTypeIndex, [Out] DMO_MEDIA_TYPE pmt)
{
if (dwInputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
if (OutputType != null && dwTypeIndex == 0)
pmt.CopyFrom(OutputType);
else
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_NO_MORE_ITEMS);
}
public void GetOutputType(uint dwOutputStreamIndex, uint dwTypeIndex, [Out] DMO_MEDIA_TYPE pmt)
{
if (dwOutputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
if (InputType != null && dwTypeIndex == 0)
pmt.CopyFrom(InputType);
else
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_NO_MORE_ITEMS);
}
The host application calls SetInputType and SetOutputType to suggest an input or output media type for the plug-in to use. The plug-in can reject a media type by returning DMO_E_TYPE_NOT_ACCEPTED. The AudioDmoBase class has a method named CanAcceptMediaType to decide whether the media type can be accepted or not. In addition, if the output format has already been set then we only accept the same format at the input and vice-versa.
public void SetInputType(uint dwInputStreamIndex, [In] DMO_MEDIA_TYPE pmt, DMO_SET_TYPE_FLAGS dwFlags)
{
if (dwInputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
if ((dwFlags & DMO_SET_TYPE_FLAGS.DMO_SET_TYPEF_CLEAR) != 0)
InputType = null;
else
{
bool accepted = OutputType != null ? OutputType.Equals(pmt) : CanAcceptMediaType(pmt);
if (accepted)
{
if (dwFlags == 0)
InputType = pmt;
}
else
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_TYPE_NOT_ACCEPTED);
}
}
public void SetOutputType(uint dwOutputStreamIndex, [In] DMO_MEDIA_TYPE pmt, DMO_SET_TYPE_FLAGS dwFlags)
{
if (dwOutputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
if ((dwFlags & DMO_SET_TYPE_FLAGS.DMO_SET_TYPEF_CLEAR) != 0)
OutputType = null;
else
{
bool accepted = InputType != null ? InputType.Equals(pmt) : CanAcceptMediaType(pmt);
if (accepted)
{
if (dwFlags == 0)
OutputType = pmt;
}
else
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_TYPE_NOT_ACCEPTED);
}
}
GetInputCurrentType and GetOutputCurrentType just return the current input and output media types respectively, if they have been set.
public void GetInputCurrentType(uint dwInputStreamIndex, [Out] DMO_MEDIA_TYPE pmt)
{
if (dwInputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
if (InputType == null)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_TYPE_NOT_SET);
pmt.CopyFrom(InputType);
}
public void GetOutputCurrentType(uint dwOutputStreamIndex, [Out] DMO_MEDIA_TYPE pmt)
{
if (dwOutputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
if (OutputType == null)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_TYPE_NOT_SET);
pmt.CopyFrom(OutputType);
}
GetInputStreamInfo, GetOutputStreamInfo, GetInputSizeInfo and GetOutputSizeInfo return more information about the streams, such as the memory requirements for the buffers holding the stream data (e.g. alignment and minimum size). Check Writing a DMO in the DirectX SDK documentation for detailed information on these methods. GetInputMaxLatency and SetInputMaxLatency respectively get or set the maximum possible difference between a time stamp on the input stream and the corresponding time stamp on the output stream. We assume that in most plug-ins the nth output sample corresponds to the nth input sample, so we return zero in GetInputMaxLatency and do nothing in SetInputMaxLatency. ![]() Data StreamingThe methods involved in data streaming are Flush, Discontinuity, AllocateStreamingResources, FreeStreamingResources, GetInputStatus, ProcessInput and ProcessOutput. The streaming process can be resumed as follows: the host calls GetInputStatus to determine whether the DMO can accept data and, if so, calls ProcessInput passing a buffer with data to process. The host keeps calling ProcessOutput to obtain the processed data until the DMO signals that it has no more data to process. The whole procedure is repeated over and over until the host has no more data to process. The code for GetInputStatus, ProcessInput and ProcessOutput is shown below. Note that stream samples are handled through the IMediaBuffer interface, which has methods to expose the actual buffer memory pointer as well as its size.
private IMediaBuffer m_Buffer;
private int m_BufferPos;
...
private IMediaBuffer CurrentBuffer
{
get { lock(this) return m_Buffer; }
set
{
lock(this)
{
if (m_Buffer != null)
Marshal.ReleaseComObject(m_Buffer);
m_Buffer = value;
m_BufferPos = 0;
}
}
}
...
public void GetInputStatus(uint dwInputStreamIndex, out DMO_INPUT_STATUS_FLAGS dwFlags)
{
if (dwInputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
lock(this)
{
dwFlags = CurrentBuffer == null ?
DMO_INPUT_STATUS_FLAGS.DMO_INPUT_STATUSF_ACCEPT_DATA : 0;
}
}
public void ProcessInput(uint dwInputStreamIndex, IMediaBuffer pBuffer,
DMO_INPUT_DATA_BUFFER_FLAGS dwFlags, long rtTimestamp, long rtTimelength)
{
if (dwInputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
lock(this)
{
CurrentBuffer = pBuffer;
}
}
public void ProcessOutput(DMO_PROCESS_OUTPUT_FLAGS dwFlags, uint cOutputBufferCount,
DMO_OUTPUT_DATA_BUFFER[] pOutputBuffers, out uint pdwStatus)
{
lock(this)
{
pdwStatus = 0; // reserved, should be zero
if (cOutputBufferCount != 1)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
IntPtr inbuf;
int inlen;
CurrentBuffer.GetBufferAndLength(out inbuf, out inlen);
IntPtr outbuf;
int outlen;
pOutputBuffers[0].pBuffer.GetBufferAndLength(out outbuf, out outlen);
outlen = pOutputBuffers[0].pBuffer.GetMaxLength();
int tofeed = Math.Min(inlen - m_BufferPos, outlen);
// process data
Process(inbuf, m_BufferPos, ref tofeed, outbuf, 0, ref outlen);
pOutputBuffers[0].pBuffer.SetLength(outlen);
m_BufferPos += tofeed;
if (m_BufferPos >= inlen)
CurrentBuffer = null;
}
}
We use the m_Buffer member variable to hold the buffer that we are currently processing. If we don't have a buffer, either because streaming just started or because we have finished with a previous buffer, we return DMO_INPUT_STATUSF_ACCEPT_DATA through the dwFlags output parameter in GetInputStatus to signal the host application that it can call ProcessInput with a new buffer. ProcessOutput keeps track of the amount of data from the input buffer that has already been processed using the m_BufferPos member variable. When all data has been consumed, m_Buffer is cleared and the DMO can accept a new input buffer. Audio samples in Windows can have different bit depths (e.g. 8 bit, 16 bit, 24 bit). ProcessOutput converts all audio samples to floating-point, calls a virtual method to process the converted samples and finally converts the samples back to their native format before returning them to the caller. This ensures that inheritors only have to deal with floating point numbers by overriding ProcessFloat.
protected virtual void Process(IntPtr src, int srcofs, ref int srclen, IntPtr dst, int dstofs, ref int dstlen)
{
WaveFormat fmt = AudioFormat;
int len = Math.Min(dstlen, srclen);
int samples = fmt.nChannels * len / fmt.nBlockAlign;
if (m_Temp == null || m_Temp.Length < samples)
m_Temp = new float[samples];
// convert to float
switch(fmt.wBitsPerSample)
{
case 8:
for (int i = 0; i < samples; i++)
m_Temp[i] = ByteToFloat(Marshal.ReadByte(src, srcofs + i));
ProcessFloat(m_Temp, samples / fmt.nChannels, fmt.nChannels);
for (int i = 0; i < samples; i++)
Marshal.WriteByte(dst, dstofs + i, FloatToByte(m_Temp[i]));
break;
case 16:
for (int i = 0; i < samples; i++)
m_Temp[i] = ShortToFloat(Marshal.ReadInt16(src, (srcofs + i) * 2));
ProcessFloat(m_Temp, samples / fmt.nChannels, fmt.nChannels);
for (int i = 0; i < samples; i++)
Marshal.WriteInt16(dst, (dstofs + i) * 2, FloatToShort(m_Temp[i]));
break;
default:
// we can't process this format, so just copy
for (int i = 0; i < len; i++)
Marshal.WriteByte(dst, dstofs + i, Marshal.ReadByte(src, srcofs + i));
break;
}
srclen = len;
dstlen = len;
}
The implementation of Flush, Discontinuity, AllocateStreamingResources, FreeStreamingResources is shown below. Flush should release any IMediaBuffer references and clear the internal state of the effect. A discontinuity is a break in the input, such as when there is a gap in the data. When this happens, the host application just calls ProcessOutput until there's no more data to process. In many audio effects there's no special action to take when a discontinuity is detected, so we just release the current buffer to make sure that processing can go on.
public virtual void Flush()
{
CurrentBuffer = null;
}
public void Discontinuity(uint dwInputStreamIndex)
{
if (dwInputStreamIndex != 0)
Marshal.ThrowExceptionForHR(DMOErrorCodes.DMO_E_INVALIDSTREAMINDEX);
CurrentBuffer = null;
}
public virtual void AllocateStreamingResources()
{
}
public virtual void FreeStreamingResources()
{
CurrentBuffer = null;
}
![]() Deriving from AudioDmoBaseThe PhaserDSPPlugin class in MyPlugin.cs inherits from AudioDmoBase to implement the actual Phaser effect. The important methods to be overridden are ProcessFloat, AllocateStreamingResources, FreeStreamingResources and Flush.
public override void AllocateStreamingResources()
{
Lock(1);
try
{
base.AllocateStreamingResources();
int rate = AudioFormat.nSamplesPerSec;
m_LeftFilter = new PhaseShiftFilter(rate);
m_RightFilter = new PhaseShiftFilter(rate);
UpdateParameters();
PhaserParameters.Current.Changed += new EventHandler(ParametersChanged);
}
finally
{
Lock(0);
}
}
public override void FreeStreamingResources()
{
Lock(1);
try
{
base.FreeStreamingResources();
PhaserParameters.Current.Changed -= new EventHandler(ParametersChanged);
m_LeftFilter = null;
m_RightFilter = null;
}
finally
{
Lock(0);
}
}
public override void Flush()
{
base.Flush();
}
protected override void ProcessFloat(float[] data, int samples, int channels)
{
switch(channels)
{
case 1:
for (int i = 0; i < samples; i++)
data[i] = m_LeftFilter.FilterSample(data[i]);
break;
case 2:
for (int i = 0; i < samples * 2; i += 2)
data[i] = m_LeftFilter.FilterSample(data[i]);
for (int i = 1; i < samples * 2; i += 2)
data[i] = m_RightFilter.FilterSample(data[i]);
break;
}
}
The actual processing code is in the PhaseShiftFilter class, which is basically my own implementation of the Phase Shifter algorithm explained by Scott Lehman in his article Phase Shifting. I took this class unchanged from my article Programming Audio Effects in C#. The details on how the algorithm works are out of the scope of this article. The PhaserParameters class holds all the parameters of the Phaser algorithm, namely Dry, Wet, Feedback, SweepRate, SweepRange and Frequency. Dry represents the amount of input signal to be present at the output unchanged and Wet is the amount of processed signal to be present at the output. Feedback is the fraction of output signal to be re-injected at the input. SweepRate, SweepRange and Frequency are Phaser-specific parameters that control the "depth" of the effect. The best way to get familiar with these is to launch Windows Media Player and to tweak the various settings while playing a music file. In the Windows Media Player 9 SDK there is no standard mechanism for bridging a DSP plug-in and its user interface together. Jim Travis describes some workarounds in his excellent article Making Windows Media Player Plug-ins Work Together. My choice was to make the PhaserParameters class a singleton so that it can be shared by both the DSP plug-in and the corresponding User Interface plug-in. ![]() The User Interface Plug-inWindows Media Player User Interface plug-ins are just COM objects that implement the IWMPPluginUI interface (see the code below). For in-depth information about User Interface plug-ins, check About User Interface Plug-ins in the Windows Media Player SDK documentation.
public interface IWMPPluginUI
{
void SetCore([MarshalAs(UnmanagedType.IDispatch)]/*IWMPCore*/object pCore);
IntPtr Create(IntPtr hwndParent);
void Destroy();
void DisplayPropertyPage(IntPtr hwndParent);
object GetProperty(string pwszName);
void SetProperty(string pwszName, [In] ref object pvarProperty);
void TranslateAccelerator(IntPtr lpmsg);
}
Windows Media Player calls the SetCore method passing an object that implements the IWMPCore interface. This mechanism allows the plug-in to control the player functionality, such as playing files, getting information about the currently playing media and so on. As we don't use this feature in the Phaser we'll just treat it as a generic Object. If you need to use IWMPCore functionality in your own plug-in you can just add the "Windows Media Player 1.0" type library (exposed by wmp.dll) to your project, which declares this interface. Create and Destroy are called by Windows Media Player to ask the plug-in to create and destroy its own window respectively. Note that the plug-in window must be a child window of the window handle passed to Create. In this example we don't do anything special in all other methods. The PhaserUIPlugin class implements IWMPPluginUI as shown below. You'll find the code in MyPlugin.cs.
public class PhaserUIPlugin : IWMPPluginUI
{
[DllImport("user32.dll")]
private static extern IntPtr SetParent(IntPtr hWndChild, IntPtr hWndNewParent);
public PhaserUIPlugin()
{
}
private /*IWMPCore*/ object m_Core;
private UICtrl m_Control;
// IWMPPluginUI
public void SetCore([MarshalAs(UnmanagedType.IDispatch)]/*IWMPCore*/object pCore)
{
m_Core = pCore;
}
public IntPtr Create(IntPtr hwndParent)
{
m_Control = new UICtrl(PhaserParameters.Current);
IntPtr h = m_Control.Handle;
SetParent(h, hwndParent);
return h;
}
public void Destroy()
{
if (m_Control != null)
m_Control.Dispose();
m_Control = null;
}
public void DisplayPropertyPage(IntPtr hwndParent)
{
MessageBox.Show("This plug-in has no property page");
}
public object GetProperty(string pwszName)
{
return null;
}
public void SetProperty(string pwszName, [In] ref object pvarProperty)
{
}
public void TranslateAccelerator(IntPtr lpmsg)
{
}
}
The actual plug-in window is a Windows Forms user control defined in UICtrl.cs, which has a set of sliders to manipulate the effect parameters by calling into the PhaserParameters singleton. In PhaserUIPlugin we create a new UICtrl instance and then call the Windows API SetParent function through P-Invoke to make our control window a child of the window passed in the call to Create. The rest of the implementation is trivial. ![]() Hooking into Windows Media PlayerNow that we have implemented all the required interfaces we need to register our plug-ins with Windows Media Player. The first thing to do is making the PhaserDSPPlugin and PhaserUIPlugin classes available to COM. The next step is to make Windows Media Player aware of our plug-ins. To register the DSP plug-in we create a WMPMediaPluginRegistrar object and call its WMPRegisterPlayerPlugin method. Similarly, to unregister the plug-in we call WMPUnRegisterPlayerPlugin. User Interface plug-ins are registered by adding some specific entries to the Windows registry. The MyPluginInstaller installer class in MyPluginInstaller.cs has the necessary code to register both the DSP plug-in and the User Interface plug-in with Windows Media Player. MyPluginInstaller inherits from System.Configuration.Install.Installer, which is the base class for all custom installers in the .NET Framework. Installers are components that help install applications on a computer. See the .NET Framework Documentation for more information. For COM objects that implement plug-ins, Windows Media Player also requires that the HKCR/CLSID/{Object_CLSID}/InProcServer32 registry entry contain the full path of the DLL that implements the COM object. The InProcServer32 entry for managed classes exposed to COM is set to mscoree.dll, which implements the COM Callable Wrapper (CCW) for managed classes. To comply with this –undocumented– requirement, MyPluginInstaller also contains the code to add the full path to InProcServer32 if necessary. We run installutil.exe as a post-build action to execute the code in MyPluginInstaller so that you can easily debug the plug-in in the Visual Studio .NET 2003 IDE using Windows Media Player as the host application. A setup project for the plug-in is also part of the Solution. The generated Windows Installer (MSI) file is available for download from wmplugins.com. ![]() ConclusionOnce again I was satisfied with the overall performance of the .NET Framework and Common Language Runtime. Some parts of the Phaser plug-in code could be considerably optimized for speed; however, it's not possible to get rid of the performance hit incurred in COM Interop due to the fact that Windows Media Player is a native Windows application. The uniqueness of a good plug-in is often reduced to its processing algorithm. In addition to performance, one important thing to take into account when developing a plug-in using managed code is that it can be relatively easily decompiled. Obfuscation indeed helps, but since the plug-in interfaces are well known, your intellectual property is most likely to be exposed. That's why I would suggest developing the effect algorithm in native code and implement the user interface in managed code so that you can still benefit from all the power of Windows Forms. Any plug-in developed using managed code requires the .NET Framework Redistributable to be downloaded and installed on the end user's computer. Note that although the .NET Framework is installed out-of-the-box in the most recent versions of the Windows Operating System, users with older versions will need to download and install it, which is a ~20MB package. ![]() About the author
| ||||||