For the Telephony API, Press 1;
For Unimodem, Press 2;
or Stay on the Line
| As an abstraction layer, TAPI's function is to provide a consistent, device-independent programming model to applications that use it. In Win32, an application does not need to know on what kind of display it's drawing a line. TAPI aims to do for telephony what GDI does for graphics.|
This article assumes you're familiar with Win32 |
Code for this article: simpleTAPI.exe (12KB)
Hiroo Umeno supports TAPI developers at Microsoft
Technical Support. He lives every day building a mystery,
fumbling towards sweet surrender.|
As a COMM and TAPI
specialist at Microsoft, I often talk to people porting older communications applications to Win32®. They ask questions such as "How can
I get the modem's initialization string?" or "How can I tell on which port the modem is installed?" My suggestion that TAPI may be their answer is usually met with responses like these: "TAPI? That's too hard," or "But all I want to do is to make a simple phone call
I won't claim that TAPI is simple. If I could do that with a straight face, I'd consider a career in politics. TAPI is a complex API set that covers broad ranges of devices and functionality. For those used to doing things directly using low-level APIs, TAPI requires a certain conceptual shift. In some ways, it's similar to the change you went through when you switched from programming in MS-DOS® to programming in Windows®. Under MS-DOS, you coded for everything an application did, from talking to ports to updating the screen. Sure there were BIOS and INT calls, but to do exactly what you wanted, you had to talk to the hardware directly.
Windows changed all that by inserting an abstraction layer between you and the hardware. But when it came to telephony, you were still in the dark ages, talking directly to hardware and proprietary libraries. TAPI brings the Win32 model to telephony by offering a standardized generic API set for telephony call control.
This article outlines the Telephony API architecture and basic programming model. It's an introduction and overview of the Telephony API, and is not meant to be a complete discussion of its capabilities.
I'll discuss TAPI 1.x and 2.0. TAPI 3.0, which is scheduled to be introduced with Windows NT® 5.0, has a distinctly different flavor. Its use of the new COM-based programming model warrants its own article.
TAPI is an abstraction layer, and by itself does not provide any useful functionality. Well, this is not quite true, but for the moment let's leave it at that. In many ways, TAPI parallels other Win32 models. As an abstraction layer, its function is to provide a consistent, device-independent programming model to applications that use it. In Win32, an application does not need to know whether it's drawing a line on a 16-color VGA display or an accelerated super VGA display with millions of colors. All it needs to know is where it's drawing the line. TAPI aims to do for telephony what GDI does for graphics.
TAPI is a generalized API that covers a wide range
of telephony equipment with diverse capabilities. But TAPI is not a "Modem API," and can't perform device-specific tasks such as modifying initialization strings or setting up auto-answer.
TAPI uses components provided by the operating system or third parties to provide call control functionality. TAPI programs often use media streams such as voice recordings, but TAPI itself does not provide media control. For media control (for example, the recording and playback of voice messages), you must use the APIs and methods appropriate for the media.
The idea of using computers to communicate is not new. Using a telephone line as a communication medium between computers isn't exactly a space-age concept either. Fast dedicated computers have controlled telephone switches for quite some time, and modems are now one of the most common PC peripherals. Despite this, CTI (Computer Telephony Integration) is not among the technical acronyms known to the average PC user.
Historically, CTI has been associated with proprietary enterprise solutions such as IVRs and PBX switches. This equipment is expensive, and is commonly supplied by the vendor with proprietary software to drive it. Relatively inexpensive telephony peripherals have been available for some time, but the software to drive them has also been hardware-specific. Modems have presented a somewhat easier programming problem since the Hayes AT commands became standard, but the devil, as they say, is in the details. The content of AT commands and their effects are vendor-specific; different modems may require different strings to achieve the same effect. So communications software is commonly shipped with databases of supported modems and their command strings.
Microsoft designed TAPI to offer an alternative to hardware-dependent CTI solutions. TAPI's goal is to present both software and hardware vendors with a consistent, device-independent programming model. With TAPI, applications need not know anything about the underlying hardware.
TAPI achieves this abstraction by inserting itself between custom applications and the hardware (see Figure 1). If you were to substitute the word "TAPI" in the center box with "Windows," and replace the "TSP" labels along the bottom with labels saying "device driver," it would be a diagram of the Windows architecture. TSPs (Telephony Service Providers) are the components that provide hardware- or service-specific functionality. When an application requests that a telephony device perform an action, TAPI figures out which TSP services that device and makes a call to it. It is then up to the TSP to get the job done.
|Figure 1 Generic TAPI Architecture|
TAPI's general architecture is the same on all supported operating systems, except Windows CE 1.0, which doesn't support the TSP layer. TAPI support is currently most complete on the Windows NT 4.0 and Windows 95 platforms. The TAPI support in Windows CE is very limited; only outbound data modem calls are supported. For this article, I will concentrate on Windows NT and Windows 95.
Windows 95 comes with TAPI version 1.x (see Figure 2). Like many other Windows 95 components, its internal implementation is 16-bit, 32-bit applications link to TAPI32.DLL, which is a simple thunk layer into the 16-bit TAPI.DLL. TAPI.DLL is loaded into a 16-bit VM with a helper application called TAPIEXE.EXE. Communication between the 32- and 16-bit DLLs is accomplished through a Flat Thunk mechanism. TSPs on Windows 95 are 16-bit DLLs, and are loaded together with TAPI.DLL. Individual TSPs communicate as needed with the hardware device drivers or any kernel-mode components they may include.
|Figure 2 Windows 95 TAPI 1.x Architecture|
Windows 95 uses a system mutex to protect its 16-bit system components from reentrancy problems that could occur due to preemptive multitasking. When the system is running 16-bit code, another 16-bit call cannot be made. This is normally not a problem since 16-bit apps are cooperatively multitasked, but it becomes a problem when 32-bit applications are thunking to 16-bit code. When multiple applications make 32-bit TAPI calls that call down to 16-bits, the applications block one another until they are out of the 16-bit code.
Therefore, when writing a TSP for Windows 95, you must be careful not to release the system mutex (by getting out of 16-bit code) until it is safe to receive another TSP call. Since the 16-bit TAPI.DLL is not reentrant, you must be sure not to cause inadvertent reentrancy by 32-bit preemption. TAPI service providers on Windows 95 are 16-bit DLLs. All the strings that are passed to or returned from TAPI 1.x are in ANSI code. There is no Unicode support.
Because of the underlying 16-bit components, misbehaving TAPI 1.x applications can cause serious problems. If an application using TAPI 1.x dies an untimely death, as may happen when encountering faults, stopping a debug session, or making a call to TerminateProcess, the results can be gruesome. Resources may be left open but unavailable, and TAPI 1.x may be left in an undetermined useless state that necessitates a system reboot.
The 16-bit components present a challenge in debugging, as well. Typical 32-bit debuggers like Developer Studio® cannot intercept the debug traces emitted by 16-bit components.
Windows NT 4.0
Windows NT 4.0 introduced TAPI 2.0, which is 32-bit from the ground up and offers significantly improved stability and reliability (see Figure 3). For application programs, the changes are transparent. TAPI 2.0 supports TAPI 1.3 and TAPI 1.4 applications, as well as those specifically written for TAPI 2.0.
|Figure 3 Windows NT TAPI 2.0 Architecture|
On the other hand, 16-bit TSPs will need a complete rewrite. All TSPs must be 32-bit DLLs that use Unicode. This doesn't mean that every string in the TSP must be represented in Unicode. Internally, a TSP can choose a format that fits its purpose. It is important that all the strings passed to TAPI 2.0 are in Unicode and all the strings received from TAPI 2.0 are in Unicode.
TAPI 2.0 provides two versions of string handling functions for application programs. At compile time, standard TAPI calls are resolved into type-specific calls by defining _Unicode as needed.
In TAPI 2.0, the TAPIEXE.EXE helper application is replaced by TAPISRV.EXE, which runs as a service under the local system context. Because of this architecture, a TSP cannot perform user interface functions like displaying a dialog box on the user's desktop. Version 2.0 introduced a secondary user interface DLL for this purpose. This UI DLL contains TSP-specific UI elements and is loaded into the application's context. This allows the dialog box to be displayed, provided that the application is running in a context that has access to the interactive desktop. TAPI 2.0 uses LRPC to communicate between processes running under separate security contexts.
When designing TSPs to run under Windows NT, you must understand the Windows NT security model. Special circumstances arise when a TSP must access resources that are specific to account contexts. Examples of these are network resources, desktops, secured resources, and the registry. Articles in both the Platform SDK and MSDN Knowledge Base discuss specific issues relating to Windows NT services and security contexts.
TAPI 2.1 is a post-release update for Windows 95 and Windows NT that adds a client-server model to TAPI 2.0 (see Figure 4). It mainly differs from TAPI 2.0 in its remote capabilities, which allow TAPI applications on client PCs to transparently access a TSP on a server as if the TSP were located locally. For the most part, no special rewriting is necessary for either TSPs or applications to take advantage of the client-server architecture.
Figure 4 TAPI 2.1 Architecture
Since TAPI 2.1 introduces no new API calls, there is no new SDK for it. TAPI 2.1 does introduce a new component called TCM (Telephony Client Management), a small set of DLLs that can control access to telephony lines on the server. TAPI 2.1 controls device access using the Windows NT security model based on domain user accounts. A third party can extend this capability by providing a custom TCM DLL.
One of the hidden benefits of the TAPI 2.1 upgrade
for Windows 95 is that it brings TAPI 2.0 support with it. This applies to both application programs and TSPs. TSPs that are written for Windows NT 4.0 can run on Windows 95 with the TAPI 2.1 upgrade if the following conditions are met:
Similarly, the following conditions must be met in order for a TAPI 2.0 application to run on Windows 95 with the TAPI 2.1 upgrade:
- The TSP does not have a device driver component.
- The TSP does not call Unicode APIs or CRT functions that are not supported by Windows 95. (The TSP itself still must be written in Unicode.)
- The TSP does not rely on Windows NT-specific features or API calls.
A common misconception about the TAPI 2.1 upgrade is it allows a TAPI 2.0 application to run on Windows 95 with a 16-bit TSP. As discussed below, the TAPI version is a function of both the application version and the TSP version. If there are no 2.0 service providers available, then TAPI 2.1 will only negotiate up to 1.4. Architecturally, the TAPI 2.1 upgrade for Windows 95 replaces the 16-bit TAPI subsystem with a 32-bit implementation similar to Windows NT.
- The application does not use a completion port.
- There is no Windows NT-specific code.
Currently, the only functionality available remotely is call control. There is no remote media accessibility through TAPI 2.1. Thus, you can place a call using UNIMODEM on a remote server from a telephony clientthe modem will dial and a connection will be establishedbut the call will be useless since there is no way to access the media stream once the connection is established.
On the Horizon
A new version of TAPI is scheduled to be included in the upcoming release of Windows NT 5.0. TAPI 3.0 will include all-new COM-based objects and interfaces that allow more language-neutral accessibility to TAPI. Version 3.0 will also make it easier to access TAPI's rich set of call control features from Visual Basic®-based or Java applications.
TAPI 3.0 will also introduce media streaming to the TAPI world. As I mentioned earlier, TAPI so far has been a
call control API with media handling left up to the applications. TAPI 3.0 will introduce the MSP (Media Stream Provider) to handle media streaming. This new feature will enable you to write applications such as an IP-to-POTS gateway with relative ease. Despite TAPI 3.0's new programming model, it will retain backward compatibility with existing applications.
On the TSP side, TSPI 3.0 will be introduced. The TAPI 2.x model for TSPs will still exist, with the addition of MSP-related hooks. TAPI 3.0 is built on TAPI 2.x; it's not a replacement.
At this point, it is too early for a detailed description of TAPI 3.0. As development progresses, the information will be provided with the Platform SDK documentation and future MSJ articles.
A common point of confusion about TAPI concerns UNIMODEM. I often hear frustrated developers complain: "TAPI doesn't support Caller ID!" or "TAPI gives me a CONNECTED' message even though the other side hasn't picked up the call!" The misunderstanding in these complaints is that the word "TAPI" should be replaced with "UNIMODEM."
UNIMODEM is a service provider and is not a part of TAPI. It implements a small subset of the functionality that TAPI supports. A "least common denominator" TSP for consumer modem devices, it implements the features that are commonly available on popular modems. IHVs typically ship INF files that specify the AT command sets and expected responses for their modems.
Two versions of UNIMODEM are distributed. The retail versions of Windows 95 and Windows NT 4.0 ship with a version of UNIMODEM that supports data calls. There is no voice functionality implemented on these versions.
A second version of UNIMODEM comes with the OSR2 version of Windows 95, which is currently only available preloaded on new PCs. This version, called UNIMODEM/V, supports some voice functionality. When used with supported voice modems, it lets you use the modem as a speakerphone, an answering machine, and so on. UNIMODEM/V also supports caller ID.
UNIMODEM/V can be downloaded for PCs with the retail version of Windows 95. Currently there is no UNIMODEM/V counterpart for Windows NT. UNIMODEM 5, which is scheduled to ship with Windows NT 5, will match UNIMODEM/V in functionality.
The specification for UNIMODEM is available on Microsoft's FTP site (ftp://ftp.microsoft.com/developr/drg/Modem/modemdev.exe), and may be helpful in developing applications meant to work with UNIMODEM-supported devices.
TAPI uses four types of telephony models: LineApp, Line, Phone, and Call. I am going to refer to these models as "objects" even though I may be publicly flogged for it later by die-hard object purists. The word "object" suggests formally defined COM / OLE objects, but I don't mean it in this way. I use the word "object" because it's a handy analogy for explaining the concept.
The LineApp object is the mother of all TAPI objects. It doesn't have a corresponding physical entity. Each LineApp represents a distinct TAPI session. A LineApp is created when lineInitialize(Ex) is called, and destroyed when lineShutdown is called. A process can create multiple LineApps if necessary. The LineApp owns all other TAPI objects; the ownership is exclusive and not transferable. Each instance of LineApp is identified by its LineApp handle.
The Line object is an abstraction of the telephony line represented by the TSP. Sometimes this is the telephone line, but often TAPI Line objects don't correspond to actual lines. A TAPI Line object is a logical representation of a device that functions as a method of conveyance for a telephony media stream. It is up to the TSP to determine how to represent a physical device in terms of TAPI Lines. For example, a particular TSP may model a standard BRI ISDN line as one data line and two voice lines for each B channel. Another TSP may implement it as two lines that each support multiple media modes.
Line objects are created by an application's call to lineOpen and are represented by their Line handles. Line devices are owned by the LineApp object, and multiple Line objects can be owned concurrently by a single LineApp. Line devices are most commonly used for call control purposes.
A Phone object is a logical representation of the terminal equipment. As in the physical telephony world, Phone objects can be used without calls. For example, you can use a telephone handset as an interface to a voice mail system, recording and retrieving voice mail messages without actually creating a call. The Phone object closely models the physical telephone device. It includes the concepts hookswitch, handset, speaker, microphone, programmable button, and display. A TSP can be written to expose these physical properties of telephone devices to TAPI applications. Note that the Phone object is not a call control object as the Line object is.
A Call object can be created by either TAPI or a TSP. Typically, the application creates a call by calling lineMakeCall or lineAnswer. Each Call object is identified by a Call handle. Under some conditions, a TSP can also create a Call object and inform TAPI after the fact. Running applications will be notified of this by a LINE_NEWCALL message. Call objects are created on a Line, and so are always associated with Line objects. Though a Call is always owned by a Line, the relationship is not necessarily one-to-one. A single Line object can own more than one call. Call waiting and conferencing are other examples.
Keeping Track of What's Happening
Part of what makes TAPI unique is its dynamic nature. Most Windows APIs are primarily concerned with what goes on inside your computer, but TAPI is designed to interface with external, noncomputer devices that can be highly dissimilar and unpredictable. When an application such as a dialer or communications application initiates a call control action, it's not too difficult to anticipate how the device will behave. Cases like this, however, are an exception to the rule.
TAPI applications often must handle actions and conditions that occur spontaneously on a device, so TAPI defines state notifications for each type of event. These notifications are sent via the notification mechanism that the application chose when calling lineInitialize.
TAPI applications need not handle every notification for every object. TAPI's event model is very extensive and many of the notifications and events may be irrelevant to what the application is trying to do. The application can choose to handle or disregard the notifications that TAPI or the TSP generate. Many notifications are informational, describing changes in the state or condition of the device or call. The danger in disregarding this information is that TAPI calls may fail at a later point.
The Versioning Riddle
TAPI's versioning scheme is another frequent source of confusion. Typically, versioning tracks the revision history of a program's components. Each release of executable modules, documents, or suites carries a fixed version number. Windows NT 4.0 is always Windows NT 4.0, whether you are running an old, 16-bit application or a newer, 32-bit one.
The term "versioning" is used in TAPI to describe a slightly different concept. Instead of referring to the revision history of the running code, it refers to the revision history of the API calling and parameter-passing conventions. As TAPI evolved, new API functions were added, new events and messages were added, and modifications were made to existing standards. Each such revision has been given a version number. To assure backward compatibility, operating systems ship with binaries that support a certain range of TAPI versions (see Figure 5). For example, Windows NT 4.0 out of the box can run TAPI versions 1.3, 1.4, and 2.0.
Typically, applications and TSPs support one specific version of TAPI. It is neither impossible nor illegal to write an application or a TSP that supports more than one version of TAPI, but the few benefits of doing so are far outweighed by the effort required to ensure proper operation. For TAPI applications that must run on both Windows 95 and Windows NT, TAPI 1.4 is the de facto standard.
TAPI applications for Windows NT can be compiled for both Unicode and ANSI, but TAPI applications for Windows 95, Windows 98, and Windows CE can use only one of these. The issue is usually resolved at compile time by defining Unicode, but this causes problems when building a TAPI 1.x application targeted for Windows 95. In this case, it is necessary to define the TAPI application version as follows:
#define TAPI_CURRENT_VERSION 0x00010004
While it's not always necessary, it's good programming practice to define the intended TAPI version. This will help to avoid potential problems in the future when the default version of TAPI is updated.
Defining this value does not determine which API version the application will use. It's merely a compile time directive to conditionally link a particular version of the API. The actual API version used is negotiated at runtime using the lineNegotiateAPIVersion call. On occasion, I have seen customer code that called this API function as follows:
0x00010004, // I want at least 1.4
0x7fffffff, // Highest I can get
While it might seem like a good idea to always ask for the latest and the greatest, doing so can introduce some practical problems. As TAPI has evolved, many of its data structures have been modified. Also, some new TAPI messages have been introduced while others have been retired or subtly changed in their meaning. Writing an application that will work with multiple versions of TAPI is not impossible, but it's a complex task that should be discouraged.
There are two factors that determine which version of TAPI an application should use: feature set and platform. If an application needs a particular feature that's only available on a certain TAPI version, use the version that supports that capability. This may limit the platforms on which the application can run. On the other hand, if an application must support as many platforms as possible, you'll have to sacrifice features that are not available on early versions of TAPI.
The Platform SDK reference contains a section called "What's New" that describes the differences between TAPI 1.4 and TAPI 2.0. The 16-bit TAPI SDK outlines TAPI 1.3. Unfortunately, there is no single document that clearly outlines the differences between all versions of TAPI.
Life Cycle of a Call
So far I've been discussing TAPI in abstract terms. In the following section, you'll look at TAPI from the API perspective by following the life of a call. The discussion will follow roughly the accompanying sample code (see Figure 6), which was written for TAPI 2.0 using event-based notification.
The sample program is a simple console mode application that takes a few parameters. In outbound mode, it calls a specified destination. In incoming mode, it answers an incoming call in the specified media mode. Once the call is connected, pressing Ctrl-C will hang up the call and exit the application. Disconnection from the target address will also terminate the application.
Before jumping into the details, I'll discuss some programming concepts that may be unfamiliar to developers who have not yet worked with TAPI.
VARSTRING and Variable Length Structures
This TAPI concept makes many programmers cry. Much of the information that is passed in TAPI programs is provided by the TAPI service provider. Often this data is amorphous in nature and variable in length. To efficiently pass this data around, TAPI uses variable length "structures" that are really headers. Variable length structures consist of conventional structures along with offset and size information for additional data. The additional information is appended to the structure itself.
Variable length structures always starts with three DWORD members: dwTotalSize, dwNeededSize, and dwUsedSize. When allocating and initializing a structure, dwTotalSize must be initialized with a value that reflects the total amount of allocated memory. Upon return from an API call, the caller should check the value of dwNeededSize against dwTotalSize. If dwNeededSize is larger than dwTotalSize, then the memory initially allocated was insufficient. You will need to reallocate the memory and try the call again.
There is no guarantee that the amount of memory needed will be the same on the second call. The contents of the structure are dynamic so the size can change. This is not a problem with relatively static TSPs like UNIMODEM, but it's an important consideration in high-volume TSPs such as PBX switches. It's a good idea to keep checking the dwNeededSize until the memory requirement is satisfied.
It's an acceptable shortcut to arbitrarily allocate extra buffer space in anticipation of extra memory requirements. The sample code demonstrates this approach by allocating 1K buffers. It's still necessary to check afterward that what you
anticipated was indeed enough. Continue looping until you have confirmed that the buffer size
is enough to hold the necessary information.
VARSTRING is the most extreme example of the variable length structure. Its header is minimal and its sole purpose is to hold opaque and amorphous data. Most commonly, it's used to pass data between TSPs and running applications that TAPI doesn't touch.
Asynchronous Calls and LINE_REPLY
TAPI calls are frequently asynchronous. When an API function requests that a certain action be performed, the request is passed to TAPI, TAPI passes it down to the TSP, the request is queued, and the API function returns. Later when the operation is completed, the application is notified.
It is important to realize that the successful return of the API call doesn't mean the requested operation was successful. It simply means that the TSP has acknowledged the request. The true result of the operation won't be available until some later point when the operation is actually completed. If the operation is asynchronous, then when the API call returns the application will be given a request identifier for the pending operation. When TAPI advises the application of request completion through LINE_REPLY, it uses this ID to identify which request has completed.
In the Beginning
Typically, the first TAPI call an application makes is lineInitialize or lineInitializeEx. A TAPI 1.3 application uses lineInitialize, and a TAPI 2.x application uses lineInitializeEx. These calls will load TAPI.DLL into the application's memory space. TAPI will also start TAPISRV (or TAPIEXE, in the case of Windows 95) and load the installed TSPs. If this is the first TAPI session, TAPI will call TSPI_
providerEnumDevices for each TSP installed. Each TSP will respond with the number of line and phone devices it's exposing, and TAPI will assign a device ID for each of them. TAPI also calls TSPI_providerInit to initialize each provider. Once complete, the API call will return with the number of total enumerated line devices in a zero-based index.
TAPI also returns the LineApp Handle, which is like a ticket stub at the movies. The LineApp Handle identifies the TAPI session your application is using. Keep this somewhere safe as you'll need to present your ticket stub later.
When calling the initialization API function, the application must select and provide TAPI with an event notification method. Telephony applications are strongly event-driven. Telephony events reported to applications by TAPI include changes in existing calls, new incoming calls, connections, and disconnections.
In TAPI 1.x, callback functions are the only method
of event notification. TAPI uses a hidden window to drive this callback mechanism. This means that TAPI 1.x applications must meet the following conditions: the application must dispatch messages, and the thread that calls lineInitialize must handle the callbacks. When writing a console-based application, the first requirement is often problematic.
TAPI 2.0 added two additional notification mechanisms: event-based and completion-port-based. These are useful in writing Windows NT service or console-based applications because they provide more flexible implementation options. While they can be used interchangeably, some applications benefit more from one than the other. Events are relatively simple to use and are good for small applications or services. The completion ports can be queued and queried, and are the logical choice for higher volume applications where many lines are opened and many activities take place concurrently. Completion ports are not implemented on Windows 95, even with the TAPI 2.1 update.
Selecting a Line
Users frequently have more than one telephony line device installed on a machine. Some of them are obvious devices like modems and telephone lines. Others are less obvious. For example, Direct Cable Connection on Windows 95 installs a telephony line for each port it can use. It's also common to have more telephony devices than physical hardware. For example, more than one modem can be installed on a single port, each with a different device ID. An ISDN adapter TSP may model its various modes (voice, data, bonded data, and so on) as different logical devices. In traditional programming before TAPI, you chose communication devices by choosing the underlying hardware. For example, you'd choose COM1 to use the fax/modem on that port. TAPI's device IDs have nothing to do with the underlying hardware.
With TAPI, instead of picking devices according to where they are, you pick them according to what they can do. The TAPI API function lineGetDevCaps lets you query each device for its capabilities. The function populates the LINEDEVCAPS structure, which includes useful information such as the supported media modes, the TSP name, and line features. An application can use the information returned by the function to choose the line device with the desired capabilities.
Once the desired device is selected, the API version must be negotiated using the lineNegotiateAPIVersion function, which allows the caller to specify low and high versions separately. In theory, you can select a range of API versions. In practice, as discussed earlier, it's a good idea to specify the API version for which the application was written.
Opening the Line
Now that you have selected a line device, you can open the line. But first you must make a few decisions. The Platform SDK documentation does a good job of explaining what each parameter means and what it does. If UNIMODEM is the TSP for the selected line, you must watch out for a few additional details:
If a line is open with LINECALLPRIVILEGE_MONITOR, no call state information will be available unless the line is opened with LINECALLPRIVILEGE_OWNER, either by another process or by another call to lineOpen.
- UNIMODEM will only open the port if LINECALLPRIVILEGE_OWNER is set.
- UNIMODEM will accept LINEMEDIAMODE_INTERACTIVEVOICE even though the modem device itself is data only. This allows dialer applications to dial outbound voice calls.
Optionally, an application can call lineOpen with a value passed in the dwCallbackInstance parameter. TAPI doesn't touch this variable; it simply saves it internally. When TAPI notifies the application with a line event, it will pass back this value. The dwCallbackInstance parameter can be used for any purpose, but is especially useful as a "token" that identifies the line in a multiline application.
A successful call to lineOpen yields a line handle that
can be used to address the line in subsequent line device API calls.
Creating a Call
You can initiate outbound calls by calling the lineMakeCall API function. As described in the previous section, the line must already be open since this call requires a valid line handle. You also will need to specify the destination address, a fancy way of describing the "phone number" you are dialing. You may simply feed in the number to be dialed as needed, or have TAPI prepare one for you by "translating" a canonically formatted number.
What is "translation"? You may recall that when you first installed a modem on a TAPI-supported PC, you filled out some dialog boxes asking for your area code, country, whether to dial 9 to get an outside line, and so on. That information is used for the translation process. TAPI provides the API function lineTranslateAddress to achieve this. You must pass the destination address in the canonical format, which is defined as follows: +Country Code (Area Code) Exchange-Station. For example, the phone number for Microsoft would be represented as +1 (425) 882-8080. The spaces and parentheses are important. TAPI will perform the translation based on the location profile and translation options, and populate the structure LINETRANSLATEOUTPUT with the translation results.
LINETRANSLATEOUTPUT contains, among other things, two types of translated destination addresses: the dialable string and the displayable string. The dialable string is the string that should be passed to lineMakeCall. It stores the digits in exactly the way they should be dialed, including calling card numbers in plain text. The displayable string contains a "safe" string that hides the card information. The displayable string always should be used for user interface displays to avoid inadvertently disclosing a user's calling card information.
In most cases, simply passing the translated dialable string to lineMakeCall will satisfy your dialing needs. There are, however, some specific instances where it is desirable to break up the dialing into multiple parts. In particular, you must use this method when it's necessary to insert delays in dialing between sets of digits. A semicolon appended to a dialable string tells TAPI that dialing is incomplete and more digits are to follow. Subsequent digits can be dialed using the lineDial API call. The dialing stays in an incomplete state as long as a semicolon is suffixed to the dialable string. You can complete the dialing by either omitting the semicolon or calling the lineDial function with a null destination address.
Once dialing is completed, TAPI will start reporting the call state using LINE_CALLSTATE messages. Typically, the call state first will transition to LINECALLSTATE_PROCEEDING, indicating that the call is being routed to the destination. If the call is answered, it will shift to the LINECALLSTATE_CONNECTED state, indicating that the call is established. Depending on the capability of the service provider and the underlying hardware, it is common to encounter several other call states before reaching the state LINECALLSTATE_CONNECTED. The exact meanings of each state are documented in the Platform SDK documentation.
Incoming call handling is simpler than outgoing call handling. For an application to respond to an incoming call, it must open the line with OWNER privilege. Ownership can be requested on a per media mode basis, allowing multiple applications to have ownership of different media modes over the same line. If the TSP can detect media modes for incoming calls, you can use this ability to efficiently share lines between multiple applications. For example, UNIMODEM/V uses the distinctive ring capability of POTS to determine the media modes predefined by the user.
In the POTS + UNIMODEM scenario, however, it is most common to request all supported media modes plus UNKNOWN. This is because POTS and UNIMODEM are normally incapable of detecting the media mode. Applications answer each incoming call, determine its media mode, then hand it off to the application that owns that media mode. A good example of an application of this is the Operator application that ships with UNIMODEM/V.
Once the line is successfully opened, the application simply waits for a call. When a call arrives that matches the media modes requested, the application receives the call state message LINECALLSTATE_OFFERING, which indicates that a call is being offered to the application. Depending on what messages were masked by the lineSetStatusMessages call, the application may also receive a LINEDEVSTATE_RINGING notification as the call rings. If this is the case, dwParam2 indicates the ring count for this call.
While it makes intuitive sense to rely on LINEDEVSTATE_RINGING to determine the presence of an incoming call rather than LINECALLSTATE_OFFERING, it's bad programming practice for several reasons. LINEDEVSTATE_RINGING simply means that the line is in a ringing state (the station is being alerted of an incoming call), and does not mean the application necessarily owns that particular call. For example, the call could be for a media mode that the application doesn't support. Also, since you are likely to get more than one ring, it's very difficult in a high-volume situation to distinguish how many distinctive calls have come in by looking at the LINEDEVSTATE_RINGING notification.
In the United States using POTS, caller ID information is sent by TelCo between the first and second ring. If the TSP supports it, caller ID information is reported by a LINE_CALLINFO notification when the information becomes available. When an application receives this notification, it can call lineGetCallInfo to obtain the actual data.
Answering Offered Calls
When an application receives a call offering, it can answer it by issuing a lineAnswer call. As with outbound calls, after a successful API function return, the call state should transition to LINECALLSTATE_CONNECTED, indicating that the call is established.
When a Connection is Made
The sample code really doesn't do much once the call is connected. It simply reports status changes if they occur. Here are some hints on where to go from here. In most cases, once the connection is made it is no longer in TAPI's domain until something happens on the line that requires call control action. The media stream itself is handled by the APIs that are appropriate for the media of interest.
If the application needs to make a data connection, then you will need to obtain a 32-bit communications handle. A call to the API function lineGetID with "comm/datamodem" as the device class will give the application the Win32 communications handle it needs. This handle is very similar to what you would get by opening the COM port using the CreateFile API call. By default, the handle is opened for overlapped I/O operation.
UNIMODEM duplicates the communications handle it owns when an application calls lineGetID. When the line is closed, UNIMODEM closes the original communications handle, but doesn't close the handles owned by the application. If an application obtained a communications handle by calling lineGetID, the application is responsible for closing the handle once it's done with it. Failure to do so will cause the handle to remain opened for the duration of the process instance, and UNIMODEM will not be able to make any subsequent calls or receive any notifications.
On LINEMEDIAMODE_AUTOMATEDVOICE calls, the wave API is used to play back or record voice signals on the phone line. Applications must call lineGetID using the device class wave/in or wave/out to obtain a handle to a wave audio device. Many sound drivers (including the SERWAVE driver that ships with UNIMODEM/V) are semi-duplex, so the same device cannot be opened simultaneously for both input and output.
DTMF and Tones
While TAPI doesn't provide much in terms of media control, it does provide methods to monitor telephony-specific media activities. DTMF (Dual Tone Multi Frequency) and tone functions fall in this category. These functions, if supported by the underlying hardware and TSP, allow TAPI applications to generate DTMF digits and tones. The term "DTMF digits" refers to specific tones generated by the digit buttons on touch-tone telephones. "Tones" here refers to specific sounds indicating line conditions, such as a dial tone or busy signal. Similarly, applications can monitor tones or gather digits if the underlying hardware and TSP is capable.
Ending a Call
A call can be dropped in one of two ways: by the remote end or by the application via an API call. This is a fancy way of saying either you hang up or I hang up. TAPI applications hang up by calling lineDrop. As with lineMakeCall, calling lineDrop doesn't mean that the call is immediately hung up. The call to lineDrop simply initiates the process. Depending on the TSP and how the call is set up, the call might go through several call states before settling down to LINECALLSTATE_IDLE, which indicates that the call no longer exists on the line. If a call is dropped by the other side or by divine intervention, the TSP will signal LINECALLSTATE_IDLE when the call no longer exists on the line.
LINECALLSTATE_IDLE is a terminal call state; it cannot transition to any other. Consider it the "corpse" of a call. Once this state is reached, call control API functions will not work on the call, but information pertaining to the call will be available until the call is deallocated.
In general, it's good practice to clean up when handling the LINECALLSTATE_IDLE notification. This is a convenient place for cleanup because the notification will be received no matter how the call was dropped. It is also a safe place for cleanup since no action can be taken on the call that may change the call state. Whenever possible, avoid doing cleanup when calling lineDrop. Cleanup at this stage may cause deallocation of the call, or the closing of handles that are in use. If no more calls are to be placed on the line, the application may elect to close the line.
After all the telephony resources are deallocated, the TAPI session can be terminated by a call to lineShutdown. As a general guideline, an application should keep its
calls to lineInitializeEx and lineShutdown to a minimum. These are costly API calls. On each call to lineInitializeEx, every TSP is loaded and initialized. Depending on the
TSP, this can be a lengthy process. Thus, it's a good idea to retain the TAPI session until the application is completely done with it.
That wasn't too bad, was it? The good news is that I have covered most of the basic programming concepts in TAPI. It's true that TAPI programming is quite a bit different from conventional Win32 communication. These models, however, are functional and efficient, and do an excellent job of providing consistent and flexible programming methods. The telecommunications industry is notoriously diverse and proprietary, with few accepted programming standards. While TAPI is far from perfect, it provides what it set out to provide: a device-independent programming environment for both ISVs and IHVs.
The bad news is that I've only scratched the surface of a very rich API set. I have only touched on phone devices, and I didn't mention any of the more advanced control functions like conferencing and handoff. TAPI 2.0 introduced a call agent and proxy functionality that I have yet to mention. Then there is the topic of TSP development, which I haven't discussed at all. No need to worry, though. The hard part is over. All of these advanced features are built on top of the same basic concepts I have already examined.
In closing, I would like to mention some resources and publications that are helpful in developing TAPI applications or TSPs. The Platform SDK is an indispensable resource. While Developer Studio and Microsoft Visual C++® ship with headers and libraries that are pulled from the SDK, they lack some extremely useful samples and utilities. The Platform SDK is available from the Microsoft Web site or through an MSDN subscription. The Platform SDK is updated more frequently than Microsoft's developer products. New information and fixes are first included on the SDK, and then propagated to developer products.
Platform SDK Resources
In the Platform SDK documentation, TAPI is now grouped under "Networking and Distributed Services" rather than "Files and I/O Systems" where it was previously. This reflects TAPI's new positioning as a strategic networking component in future Microsoft operating systems. The documentation includes a fairly detailed overview of each component. If you are new to TAPI, it is worth going through each section.
The SDK ships with two development tools that are extremely useful in working with and diagnosing TAPI applications. TB (TAPI Browser) lets you issue individual API calls without actually having to write code. API calls are presented in a list format, and parameters can be specified individually. The return codes and event notifications are presented in another window. TB is useful in "prototyping" an application because it can simulate how TAPI functions will work. It can also be used as a monitoring application to aid debugging. ESP (Economical Service Provider) is the TSP counterpart of TB, and simulates a TSP. You can create calls and various line events using ESP.
The SDK ships with several sample programs. If you are new to TAPI, I recommend taking a look at these and playing with them. TAPICOMM demonstrates the creation of an outbound datamodem call. It's similar in functionality to the TTY sample, using TAPI to create the outbound call and emulating TTY once connected. Dialer is similar to the Phone Dialer applet that comes with Windows. Dialer creates outbound voice calls and logs line activity if requested. ATSP32 is a skeleton TSP sample program that demonstrates the minimum requirements for a functional TSP. This is a good starting point for writing a TAPI 2.0 service provider. ACD demonstrates the new call agent capability of TAPI 2.0.
From the April 1998 issue of Microsoft Systems Journal.
Get it at your local newsstand, or better yet, subscribe.