James Finnegan is a developer at Communica, Inc., a Massachusetts-based company specializing in system software design. He can be reached via firstname.lastname@example.org.
Many of the articles and columns in MSJ present
techniques and technologies that are far more
than just a flash in the pan. In fact, I find that I get the most feedback on two of my oldest articles, "Hook and Monitor Any 16-bit Windows Function With Our ProcHook DLL" (January 1994) and "Test Drive Win32 from 16-bit Code Using the Windows NT WOW Layer and Generic Thunk" (June 1994), even though they are both very 16-bit centric.
So I'm dedicating this column as my last hurrah of 16-bit Windows®, showing the 16-bit to Win32® integration enhancements that are now available under Windows 9x, Windows NT®, and Windows 2000. This month I'd like to stress that system-level development doesn't always have to involve kernel mode explicitly.
Now without further ado, let's look at the changes and enhancements that are present in the current incarnations of Windows on Win32 (WOW). All of the relevant APIs are listed in Figure 1. I've enhanced the original applications presented in my June 1994 WOW article so you can compare and contrast the enhancements and implementation differences on the various Win32 implementations available today.
Let's start by complaining. Architecturally, WOW under Windows NT is implemented so that each 16-bit task is
an individual thread under a single Win32 process (NTVDM.EXE). Windows 9x, however, is a baroque hodgepodge of 16 and 32-bit code, without the strict structure that exists under Windows NT. This was largely done to ensure a high level of compatibility with legacy apps and drivers.
Therefore, not surprisingly, many of the nonstandard thread features that I employed in WOW under Windows NT do not work under Windows 9x. Most notable is the ability to create threads from the context of a 16-bit app. (See the Create New Thread from Win32 and Create New Thread from Win16 menu items and their associated code in WOWTEST.EXE.) Even though each 16-bit app under Windows 9x has associated Win32 data structures (process IDs and the like), the infrastructure to do advanced Win32 things such as thread creation does not exist. Although you may know to stay away from thread creation when operating under Windows 9x, this issue may haunt you when calling Win32 API functions that you did not create. As noted in the SDK documentation, the functions most likely to fall into this category are the common dialog calls, which implicitly create additional threads for processing.
Likewise, since 16-bit apps under Windows 9x are not threads within the context of a Win32 process, a 16-bit app's Task Database (TDB) does not contain a Win32 thread ID at offset 0xFC (as shown in the List WOW Tasks menu item). This behavior under Windows NT is undocumented anyway, so I shouldn't complain about its absence in Windows 9x. In addition, since there's no need for Windows 9x to associate a task within a Win32 process, I assume that this data was missing from the get-go anyway.
Disappointingly, Win32 code no longer has the ability to post user-defined messages to 16-bit windows. As shown in the Post Messages from Win32 menu item, documented window messages can be passed freely between Win32 and 16-bit code, and their parameters are converted transparently to suit each environment. But where Windows NT permits undocumented or user-defined messages to be passed back and forth (under the assumption that parameters passed by reference are translated by your code), Windows 9x quietly disposes of such messages, never posting them to the target 16-bit window. Bummer.
However, as you'll see toward the end of this column, enhancements to the Generic Thunk API have more than made up for this deficiency by finally permitting callbacks from Win32 into 16-bit code. So even though you can't use all of the Generic Thunk features across Windows 9x and Windows NT, you can easily overcome any particular design limitation with functionality available in both.
Finally, a Real CallProc in 16-bit Windows!
From the onset, the most puzzling aspect of WOW has always been its quirky CallProc32W function. Prototyped as a Pascal-type function, CallProc32W took a variable number of parameters (something that Pascal and its function prototypes do not support). Because the Pascal-style prototype was used, the called function is responsible for removing passed arguments from the stack rather
than the caller, as in C. This created the need for some funky code in the implementation of CallProc32W, which used an indexed jump to adjust the stack properly upon the function's return, depending on the number of parameters passed in. Outside of programming in assembler, I have no idea how a developer was expected to call this function
from within a modern high-level language without great pain. This inspired me to create WOWCallProc32 (in WOWGlue.c), which used a standard variable-argument
C-style prototype and some inline assembler to flip the parameters around in order to satisfy CallProc32W's cracked implementation.
Added to the Generic Thunk in Windows 9x, Windows NT 4.0, and Windows 2000 is an export in KRNL386.EXE called CallProcEx32W (export ordinal 518). This function, which still takes a variable number of parameters, is finally prototyped as a C-style function. This permits you to dispose of any contortions previously required to call into Win32 code. After all this, CallProc32W is the secret sauce to the Generic Thunk. You'd think that it wouldn't have been designed to be so impenetrable in the first place!
Using CallProcEx32W is straightforward. Here's its prototype and related defines:
#define CPEX_DEST_STDCALL 0x00000000L
#define CPEX_DEST_CDECL 0x80000000L
These are defined in WOWNT16.H, which is included in all 32-bit versions of Visual C++®. The first parameter indicates the calling convention of the API being called (either of a Pascal type, indicated by CPEX_DEST_STDCALL, or a C type, indicated by CPEX_DEST_CDECL). This parameter is ORd with the number of parameters that are being passed to the target Win32 function. The second parameter is a bitmap indicating which passed (by reference) parameters need their addresses converted from selector:offset to linear address. The rightmost bit indicates the first parameter, the one to its left indicates the second one, and so on. You can OR the #defines I've placed in WOWGlue.h to make your life easier. The third parameter is the address of the target Win32 function, obtained with a call to GetProcAddress32W. Any parameter after the third is passed to the target function (from left to right or right to left, depending on the calling convention referenced). Here's how to call the Win32 MessageBox from 16-bit Windows:
// Get a pointer to MessageBox in USER32.DLL
// Call it via the cool, new CallProcEx32W!
CallProcEx32W(CPEX_DEST_STDCALL | 4,
PARAM_02 | PARAM_03,
(DWORD)0, LPSTR("Hello world!"),
LPSTR("WOW Call from Win32"), (DWORD)MB_OK );
As noted in my June 1994 WOW article and as shown in the previous code, all parameters must be explicitly cast as 32-bits in size. This is critical; C-type variable-length functions do not expand parameters automatically. (The compiler, of course, cannot determine this at compile time.) When in doubt, play it safe and cast!
Also, when pulling CallProcEx32W into your code (via the .DEF file or by calling GetProcAddress), don't forget to add the leading underscore, since this function is of the C variety (unlike most other Windows API calls):
Added Features in WOW32.DLL
Although the addition of the Generic Thunk within Windows 9x is interesting, the real enhancements that show the Generic Thunk's maturity are on the Win32 side of the house.
I struggled philosophically with these enhancements at first, largely because under Windows NT and Windows 2000 WOW is compartmentalized within the context of NTVDM, a Win32 process. Empowering Win32 with the intrinsic knowledge of a foreign environment like NTVDM and 16-bit Windows is something that made little sense architecturally, in my opinion. Windows 9x, on the other hand, has a greater level of integration in which 16-bit and Win32-based apps are treated more or less as peers. As such, some compromises and enhancements were needed to accommodate proper integration and compatibility with the Generic Thunk. All of the new APIs have been added to WOW32.DLL and are prototyped in WOWNT32.H, which is included in all versions of Visual C++. Shortly I'll go over each of these features.
Most of the functions I'll mention are for the manipulation (obviously) of the 16-bit Windows world. Under Windows 9x, this environment is always available and is presented alongside Win32 and its infrastructure. This means that Win32-based apps that do not directly relate to a running 16-bit task can utilize these functions to manipulate the 16-bit environment. In contrast, with Windows NT these functions must be called from within the context of a 16-bit thread. This means that these functions should only be used in Win32 function calls that are made directly from 16-bit code via the Generic Thunk. The Generic Thunk functions are even off limits to threads created within your Win32 codeeverything must be handled within the originating 16-bit thread's context. Consider yourself warned.
VDM Memory Management and Manipulation
To show how 16-bit Windows is morphing into the world of Win32, let's first take a look at functions made available to allocate and manipulate 16-bit memory. You may be wondering what business Win32 has even knowing anything about 16-bit memory in the first place. It will become painfully obvious throughout the remainder of this column that 16-bit features and functionality exposed to Win32 are largely for the support of 16-bit callbacks, which is the ability of Win32 code to call back into the 16-bit application. Any Win32 code utilizing 16-bit callbacks clearly has to do things the 16-bit way. These memory functions, as well as the other enhancements discussed later, are here to serve that purpose.
The 16-bit Windows memory management functions exposed in WOW32.DLL are shown in Figure 2. The allocation and lock functions correlate to their 16-bit counterparts directly. The combined functions wrap the allocate/lock and deallocate/unlock calls since the locking functions have become obsolete with the introduction of protected mode Windows. They no longer interfere with Windows' memory management.
These functions strictly allocate from the heap of the calling 16-bit app context. Under Windows 95 this means nothing, as all 16-bit apps execute in the same place. For Windows NT, where users have the option of starting separate VDMs for 16-bit apps, memory allocation is clearly occurring in multiple environments. In short, do not assume anything outside of the context of the thread that's calling your Win32 code.
Also added to the Win32 side of the fence is the ability to convert real and protected mode 16-bit pointers to 32-bit linear addresses. This is functionally equivalent to GetVDMPointer32W (from the 16-bit Generic Thunk API). WOWGetVDMPointer is functionally identical to its 16-bit counterpart. The first parameter is a segment:offset or selector:offset, and the second parameter specifies the range of bytes for the referenced memory. The third parameter, if TRUE, indicates that the memory address is a protected mode selector:offset; otherwise, it is treated as a real-mode pointer. The call returns a linear address that Win32 can use. Note that no validation occurs in regard to the length requestedWOWGetVDMPointer simply creates page table entries (on Intel-based platforms) for the referenced memory. This means that memory limit overruns can occur, which can easily wreak havoc on your 16-bit apps.
For the benefit of Windows 9x, two additional pointer creation functions exist: WOWGetVDMPointerFix and WOWGetVDMPointerUnfix. WOWGetVDMPointerFix takes the same parameters as WOWGetVDMPointer, but GlobalFix is called on the segment prior to the creation of the linear address pointer. This is to prevent any compaction of the global heap from moving memory to which you hold a pointer. Generally, this cannot happen under Windows NT within the processing of a Win32 call from the Generic Thunk (unless you execute 16-bit code through a 16-bit callback).
However, due to the architecture of Windows 9x, global heap manipulation can occur during processing of Win32 code. Regardless, in both environments linear address pointers obtained from any of the aforementioned functions should be disposed of after processing your Generic Thunk Win32 code, since the state of the 16-bit global heap can easily change between calls to your Win32 functions.
In my June 1994 WOW article, I showed you how to convert 16-bit window handles for use by Win32. Expansion of the 16-bit-long window handle to 32 bits required all the upper 16 bits to be set to 1s. This is easily done by ORing the upper half of the newly created window handle:
hWnd32 = hWnd | 0xffff0000;
This technique still works in all current implementations of WOW. However, its future support is never guaranteed. Furthermore, this technique only addresses window handles, not the plethora of other opaque handle types throughout Windows.
Fortunately, two functions were conveniently added to WOW32: WOWHandle16 and WOWHandle32. WOWHandle16 takes a 16-bit handle of any type and converts it
for use by Win32 API calls. WOWHandle32 does the opposite. To make their use as straightforward as possible, WOWNT32.H includes a series of conversion macros for all supported handle types (see Figure 3). To ensure future compatibility of your code, I highly recommend utilizing these functions, rather than any other hacks that you may have employed in the past.
There are two things I'd like to note. First, the fact that these functions are present on the Win32 side of the fence just doesn't make sense. In my opinion, use of the Generic Thunk should happen without the involvement or volition of Win32 code. All handle conversion and pointer manipulation should be within the code that's explicitly using the Generic Thunk (the 16-bit code). Win32 code should remain the unwitting target. However, from an implementation standpoint, Win32 has greater visibility than 16-bit code. Therefore, there really is no other practical place for implementation of these functions. Of course, if I wanted to put the architectural burden of handle conversion squarely on 16-bit code, I could just expose the Win32 conversion functions via the Generic Thunk!
Second, you may be wondering why Win32 code needs 16-bit handle conversion routines. This is clearly for the support of passing parameters to 16-bit callback functions. It's probably becoming obvious that many of the Generic Thunk's newer features are for callback support.
Manipulation of the 16-bit Nonpreemptive Scheduler
As 16 and 32-bit code becomes more intertwined (thanks to callbacks), more of what makes 16-bit Windows work is becoming visible to Win32 via WOW32.DLL. Another example of this is the inclusion of WOWYield16 and WOWDirectedYield16. These functions are identical to the 16-bit Yield and DirectedYield functions. Calling these new functions from Win32 permits the nonpreemptive scheduler of the associated 16-bit Windows task to execute. The associated task is the one that is currently executing the Win32 code. In the case of Windows NT, this directly correlates to the thread that represents the currently executing 16-bit task. This point becomes more important when discussing callbacks, which I'll do now.
Finally, 16-bit Callbacks!
In my June 1994 WOW article, I mentioned that you could not mix 16-bit code and Win32 when utilizing callback functions. In other words, even though the Generic Thunk offered a trap door to Win32, there was no reaching back into 16-bit Windows in the form of any callback. You could theoretically fudge something by using window messages, but as discussed earlier, this facility is restricted under Windows 9x to documented window messages. Defining your own messages and passing your own data is just something you cannot do consistently across all Win32 implementations. In addition, even under Windows NT you are restricted to posting messages, which precludes you from obtaining any type of return values from processing a 16-bit window message. Now the fairly complex feature of real callbacks has been added to the Generic Thunk, but it does come with some restrictions.
Two functions have been added to let you call into 16-bit code from Win32 code within the context of a Generic Thunk call. These two functions are WOWCallback16 and WOWCallback16Ex, which are prototyped in WOWNT32.H.
#define WCB16_PASCAL (0x0)
#define WCB16_CDECL (0x1)
DWORD WINAPI WOWCallback16(DWORD vpfn16,
BOOL WINAPI WOWCallback16Ex(DWORD vpfn16,
DWORD dwFlags, DWORD cbArgs, PVOID pArgs,
WOWCallback16 permits calling a Pascal-type 16-bit function (the callback function's address is passed in the first parameter), allowing the passing of a single DWORD parameter to the callback. The 16-bit callback can return up to a DWORD to the Win32 caller. Of course, if this value is smaller than 32-bits, the unused bits are undefined. Likewise, if a pointer is returned to Win32, the pointer needs to be appropriately converted to make it usable by Win32.|
WOWCallback16Ex lets you call Pascal or C-prototyped functions. In addition, up to 16 parameters can be passed to the callback function. Unlike the CallProc32W and CallProcEx32W functions, where the function can optionally convert pointers for you automatically, WOWCallback16Ex can convert any parameters for you. Conversion must be performed manually, which can be done by the previously discussed VDM memory management functions.
Calling WOWCallback16Ex is straightforward. The first parameter is the address of the 16-bit callback, and the second parameter indicates whether the callback is defined in Pascal or C format. The third parameter is the count of the arguments passed. The fourth parameter is a pointer to an array of the arguments to be passed to the 16-bit callback. The final parameter receives the return value from the callback.
As you can see, the use of these functions is fairly clear. Once you convey the 16-bit address of the callback to your Win32 code, calling it is easy. (Implementation is demonstrated under the Invoke WOW Callback menu item in WOWTEST.EXE.) However, there is at least one restriction that is less than obvious in the callback's implementation. The 16-bit callback will only work when invoked from the context (thread) that called your Win32 code. This may restrict the design of your Win32 code. For example, you can't spawn a Win32 thread to do some asynchronous processing, calling the 16-bit callback at some later point when interesting things happen. Life's tough, eh? Again, none of this matters under Windows 9x, since you can't create new threads and break away from the context of the 16-bit code that called you in the first place.
Have a suggestion for Nerditorium? Send it to Jim Finnegan at email@example.com.