Matt Pietrek Matt Pietrek is the author of Windows 95 System Programming Secrets (IDG Books, 1995). He works at NuMega Technologies Inc., and can be reached at mpietrek@tiac.com or at In my May 1997 column, I used some APIs from This month, I’ll go over a different subset of the IMAGEHLP APIs to show how their powerful features can be implemented with a few simple lines of code. To demonstrate how easy it is to use IMAGEHLP APIs, I wrote the EZPE program, a PE file-display program that also displays debug symbols belonging to an executable (that is, EXEs, DLLs, OCXs, and so on). It displays information similar to programs like DUMPBIN from Visual C++® or PEDUMP from my book, Windows 95 System Programming Secrets (IDG Books, 1995). The key difference is that EZPE never touches the executable file directly, and it doesn’t grovel through data structures like other PE file- display programs. Instead, EZPE lets the IMAGEHLP APIs do all the hard work and effectively demonstrates the proper use of the APIs as a by-product. Another nice feature that falls out from EZPE’s use of the IMAGEHLP APIs is that you can see the symbol names and addresses contained within debug information, such as the DBG files that are provided for Windows NT components. You can also use EZPE to see the symbols contained with PDB files, something that even DUMPBIN can’t do. All you have to do is make sure that the symbol table file (for example, PDB or DBG) is in the same directory as the executable it belongs to. Running EZPE on the executable file causes it to automatically find the symbols in the PDB or DBG file, as appropriate. The beauty is that EZPE kicks back and lets IMAGEHLP.DLL do the hard work of finding and loading the symbol tables. More on this later. Before jumping into a description of the IMAGEHLP APIs, a quick review of IMAGEHLP’s availability is worthwhile. IMAGEHLP is a standard component of Windows NT® 4.0. However, it’s a redistributable DLL, so you can ship it with your app if it needs to run on Windows® 95. Be aware, though, that certain functions in IMAGEHLP don’t work under Windows 95 (at least not in the IMAGEHLP.DLL that was available when I wrote this). The import library and header file for IMAGEHLP.DLL can be found in any Win32® SDK that shipped on or after the release date of Windows NT 4.0 (July 31, 1996). IMAGEHLP isn’t specific to any one CPU platform. I built my EZPE program on a DEC Alpha, and it worked perfectly the first time. The first IMAGEHLP API to look at is MapAndLoad. You’d use this if you were interested only in the contents of an executable and didn’t care about any debug information that might be available. Although the IMAGEHLP documentation is vague about exactly what MapAndLoad does, it’s really quite simple. First, the function goes through After memory mapping the executable file, MapAndLoad fills in the LOADED_IMAGE structure that was passed in. There are a number of key fields in this structure that are likely to be valuable to you. The MappedAddress field is where the executable is mapped into memory (that is, it’s what the internal call to MapViewOfFile returned). The FileHeader field contains a pointer to an IMAGE_NT_ The Sections field in the LOADED_IMAGE structure is a pointer to the PE section table, which is an array of IMAGE_SECTION_HEADERS that is also defined in WINNT.H. The number of sections in the array is given by the (you guessed it) NumberOfSections field. An IMAGE_ The final field in the LOADED_IMAGE structure worth mentioning here is the Characteristics field. Using this is just a shortcut to grabbing the Characteristics field out of the PE header. The characteristics flags are defined in WINNT.H and include values such as IMAGE_FILE_DLL, which means the executable is a DLL rather than a program (EXE) file. As the main function in EZPE.CPP shows (see Figure 1), the MapAndLoad and UnMapAndLoad APIs can be used without any advanced preparation, unlike the symbol table APIs that I’ll get to shortly. MapAndLoad is relatively lightweight and executes quickly. Using just MapAndLoad, and knowing the contents of various PE file data structures, you can quickly access nearly everything of importance in an executable file. Figure 1 EZPE.CPP
//==========================================
// Matt Pietrek
// Microsoft Systems Journal, August 1997
// FILE: EZPE.CPP
//==========================================
#include <windows.h>
#include <stdio.h>
#include <imagehlp.h>
//=============================================================================
// Macros to make formatted output of numerous data structure fields easier
//=============================================================================
#define DISPLAY_COLUMNS "35"
#define DisplayPtrField( ptr, field, fmt ) \
printf( "%-" DISPLAY_COLUMNS fmt, #field, ptr->field );
#define DisplayPtrFieldD( ptr, field ) DisplayPtrField( ptr, field, "s%08X\n" )
#define DisplayPtrFieldW( ptr, field ) DisplayPtrField( ptr, field, "s%04X\n" )
#define DisplayPtrFieldStr(ptr,field ) DisplayPtrField( ptr, field, "s%s\n" )
#define DisplayPtrVersionFields( name, ptr, field1, field2 ) \
printf("%-" DISPLAY_COLUMNS "s%u.%02u\n", name,ptr->field1,ptr->field2);
#define DisplayField( struct, field, fmt ) \
printf( "%-" DISPLAY_COLUMNS fmt, #field, struct.field );
#define DisplayFieldD( struct, field ) DisplayField( struct, field, "s%08X\n")
#define DisplayFieldW( struct, field ) DisplayField( struct, field, "s%04X\n")
#define DisplayFieldStr(struct,field ) DisplayField( struct, field, "s%s\n")
#define DisplayDataDir( ptr, x ) \
printf( "%-" DISPLAY_COLUMNS "sAddress: %08X Size: %08X\n", \
#x, ptr->DataDirectory[x].VirtualAddress, ptr->DataDirectory[x].Size );
//=============================================================================
// Prototypes for helper functions
//=============================================================================
BOOL CALLBACK EnumSymbolsCallback( LPSTR, ULONG, ULONG, PVOID );
void ShowImageFileHeaders( PIMAGE_NT_HEADERS pNTHdrs );
void ShowSectionHeaders( PIMAGE_SECTION_HEADER pSectionHdr, DWORD cSections );
void StartNewDisplaySection( PSTR pszSectionName );
BOOL ParseCommandLine(int argc, char * argv[], PSTR pszFilename, PSTR pszPath);
void ShowSymbols( PSTR pszFilename, PSTR pszPath );
//=============================================================================
// Global variables
//=============================================================================
BOOL g_fDecoratedNames = FALSE;
BOOL g_fShowSymbols = TRUE;
char g_szHelp[] =
"EZPE - August 1997 Microsoft Systems Journal, by Matt Pietrek\n"
"Syntax: EZPE [options] <filename>\n"
" -d Decorated C++ names\n"
" -n No symbol display\n";
//=============================================================================
// Main program - First display info about the executable, then shows symbols
//=============================================================================
int main( int argc, char * argv[] )
{
char szFilename[MAX_PATH];
char szPath[MAX_PATH];
if ( !ParseCommandLine(argc, argv, szFilename, szPath) )
{
printf( g_szHelp );
return 1;
}
printf( "Display of file %s\n", szFilename );
LOADED_IMAGE li;
if ( MapAndLoad( szFilename, 0, &li, FALSE, TRUE) )
{
StartNewDisplaySection( " LOADED_IMAGE" );
DisplayFieldStr( li, ModuleName )
DisplayFieldD( li, hFile )
DisplayFieldD( li, MappedAddress )
DisplayFieldD( li, FileHeader )
DisplayFieldD( li, LastRvaSection )
DisplayFieldD( li, NumberOfSections )
DisplayFieldD( li, Sections )
DisplayFieldD( li, Characteristics )
DisplayFieldD( li, fSystemImage )
DisplayFieldD( li, fDOSImage )
DisplayFieldD( li, Links )
DisplayFieldD( li, SizeOfImage )
StartNewDisplaySection( "PE File Headers (LOADED_IMAGE.FileHeader)" );
ShowImageFileHeaders( li.FileHeader );
StartNewDisplaySection( "Section Headers (LOADED_IMAGE.Sections)" );
ShowSectionHeaders( li.Sections, li.NumberOfSections );
}
else
{
printf( "MapAndLoad failed - exiting\n" );
return 1;
}
UnMapAndLoad( &li );
if ( g_fShowSymbols )
ShowSymbols( szFilename, szPath );
return 0;
}
#define MY_PROCESS_HANDLE 0
//=============================================================================
// Loads a symbol table and enumerates the symbols in it
//=============================================================================
void ShowSymbols( PSTR pszFilename, PSTR pszPath )
{
PIMAGE_DEBUG_INFORMATION pidi;
//
// First, see if there's any debug information worth displaying. If
// this functions succeeds, the PE file, and its debug information
// is memory mapped in.
//
pidi = MapDebugInformation( 0, pszFilename, "", 0 );
if ( pidi )
{
StartNewDisplaySection( "IMAGE_DEBUG_INFORMATION" );
DisplayPtrFieldD( pidi, Size )
DisplayPtrFieldD( pidi, MappedBase )
DisplayPtrFieldW( pidi, Machine )
DisplayPtrFieldW( pidi, Characteristics )
DisplayPtrFieldD( pidi, CheckSum )
DisplayPtrFieldD( pidi, ImageBase )
DisplayPtrFieldD( pidi, SizeOfImage )
DisplayPtrFieldD( pidi, NumberOfSections )
DisplayPtrFieldD( pidi, Sections )
DisplayPtrFieldD( pidi, ExportedNamesSize )
DisplayPtrFieldD( pidi, ExportedNames )
DisplayPtrFieldD( pidi, NumberOfFunctionTableEntries )
DisplayPtrFieldD( pidi, FunctionTableEntries )
DisplayPtrFieldD( pidi, LowestFunctionStartingAddress )
DisplayPtrFieldD( pidi, HighestFunctionEndingAddress )
DisplayPtrFieldD( pidi, NumberOfFpoTableEntries )
DisplayPtrFieldD( pidi, FpoTableEntries )
DisplayPtrFieldD( pidi, SizeOfCoffSymbols )
DisplayPtrFieldD( pidi, CoffSymbols )
DisplayPtrFieldD( pidi, SizeOfCodeViewSymbols )
DisplayPtrFieldD( pidi, CodeViewSymbols )
DisplayPtrFieldStr( pidi, ImageFilePath )
DisplayPtrFieldStr( pidi, ImageFileName )
DisplayPtrFieldStr( pidi, DebugFilePath )
DisplayPtrFieldD( pidi, TimeDateStamp )
DisplayPtrFieldD( pidi, RomImage )
DisplayPtrFieldD( pidi, DebugDirectory )
DisplayPtrFieldD( pidi, NumberOfDebugDirectories )
}
else
{
printf( "MapDebugInformation failed - exiting\n" );
return;
}
//
// Init symbol handler for "process" (in our case, "0" is the process)
//
if ( !SymInitialize( MY_PROCESS_HANDLE, pszPath, FALSE ) )
{
printf( "MapDebugInformation failed - exiting\n" );
return;
}
//
// If the command line specified "decorated" names (-d), set the symbol
// options before loading the symbol table below.
//
if ( g_fDecoratedNames )
{
DWORD symOptions = SymGetOptions();
symOptions &= ~SYMOPT_UNDNAME; // Turn off the SYMOPT_UNDNAME flag
SymSetOptions( symOptions );
}
//
// Load the symbol table for the specified file. A "MappedAddress" from
// MapAndLoad, or a "MappedBase" from MapDebugInformation is needed as
// parameter 5.
//
if ( !SymLoadModule( MY_PROCESS_HANDLE, 0, pszFilename, 0,
(DWORD)pidi->MappedBase, 0 ) )
{
printf( "SymLoadModuleFailed\n" );
return;
}
IMAGEHLP_MODULE im = { sizeof(im) };
if ( SymGetModuleInfo( MY_PROCESS_HANDLE, (DWORD)pidi->MappedBase, &im ) )
{
StartNewDisplaySection( "IMAGEHLP_MODULE" );
// DisplayFieldD( im, SizeOfStruct ); // Not worth showing
// DisplayFieldD( im, BaseOfImage );
// DisplayFieldD( im, ImageSize );
// DisplayFieldD( im, TimeDateStamp );
DisplayFieldD( im, CheckSum );
DisplayFieldD( im, NumSyms );
DisplayFieldD( im, SymType );
PSTR pszSymType;
switch( im.SymType )
{
case SymNone: pszSymType = "SymNone"; break;
case SymCoff: pszSymType = "SymCoff"; break;
case SymCv: pszSymType = "SymCv"; break;
case SymPdb: pszSymType = "SymPdb"; break;
case SymExport: pszSymType = "SymExport"; break;
case SymDeferred: pszSymType = "SymDeferred"; break;
default: pszSymType = "?";
}
printf( "%-" DISPLAY_COLUMNS "s%s\n", " ", pszSymType );
DisplayFieldStr( im, ModuleName );
DisplayFieldStr( im, ImageName );
DisplayFieldStr( im, LoadedImageName );
}
StartNewDisplaySection( "Symbols" );
printf( " RVA Name\n" );
printf( "-------- ----\n" );
//
// Kick off the symbol enumeration. The EnumSymbolsCallback function
// is called once for each symbol.
//
SymEnumerateSymbols( 0, (DWORD)pidi->MappedBase, EnumSymbolsCallback,
pidi->MappedBase );
SymUnloadModule( 0, (DWORD)pidi->MappedBase ); // Undo SymLoadModule
SymCleanup( 0 ); // Undo the SymInitialize
UnmapDebugInformation( pidi ); // Close the PE file and debug info mapping
}
//=============================================================================
// Callback function for use by the SymEnumerateSymbols API
//=============================================================================
BOOL CALLBACK EnumSymbolsCallback(
LPSTR SymbolName,
ULONG SymbolAddress,
ULONG SymbolSize,
PVOID UserContext )
{
// User Context is whatever was passed to SymEnumerateSymbols. Here,
// we passed the mapped address of the executable. This allows us
// to convert the symbol addresses we get into RVAs, below.
DWORD mappedBase = (DWORD)UserContext;
// print out the RVA, and the symbol name passed to us.
printf( "%08X %s\n", SymbolAddress - mappedBase, SymbolName );
//
// If "decorated" names were specified, and if the name is "decorated,"
// undecorate it so that a human readable version can be displayed.
//
if ( g_fDecoratedNames && ('?' == *SymbolName) )
{
char szUndecoratedName[0x400]; // Make symbol name buffers for the
char szDecoratedName[0x400]; // decorated & undecorated versions
// Make a copy of the original SymbolName, so that we can modify it
lstrcpy( szDecoratedName, SymbolName );
PSTR pEnd = szDecoratedName + lstrlen( szDecoratedName );
// Strip everything off the end until we reach a 'Z'
while ( (pEnd > szDecoratedName) && (*pEnd != 'Z') )
*pEnd-- = 0;
// Call the IMAGEHLP function to undecorate the name
if ( 0 != UnDecorateSymbolName( szDecoratedName, szUndecoratedName,
sizeof(szUndecoratedName),
UNDNAME_COMPLETE |
UNDNAME_32_BIT_DECODE ) )
{
// End the output line with the undecorated name
printf( " %s\n", szUndecoratedName );
}
}
return TRUE;
}
//=============================================================================
// Shows the contents of a PE header. Called by main()
//=============================================================================
void ShowImageFileHeaders( PIMAGE_NT_HEADERS pNTHdrs )
{
PIMAGE_FILE_HEADER pImageFileHeader = &pNTHdrs->FileHeader;
DisplayPtrFieldW( pImageFileHeader, Machine )
DisplayPtrFieldW( pImageFileHeader, NumberOfSections )
DisplayPtrFieldD( pImageFileHeader, TimeDateStamp )
DisplayPtrFieldD( pImageFileHeader, PointerToSymbolTable )
DisplayPtrFieldD( pImageFileHeader, NumberOfSymbols )
DisplayPtrFieldW( pImageFileHeader, SizeOfOptionalHeader )
DisplayPtrFieldW( pImageFileHeader, Characteristics )
PIMAGE_OPTIONAL_HEADER pImageOptHeader = &pNTHdrs->OptionalHeader;
DisplayPtrFieldW( pImageOptHeader, Magic )
DisplayPtrVersionFields("LinkerVersion", pImageOptHeader,
MajorLinkerVersion, MinorLinkerVersion );
DisplayPtrFieldD( pImageOptHeader, SizeOfCode )
DisplayPtrFieldD( pImageOptHeader, SizeOfInitializedData )
DisplayPtrFieldD( pImageOptHeader, SizeOfUninitializedData )
DisplayPtrFieldD( pImageOptHeader, AddressOfEntryPoint )
DisplayPtrFieldD( pImageOptHeader, BaseOfCode )
DisplayPtrFieldD( pImageOptHeader, BaseOfData )
DisplayPtrFieldD( pImageOptHeader, ImageBase )
DisplayPtrFieldD( pImageOptHeader, SectionAlignment )
DisplayPtrFieldD( pImageOptHeader, FileAlignment )
DisplayPtrVersionFields("OperatingSystemVersion", pImageOptHeader,
MajorOperatingSystemVersion,
MinorOperatingSystemVersion )
DisplayPtrVersionFields("ImageVersion", pImageOptHeader, MajorImageVersion,
MinorImageVersion )
DisplayPtrVersionFields("SubsystemVersion", pImageOptHeader,
MajorSubsystemVersion,
MinorSubsystemVersion )
DisplayPtrFieldD( pImageOptHeader, Win32VersionValue )
DisplayPtrFieldD( pImageOptHeader, SizeOfImage )
DisplayPtrFieldD( pImageOptHeader, SizeOfHeaders )
DisplayPtrFieldD( pImageOptHeader, CheckSum )
DisplayPtrFieldW( pImageOptHeader, Subsystem )
DisplayPtrFieldW( pImageOptHeader, DllCharacteristics )
DisplayPtrFieldD( pImageOptHeader, SizeOfStackReserve )
DisplayPtrFieldD( pImageOptHeader, SizeOfStackCommit )
DisplayPtrFieldD( pImageOptHeader, SizeOfHeapReserve )
DisplayPtrFieldD( pImageOptHeader, SizeOfHeapCommit )
DisplayPtrFieldD( pImageOptHeader, LoaderFlags )
DisplayPtrFieldD( pImageOptHeader, NumberOfRvaAndSizes )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_EXPORT )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_IMPORT )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_RESOURCE )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_EXCEPTION )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_SECURITY )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_BASERELOC )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_DEBUG )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_COPYRIGHT )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_GLOBALPTR )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_TLS )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT )
DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_IAT )
}
//=============================================================================
// Enumerates through a section table and displays basic info for each section
//=============================================================================
void ShowSectionHeaders( PIMAGE_SECTION_HEADER pSectionHdr, DWORD cSections )
{
printf( " # Name Address VirtSize RawSize\n" );
printf( "-- -------- -------- -------- --------\n" );
for ( unsigned i=1; i <= cSections; i++, pSectionHdr++ )
{
printf( "%2u %8.8s %08X %08X %08x\n", i, pSectionHdr->Name,
pSectionHdr->VirtualAddress, pSectionHdr->Misc.VirtualSize,
pSectionHdr->SizeOfRawData );
}
}
//=============================================================================
// Helper function used to provide consistent delimitation of output sections
//=============================================================================
void StartNewDisplaySection( PSTR pszSectionName )
{
printf( "\n==== %s ====\n", pszSectionName );
}
//=============================================================================
// Called by main() to extract any command-line arguments, as well as the
// filename to be used. Also attempts to come up with a complete path to
// the directory containing the named executable. Returns this value in the
// pszPath param buffer.
//=============================================================================
BOOL ParseCommandLine( int argc, char * argv[], PSTR pszFilename, PSTR pszPath )
{
if ( argc < 2 )
return FALSE;
BOOL fSawFilename = FALSE;
for ( int i = 1; i < argc; i++ )
{
PSTR pszArg = argv[i];
if ( *pszArg == '-' ) // Is the first character a '-' ?
{
*pszArg++;
if ( (*pszArg=='d') || (*pszArg=='D') ) // allow "d" or "D"
g_fDecoratedNames = TRUE; // set global flag
else if ( (*pszArg=='n') || (*pszArg=='N') ) // allow "n" or "N"
g_fShowSymbols = FALSE; // set global flag
}
else
{
if ( fSawFilename ) // We should only get here once
return FALSE;
PSTR pszFilenamePart;
if (GetFullPathName( argv[i], MAX_PATH, pszPath, &pszFilenamePart))
{
// Truncate the filename portion off the path
if ( pszFilenamePart )
*pszFilenamePart = 0;
// Copy the input argument to the passed in filename buffer
lstrcpy( pszFilename, argv[i] );
fSawFilename = TRUE;
}
else // GetFullPathName failed. Ooops!
return FALSE;
}
}
return fSawFilename;
} As a final note on MapAndLoad, it’s important to remember that it creates a linear mapping of the entire file in one contiguous chunk. This is different from the Win32 loader bringing an executable module into memory, creating distinct mappings for each section so that it starts on a page boundary in memory. The result of this linear mapping is that any Relative Virtual Addresses (RVA) that you might see in the PE header aren’t directly usable with the image as loaded by MapAndLoad. To use an RVA in this situation, you’d have to adjust it to account for the difference between the section’s file offset and its in-memory address. Luckily, IMAGEHLP.DLL provides an API, ImageRVAToVa, that will do this for you. If it’s symbol table information you’re after, the equivalent to MapAndLoad is the MapDebugInformation API. You can think of MapDebugInformation as a superset of the MapAndLoad API. Besides mapping the executable file into memory, this API also figures out what the best type of symbol information is as well as some basic information about that symbol table. What exactly do I mean by “best”? It turns out that an executable can be built with more than one type of debug information. For example, you can create an executable with both CodeView (PDB) information and a COFF symbol table. IMAGEHLP knows how to read both formats, as well as a few others, and knows which one is optimal for your executable. More on this later. Just as the MapAndLoad API eventually needs to be followed by a call to UnMapAndLoad, MapDebugInformation also needs to be cleaned up by calling UnmapDebugInformation. Because the symbols for an executable may be in a file other than the executable itself, the MapDebugInformation API takes a parameter not needed for the MapAndLoad API—the symbol search path. By default, IMAGEHLP searches for symbol files in a series of paths that I’ll describe later. However, the MapDebugInformation API lets you override these paths. This is what I’ve done in the EZPE source where it calls MapDebugInformation. Besides mapping and loading the executable and its symbols (if present), the MapDebugInformation API returns a pointer to an IMAGE_DEBUG_INFORMATION structure. This structure contains many more fields than a LOADED_IMAGE structure, although nearly every field in the LOADED_IMAGE structure can be found in the IMAGE_DEBUG_INFORMATION structure. For example, the MappedBase field contains the address where the executable was mapped, and is the same as the MappedAddress field in a LOADED_IMAGE structure. Similarly, the Sections field is a pointer to the executable’s section table, and so forth. More useful information found in the IMAGE_DEBUG_ The meaning of some fields in the IMAGE_DEBUG_ Finally, the IMAGE_DEBUG_INFORMATION has a variety of fields that indicate if CodeView and COFF information are present, and if so, where. There’s even a pointer to the debug directory. This is the data structure in the PE file that tells you what types of debug information are present and where. The MapDebugInformation API does a good job of extracting this information and presenting it in the IMAGE_DEBUG_INFORMATION structure. Still, if you’re so inclined, you can go straight to the same raw data that IMAGEHLP uses to generate the IMAGE_ So far, the two APIs I’ve examined (MapAndLoad and MapDebugInformation) simply map an executable into memory and extract some useful information from it. Neither API loads a symbol table, although calling one of them is effectively a prerequisite to using the symbol table APIs. The key piece of data needed is the mapped address of the executable. The symbol table APIs work from the mapped executable to find and load the appropriate symbol table into memory. The first IMAGEHLP symbol table API you should be aware of is SymInitialize, which sets up internal variables in IMAGEHLP so that the DLL is prepared to load symbol tables for the executable and possibly multiple DLLs within a process. As you might expect, there’s a corresponding shutdown API, SymCleanup, that should be called when you’re finished working with symbols. The first parameter to SymInitialize is an identifier for a process that you want to use when working with symbols. If you were using IMAGEHLP as part of a real debugger, you’d want to pass a valid process handle. This allows IMAGEHLP to enumerate through all the loaded modules in a process address space and load the associated symbol tables. You can turn off this automatic module enumeration by passing FALSE as the third parameter to SymInitialize. If you’re not a debugger process (EZPE isn’t), you can pass whatever value you’d like as the process handle. Just remember to pass the same value to subsequent symbol APIs that expect a process handle. In the case of EZPE, I used the value zero through a #define called MY_ (As a side note to the automatic module enumeration I referred to, the Windows NT 4.0-supplied IMAGEHLP won’t do this under Windows 95. The module enumeration APIs under Windows NT are different than those in Windows 95. However, these differences are slated to be resolved in a subsequent release.) The second parameter to SymInitialize is the symbol search path. If you pass a valid string pointer in the form of a path (that is, directories separated by semicolons), IMAGEHLP searches those directories when looking for a symbol table that’s in a different file than the executable. Passing zero causes IMAGEHLP to use three environment variables as the path: _NT_SYMBOL_PATH, _NT_ALTERNATE_SYMBOL_PATH, and SYSTEMROOT. After you’ve called SymInitialize, the next step is to load the symbol tables you’re interested in—that is, assuming you didn’t pass a valid process handle to SymInitialize so that it enumerated and loaded all the symbol tables automatically. EZPE doesn’t do this, so it’s necessary to manually load the symbol table for the executable file that it’s working with. The API that manually loads a symbol table is SymLoadModule. Not surprisingly, there’s a SymUnloadModule to use when you’re done with a given symbol table. Although the SymLoadModule API takes six parameters, only three are required for a simple program like EZPE. The first parameter is the process handle value that was passed to SymInitialize earlier in the program. Parameter three is the name of the executable file whose symbols are to be loaded. Parameter five is the address where the executable is mapped into memory. As I alluded to earlier, this value can be obtained easily by calling MapAndLoad or MapDebugInformation. Assuming all goes well, SymLoadModule returns TRUE. After loading a symbol table, there are a variety of actions available. For example, in my April 1997 column I used the SymGetSymFromAddr function to take an address and find the name of the nearest symbol. The end result was a stack trace containing symbolic function names. If I were writing a debugger, I could use the SymGetSymFromName API to look up the address of a function or variable name that the user requested. With EZPE, the first action after loading a symbol table is to find out more about what was just loaded. This can be done with the SymGetModuleInfo API. The first parameter is the process handle value used with the other symbol APIs. The second parameter needs to be an address somewhere within the module that the symbol table belongs to. In the EZPE code, the easiest thing to use is the base address to which the executable was memory mapped. The third parameter to SymGetModuleInfo is a pointer to an IMAGEHLP_MODULE structure that the API fills with information about the module and its symbol table. The first group of fields in an IMAGEHLP_MODULE structure is standard stuff that you can get in ways that I described earlier. More interesting is the SymType and NumSyms fields. The SymType field contains an enum that indicates what type of symbol table was loaded (for example, SymCoff, SymCv, SymPdb, or SymExport). The SymExport type is worth a mention. Exports aren’t formally considered to be debug information. However, the information stored for an exported function (its name and address) is the bare minimum required for inclusion in a symbol table. Therefore, IMAGEHLP can synthesize a symbol table out of an executable’s exports. The upshot is that any executable that exports symbols can be considered to have at least a minimal symbol table available. (By the way, if you’re a SoftIce user, the Load EXPORTS capability works along the same lines.) Another, more useful action you can take with a loaded symbol table is to enumerate through all the symbols. For this purpose, IMAGEHLP has the SymEnumerateSymbols API. The first parameter is the process handle value used with the other symbol APIs. Parameter two is the base address of the executable whose symbols you’re interested in. The third parameter is the address of a callback function that will be called once for each symbol in the symbol table. The fourth parameter can be whatever you’d like. It’s passed on to the callback function, unmodified. If I didn’t want to use the SymEnumerateSymbols API, I could use the GymGetSymNext API in a loop instead. Both APIs have their strengths and weaknesses, so I just picked one arbitrarily for EZPE to use. Now let’s look at the EZPE program and its code. EZPE is a command-line program that accepts arguments. The source file EZPE.CPP is shown in Figure 1. You can see its usage by running EZPE with no arguments:
Syntax: EZPE [options] <filename>
-d Decorated C++ names
-n No symbol display In the simplest case, you’d give EZPE the name of an executable file to display. EZPE outputs to the stdout, so its output can be redirected to a file. For example:
EZPE C:\WINNT\SYSTEM32\KERNEL32.DLL > results Figure 2 shows the results of running EZPE on its own EXE. The –n option tells EZPE to not bother loading and displaying the symbols. If you were to use the –n option, everything after the “==== IMAGE_DEBUG_ INFORMATION ====” line in Figure 2 would be omitted from the program output. Figure 2 EZPE Output
Display of file EZPE.EXE
==== LOADED_IMAGE ====
ModuleName EZPE.EXE
hFile FFFFFFFF
MappedAddress 008B0000
FileHeader 008B0080
LastRvaSection 008B0178
NumberOfSections 00000004
Sections 008B0178
Characteristics 0000010B
fSystemImage 00000000
fDOSImage 00000000
Links 0012FFA4
SizeOfImage 00003E00
==== PE File Headers (LOADED_IMAGE.FileHeader) ====
Machine 014C
NumberOfSections 0004
TimeDateStamp 336CF53B
PointerToSymbolTable 00000000
NumberOfSymbols 00000000
SizeOfOptionalHeader 00E0
Characteristics 010B
Magic 010B
LinkerVersion 5.00
SizeOfCode 00002200
SizeOfInitializedData 00001800
SizeOfUninitializedData 00000000
AddressOfEntryPoint 00001E59
BaseOfCode 00001000
BaseOfData 00004000
ImageBase 00400000
SectionAlignment 00001000
FileAlignment 00000200
OperatingSystemVersion 4.00
ImageVersion 0.00
SubsystemVersion 4.00
Win32VersionValue 00000000
SizeOfImage 00007000
SizeOfHeaders 00000400
CheckSum 00000000
Subsystem 0003
DllCharacteristics 0000
SizeOfStackReserve 00100000
SizeOfStackCommit 00001000
SizeOfHeapReserve 00100000
SizeOfHeapCommit 00001000
LoaderFlags 00000000
NumberOfRvaAndSizes 00000010
IMAGE_DIRECTORY_ENTRY_EXPORT Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_IMPORT Address: 00006000 Size: 00000050
IMAGE_DIRECTORY_ENTRY_RESOURCE Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_EXCEPTION Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_SECURITY Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_BASERELOC Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_DEBUG Address: 00004000 Size: 00000054
IMAGE_DIRECTORY_ENTRY_COPYRIGHT Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_GLOBALPTR Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_TLS Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT Address: 00000000 Size: 00000000
IMAGE_DIRECTORY_ENTRY_IAT Address: 0000613C Size: 000000EC
==== Section Headers (LOADED_IMAGE.Sections) ====
# Name Address VirtSize RawSize
-- -------- -------- -------- --------
1 .text 00001000 0000206F 00002200
2 .rdata 00004000 00000164 00000200
3 .data 00005000 00000E53 00000c00
4 .idata 00006000 000004FC 00000600
==== IMAGE_DEBUG_INFORMATION ====
Size 0000012C
MappedBase 008B0000
Machine 014C
Characteristics 010B
CheckSum 00000000
ImageBase 00400000
SizeOfImage 00007000
NumberOfSections 00000004
Sections 008B0178
ExportedNamesSize 00000000
ExportedNames 00000000
NumberOfFunctionTableEntries 00000000
FunctionTableEntries 00000000
LowestFunctionStartingAddress 00000000
HighestFunctionEndingAddress 00000000
NumberOfFpoTableEntries 00000005
FpoTableEntries 008AE72C
SizeOfCoffSymbols 00000000
CoffSymbols 00000000
SizeOfCodeViewSymbols 00000029
CodeViewSymbols 008AE780
ImageFilePath E:\column\col47\EZPE.EXE
ImageFileName EZPE.EXE
DebugFilePath E:\column\col47\EZPE.EXE
TimeDateStamp 336CF53B
RomImage 00000000
DebugDirectory 008B2600
NumberOfDebugDirectories 00000003
==== IMAGEHLP_MODULE ====
CheckSum 00000000
NumSyms 00000000
SymType 00000003
SymPdb
ModuleName EZPE
ImageName EZPE.EXE
LoadedImageName E:\column\col47\EZPE.EXE
==== Symbols ====
RVA Name
-------- ----
000061BC _imp__ExitProcess
00001834 ShowImageFileHeaders
0000616C _imp__UnDecorateSymbolName
00001F92 UnmapDebugInformation
00001FEC GetStdHandle
00001F8C MapAndLoad
00006014 _IMPORT_DESCRIPTOR_IMAGEHLP
00001E14 printf
000061A8 _imp__lstrlenA
00001F80 wvsprintfA
00001D5F ParseCommandLine
00001E76 _ConvertCommandLineToArgcArgv
00005A78 g_szHelp
000061F8 _imp__wvsprintfA
0000614C _imp__UnmapDebugInformation
00001FE0 GetFullPathNameA
00006154 _imp__SymUnloadModule
000061C8 KERNEL32_NULL_THUNK_DATA
00006148 _imp__MapAndLoad
00001FDA lstrcpyA
000061B0 _imp__GetCommandLineA
00001F86 UnMapAndLoad
00001FA4 SymEnumerateSymbols
00005BE0 _ppszArgv
00005A70 g_fShowSymbols
00001FAA SymGetModuleInfo
00001292 ShowSymbols
0000613C _imp__MapDebugInformation
000061FC USER32_NULL_THUNK_DATA
00001D4A StartNewDisplaySection
00001046 main
00001FBC SymGetOptions
0000615C _imp__SymGetModuleInfo
00006140 _imp__SymInitialize
00006144 _imp__UnMapAndLoad
00002004 GetCommandLineA
000061C4 _imp__GetProcessHeap
000061A4 _imp__GetFullPathNameA
00006170 IMAGEHLP_NULL_THUNK_DATA
00006164 _imp__SymSetOptions
00005BC0 g_fDecoratedNames
00006160 _imp__SymLoadModule
00001FC8 MapDebugInformation
00001F9E SymUnloadModule
00006000 _IMPORT_DESCRIPTOR_USER32
00001FF8 HeapAlloc
00006158 _imp__SymEnumerateSymbols
00001F98 SymCleanup
00001FE6 WriteFile
000061AC _imp__lstrcpyA
00001767 EnumSymbolsCallback
00006150 _imp__SymCleanup
000061B8 _imp__GetStdHandle
00001FD4 lstrlenA
00001FB6 SymSetOptions
00006168 _imp__SymGetOptions
00001FFE GetProcessHeap
000061B4 _imp__WriteFile
00006028 _IMPORT_DESCRIPTOR_KERNEL32
00001E59 mainCRTStartup
00001FF2 ExitProcess
000061C0 _imp__HeapAlloc
00001FCE UnDecorateSymbolName
00001CEE ShowSectionHeaders
00001FC2 SymInitialize
00001FB0 SymLoadModule
0000603C _NULL_IMPORT_DESCRIPTOR The –d option tells EZPE to display the decorated (mangled) names of any C++ symbols in the symbol table. By default, when SymLoadModule creates the symbol table, it undecorates any C++ symbols into human readable form. The undecorated name consists solely of the class name and member function name, such as foo::bar. This is the default output mode that EZPE uses. The –d option tells EZPE to emit the raw, decorated names instead: ?ParseCommandLine@@YAHHQAPADPAD1@Z. While I was writing EZPE, it occurred to me that the default undecoration strips out lots of potentially useful information such as the parameters, calling convention, return type, and so forth. Therefore, when using the –d option, EZPE displays the decorated name as well as an undecorated version that contains much more information about the symbol. Coming up with a better undecorated version of a symbol name turned out to be a bit of a challenge. The first requirement was to force SymLoadModule to leave the symbol names alone when loading the symbol table. Luckily, there’s another IMAGEHLP API that makes this easy—the SymSetOptions API takes a flag called SYMOPT_UNDNAME, which isn’t a default setting. Because I wanted to change only that option and leave the others alone, the code calls SymGetOptions to get the current options. It then ORs in the SYMOPT_UNDNAME flag and calls SymSetOptions with the result. The remaining work of displaying a better undecorated symbol name is to call yet another IMAGEHLP API, UndecorateSymbolName, for any name that appears to be decorated (decorated names begin with a “?”). UndecorateSymbolName takes a whole slew of parameters that tell it what parts of an undecorated name to include or not include. The EZPE code uses the set of options that should produce the most information in the undecorated name. When I tested EZPE, the UndecorateSymbolName failed on certain symbol names. A little investigation proved that some symbols had garbage characters at the end of their names, whether the name was normal or decorated. Apparently, IMAGEHLP leaves garbage at the end of certain symbol names when operating with the SYMOPT_ Another interesting thing about the EZPE EnumSymbolsCallback function concerns the fourth parameter. It turns out that when IMAGEHLP calls the function, the symbol address it passes is a linear address and is connected to where the executable was mapped into memory. For a debugger operating on a live process, this is just fine. However, in a symbol display program, it’s worthless. The executable could be mapped nearly anywhere. To resolve this situation, I made EZPE emit the RVA of the symbol rather than the value IMAGEHLP passes to the callback function. (An RVA is independent of the executable’s mapped address, and just makes more sense since PE files themselves store all addresses as RVAs.) To calculate the RVA of each symbol, the EnumSymbolsCallback has to know where the executable is mapped into memory. Luckily, SymEnumerateSymbols has a parameter that it passes on, unmodified, to the enumeration callback function. EZPE uses this parameter to convey the executable’s mapped address to the enumeration callback function. In the callback, the code subtracts this value from the symbol address to obtain the symbol’s RVA. You’ll see this in the portion of Figure 2 that begins with the header “==== Symbols ====”. In particular, note that the addresses for the symbols are relatively small and fall within the RVAs listed for the various PE file sections. As a final wrap-up, let me first apologize for the macro madness at the beginning (for example, the DisplayPtrFieldD macro). When I was writing EZPE, I knew that it would display many fields from numerous structures. I wanted these fields to be formatted in a nice, consistent manner. If I had used printf directly, I would need to modify each printf individually if I wanted to change any output formatting. Making EZPE into a GUI app would have been even more of a pain. By using nested macros and the preprocessor stringize feature, I was able to isolate all the details of how the structure fields should be displayed into one location. If you’re ambitious and want to extend or customize EZPE, there are a number of things you can do. For example, you could remove the display of the various IMAGEHLP-specific data structures. I included them to show what sort of information IMAGEHLP gives you. The resulting output would be smaller and would include only information from the executable and symbol tables. Another nice feature would be to decode the various fields containing flags, such as the Characteristics field in the PE header, or the PE section attributes. Even with this extra code, you’d have a very compact program, which is a testament to the power that IMAGEHLP.DLL provides. Have a question about programming in Windows? Send it to Matt at mpietrek@tiac.comUnder the Hood
http://www.tiac.com/users/mpietrek.
IMAGEHLP.DLL as part of a framework for reporting
on unhandled exceptions. Since then I’ve received quite a bit of email about the use of those APIs, indicating that IMAGEHLP.DLL is an area of widespread interest. Unfortunately, in many ways the IMAGEHLP documentation assumes that you’re comfortable working with executable files and symbol tables. It’s also weak in explaining which APIs need to be used, and in what particular order to perform a given task. The result is that many developers who would benefit from using IMAGEHLP.DLL get lost in the documentation. IMAGEHLP APIs
the necessary gyrations to make a memory mapped file corresponding to the specified executable. Internally, MapAndLoad goes through the standard OpenFile, CreateFileMapping, MapViewOfFile sequence. Because these underlying APIs open up handles, it’s important that you call the matching UnMapAndLoad API when you’re done to close all the handles.
HEADERS structure, which is defined in WINNT.H. The IMAGE_NT_HEADERS structure is better known as the PE header, and contains all the vital values for the executable. This structure has been described in numerous articles (many of which are in the Microsoft KnowledgeBase), so I won’t dwell on it here. However, EZPE does a rudimentary printout of the PE header contents without putting too much effort into interpreting the fields.
SECTION_HEADER structure contains the name of a section, its size, its attributes, and its location within the executable file. The EZPE program prints out the important contents of each IMAGE_SECTION_HEADER in sequence, again without too much effort doing things
such as breaking down the attributes into meaningful
flags like PAGE_READONLY.
INFORMATION structure includes the preferred load address (the ImageBase field), and the size of the executable in memory (the SizeOfImage field). There are also pointers to the table of names for the exported functions, as well as the executable’s time/date stamp DWORD. You can pass this DWORD to the C++ ctime function to get the time and date when the executable was built. For more information on the time/date stamp, see my February 1997 column.
INFORMATION structure isn’t so obvious—like the pointers to Function and FPO tables. The Function table is data used by the structured exception handling code on the Alpha and MIPS platforms (it’s not encountered with Intel-based executables). FPO information is seen only on the Intel platform; it helps debuggers walk the call stack in the absence of standard EBP register stack frames.
DEBUG_INFORMATION structure. Remember though, the whole advantage of using IMAGEHLP is to avoid such low-level grunginess.
PROCESS_HANDLE.The EZPE Code
ezpe.mak
Ezpe.out
UNDNAME option enabled. For normal names, I didn’t go to the trouble of trying to strip off the garbage characters. However, I did notice that most C++ symbol names end with a capital Z. In the EnumSymbolsCallback function from EZPE.CPP, you’ll find that the code works backwards from the end of C++ symbol names, stripping off characters until it encounters a Z. Not pretty, but it seems to work OK.To obtain complete source code listings, see page 5.