Under the Hood

Matt Pietrek

Matt Pietrek is the author of Windows 95 System Programming Secrets (IDG Books, 1995). He works at NuMega Technologies Inc., and can be reached at mpietrek@tiac.com or at
http://www.tiac.com/users/mpietrek.

In my May 1997 column, I used some APIs from
IMAGEHLP.DLL as part of a framework for reporting
on unhandled exceptions. Since then Iíve received quite a bit of email about the use of those APIs, indicating that IMAGEHLP.DLL is an area of widespread interest. Unfortunately, in many ways the IMAGEHLP documentation assumes that youíre comfortable working with executable files and symbol tables. Itís also weak in explaining which APIs need to be used, and in what particular order to perform a given task. The result is that many developers who would benefit from using IMAGEHLP.DLL get lost in the documentation.

This month, Iíll go over a different subset of the IMAGEHLP APIs to show how their powerful features can be implemented with a few simple lines of code. To demonstrate how easy it is to use IMAGEHLP APIs, I wrote the EZPE program, a PE file-display program that also displays debug symbols belonging to an executable (that is, EXEs, DLLs, OCXs, and so on). It displays information similar to programs like DUMPBIN from Visual C++® or PEDUMP from my book, Windows 95 System Programming Secrets (IDG Books, 1995). The key difference is that EZPE never touches the executable file directly, and it doesnít grovel through data structures like other PE file- display programs. Instead, EZPE lets the IMAGEHLP APIs do all the hard work and effectively demonstrates the proper use of the APIs as a by-product.

Another nice feature that falls out from EZPEís use of the IMAGEHLP APIs is that you can see the symbol names and addresses contained within debug information, such as the DBG files that are provided for Windows NT components. You can also use EZPE to see the symbols contained with PDB files, something that even DUMPBIN canít do. All you have to do is make sure that the symbol table file (for example, PDB or DBG) is in the same directory as the executable it belongs to. Running EZPE on the executable file causes it to automatically find the symbols in the PDB or DBG file, as appropriate. The beauty is that EZPE kicks back and lets IMAGEHLP.DLL do the hard work of finding and loading the symbol tables. More on this later.

Before jumping into a description of the IMAGEHLP APIs, a quick review of IMAGEHLPís availability is worthwhile. IMAGEHLP is a standard component of Windows NT® 4.0. However, itís a redistributable DLL, so you can ship it with your app if it needs to run on Windows® 95. Be aware, though, that certain functions in IMAGEHLP donít work under Windows 95 (at least not in the IMAGEHLP.DLL that was available when I wrote this). The import library and header file for IMAGEHLP.DLL can be found in any Win32® SDK that shipped on or after the release date of Windows NT 4.0 (July 31, 1996). IMAGEHLP isnít specific to any one CPU platform. I built my EZPE program on a DEC Alpha, and it worked perfectly the first time.

IMAGEHLP APIs

The first IMAGEHLP API to look at is MapAndLoad. Youíd use this if you were interested only in the contents of an executable and didnít care about any debug information that might be available. Although the IMAGEHLP documentation is vague about exactly what MapAndLoad does, itís really quite simple. First, the function goes through
the necessary gyrations to make a memory mapped file corresponding to the specified executable. Internally, Map­AndLoad goes through the standard OpenFile,  Create­­­File­Mapping, MapViewOfFile sequence. Because these underlying APIs open up handles, itís important that you call the matching UnMapAndLoad API when youíre done to close all the handles.

After memory mapping the executable file, MapAndLoad fills in the LOADED_IMAGE structure that was passed in. There are a number of key fields in this structure that are likely to be valuable to you. The Mapped­Address field is where the executable is mapped into memory (that is, itís what the internal call to MapViewOfFile returned). The FileHeader field contains a pointer to an IMAGE_NT_
HEADERS structure, which is defined in WINNT.H. The IMAGE_NT_HEADERS structure is better known as the PE header, and contains all the vital values for the executable. This structure has been described in numerous articles (many of which are in the Microsoft KnowledgeBase), so I wonít dwell on it here. However, EZPE does a rudimentary printout of the PE header contents without putting too much effort into interpreting the fields.

The Sections field in the LOADED_IMAGE structure is a pointer to the PE section table, which is an array of IMAGE_SECTION_HEADERS that is also defined in WINNT.H. The number of sections in the array is given by the (you guessed it) NumberOfSections field. An IMAGE_
SECTION_HEADER structure contains the name of a section, its size, its attributes, and its location within the executable file. The EZPE program prints out the im­portant contents of each IMAGE_SECTION_HEADER in sequence, again without too much effort doing things
such as breaking down the attributes into meaningful
flags like PAGE_READONLY.

The final field in the LOADED_IMAGE structure worth mentioning here is the Characteristics field. Using this is just a shortcut to grabbing the Characteristics field out of the PE header. The characteristics flags are defined in WINNT.H and include values such as IMAGE_FILE_DLL, which means the executable is a DLL rather than a program (EXE) file.

As the main function in EZPE.CPP shows (see Figure 1), the MapAndLoad and UnMapAndLoad APIs can be used without any advanced preparation, unlike the symbol table APIs that Iíll get to shortly. MapAndLoad is relatively lightweight and executes quickly. Using just MapAndLoad, and knowing the contents of various PE file data structures, you can quickly access nearly everything of importance in an executable file.

Figure 1 EZPE.CPP

//==========================================

// Matt Pietrek

// Microsoft Systems Journal, August 1997

// FILE: EZPE.CPP

//==========================================

#include <windows.h>

#include <stdio.h>

#include <imagehlp.h>

//=============================================================================

// Macros to make formatted output of numerous data structure fields easier

//=============================================================================

#define DISPLAY_COLUMNS "35"

#define DisplayPtrField( ptr, field, fmt ) \

printf( "%-" DISPLAY_COLUMNS fmt, #field, ptr->field );

#define DisplayPtrFieldD( ptr, field ) DisplayPtrField( ptr, field, "s%08X\n" )

#define DisplayPtrFieldW( ptr, field ) DisplayPtrField( ptr, field, "s%04X\n" )

#define DisplayPtrFieldStr(ptr,field ) DisplayPtrField( ptr, field, "s%s\n" )

#define DisplayPtrVersionFields( name, ptr, field1, field2 ) \

printf("%-" DISPLAY_COLUMNS "s%u.%02u\n", name,ptr->field1,ptr->field2);

#define DisplayField( struct, field, fmt ) \

printf( "%-" DISPLAY_COLUMNS fmt, #field, struct.field );

#define DisplayFieldD( struct, field ) DisplayField( struct, field, "s%08X\n")

#define DisplayFieldW( struct, field ) DisplayField( struct, field, "s%04X\n")

#define DisplayFieldStr(struct,field ) DisplayField( struct, field, "s%s\n")

#define DisplayDataDir( ptr, x ) \

printf( "%-" DISPLAY_COLUMNS "sAddress: %08X Size: %08X\n", \

#x, ptr->DataDirectory[x].VirtualAddress, ptr->DataDirectory[x].Size );

//=============================================================================

// Prototypes for helper functions

//=============================================================================

BOOL CALLBACK EnumSymbolsCallback( LPSTR, ULONG, ULONG, PVOID );

void ShowImageFileHeaders( PIMAGE_NT_HEADERS pNTHdrs );

void ShowSectionHeaders( PIMAGE_SECTION_HEADER pSectionHdr, DWORD cSections );

void StartNewDisplaySection( PSTR pszSectionName );

BOOL ParseCommandLine(int argc, char * argv[], PSTR pszFilename, PSTR pszPath);

void ShowSymbols( PSTR pszFilename, PSTR pszPath );

//=============================================================================

// Global variables

//=============================================================================

BOOL g_fDecoratedNames = FALSE;

BOOL g_fShowSymbols = TRUE;

char g_szHelp[] =

"EZPE - August 1997 Microsoft Systems Journal, by Matt Pietrek\n"

"Syntax: EZPE [options] <filename>\n"

" -d Decorated C++ names\n"

" -n No symbol display\n";

//=============================================================================

// Main program - First display info about the executable, then shows symbols

//=============================================================================

int main( int argc, char * argv[] )

{

char szFilename[MAX_PATH];

char szPath[MAX_PATH];

if ( !ParseCommandLine(argc, argv, szFilename, szPath) )

{

printf( g_szHelp );

return 1;

}

printf( "Display of file %s\n", szFilename );

LOADED_IMAGE li;

if ( MapAndLoad( szFilename, 0, &li, FALSE, TRUE) )

{

StartNewDisplaySection( " LOADED_IMAGE" );

DisplayFieldStr( li, ModuleName )

DisplayFieldD( li, hFile )

DisplayFieldD( li, MappedAddress )

DisplayFieldD( li, FileHeader )

DisplayFieldD( li, LastRvaSection )

DisplayFieldD( li, NumberOfSections )

DisplayFieldD( li, Sections )

DisplayFieldD( li, Characteristics )

DisplayFieldD( li, fSystemImage )

DisplayFieldD( li, fDOSImage )

DisplayFieldD( li, Links )

DisplayFieldD( li, SizeOfImage )

StartNewDisplaySection( "PE File Headers (LOADED_IMAGE.FileHeader)" );

ShowImageFileHeaders( li.FileHeader );

StartNewDisplaySection( "Section Headers (LOADED_IMAGE.Sections)" );

ShowSectionHeaders( li.Sections, li.NumberOfSections );

}

else

{

printf( "MapAndLoad failed - exiting\n" );

return 1;

}

UnMapAndLoad( &li );

if ( g_fShowSymbols )

ShowSymbols( szFilename, szPath );

return 0;

}

#define MY_PROCESS_HANDLE 0

//=============================================================================

// Loads a symbol table and enumerates the symbols in it

//=============================================================================

void ShowSymbols( PSTR pszFilename, PSTR pszPath )

{

PIMAGE_DEBUG_INFORMATION pidi;

//

// First, see if there's any debug information worth displaying. If

// this functions succeeds, the PE file, and its debug information

// is memory mapped in.

//

pidi = MapDebugInformation( 0, pszFilename, "", 0 );

if ( pidi )

{

StartNewDisplaySection( "IMAGE_DEBUG_INFORMATION" );

DisplayPtrFieldD( pidi, Size )

DisplayPtrFieldD( pidi, MappedBase )

DisplayPtrFieldW( pidi, Machine )

DisplayPtrFieldW( pidi, Characteristics )

DisplayPtrFieldD( pidi, CheckSum )

DisplayPtrFieldD( pidi, ImageBase )

DisplayPtrFieldD( pidi, SizeOfImage )

DisplayPtrFieldD( pidi, NumberOfSections )

DisplayPtrFieldD( pidi, Sections )

DisplayPtrFieldD( pidi, ExportedNamesSize )

DisplayPtrFieldD( pidi, ExportedNames )

DisplayPtrFieldD( pidi, NumberOfFunctionTableEntries )

DisplayPtrFieldD( pidi, FunctionTableEntries )

DisplayPtrFieldD( pidi, LowestFunctionStartingAddress )

DisplayPtrFieldD( pidi, HighestFunctionEndingAddress )

DisplayPtrFieldD( pidi, NumberOfFpoTableEntries )

DisplayPtrFieldD( pidi, FpoTableEntries )

DisplayPtrFieldD( pidi, SizeOfCoffSymbols )

DisplayPtrFieldD( pidi, CoffSymbols )

DisplayPtrFieldD( pidi, SizeOfCodeViewSymbols )

DisplayPtrFieldD( pidi, CodeViewSymbols )

DisplayPtrFieldStr( pidi, ImageFilePath )

DisplayPtrFieldStr( pidi, ImageFileName )

DisplayPtrFieldStr( pidi, DebugFilePath )

DisplayPtrFieldD( pidi, TimeDateStamp )

DisplayPtrFieldD( pidi, RomImage )

DisplayPtrFieldD( pidi, DebugDirectory )

DisplayPtrFieldD( pidi, NumberOfDebugDirectories )

}

else

{

printf( "MapDebugInformation failed - exiting\n" );

return;

}

//

// Init symbol handler for "process" (in our case, "0" is the process)

//

if ( !SymInitialize( MY_PROCESS_HANDLE, pszPath, FALSE ) )

{

printf( "MapDebugInformation failed - exiting\n" );

return;

}

//

// If the command line specified "decorated" names (-d), set the symbol

// options before loading the symbol table below.

//

if ( g_fDecoratedNames )

{

DWORD symOptions = SymGetOptions();

symOptions &= ~SYMOPT_UNDNAME; // Turn off the SYMOPT_UNDNAME flag

SymSetOptions( symOptions );

}

//

// Load the symbol table for the specified file. A "MappedAddress" from

// MapAndLoad, or a "MappedBase" from MapDebugInformation is needed as

// parameter 5.

//

if ( !SymLoadModule( MY_PROCESS_HANDLE, 0, pszFilename, 0,

(DWORD)pidi->MappedBase, 0 ) )

{

printf( "SymLoadModuleFailed\n" );

return;

}

IMAGEHLP_MODULE im = { sizeof(im) };

if ( SymGetModuleInfo( MY_PROCESS_HANDLE, (DWORD)pidi->MappedBase, &im ) )

{

StartNewDisplaySection( "IMAGEHLP_MODULE" );

// DisplayFieldD( im, SizeOfStruct ); // Not worth showing

// DisplayFieldD( im, BaseOfImage );

// DisplayFieldD( im, ImageSize );

// DisplayFieldD( im, TimeDateStamp );

DisplayFieldD( im, CheckSum );

DisplayFieldD( im, NumSyms );

DisplayFieldD( im, SymType );

PSTR pszSymType;

switch( im.SymType )

{

case SymNone: pszSymType = "SymNone"; break;

case SymCoff: pszSymType = "SymCoff"; break;

case SymCv: pszSymType = "SymCv"; break;

case SymPdb: pszSymType = "SymPdb"; break;

case SymExport: pszSymType = "SymExport"; break;

case SymDeferred: pszSymType = "SymDeferred"; break;

default: pszSymType = "?";

}

printf( "%-" DISPLAY_COLUMNS "s%s\n", " ", pszSymType );

DisplayFieldStr( im, ModuleName );

DisplayFieldStr( im, ImageName );

DisplayFieldStr( im, LoadedImageName );

}

StartNewDisplaySection( "Symbols" );

printf( " RVA Name\n" );

printf( "-------- ----\n" );

//

// Kick off the symbol enumeration. The EnumSymbolsCallback function

// is called once for each symbol.

//

SymEnumerateSymbols( 0, (DWORD)pidi->MappedBase, EnumSymbolsCallback,

pidi->MappedBase );

SymUnloadModule( 0, (DWORD)pidi->MappedBase ); // Undo SymLoadModule

SymCleanup( 0 ); // Undo the SymInitialize

UnmapDebugInformation( pidi ); // Close the PE file and debug info mapping

}

//=============================================================================

// Callback function for use by the SymEnumerateSymbols API

//=============================================================================

BOOL CALLBACK EnumSymbolsCallback(

LPSTR SymbolName,

ULONG SymbolAddress,

ULONG SymbolSize,

PVOID UserContext )

{

// User Context is whatever was passed to SymEnumerateSymbols. Here,

// we passed the mapped address of the executable. This allows us

// to convert the symbol addresses we get into RVAs, below.

DWORD mappedBase = (DWORD)UserContext;

// print out the RVA, and the symbol name passed to us.

printf( "%08X %s\n", SymbolAddress - mappedBase, SymbolName );

//

// If "decorated" names were specified, and if the name is "decorated,"

// undecorate it so that a human readable version can be displayed.

//

if ( g_fDecoratedNames && ('?' == *SymbolName) )

{

char szUndecoratedName[0x400]; // Make symbol name buffers for the

char szDecoratedName[0x400]; // decorated & undecorated versions

// Make a copy of the original SymbolName, so that we can modify it

lstrcpy( szDecoratedName, SymbolName );

PSTR pEnd = szDecoratedName + lstrlen( szDecoratedName );

// Strip everything off the end until we reach a 'Z'

while ( (pEnd > szDecoratedName) && (*pEnd != 'Z') )

*pEnd-- = 0;

// Call the IMAGEHLP function to undecorate the name

if ( 0 != UnDecorateSymbolName( szDecoratedName, szUndecoratedName,

sizeof(szUndecoratedName),

UNDNAME_COMPLETE |

UNDNAME_32_BIT_DECODE ) )

{

// End the output line with the undecorated name

printf( " %s\n", szUndecoratedName );

}

}

return TRUE;

}

//=============================================================================

// Shows the contents of a PE header. Called by main()

//=============================================================================

void ShowImageFileHeaders( PIMAGE_NT_HEADERS pNTHdrs )

{

PIMAGE_FILE_HEADER pImageFileHeader = &pNTHdrs->FileHeader;

DisplayPtrFieldW( pImageFileHeader, Machine )

DisplayPtrFieldW( pImageFileHeader, NumberOfSections )

DisplayPtrFieldD( pImageFileHeader, TimeDateStamp )

DisplayPtrFieldD( pImageFileHeader, PointerToSymbolTable )

DisplayPtrFieldD( pImageFileHeader, NumberOfSymbols )

DisplayPtrFieldW( pImageFileHeader, SizeOfOptionalHeader )

DisplayPtrFieldW( pImageFileHeader, Characteristics )

PIMAGE_OPTIONAL_HEADER pImageOptHeader = &pNTHdrs->OptionalHeader;

DisplayPtrFieldW( pImageOptHeader, Magic )

DisplayPtrVersionFields("LinkerVersion", pImageOptHeader,

MajorLinkerVersion, MinorLinkerVersion );

DisplayPtrFieldD( pImageOptHeader, SizeOfCode )

DisplayPtrFieldD( pImageOptHeader, SizeOfInitializedData )

DisplayPtrFieldD( pImageOptHeader, SizeOfUninitializedData )

DisplayPtrFieldD( pImageOptHeader, AddressOfEntryPoint )

DisplayPtrFieldD( pImageOptHeader, BaseOfCode )

DisplayPtrFieldD( pImageOptHeader, BaseOfData )

DisplayPtrFieldD( pImageOptHeader, ImageBase )

DisplayPtrFieldD( pImageOptHeader, SectionAlignment )

DisplayPtrFieldD( pImageOptHeader, FileAlignment )

DisplayPtrVersionFields("OperatingSystemVersion", pImageOptHeader,

MajorOperatingSystemVersion,

MinorOperatingSystemVersion )

DisplayPtrVersionFields("ImageVersion", pImageOptHeader, MajorImageVersion,

MinorImageVersion )

DisplayPtrVersionFields("SubsystemVersion", pImageOptHeader,

MajorSubsystemVersion,

MinorSubsystemVersion )

DisplayPtrFieldD( pImageOptHeader, Win32VersionValue )

DisplayPtrFieldD( pImageOptHeader, SizeOfImage )

DisplayPtrFieldD( pImageOptHeader, SizeOfHeaders )

DisplayPtrFieldD( pImageOptHeader, CheckSum )

DisplayPtrFieldW( pImageOptHeader, Subsystem )

DisplayPtrFieldW( pImageOptHeader, DllCharacteristics )

DisplayPtrFieldD( pImageOptHeader, SizeOfStackReserve )

DisplayPtrFieldD( pImageOptHeader, SizeOfStackCommit )

DisplayPtrFieldD( pImageOptHeader, SizeOfHeapReserve )

DisplayPtrFieldD( pImageOptHeader, SizeOfHeapCommit )

DisplayPtrFieldD( pImageOptHeader, LoaderFlags )

DisplayPtrFieldD( pImageOptHeader, NumberOfRvaAndSizes )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_EXPORT )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_IMPORT )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_RESOURCE )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_EXCEPTION )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_SECURITY )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_BASERELOC )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_DEBUG )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_COPYRIGHT )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_GLOBALPTR )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_TLS )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT )

DisplayDataDir( pImageOptHeader, IMAGE_DIRECTORY_ENTRY_IAT )

}

//=============================================================================

// Enumerates through a section table and displays basic info for each section

//=============================================================================

void ShowSectionHeaders( PIMAGE_SECTION_HEADER pSectionHdr, DWORD cSections )

{

printf( " # Name Address VirtSize RawSize\n" );

printf( "-- -------- -------- -------- --------\n" );

for ( unsigned i=1; i <= cSections; i++, pSectionHdr++ )

{

printf( "%2u %8.8s %08X %08X %08x\n", i, pSectionHdr->Name,

pSectionHdr->VirtualAddress, pSectionHdr->Misc.VirtualSize,

pSectionHdr->SizeOfRawData );

}

}

//=============================================================================

// Helper function used to provide consistent delimitation of output sections

//=============================================================================

void StartNewDisplaySection( PSTR pszSectionName )

{

printf( "\n==== %s ====\n", pszSectionName );

}

//=============================================================================

// Called by main() to extract any command-line arguments, as well as the

// filename to be used. Also attempts to come up with a complete path to

// the directory containing the named executable. Returns this value in the

// pszPath param buffer.

//=============================================================================

BOOL ParseCommandLine( int argc, char * argv[], PSTR pszFilename, PSTR pszPath )

{

if ( argc < 2 )

return FALSE;

BOOL fSawFilename = FALSE;

for ( int i = 1; i < argc; i++ )

{

PSTR pszArg = argv[i];

if ( *pszArg == '-' ) // Is the first character a '-' ?

{

*pszArg++;

if ( (*pszArg=='d') || (*pszArg=='D') ) // allow "d" or "D"

g_fDecoratedNames = TRUE; // set global flag

else if ( (*pszArg=='n') || (*pszArg=='N') ) // allow "n" or "N"

g_fShowSymbols = FALSE; // set global flag

}

else

{

if ( fSawFilename ) // We should only get here once

return FALSE;

PSTR pszFilenamePart;

if (GetFullPathName( argv[i], MAX_PATH, pszPath, &pszFilenamePart))

{

// Truncate the filename portion off the path

if ( pszFilenamePart )

*pszFilenamePart = 0;

// Copy the input argument to the passed in filename buffer

lstrcpy( pszFilename, argv[i] );

fSawFilename = TRUE;

}

else // GetFullPathName failed. Ooops!

return FALSE;

}

}

return fSawFilename;

}

As a final note on MapAndLoad, itís important to remember that it creates a linear mapping of the entire file in one contiguous chunk. This is different from the Win32 loader bringing an executable module into memory, creating distinct mappings for each section so that it starts on a page boundary in memory. The result of this linear mapping is that any Relative Virtual Addresses (RVA) that you might see in the PE header arenít directly usable with the image as loaded by MapAndLoad. To use an RVA in this situation, youíd have to adjust it to account for the difference between the sectionís file offset and its in-memory address. Luckily, IMAGEHLP.DLL provides an API, ImageRVAToVa, that will do this for you.

If itís symbol table information youíre after, the equivalent to MapAndLoad is the MapDebugInformation API. You can think of MapDebugInformation as a superset of the MapAndLoad API. Besides mapping the executable file into memory, this API also figures out what the best type of symbol information is as well as some basic information about that symbol table. What exactly do I mean by ďbestĒ? It turns out that an executable can be built with more than one type of debug information. For example, you can create an executable with both CodeView (PDB) information and a COFF symbol table. IMAGEHLP knows how to read both formats, as well as a few others, and knows which one is optimal for your executable. More on this later. Just as the MapAndLoad API eventually needs to be followed by a call to UnMapAndLoad, MapDebugInformation also needs to be cleaned up by calling UnmapDebugInformation.

Because the symbols for an executable may be in a file other than the executable itself, the MapDebugInformation API takes a parameter not needed for the MapAndLoad APIóthe symbol search path. By default, IMAGEHLP searches for symbol files in a series of paths that Iíll describe later. However, the MapDebugInformation API lets you override these paths. This is what Iíve done in the EZPE source where it calls MapDebugInformation.

Besides mapping and loading the executable and its symbols (if present), the MapDebugInformation API returns a pointer to an IMAGE_DEBUG_INFORMATION structure. This structure contains many more fields than a LOADED_IMAGE structure, although nearly every field in the LOADED_IMAGE structure can be found in the IMAGE_DEBUG_INFORMATION structure. For example, the MappedBase field contains the address where the executable was mapped, and is the same as the Mapped­Address field in a LOADED_IMAGE structure. Similarly, the Sections field is a pointer to the executableís section table, and so forth.

More useful information found in the IMAGE_DEBUG_
INFORMATION structure includes the preferred load address (the ImageBase field), and the size of the executable in memory (the SizeOfImage field). There are also pointers to the table of names for the exported functions, as well as the executableís time/date stamp DWORD. You can pass this DWORD to the C++ ctime function to get the time and date when the executable was built. For more information on the time/date stamp, see my February 1997 column.

The meaning of some fields in the IMAGE_DEBUG_
INFORMATION structure isnít so obviousólike the pointers to Function and FPO tables. The Function table is data used by the structured exception handling code on the Alpha and MIPS platforms (itís not encountered with Intel-based executables). FPO information is seen only on the Intel platform; it helps debuggers walk the call stack in the absence of standard EBP register stack frames.

Finally, the IMAGE_DEBUG_INFORMATION has a variety of fields that indicate if CodeView and COFF information are present, and if so, where. Thereís even a pointer to the debug directory. This is the data structure in the PE file that tells you what types of debug information are present and where. The MapDebugInformation API does a good job of extracting this information and presenting it in the IMAGE_DEBUG_INFORMATION structure. Still, if youíre so inclined, you can go straight to the same raw data that IMAGEHLP uses to generate the IMAGE_
DEBUG_INFORMATION structure. Remember though, the whole advantage of using IMAGEHLP is to avoid such low-level grunginess.

So far, the two APIs Iíve examined (MapAndLoad and MapDebugInformation) simply map an executable into memory and extract some useful information from it. Neither API loads a symbol table, although calling one of them is effectively a prerequisite to using the symbol table APIs. The key piece of data needed is the mapped address of the executable. The symbol table APIs work from the mapped executable to find and load the appropriate symbol table into memory.

The first IMAGEHLP symbol table API you should be aware of is SymInitialize, which sets up internal variables in IMAGEHLP so that the DLL is prepared to load symbol tables for the executable and possibly multiple DLLs within a process. As you might expect, thereís a corresponding shutdown API, SymCleanup, that should be called when youíre finished working with symbols.

The first parameter to SymInitialize is an identifier for a process that you want to use when working with symbols. If you were using IMAGEHLP as part of a real debugger, youíd want to pass a valid process handle. This allows IMAGEHLP to enumerate through all the loaded modules in a process address space and load the associated symbol tables. You can turn off this automatic module enumeration by passing FALSE as the third parameter to Sym­Initialize. If youíre not a debugger process (EZPE isnít), you can pass whatever value youíd like as the process handle. Just remember to pass the same value to subsequent symbol APIs that expect a process handle. In the case of EZPE, I used the value zero through a #define called MY_
PROCESS_HANDLE.

(As a side note to the automatic module enumeration I referred to, the Windows NT 4.0-supplied IMAGEHLP wonít do this under Windows 95. The module enumeration APIs under Windows NT are different than those in Windows 95. However, these differences are slated to be resolved in a subsequent release.)

The second parameter to SymInitialize is the symbol search path. If you pass a valid string pointer in the form of a path (that is, directories separated by semicolons), IMAGEHLP searches those directories when looking for a symbol table thatís in a different file than the executable. Passing zero causes IMAGEHLP to use three environment variables as the path: _NT_SYMBOL_PATH, _NT_ALTER­NATE_SYMBOL_PATH, and SYSTEMROOT.

After youíve called SymInitialize, the next step is to load the symbol tables youíre interested inóthat is, assuming you didnít pass a valid process handle to SymInitialize so that it enumerated and loaded all the symbol tables automatically. EZPE doesnít do this, so itís necessary to manually load the symbol table for the executable file that itís working with. The API that manually loads a symbol table is SymLoadModule. Not surprisingly, thereís a SymUnload­Module to use when youíre done with a given symbol table.

Although the SymLoadModule API takes six parameters, only three are required for a simple program like EZPE. The first parameter is the process handle value that was passed to SymInitialize earlier in the program. Parameter three is the name of the executable file whose symbols are to be loaded. Parameter five is the address where the executable is mapped into memory. As I alluded to earlier, this value can be obtained easily by calling MapAndLoad or MapDebugInformation. Assuming all goes well, SymLoad­Module returns TRUE.

After loading a symbol table, there are a variety of actions available. For example, in my April 1997 column I used the SymGetSymFromAddr function to take an address and find the name of the nearest symbol. The end result was a stack trace containing symbolic function names. If I were writing a debugger, I could use the SymGetSymFromName API to look up the address of a function or variable name that the user requested.

With EZPE, the first action after loading a symbol table is to find out more about what was just loaded. This can be done with the SymGetModuleInfo API. The first parameter is the process handle value used with the other symbol APIs. The second parameter needs to be an address somewhere within the module that the symbol table belongs to. In the EZPE code, the easiest thing to use is the base address to which the executable was memory mapped. The third parameter to SymGetModuleInfo is a pointer to an IMAGEHLP_MODULE structure that the API fills with information about the module and its symbol table.

The first group of fields in an IMAGEHLP_MODULE structure is standard stuff that you can get in ways that I described earlier. More interesting is the SymType and NumSyms fields. The SymType field contains an enum that indicates what type of symbol table was loaded (for example, SymCoff, SymCv, SymPdb, or SymExport).

The SymExport type is worth a mention. Exports arenít formally considered to be debug information. However, the information stored for an exported function (its name and address) is the bare minimum required for inclusion in a symbol table. Therefore, IMAGEHLP can synthesize a symbol table out of an executableís exports. The upshot is that any executable that exports symbols can be considered to have at least a minimal symbol table available. (By the way, if youíre a SoftIce user, the Load EXPORTS capability works along the same lines.)

Another, more useful action you can take with a loaded symbol table is to enumerate through all the symbols. For this purpose, IMAGEHLP has the SymEnumerateSymbols API. The first parameter is the process handle value used with the other symbol APIs. Parameter two is the base address of the executable whose symbols youíre interested in. The third parameter is the address of a callback function that will be called once for each symbol in the symbol table. The fourth parameter can be whatever youíd like. Itís passed on to the callback function, unmodified. If I didnít want to use the SymEnumerateSymbols API, I could use the GymGetSymNext API in a loop instead. Both APIs have their strengths and weaknesses, so I just picked one arbitrarily for EZPE to use.

The EZPE Code

Now letís look at the EZPE program and its code. EZPE is a command-line program that accepts arguments. The source file EZPE.CPP is shown in Figure 1. You can see its usage by running EZPE with no arguments:

Syntax: EZPE [options] <filename>

-d Decorated C++ names

-n No symbol display

In the simplest case, youíd give EZPE the name of an executable file to display. EZPE outputs to the stdout, so its output can be redirected to a file. For example:

EZPE C:\WINNT\SYSTEM32\KERNEL32.DLL > results

Figure 2 shows the results of running EZPE on its own EXE. The Ėn option tells EZPE to not bother loading and displaying the symbols. If you were to use the Ėn option, everything after the ď==== IMAGE_DEBUG_ INFORMATION ====Ē line in Figure 2 would be omitted from the program output.

Figure 2 EZPE Output

ezpe.mak

Ezpe.out

Display of file EZPE.EXE

==== LOADED_IMAGE ====

ModuleName EZPE.EXE

hFile FFFFFFFF

MappedAddress 008B0000

FileHeader 008B0080

LastRvaSection 008B0178

NumberOfSections 00000004

Sections 008B0178

Characteristics 0000010B

fSystemImage 00000000

fDOSImage 00000000

Links 0012FFA4

SizeOfImage 00003E00

==== PE File Headers (LOADED_IMAGE.FileHeader) ====

Machine 014C

NumberOfSections 0004

TimeDateStamp 336CF53B

PointerToSymbolTable 00000000

NumberOfSymbols 00000000

SizeOfOptionalHeader 00E0

Characteristics 010B

Magic 010B

LinkerVersion 5.00

SizeOfCode 00002200

SizeOfInitializedData 00001800

SizeOfUninitializedData 00000000

AddressOfEntryPoint 00001E59

BaseOfCode 00001000

BaseOfData 00004000

ImageBase 00400000

SectionAlignment 00001000

FileAlignment 00000200

OperatingSystemVersion 4.00

ImageVersion 0.00

SubsystemVersion 4.00

Win32VersionValue 00000000

SizeOfImage 00007000

SizeOfHeaders 00000400

CheckSum 00000000

Subsystem 0003

DllCharacteristics 0000

SizeOfStackReserve 00100000

SizeOfStackCommit 00001000

SizeOfHeapReserve 00100000

SizeOfHeapCommit 00001000

LoaderFlags 00000000

NumberOfRvaAndSizes 00000010

IMAGE_DIRECTORY_ENTRY_EXPORT Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_IMPORT Address: 00006000 Size: 00000050

IMAGE_DIRECTORY_ENTRY_RESOURCE Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_EXCEPTION Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_SECURITY Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_BASERELOC Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_DEBUG Address: 00004000 Size: 00000054

IMAGE_DIRECTORY_ENTRY_COPYRIGHT Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_GLOBALPTR Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_TLS Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT Address: 00000000 Size: 00000000

IMAGE_DIRECTORY_ENTRY_IAT Address: 0000613C Size: 000000EC

==== Section Headers (LOADED_IMAGE.Sections) ====

# Name Address VirtSize RawSize

-- -------- -------- -------- --------

1 .text 00001000 0000206F 00002200

2 .rdata 00004000 00000164 00000200

3 .data 00005000 00000E53 00000c00

4 .idata 00006000 000004FC 00000600

==== IMAGE_DEBUG_INFORMATION ====

Size 0000012C

MappedBase 008B0000

Machine 014C

Characteristics 010B

CheckSum 00000000

ImageBase 00400000

SizeOfImage 00007000

NumberOfSections 00000004

Sections 008B0178

ExportedNamesSize 00000000

ExportedNames 00000000

NumberOfFunctionTableEntries 00000000

FunctionTableEntries 00000000

LowestFunctionStartingAddress 00000000

HighestFunctionEndingAddress 00000000

NumberOfFpoTableEntries 00000005

FpoTableEntries 008AE72C

SizeOfCoffSymbols 00000000

CoffSymbols 00000000

SizeOfCodeViewSymbols 00000029

CodeViewSymbols 008AE780

ImageFilePath E:\column\col47\EZPE.EXE

ImageFileName EZPE.EXE

DebugFilePath E:\column\col47\EZPE.EXE

TimeDateStamp 336CF53B

RomImage 00000000

DebugDirectory 008B2600

NumberOfDebugDirectories 00000003

==== IMAGEHLP_MODULE ====

CheckSum 00000000

NumSyms 00000000

SymType 00000003

SymPdb

ModuleName EZPE

ImageName EZPE.EXE

LoadedImageName E:\column\col47\EZPE.EXE

==== Symbols ====

RVA Name

-------- ----

000061BC _imp__ExitProcess

00001834 ShowImageFileHeaders

0000616C _imp__UnDecorateSymbolName

00001F92 UnmapDebugInformation

00001FEC GetStdHandle

00001F8C MapAndLoad

00006014 _IMPORT_DESCRIPTOR_IMAGEHLP

00001E14 printf

000061A8 _imp__lstrlenA

00001F80 wvsprintfA

00001D5F ParseCommandLine

00001E76 _ConvertCommandLineToArgcArgv

00005A78 g_szHelp

000061F8 _imp__wvsprintfA

0000614C _imp__UnmapDebugInformation

00001FE0 GetFullPathNameA

00006154 _imp__SymUnloadModule

000061C8 KERNEL32_NULL_THUNK_DATA

00006148 _imp__MapAndLoad

00001FDA lstrcpyA

000061B0 _imp__GetCommandLineA

00001F86 UnMapAndLoad

00001FA4 SymEnumerateSymbols

00005BE0 _ppszArgv

00005A70 g_fShowSymbols

00001FAA SymGetModuleInfo

00001292 ShowSymbols

0000613C _imp__MapDebugInformation

000061FC USER32_NULL_THUNK_DATA

00001D4A StartNewDisplaySection

00001046 main

00001FBC SymGetOptions

0000615C _imp__SymGetModuleInfo

00006140 _imp__SymInitialize

00006144 _imp__UnMapAndLoad

00002004 GetCommandLineA

000061C4 _imp__GetProcessHeap

000061A4 _imp__GetFullPathNameA

00006170 IMAGEHLP_NULL_THUNK_DATA

00006164 _imp__SymSetOptions

00005BC0 g_fDecoratedNames

00006160 _imp__SymLoadModule

00001FC8 MapDebugInformation

00001F9E SymUnloadModule

00006000 _IMPORT_DESCRIPTOR_USER32

00001FF8 HeapAlloc

00006158 _imp__SymEnumerateSymbols

00001F98 SymCleanup

00001FE6 WriteFile

000061AC _imp__lstrcpyA

00001767 EnumSymbolsCallback

00006150 _imp__SymCleanup

000061B8 _imp__GetStdHandle

00001FD4 lstrlenA

00001FB6 SymSetOptions

00006168 _imp__SymGetOptions

00001FFE GetProcessHeap

000061B4 _imp__WriteFile

00006028 _IMPORT_DESCRIPTOR_KERNEL32

00001E59 mainCRTStartup

00001FF2 ExitProcess

000061C0 _imp__HeapAlloc

00001FCE UnDecorateSymbolName

00001CEE ShowSectionHeaders

00001FC2 SymInitialize

00001FB0 SymLoadModule

0000603C _NULL_IMPORT_DESCRIPTOR

The Ėd option tells EZPE to display the decorated (mangled) names of any C++ symbols in the symbol table. By default, when SymLoadModule creates the symbol table, it undecorates any C++ symbols into human readable form. The undecorated name consists solely of the class name and member function name, such as foo::bar. This is the default output mode that EZPE uses. The Ėd option tells EZPE to emit the raw, decorated names instead: ?ParseCommand­Line@@YAHHQAPADPAD1@Z.

While I was writing EZPE, it occurred to me that the default undecoration strips out lots of potentially useful information such as the parameters, calling convention, return type, and so forth. Therefore, when using the Ėd option, EZPE displays the decorated name as well as an undecorated version that contains much more information about the symbol.

Coming up with a better undecorated version of a symbol name turned out to be a bit of a challenge. The first require­ment was to force SymLoadModule to leave the symbol names alone when loading the symbol table. Luckily, thereís another IMAGEHLP API that makes this easyóthe SymSetOptions API takes a flag called SYMOPT_UND­NAME, which isnít a default setting. Because I wanted to change only that option and leave the others alone, the code calls SymGetOptions to get the current options. It then ORs in the SYMOPT_UNDNAME flag and calls SymSetOptions with the result.

The remaining work of displaying a better undecorated symbol name is to call yet another IMAGEHLP API, UndecorateSymbolName, for any name that appears to be decorated (decorated names begin with a ď?Ē). Undecorate­SymbolName takes a whole slew of parameters that tell it what parts of an undecorated name to include or not include. The EZPE code uses the set of options that should produce the most information in the undecorated name.

When I tested EZPE, the UndecorateSymbolName failed on certain symbol names. A little investigation proved that some symbols had garbage characters at the end of their names, whether the name was normal or decorated. Apparently, IMAGEHLP leaves garbage at the end of certain symbol names when operating with the SYMOPT_
UNDNAME option enabled. For normal names, I didnít go to the trouble of trying to strip off the garbage characters. However, I did notice that most C++ symbol names end with a capital Z. In the EnumSymbolsCallback function from EZPE.CPP, youíll find that the code works backwards from the end of C++ symbol names, stripping off characters until it encounters a Z. Not pretty, but it seems to work OK.

Another interesting thing about the EZPE Enum­SymbolsCallback function concerns the fourth parameter. It turns out that when IMAGEHLP calls the function, the symbol address it passes is a linear address and is connected to where the executable was mapped into memory. For a debugger operating on a live process, this is just fine. However, in a symbol display program, itís worthless. The executable could be mapped nearly anywhere.

To resolve this situation, I made EZPE emit the RVA of the symbol rather than the value IMAGEHLP passes to the callback function. (An RVA is independent of the executableís mapped address, and just makes more sense since PE files themselves store all addresses as RVAs.) To calculate the RVA of each symbol, the EnumSymbols­Callback has to know where the executable is mapped into memory. Luckily, SymEnumerate­Symbols has a parameter that it passes on, unmodified, to the enumeration callback function. EZPE uses this parameter to convey the executableís mapped address to the enumeration callback function. In the callback, the code subtracts this value from the symbol address to obtain the symbolís RVA. Youíll see this in the portion of Figure 2 that begins with the header ď==== Symbols ====Ē. In particular, note that the addresses for the symbols are relatively small and fall within the RVAs listed for the various PE file sections.

As a final wrap-up, let me first apologize for the macro madness at the beginning (for example, the DisplayPtr­FieldD macro). When I was writing EZPE, I knew that it would display many fields from numerous structures. I wanted these fields to be formatted in a nice, consistent manner. If I had used printf directly, I would need to modify each printf individually if I wanted to change any output formatting. Making EZPE into a GUI app would have been even more of a pain. By using nested macros and the pre­­processor stringize feature, I was able to isolate all the details of how the structure fields should be displayed into one location.

If youíre ambitious and want to extend or customize EZPE, there are a number of things you can do. For example, you could remove the display of the various IMAGEHLP-specific data structures. I included them to show what sort of information IMAGEHLP gives you. The resulting output would be smaller and would include only information from the executable and symbol tables. Another nice feature would be to decode the various fields containing flags, such as the Characteristics field in the PE header, or the PE section attributes. Even with this extra code, youíd have a very compact program, which is a testament to the power that IMAGEHLP.DLL provides.

To obtain complete source code listings, see page 5.

Have a question about programming in Windows? Send it to Matt at mpietrek@tiac.com

&# 169; 1997 Microsoft Corporation. All rights reserved. Terms of use