Code for this article: Hood0297.exe (21KB)
Matt Pietrek is the author of Windows 95 System Programming Secrets (IDG Books, 1995). He works at NuMega Technologies Inc., and can be reached at email@example.com.|
In my October 1996 Liposuction article, I created a set of classes for accessing information in executable files. The article also used a class I called a MODULE_DEPENDENCY_LIST, which attempted to find all DLLs that were imported (either directly or indirectly) by a Win32® portable executable (PE). To keep that article on track with the subject matter, I didn't do full justice in describing these classes and their capabilities.
This month, I'll revisit these classes (which I've spiffed up), and create a new utility from them. I must confess
to having an ulterior motive in doing this work. I've
recently been bitten by several problems relating to DLLs and versioning. One problem involves multiple versions
of a DLL in different directories. This inevitably leads to
the wrong DLL being used, which can be a hair-pulling experience when trying to figure out why your program doesn't work.
Another problem occurs when the system complains that a DLL can't be found, and I can't determine who referenced it. Figure 1 shows the enormously helpful message you receive from Windows NT® 4.0 when a DLL isn't found. It sure would be nice to know who needs "ONION32.DLL."
Figure 1 Windows NT can't find a DLL
Yet a third problem is that a coworker of mine needs to know exactly when the executables in a particular beta build were actually created. It's common practice to mark all the executables with the same time and date when a product is released. This makes it difficult to determine which version of a file was actually shipped. It turns out that, in Win32 executables, you can (usually) determine exactly when the file was linked. This isn't a trivial task, however, and by the time I had finished writing my program, I knew more about dates and times in Win32 than I ever cared to know. Would you believe that there are four different ways that dates and times are expressed in various parts of Win32?
The DEPENDS Program
The program that I put together to help solve these problems is called DEPENDS (insert your own joke here). I wrote DEPENDS as a console-mode application so it's easy to collect the results into a file, and it's also easy to use it in automatic build processes. The syntax for using DEPENDS can be gleaned by running it without any arguments:
DEPENDS - Matt Pietrek, 1997, for MSJ
Syntax: DEPENDS [args] <executable filename>
/v show version information
/t show time & date information
/p show full path
/l show link time & date information
If you run DEPENDS with just the name of an executable file, you'll get a list of all DLLs used by the executable, along with the EXE's name. For example, running DEPENDS on MSDEV.EXE from Visual C++® gives the output shown in Figure 2. Of the 19 executables that are required to run MSDEV.EXE (18 DLLs and one EXE), only four of the DLLs are referenced directly: MSVCSHL.DLL, MFC40.DLL, MSVCRT40.DLL, and KERNEL32.DLL. The remaining DLLs are indirectly imported; that is, they're imported
by one of the four DLLs used by MSDEV. Alternatively,
the DLLs may be another level away and are imported by
one of the DLLs imported by MSDEV's four DLLs. The DEPENDS program uses recursion to show you all of an executable's dependencies, much like the Win32 loader does when it loads the program. I 'll have more to say on
The four command-line switches to DEPENDS let you tailor the output to your liking. You can use either - or / as the switch character, and the options are not case-sensitive (/V is equivalent to -v).
The /v switch causes DEPENDS to emit any version information that it finds in an executable file. Figure 3 shows an example of the /v switch used on CLOCK.EXE.
The /p switch tells the program to append the complete path to each EXE or DLL in the dependency list. For example, this command line
DEPENDS /p c:\WINNT\SYSTEM32\CLOCK.EXE
This capability lets you see exactly which DLLs are being used by a program. If you suspect a DLL mismatch (for instance, a DLL in multiple directories in the path), the /p switch can be helpful in tracking down the problem.|
The remaining two command-line switches emit the time and date of each EXE or DLL in the dependency list. Using /t forces DEPENDS to emit the date and time of the file as recorded by the file system. This time and date information is what you'll see in the Explorer or by doing a DIR from the command line.
The other date and time information that DEPENDS can show is when the executable was created. This information is stored in the PE header and doesn't change even if you explicitly modify the traditional date and time by using tools like TOUCH. To see this information, use DEPENDS with the /l switch.
How I obtained this information is an interesting programming story, which I'll come back to later. As a side note, I was quite surprised when I ran DEPENDS on some Windows NT 4.0 executables. It seems that USER32.DLL, KERNEL32.DLL, and NTDLL.DLL were all created on different days, and those dates were about two weeks before the formal release date of 08/09/96 that the Windows Explorer shows. Give it a try and see for yourself!
The MODULE_DEPENDENCY_LIST Class
The heart of the DEPENDS code is the MODULE_DEPENDENCY_LIST class, implemented in DependencyList.h and DependencyList.cpp (see Figure 4). The constructor for this class takes one argument, the name of the executable to be searched for dependencies. When the constructor returns, the dependency list has been generated. There are querying methods to retrieve information from the list.
What if there's an error and a dependency list isn't generated? For instance, what if a nonexistent file name is passed to the constructor? The IsValid method returns a BOOL indicating if a dependency list was successfully created. If there was a problem, you can ascertain the reason via the GetErrorType method, which returns an enum indicating the cause of the problem. The possible problems are a file that doesn't exist, a file that's not a Win32 PE file, and "general." The last is a catchall for problems such as memory allocation failures. You can also get a descriptive string for the problem by calling the GetErrorString method.
If the dependency list was created successfully, there are two methods for finding out about a particular module in the list. The LookupModule method takes either the base file name or the full path to a module and returns information about the module, if found. The GetNextModule method is for iterating through each module in succession. To start the enumeration, pass in zero as the parameter. All subsequent calls should pass a pointer to the information returned by the previous call to GetNextModule.
The most interesting code in MODULE_DEPENDENCY_
LIST occurs during the constructor call. This code takes the file name parameter and prepares to add it as the first entry in the dependency list. Next, the constructor saves the current directory away and switches to the directory where the file is located. This mimics the behavior of the operating system, which treats the executable's directory as implicitly part of the search path. After creating the entire dependency list, the constructor switches the current directory back to its original value. If you decide to use the MODULE_DEPENDENCY_LIST code in your own programs, be aware that this directory switching makes the class thread-unsafe. Remember, the current working directory is effectively global data for a program.
The workhorse of the MODULE_DEPENDENCY_LIST class is the private AddModule method, invoked from the class's constructor. AddModule takes a file name as a parameter and adds the file's information to the dependency list. AddModule next scans through the file's import table and looks for other files that aren't already in the dependency list. If AddModule finds such a module, it calls itself again, this time with the name of the imported module. This recursiveness is similar to what the Win32 loader does when it verifies that all required modules are present before starting a process. By the time the first call to AddModule returns, the entire dependency tree has been recursively searched and built.
Another way that the AddModule method imitates the system's behavior is in how it finds the complete path to imported DLLs. In the import table of a module, only the base name of the imported DLL appears (for example, "ONION32.DLL"). The system takes that base file name and searches the path for a file with that name. I didn't want to write my own path-searching code and, luckily, I didn't have to; the Win32 SearchPath API does exactly what I need.
With the AddModule method behind me, let's now return to the subject of extracting information about the dependency list. Both the LookupModule and GetNextModule methods of the MODULE_DEPENDENCY_LIST class return a pointer to a class known as MODULE_FILE_INFO. A MODULE_FILE_INFO class describes exactly one module in the dependency list, and is implemented in ModuleFileInfo.H and ModuleFileInfo.CPP. The primary public methods are GetBaseName and GetFullName, which return the base file name and full path to the module, respectively.
One slick new addition to the MODULE_DEPENDENCY_
LIST code (relative to the version of this code from my Liposuction article) is the "not found" list. Each MODULE_FILE_INFO class contains a list of imported modules that the MODULE_DEPENDENCY_LIST::AddModule method was unable to locate. To enumerate this list, use the GetNextNotFoundModule method, which returns a pointer to a MODULE_FILE_INFO describing the unlocatable module. To start enumerating the unfound modules, pass in a NULL pointer. In subsequent calls, pass the previously returned MODULE_FILE_INFO pointer. I'll demonstrate this method later on.
The PE_EXE Class
While much of the action of DEPENDS occurs in MODULE_DEPENDENCY_LIST, this class relies heavily on the underlying PE_EXE class shown in Figure 5. The
PE_EXE class is itself derived from the EXE_FILE
class (see Figure 6), which is derived from the MEMORY_
MAPPED_FILE class. Figure 7 shows the hierarchy. Let's start at the lowest level, and look at each successive class briefly.
The base class for the hierarchy is the MEMORY_MAPPED_FILE class. It simply provides a wrapper around the APIs necessary to use memory-mapped files: CreateFile, CreateFileMapping, and MapViewOfView. The destructor for the class automatically undoes everything to clean up properly.
Figure 7 Class Hierarchy
After the MEMORY_MAPPED_FILE constructor returns, you can check that everything went OK by calling the IsValid method. For more detailed information in the event of an error, call the GetErrorType method. If everything succeeded, the GetBase method returns a pointer to the beginning of the mapped region.
Up a level in the hierarchy is the EXE_FILE class, which is derived from the MEMORY_MAPPED_FILE class. This is because an EXE_FILE is just a special case of a regular file. The EXE_FILE constructor also takes a file name as its only parameter, and passes the file name on to the MEMORY_MAPPED_FILE constructor. The guts of the EXE_FILE constructor check to make sure that the file is (at a minimum) an MS-DOS® MZ executable. Code using the EXE_FILE class can use the IsValid function to ensure that the specified file really is an executable.
If the file begins with an MS-DOS MZ executable, the executable may be just an MS-DOS stub for a newer type of executable. The file might really be a 16-bit Windows executable (NE), a Win32 executable (PE), an OS/2 executable (LX), or a VxD (LE). The EXE_FILE constructor examines the file and tries to determine what type of executable it is. The EXE_FILE::GetExeType method returns an enum indicating the kind of executable it is. All of the more modern executables contain a secondary header, so the EXE_FILE class also has the GetSecondaryHeaderOffset method, which does just what its name implies.
Finally, the PE_EXE class derives from the EXE_FILE class. The PE_EXE constructor also takes a file name as the only parameter, and passes it down the chain to the EXE_FILE constructor. The PE_EXE class has specific knowledge about the IMAGE_NT_HEADERS and related structures defined in WINNT.H. After creating the class, call the IsValid method to make sure that everything went OK before using the other methods. PE_EXE doesn't define its own GetErrorType method. Rather, the same error codes returned by the base class EXE_FILE::GetErrorType method apply.
Once a valid PE_EXE exists, there are several different ways of accessing the data in the file. The GetIMAGE_NT_
HEADERS method returns a pointer to the IMAGE_
NT_HEADERS structure in the memory-mapped file, and you're free to pick through it however you want. For simpler access to the data, the PE_EXE class provides wrapper methods that return the values of individual fields in the PE header (for example, the GetAddressOfEntryPoint method). The class also provides easy access to information in the PE file's DataDirectory via the GetDataDirectoryEntryXXX methods. Finally, the GetReadablePointerFromRVA method takes a Relative Virtual Address (RVA) as input, and returns a pointer to the corresponding location in the underlying memory-mapped file.
In my Liposuction code, I went a step further and derived a PE_EXE2 class from the PE_EXE class. I don't need anything so fancy here. The PE_EXE class provides quick and easy access to information in a PE file with a minimum of overhead. I suspect that I'll be using the PE_EXE class in future projects because it's so handy.
Who's Got the Time?
My initial goals for DEPENDS were to just spit out the dependency list and then elaborate it with the ability to print out the full path to the module. Next on the list was to add date and time information. That's when I ran into trouble with Win32. As I mentioned earlier, there are at least four different ways that a time can be stored under Win32, and I ended up working with all four.® 95 and you had to tell it where you live? There's a reason for that. It turns out that, under Win32, file times are specified in Coordinated Universal Time (UTC). Using UTC allows for the operating system to account for the fact that while it's 7PM in Greenwich, England, it's only 2PM in Nashua, New Hampshire.
The first type of time that I encountered was FILETIME, which is a 64-bit structure returned by the GetFileTime function. Looking up the FILETIME structure in the SDK documentation, you'll come across this definition: "The FILETIME structure is a 64-bit value representing the number of 100-nanosecond intervals since January 1, 1601." I don't know about you, but I find that friends and relatives get testy when I specify the time in 100-nanosecond intervals. Luckily, the second Win32 time format comes to the rescue. This time format is a structure known as a SYSTEMTIME that has fields for the year, month, day, hour, second, and millisecond. There's even a Win32 API, FileTimeToSystemTime, that does the conversion for you. Of course, if you're big into the whole Julian, Gregorian, leap year, leap century thing, you could do the conversion yourself.
Once I had coded up my calls to GetFileTime and FileTimeToSystemTime, I fired up the program and promptly discovered that all the times were off by several hours. Ooops! Remember when you installed Windows NT or Windows
Making every programmer responsible for checking time zones and compensating accordingly would be a bad thing. That's why Win32 has the FileTimeToLocalTime API. To make my file dates and times appear correct, I had to first call FileTimeToLocalFileTime before calling FileTimeToSystemTime.
The third format for representing times in Win32 is the old MS-DOS way. In this format, the date and time are stored in separate WORDs. Because there are only 16 bits to play with, the year is stored relative to 1980. Likewise, the lowest time resolution is two seconds. If you choose to work in this time format, the FileTimeToDosDateTime API will be of interest. Why bring up this archaic time format? Silly me; when I started work on DEPENDS, I didn't immediately realize that the SYSTEMTIME format was what I should be using. The early versions of DEPENDS converted FILETIMEs to MS-DOS dates and times until I realized the error of my ways.
The fourth time format under Win32 is one you won't see in any of the API documentation. In Win32 executables, there's a DWORD in the IMAGE_FILE_HEADER portion of the PE header. This DWORD is called the TimeDateStamp, and represents the number of seconds since midnight on January 1, 1970, in Greenwich, England. The TimeDateStamp is set by the linker, and is actually used in other parts of a PE file.
At this point, I need to confess a small boo-boo. In my article, "Peering Inside the PE: A Tour of the Win32 Portable Executable File Format," (MSJ March 1994), as well as in my book, Windows 95 System Programming Secrets, I described the TimeDateStamp field as being the number of seconds since 4PM on December 31, 1969. I obtained this particular time by setting the TimeDateStamp on a file to the value zero and then running DUMPBIN on the file. What I didn't take into account was that DUMPBIN was adjusted for my time zone (which pretty obviously wasn't Greenwich Mean Time). So here's another one for all you errata collectors out there.
This TimeDateStamp field can be quite useful. For example, while you can change the file's date and time in the file system, the TimeDateStamp remains unaffected. Therefore, if you really want to know when an executable was created, the TimeDateStamp field is more accurate (assuming the linker set it properly). The only tricky part is fig- uring out how to get the number of seconds since 1970 into a format that the general population cares to work with.
After pondering this problem for a while, I came across the following trick. Both FILETIME and TimeDateStamp are values relative to some point in time. If I can somehow express a TimeDateStamp in terms of a FILETIME, I can then use the various Win32 time APIs to do whatever I desire. To start with, I need to know how January 1, 1970, is expressed as a 64-bit FILETIME. Next, I need to convert the TimeDateStamp (expressed in seconds) into 100-nanosecond units. Finally, add the two values together to make a FILETIME containing the desired time.
To convert January 1, 1970, into a FILETIME, I work backwards. First, I create a SYSTEMTIME structure and fill in the fields corresponding to January 1, 1970. Next, I pass this structure to the SystemTimeToFileTime API and print out the resulting 64-bit FILETIME value. You can see this value (0x0x019DB1DED53E8000) in use in the DEPENDS code. Converting seconds to 100-nanosecond units is easy. Just multiply by 10 million. Of course, the result could overflow a 32-bit DWORD, so I made sure to cast
one of the multiplicands to a 64-bit integer (an __int64 in Visual C++). Of course, if you want to take the easy way
out, you could just use the ctime function from the C runtime library.
The DEPENDS Code
The main code for Depends.exe is in Depends.cpp (see Figure 8). The main function first invokes the ProcessCommandLine function to parse the command-line arguments, including the name of the file to process. Next, function main declares a MODULE_DEPENDENCY_LIST class instance. The rest of function main is a while loop that iterates through every MODULE_FILE_INFO class in the dependency list. Each MODULE_FILE_INFO instance is passed to the DisplayFileInformation function, which emits the requested information about the file. Before continuing on to the next module, the while loop also uses the MODULE_FILE_INFO::GetNextNotFoundModule method to print out any imported modules that weren't located. By doing this, DEPENDS makes it easy to track down exactly who's referencing some DLL that the system isn't finding.
The DisplayFileInformation function, at a minimum, displays the base file name from the MODULE_FILE_INFO passed to it. The remainder of the function's output depends on the command-line switches. If /t is specified, the function uses GetFileTime and a pair of helper functions to display the file system's date and time for the file. Next, if /l is specified, the function creates a temporary instance of the PE_EXE class in order to retrieve the TimeDateStamp. The TimeDateStamp is then passed to a helper function, TimeDateStampToFileTime, and the returned FILETIME information is displayed
The last bit of code in the DisplayFileInformation function is for the /v switch. If set, the file name is passed to the ShowVersionInfo function. ShowVersionInfo uses several of the version APIs: GetFileVersionInfoSize, GetFileVersionInfo, and VerQueryValue. After allocating space for the version information for a file and reading it in, the code uses VerQueryValue to look for the Win32 predefined version strings such as "CompanyName," "FileDescription," and so on. In writing this code, I found that even Microsoft is inconsistent in their use of a code page for the version strings. Most version resources use code page 1252 (Unicode), but a few use code page 1200 (Windows Multilingual). My code checks for both. In testing, I found that some executables used even other code pages. If you're looking to improve the DEPENDS code, this function is fertile ground.
When I was all done, I figured that it was a perfect candidate for TINYCRT, which appeared in my October 1996 column. TINYCRT is a minimal replacement runtime library for the standard C++ RTL. Using TINYCRT (the Visual C++ version is called LIBCTINY.LIB) is as simple
as including it in the linker's list of libraries. By using LIBCTINY.LIB, I was able to cut Depends.exe from 25KB down to 9KB. I've included LIBCTINY.LIB in the downloadable sources so that you can rebuild DEPENDS if necessary.
Have a question about programming in Windows? Send it to Matt at firstname.lastname@example.org
© 1997 Microsoft Corporation. All rights reserved. Legal Notices.