Code for this article: X-ProcessOptex.exe (3KB)
Jeffrey Richter wrote Advanced Windows, Third Edition (Microsoft Press, 1998) and Windows 95: A Developer's Guide (M&T Books, 1995). Jeff is a consultant and teaches Win32 programming courses (www.solsem.com). He can be reached at www.JeffreyRichter.com.|
I discussed how to write an unsetup program that was able to delete itself from the disk in my January 1996 Win32® Q&A column, and I presented three techniques for making an application delete itself. A reader recently suggested another technique that I'd like to share with you. I have tested this technique only on Windows NT®, but I believe that it will also work on Windows® 95. Figure 1 shows the code.
The technique works as follows: when the program wants to delete itself, it first copies the EXE file on the disk to the user's temporary directory. I call this copy the clone. Then the program calls CreateFile to open the cloned EXE file. The call to CreateFile passes the FILE_FLAG_DELETE_
ON_CLOSE option. This tells the operating system to delete the file when it is closed (I'll come back to this later). Finally, the original EXE file spawns this cloned version of the EXE, passing it two command-line arguments. The first argument is the inheritable handle of the original process, and the second is the full path name of the original EXE file. After spawning the cloned version, the original EXE file simply terminates.
At this point, the cloned version of the EXE file (located in the user's temporary directory) starts running. This version examines its command-line arguments and detects that it is not the original version but the cloned version. The cloned version extracts the process handle to the original EXE file and waits for the original process to terminate. Once the original process has terminated, the cloned process can delete the EXE file image. This, of course, gives you the desired effect of deleting the EXE file that was running. But what about deleting the cloned EXE file that was copied to the user's temporary directory? Because the cloned EXE file was opened by the original EXE with the FILE_FLAG_
DELETE_ON_CLOSE flag, the operating system will delete the cloned EXE file automatically when it terminates. Nothing could be easier.
Q I need to synchronize threads running in multiple processes. I would like to use critical sections because it is my understanding that entering and leaving critical sections is significantly faster than waiting on and releasing mutex kernel objects. However, the Win32 documentation states that critical sections cannot be used to synchronize threads in multiple processes.
At first, I thought this was because the critical section is simply a data structure not accessible in the other processes' address spaces. So I decided to put the critical section data structure inside a memory-mapped file that all my processes had access to. This did not work at all. Is there any way to do really fast thread synchronization across process boundaries?
A First, let me say that you are quite correct that critical sections are significantly faster than mutex kernel objectsor any other kernel object, for that matter. The EnterCriticalSection and LeaveCriticalSection functions take advantage of the InterlockedXxx family of functions. The InterlockedXxx functions are implemented entirely in user-mode space and do not require your thread to go from user mode to kernel mode and back again. For this reason, entering a critical section typically requires only 10 or so CPU instructions to execute. On the flip side, when you call WaitForSingleObject and the like, you are forcing your thread to transition to kernel mode and back. This transition typically requires 600 CPU instructions on an x86 processor. That's a huge difference!
You're right to want to use a critical section instead of a mutex to increase your performance. Critical sections are fastas long as there is no contention for the shared resource. As soon as a thread attempts to enter a critical section owned by another thread, critical sections degrade to using an event kernel object requiring approximately 600 CPU instructions to enter. Because contention is so rare, entering a critical section usually takes the high-speed, 10 CPU instruction path.
What does "critical sections degrade to an event kernel object" mean? Inside the CRITICAL_SECTION data structure, there is a handle to an event kernel object. (The undocumented member is called LockSemaphore, but it contains the handle of an event, not a semaphore.) As you know, kernel object handles are process-relative, not global throughout the system. So if you put the critical section structure inside a shared, memory-mapped file, the critical section will not work properly. The handle value inside the LockSemaphore member will be valid for one of your processes, but totally invalid for all of the other processes. This means it's not possible to use critical sections across process boundaries.
All is not lost. In the July 1996 issue of MSJ, I demonstrated how you can roll your own critical sections. I'm not going to repeat the explanation of how it all worked in this column; I'm going to use that code as my starting point and modify it so that it works across process boundaries. I've packaged this new version as a C++ class called COptex (see Figure 2 ). I have also taken the liberty of adding two new features to this class that were not in the original version. This new version offers a TryEnter member function that works like the TryEnterCriticalSection function introduced in Windows NT 4.0. I have also added support for spin counts as found in the InitializeCriticalSectionAndSpinCount and SetCriticalSectionSpinCount functions introduced in Windows NT 4.0 Service Pack 3.
Using COptex is extremely simple. All you have to do is construct an instance of this C++ class using either of its two constructors:
COptex(LPCSTR pszName, DWORD dwSpinCount = 4000);
COptex(LPCWSTR pszName, DWORD dwSpinCount = 4000);
When constructing a COptex, you must give it a string name. This is used for sharing across process boundaries. When another process constructs a COptex using the same name, the constructor code will see that the object already exists and simply perform the desired synchronization across process boundaries, if necessary. Having the two constructors allows you to pass either an ANSI or Unicode name for the object. The second parameter to the constructor allows you to specify a spin count or accept the default of 4000 (which is what the operating system functions use to serialize a heap). The remaining members of the COptex class have a one-to-one correspondence with the Win32 critical section functions, so I will not describe them here.|
So how does the COptex work? Well, a COptex object actually consists of two data blocks: a local, private block and a global, shared block. When you create a COptex object, the local block consists of the data members inside the COptex class definition. The m_hevt and m_hfm members are initialized to a named event and a named memory-mapped file object, respectively. Since the objects are named, they can be shared across process boundaries. But since handles to these objects are process-relative, each COptex object must keep its own values for these handles.
The m_pSharedInfo member points to the memory-mapped file, which contains the global data that all COptex objects for a given name share. The data in the memory-mapped file is a SHAREDINFO structure as privately defined inside the COptex class. A complete description of these members and how they relate to one another can be found in the July 1996 issue of MSJ. Note that the m_dwSpinCount member is new for this version.
Have a question about programming in Win32?
Send your questions via email to Jeffrey Richter from his website at http://www.jeffreyrichter.com.