This article focuses on how
to extract metafiles from undocumented Spool files and use that
data for debugging, understanding the operating system and its
components as building blocks for tools. To get to see all these
concepts in action, check out the GNU GPL Program, PrintMirror
printer-driver tool under development in cvs.sourceforge.net:/cvsroot/printmirror.
More development involving generating metafiles natively from
driver is in progress.
Note: All the explanation below are based on author’s experiments
on windows 2000 and Windows XP.
Spool File
This is one of the Microsoft’s loosely documented features.
If you search through MSDN, you had find hardly any explanation
on this. I tried building a logical view from bits in spool
file. Whenever something is printed in the spool mode, 2 files
are created with extensions .spl and .shd. Well that is documented,
but what about the prefix, it’s not. On my windows 2000 box,
it is of the form nnnn.spl,where nnnn stands for job ID. So
If you print documents one after the another, the job ID for
each print job are assigned incrementally, e.g., if a print
job has a spool file named 00002.spl associated with it then
the immediate next job would have 00003.spl.
That concludes the fact that job ids which are assigned by
spooler are assigned incrementally and that controls the name
of the spool file. In Windows XP, Microsoft changed
the file naming convention to FP*.spl but on experimenting it
was found that if you use the “keep printed documents”, the
file convention followed is same as Windows 2000.
The next question that might
come to your mind is: How do I get the job ID for a print job
inside a printer driver? There are two ways to do this in the
driver:
1.
In the DOCUMENT_STARTDOCPOST
in DrvDocumentEvent(…) in the user-mode User Interface
dll.
2.
It comes as a parameter
in DrvStartDoc(…) in the rendering dll, which can
be in Kernel or User dll.
An exception occurs when ResetDC(…)
is called before each StartPage(…), in which case there would
be multiple DrvStartDoc’s but still one DrvEndDoc(…) inside
the driver. In the code I show how to solve the dilemma. Source
code is obtainable from
http://www.microsoft.com/india/msdn/articles/132.aspx
We now have a definite way of associating a spool-file with
a print job. Listing 1 shows how to accomplish this using C,
following these steps:
1.
Get the spooler directory.
2.
Look for a filename of
the form …jobId.spl
3.
Give back the expanded
form of the filename e.g., 00002.spl where 2 is jobid.
DWORD cbNeeded;
DWORD dwType = REG_SZ;
// data type
GetPrinterData(
hDriver, //
handle to printer or print server
SPLREG_DEFAULT_SPOOL_DIRECTORY,
&dwType,
// data type
NULL,
// configuration data buffer
0,
// size of configuration data buffer
&cbNeeded
// bytes received or required
);
LPBYTE pSpoolDirectory
= (LPBYTE)MALLOC(cbNeeded);
GetPrinterData(
hDriver,
// handle to printer or print server
SPLREG_DEFAULT_SPOOL_DIRECTORY,
&dwType,
// data type
pSpoolDirectory,
// configuration
data buffer
cbNeeded,
// size of configuration data buffer
&cbNeeded
// bytes received or required
);
TCHAR TempSpoolFileName[MAX_PATH];
{
wsprintf(TempSpoolFileName , L"%s\\",pSpoolDirectory);
TCHAR JobIdName[256];
wsprintf(JobIdName , L"%d",JobId);
int zeros = 5 - wcslen(JobIdName);
for(; zeros > 0; zeros--)
{
wsprintf(SpoolFileName , L"%s0",TempSpoolFileName);
wcscpy(TempSpoolFileName,SpoolFileName);
}
Lets take a look at the spool
file contents. Here is the format of the spool file’s header:
1.
DWORD 00010000.
2.
DWORD containing the bytes
of document info following which is docinfo, same as what
is sent in StartDoc(…) win32 call and what comes in DrvDocumentEvent(…).
In figure (1) 0x000004-0x000007 is the sizeof(DOCINFO)
= 44,with the only difference that DOCINFO’s cbSize is
int and it is DWORD here. 0x000008-0x00000B is the DocName
offset in spool file = 0x000010. 0x00000C-0x00000F is
NULL, DOCINFO’s lpszOutput. DWORD containing marker 0x0000000c.
3.
DWORD containing the size
of the EMF following the header:
0x000030-0x000033 = 32104 bytes
Figure 1 shows a sample from
my system of a Spool File Header generated using Corel Draw.
In the Corel Draw single-page
document, the end of spool-file looks like Figure 2.
That is unclear, but 0x007DA4-0x007DA7
is size of page metafile + 8bytes
Certain bytes separate the metafiles
for individual pages in the spool file. I could see 20 bytes
whose contents aren’t useful followed by four bytes that
describe the size of the following page’s metafile (see
Figure 2). A variation to the above situation occurs when an
application calls ResetDC before the start of each page. A typical
example of a commercial application is Winword, in which case
the demarcation of the page’s metafile will be of the
form 1 DWORD + 1 DWORD (sizeof DEVMODE rounded to DWORD boundary)
+ DEVMODE + 16 Bytes [similar to bytes shaded in Figure 2] +
4 Bytes [start of next Page’s metafile]
Function GetMetaFileFromSpoolFile(…)
in Listing 2 shows how to extract a metafile page from the spool
file. The parameters to the function are:
1.
The name of file that contains
the spool file’s contents. [IN]
2.
The PageNbr, which indicates
which page in the spool file is to be retrieved. [IN]
3.
The name of the File in
which the function extracts the metafile. [IN]
4.
PDEV [IN]
4.
pDevmode , which is the
devmode associated with the page and is extracted by the
function. [OUT]
void
GetMetaFileFromSpoolFile(TCHAR *SpoolFileName , int
PageNbr , TCHAR *MetaFileName, PPDEV pPDev,LPBYTE *pDevmode)
{
HANDLE hFile = CreateFile( SpoolFileName,
GENERIC_READ, // open for reading
FILE_SHARE_READ, // share for
reading
NULL, // no security
OPEN_EXISTING, // existing file
only
FILE_ATTRIBUTE_NORMAL, // normal
file
NULL); // no attr. template
HANDLE hMapFile = CreateFileMapping(
hFile, // handle to file
NULL, // security
PAGE_READONLY, // protection
0, // high-order DWORD of size
0, // low-order DWORD of size
NULL // object name
);
LPBYTE pMapFile = (LPBYTE)MapViewOfFileEx(
hMapFile, // handle to file-mapping
object
FILE_MAP_READ, // access mode
0, // high-order DWORD of offset
0, // low-order DWORD of offset
0, // number of bytes to map
NULL // starting address
);
DWORD granularity = *((DWORD *)pMapFile);
pMapFile += sizeof(DWORD);
DWORD splheader = *((DWORD *)pMapFile);
pMapFile += splheader;
DWORD metafilelen = *((DWORD *)pMapFile);
pMapFile += sizeof(DWORD); // This is hack
after comparison with win9x.
for(int
Nbr = 1 ; Nbr < PageNbr ; Nbr++)
{
pMapFile += metafilelen;
if(pPDev->pResetDC[Nbr
- 1] == FALSE)
{
pMapFile
+= 20;
}
else
{
/* skip
the reset devmode here */
pMapFile
+= 4; // This marker is the same as the one after Devmode
DWORD
offset = *((DWORD *)pMapFile); //this is multiple of
4bytes.
pMapFile
+= offset + 4; // devmode + devmode-length
pMapFile
+= 16 + 4; //Regular 20 bytes seperator(marker,...,metalen-tillhere,0000
//,startpagemarker)
}
metafilelen = *((DWORD
*)pMapFile);
pMapFile += 4 ;
//This has the metafile length!!!
//keep incrementing
till we are on the last page.
}
HANDLE
hMetaFile = CreateFile( MetaFileName,
GENERIC_READ | GENERIC_WRITE,
// open for reading
FILE_SHARE_READ, // share for
reading
NULL, // no security
CREATE_ALWAYS, // existing file
only
FILE_ATTRIBUTE_NORMAL, // normal
file
NULL); // no attr. template
DWORD
numWritten;
WriteFile(
hMetaFile, // handle to output
file
pMapFile, // data buffer
metafilelen, // number of bytes
to write
&numWritten,
NULL // overlapped buffer
);
if(pDevmode)
{
LPBYTE ptr = pMapFile + metafilelen;
if(pPDev->pResetDC[PageNbr
- 1] == TRUE)
{
ptr
+= 4; // This marker is the same as the
// one
after Devmode
DWORD
offset = *((DWORD *)ptr); //this is multiple of 4bytes.
*pDevmode
= (LPBYTE)MALLOC(offset);
memcpy(*pDevmode
, ptr + 4 , offset);
}
}
CloseHandle(hMetaFile);
Listing 2 shows two switches:
one for ResetDC and one without. In another scenario, ResetDC(…)
acts differently is when ResetDC(…) is called before the StartDoc(…)
win32 call in the application, in which case Devmode is part
of the shadow file. A shadow file, with .shd extension is created
for each job along with the spool file and contains the job’s
characteristics/specifications. You should look at JOB_INFO_2
structure to get the layout of .shd file.
Conclusion
The GetMetafileFromSpoolFile(…)
is the gist of the article and is a must to understand for developers
who want to know the contents of spool files. The source code
of PrintMirror, which shows how to extract metafiles pagewise
from a spool file and also enables preview (display the contents
of metafile) can be obtained from
http://www.microsoft.com/india/msdn/articles/132.aspx.