|
Chapter 11: Structured Exception Handling
11 Structured Exception HandlingUsually the exception handling model of a programming language is considered the domain of that particular language's runtime. Under the hood, each language has its own way of detecting exceptions and locating an appropriate exception handler. Some languages perform exception handling completely within the language runtime, whereas others rely on the structured exception handling (SEH) mechanism provided by the operating systemwhich in our case is Win32.In the world of managed code, exception handling is a fundamental feature of the common language runtime execution engine. The execution engine is fully capable of handling exceptions without regard to language, allowing exceptions to be raised in one language and caught in another. At that, the runtime does not dictate any particular syntax for handling exceptions. The exception mechanism is language-neutral in that it is equally efficient for all languages. No special metadata is captured for exceptions other than the metadata for the exception classes themselves. No association exists between a method of a class and the exceptions that the method might throw. Any method is permitted to throw any exception at any time. Although we talk about managed exceptions thrown and caught within managed code, a common scenario involves a mix of both managed and unmanaged code. Execution threads routinely traverse managed and unmanaged blocks of code through the use of the common language runtime's platform invocation mechanism (P/Invoke) and other interoperability mechanisms. (See Chapter 15, "Managed and Unmanaged Code Interoperation.") Consequently, during execution, exceptions can be thrown or caught in either managed or unmanaged code. The runtime exception handling mechanism integrates seamlessly with the Win32 SEH mechanism so that exceptions can be thrown and caught within and between the two exception handling systems.
SEH Clause Internal RepresentationStructured exception handling tables are located immediately after a method's IL code, with the beginning of the table aligned on a double word boundary. It would be more accurate to say that "additional sections" are located after the method IL code, but the first release of the common language runtime allows only one kind of additional sectionthe exception handling section.This additional section begins with the section header, which contains two entries, Kind and DataSize. In a small header, DataSize is represented by 1 byte, whereas in a fat header, DataSize is 3 bytes long. A Kind entry can contain the following binary flags:
The section headerpadded with 2 bytes if smallis followed by a sequence of exception handling (EH) clauses, which can also have small or fat format. Each EH clause describes a single triad made up of a guarded block, an exception identification, and an exception handler. The entries of small and fat EH clauses have the same names and meanings but different sizes, as shown in Table 11-1. Table 11-1 EH Clause Entries
Branching into or out of guarded blocks and handler blocks is illegal. A guarded block must be entered "through the top"that is, through the instruction located at TryOffsetand handler blocks are entered only when they are engaged by the exception handling subsystem of the execution engine. To exit guarded and handler blocks, you must use the instruction leave (or leave.s). You might recall that in Chapter 2, "Enhancing the Code," this principle was formulated as "leave only by leave." Another way to leave any block is to throw an exception using the throw or rethrow instruction.
Types of SEH ClausesException handling clauses are classified by the algorithm of the handler engagement. Four mutually exclusive EH clause types are available, and because of that the Flags entry must hold one of the following values:
Figure 11-1 illustrates this process. If an exception of type A is thrown within the guarded block, it is caught and processed by the first handler (catch A), and the finally handler is engaged when the first handler invokes the leave instruction. If an exception of type B is thrown, it is caught by the third handler (catch B), and the finally handler is executed before the third handler. If no exception is thrown within the guarded block, the finally handler is engaged when the guarded block invokes the leave instruction.
Label Form of SEH Clause DeclarationThe most generic form of IL assembly language (ILAsm) notation of an EH clause is as follows:
.try <label> to <label> <EH_type_specific> handler <label> to <label> where <EH_type_specific> ::=
catch <class_ref> | Take a look at this example:
BeginTry: Figure 11-1 Engagement of the finally exception handler. In the final lines of the example, the code .try <label> to <label> defines the guarded block, and handler <label> to <label> defines the handler block. In both cases, the second <label> is exclusive, pointing at the first instruction after the respective block. ILAsm imposes a limitation on the positioning of the EH clause declaration directives: all labels used in the directives must have already been defined. Thus, the best place for EH clause declarations in the label form is at the end of the method scope. In the case just presented, the handler block immediately follows the guarded block, but we could put the handler block anywhere within the method, provided it does not overlap with the guarded block or other handlers:
A single guarded block can have several handlers:
In the case of multiple handlerscatch or filter, but not finally or faultthe guarded block declaration need not be repeated:
.try BeginTry to KeepGoing The lexical order of handlers belonging to the same guarded block is the order in which the ILAsm compiler emits the EH clauses, and hence is the same order in which the execution engine of the runtime processes these clauses. We must be careful about ordering the handlers. For instance, if we swap the handlers in the preceding example, the handler for [mscorlib]System.Exception will always work and the handler for [mscorlib]System.StackOverflowException will never work. This is because all exceptions are derived, eventually, from [mscorlib]System.Exception, and hence all exceptions are caught by the first handler, leaving the other handlers unemployed. The finally and fault handlers cannot peacefully coexist with other handlers, so if a guarded block has a finally or fault handler, it cannot have anything else. To combine a finally or fault handler with other handlers, we need to nest the guarded and handler blocks within other guarded blocks, as shown in Figure 11-1, so that each finally or fault handler has its own personal guarded block.
Scope Form of SEH Clause DeclarationThe label form of the EH clause declaration is universal, ubiquitous, and close to the actual representation of the EH clauses in the EH table. The only quality the label form lacks is convenience. In view of that, ILAsm offers an alternative form of EH clause description: a scope form. You've already encountered the scope form in Chapter 2, which discussed protecting the code against possible surprises in the unmanaged code being invoked. Just to remind you, here's what the protected part of the method (from the sample file Simple2.il on the companion CD) looks like:
The scope form can be used only for a limited subset of all possible EH clause configurations: the handler blocks must immediately follow the previous handler block or the guarded block. If the EH clause configuration is different, we must resort to the label form or a mixed form:
The IL Disassembler by default outputs the EH clauses in the scope format least those clauses that can be represented in this form. However, we have the option to suppress the scope form and output all EH clauses in their generic label form. But let's suppose for the sake of convenience that we can shape the code in such a way that the contiguity condition is satisfied, allowing us to use the scope form. A single guarded block with multiple handlers in scope form will look like this:
.try {Much more readable, isn't it? The nested EH configuration shown earlier in Figure 11-1 is easily understandable when written in scope form:
.try {The filter EH clauses in scope form are subject to the same limitation: the handler block must immediately follow the guarded block. But because in a filter clause the handler block includes first the filter block and then, immediately following it, the actual handler, the scope form of a filter clause looks like this:
.try {And, of course, we can easily switch between scope form and label form within a single EH clause declaration. The general ILAsm syntax for an EH clause declaration is as follows:
<EH_clause> ::= .try <guarded_block> The nonterminals <label> and <class_ref> must be familiar by now, and the meaning of <scope> is obvious: "code enclosed in curly braces."
Processing the ExceptionsThe execution engine of the runtime processes an exception in two passes. The first pass determines which, if any, of the managed handlers will process the exception. Starting at the top of the EH table for the current method frame, the execution engine compares the address where the exception occurred to the TryOffset and TryLength entries of each EH clause. If it finds that the exception happened in a guarded block, the execution engine checks to see whether the handler specified in this clause will process the exception. (The "rules of engagement" for catch and filter handlers were discussed in previous sections.) If this particular handler can't be engagedfor example, the wrong type of exception has been thrownthe execution engine continues traversing the EH table in search of other clauses that have guarded blocks covering the exception locus. The finally and fault handlers are ignored during the first pass.If none of the clauses in the EH table for the current method are suited to handle the exception, the execution engine steps up the call stack and starts checking the exception against EH tables of the method that called the method where the exception occurred. In these checks, the call site address serves as the exception locus. This process continues from method frame to method frame up the call stack, until the execution engine finds a handler to be engaged or until it exhausts the call stack. The latter case is the end of the story: the execution engine cannot continue with an unhandled exception on its conscience, and the runtime either aborts the application execution or offers the user a choice between aborting the execution and invoking the debugger, depending on the runtime configuration. If a suitable handler is found, the execution engine swings into the second pass. The execution engine again walks the EH tables it worked with during the first pass and invokes all relevant finally and fault handlers. Each of these handlers ends with the endfinally instruction (or endfault, its synonym), signaling the execution engine that the handler has finished and that it can proceed browsing the EH tables. Once the execution engine reaches the catch or filter handler it found on its first pass, it engages the actual handler. What happens to the method's evaluation stack? When a guarded block is exited in any way, the evaluation stack is discarded. If the guarded block is exited peacefully, without raising an exception, the leave instruction discards the stack; otherwise, the evaluation stack is discarded the moment the exception is thrown. During the first pass, the execution engine puts the exception object on the evaluation stack every time it invokes a filter block. The filter block pops the exception object from the stack and analyzes it, deciding whether this is a job for its actual handler block. The decision, in the form of int32 having the value 1 or 0, is the only thing that must be on the evaluation stack when the endfilter instruction is reached; otherwise, the IL verification will fail. The endfilter instruction takes this value from the stack and passes it to the execution engine. During the second pass, the finally and fault handlers are invoked with an empty evaluation stack. Because these handlers do nothing about the exception itself and work only with method arguments and local variables, the execution engine doesn't bother providing the exception object. If anything is left on the evaluation stack by the time the endfinally (or endfault) instruction is reached, it is discarded by endfinally (or endfault). When the actual handler is invoked, the execution engine puts the exception object on the evaluation stack. The handler pops this object from the stack and handles it to the best of its abilities. When the handler is exited by using the leave instruction, the evaluation stack is discarded. Table 11-2 summarizes the stack evolutions. Table 11-2 Changes in the Evaluation Stack
Two IL instructions are used for raising an exception explicitly: throw and rethrow. The throw instruction takes the exception object (ObjectRef) from the stack and raises the exception. This instruction can be used anywhere, within or outside any EH block. The rethrow instruction can be used within actual handlers only (not within the filter block), and it does not work with the evaluation stack. This instruction signals the execution engine that the handler that was supposed to take care of the caught exception has for some reason changed its mind and that the exception should therefore be offered to the higher-level EH clauses. The only blocks where the words "caught exception" mean something are the actual handler block and the filter block, but invoking rethrow within a filter block, though theoretically possible, is illegal. It is legal to throw the caught exception from the filter block, but it doesn't make much sense to do so: the effect is the same as if the filter simply refused to handle the exception, by loading 0 on the stack and invoking endfilter. Rethrowing an exception is not the same as throwing the caught exception, which we have on the evaluation stack upon entering an actual handler. The rethrow instruction preserves the call stack trace of the original exception so that the exception can be tracked down to its point of origin. The throw instruction starts the call stack trace anew, giving us no way to determine where the original exception came from.
Exception TypesChapter 10 mentioned some of the exception types that can be thrown during the execution of IL instructions. Earlier chapters mentioned some of the exceptions thrown by the loader and the JIT (just-in-time) compiler. Now it's time to review all these exceptions in an orderly manner.All managed exceptions defined in the .NET Framework classes are descendants of the [mscorlib]System.Exception class. This base exception type, however, is never thrown by the common language runtime. In the following sections, I've listed the exceptions the runtime does throw, classifying them by major runtime subsystems. Enjoying the monotonous repetition no more than you do, I've omitted the [mscorlib]System. part of the names, common to all exception types. As you can see, many of the exception type names are self-explanatory.
Loader ExceptionsThe loader represents the first line of defense against erroneous applications, and the exceptions it throws are related to the file presence and integrity.
JIT Compiler ExceptionsThe JIT compiler throws only two exceptions. The second one can be thrown only when the security services are engaged.
Execution Engine ExceptionsThe execution engine throws a wide variety of exceptions, most of them related to the operations on the evaluation stack. A few exceptions are thrown by the thread control subsystem of the engine.
Interoperability ExceptionsThe following exceptions are thrown by the interoperability services of the common language runtime, which are responsible for managed and unmanaged code interoperation:
Subclassing the ExceptionsIn addition to the plethora of exception types already defined in the .NET Framework classes, you can always devise your own types tailored to your needs. The best way to do this is to derive your exception types from the "standard" types listed in the preceding sections.The following exception types are sealed and can't be subclassed. Again, I've omitted the [mscorlib]System. portion of the names.
Unmanaged Exception MappingWhen an unmanaged Win32 exception occurs within a native code segment, the execution engine maps it to a managed exception that is thrown in its stead. The different types of unmanaged exceptions, identified by their status code, are mapped to the managed exceptions as described in Table 11-3.Table 11-3 Mapping Between the Managed and Unmanaged Exceptions
SEH Clause Structuring RulesThe rules for structuring EH clauses within a method are neither numerous nor overly complex:All the blockstry, filter, handler, finally, and faultof each EH clause must be fully contained within the method code. No block can protrude from the method. Blocks belonging to the same EH clause or different EH clauses can't partially overlap. A block either is fully contained within another block or is completely outside it. If one guarded block (A) is contained within another guarded block (B) but is not equal to it, all handlers assigned to A must also be fully contained within B. A handler block of an EH clause can't be contained within a guarded block of the same clause, and vice versa. Neither can a handler block be contained in another handler block that is assigned to the same guarded block. A filter block can't contain any guarded blocks or handler blocks. All blocks must start and end on instruction boundariesthat is, at offsets corresponding to the first byte of an instruction. Prefixed instructions must not be split, meaning that you can't have constructs such as tail. .try { call . }. A guarded block must start at a code point where the evaluation stack is empty. The same handler block can't be associated with different guarded blocks:
.try Label1 to Label2 catch A handler Label3 to Label4 If the EH clause is a filter type, the filter's actual handler must immediately follow the filter block. Since the filter block must end with the endfilter instruction, this rule can be formulated as "the actual handler starts with the instruction after endfilter." If a guarded block has a finally or fault handler, the same block can have no other handler. If you need other handlers, you must declare another guarded block, encompassing the original guarded block and the handler:
.try {
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||