|
Chapter 4: Concurrency continued
Internal Object ReferencesThe FTM check box I just mentioned is the single biggest source of confusion and frustration among users new to COM and the Visual C++ development system. The user interface makes it appear like a common choice, somewhat akin to providing error support. (The check box appears next to the option for which it is listed in the object creation dialog box.) But context-neutral objects are rare, and at the very least, the FTM check box should have been hidden behind an Advanced button or some such user interface device. Instead, developers check the box innocently, later forgetting that they did. But the effects of aggregating the free-threaded marshaler are dramatic andunbeknownst to the budding, point-and-shoot Visual C++ developerturn the programming model of COM+ objects on its head.The biggest change for COM+ object implementation involves storing interface pointers. You expect to be able to create an object and store it in a data member, or use that data member to store an interface pointer passed as an [in] or [in, out)] parameter after increasing its reference count, and then access this stored interface pointer at any time before finally releasing it. But since every method invocation can occur under a different context for a context-neutral object, no interface pointer that was created, passed, or otherwise obtained during one call can be accessed directly in subsequent calls. Instead, the context-neutral object must ensure that such interface pointers are properly marshaled into each caller’s context before accessing them, even within the object’s own implementation. Since the standard COM+ marshaling APIs (CoMarshalXXX, CoUnmarshal.Interface, CoGetInterfaceAndReleaseStream, and so on) require the participation of threads in both the exporting and importing contexts, context-neutral objects usually use the global interface table (GIT) to store interface pointers for access across method invocations. This mechanism works by table-marshaling the pointer into a global location, and then returning a registration cookie to the caller. Since the pointer was table marshaled, the cookie can be used later to import the pointer into the retrieving context as often as desired. Context-neutral objects therefore never store interface pointers; instead, they store registration cookies. (This is what I meant when I said earlier that there would be dramatic changes to the programming model.) Specifically, you could accomplish this by using the following code:
STDMETHODIMP CNeutral::Foo(/*[in]*/ IBar* piBar) But Is It Fast?I once had clients who told me that they always checked the FTM check box for all new COM objects they developed. When I asked them why they did this, they explained that they had read the free-threaded marshaler was supposed to make COM objects faster. The option therefore was much like the Turbo button on older PCs: you could go slow if you wanted to, but most people preferred to go fast, especially when there were no penalties.Jokes aside, it is true that the performance gains derived from using the FTM can be significantespecially when you short-circuit cross-apartment interceptors that would otherwise cause thread switches, as opposed to lightweight proxies. It is typical for a direct call to execute up to 30 times faster than a thread-switched one. The duration of execution of your performance-critical, context-neutral code obviously plays a crucial role here, and the benefit of context neutrality will lessen as the ratio of actual code execution time to middleware overhead increases. In other words, don’t expect much gain from making a large prime-number generator context neutral. There are situations, however, in which aggregating the FTM actually will make an object slower. To understand how this can happen, consider the fact that a context-neutral object always accesses other objects from the context of its caller. Suppose that these other objects reside in the MTA, while most callers execute on STA threads. In this case, we encounter no overhead during the initial call from an STA into our context-neutral object, but in the context-neutral code we face one expensive thread switch for every MTA object that we access, whether the objects’ interface pointers are arguments to our methods or we stored interface pointers to the MTA objects in the GIT. If we had not made our object context neutral and assigned threading model Free, we would encounter only one thread switch total per call. Determining the actual gain of context neutrality therefore can be a complicated matter and requires careful analysis of an object’s actions. For this reason, evaluating the performance impact of context neutrality is similar to assessing the impact of an object’s threading model.
FTM vs. TNAWhen considering context neutrality, the designer’s goal is usually the elimination of the expensive kind of proxythe thread switching kind. Before COM+, the FTM was the only way to accomplish this. However, boosting performance by eliminating thread switches is precisely why the thread-neutral apartment was invented. And where applicable, the thread-neutral apartment is a far more ele.gant solution than the free-threaded marshaler: it’s like having COM+ solve your performance problem for you, instead of fighting the apartment model.I would estimate that the TNA is now a better solution for at least 80 percent of the cases that used FTM before COM+. The chief advantage the thread-neutral apartment has over the free-threaded marshaler is that it does not alter the familiar COM+ programming model with respect to stored object references. No more cookies, no more GIT, no more worrying about keeping alive all apartments that ever imported an interface pointer or in whose scope an object was created and saved7 just plain vanilla COM+ object implementation, easy and clean. And if you can manage to implement entire layers of your project in the thread-neutral apartment, the overhead of further method invocation once the calling thread has entered the first TNA object is likely very low to nonexistent. (The method invocation overhead is throttled only by the overhead of extended any interceptor-based COM+ services you might be using.) Of course, the factors that can cause a context-neutral object to be slow also can affect an object in the thread-neutral apartment. If, for example, such an object needs to access multiple MTA objects per method invocation but is mostly called on STA threads, the thread-neutral apartment might have been a poor choice for the sake of improving performance. The free-threaded apartment remains the superior option in this scenario. Of course, the FTM still eliminates the overhead of even lightweight proxies. For objects that stand to gain from this further optimization, the trade-off will be between actual performance gain on one hand and ease of implementation as well as maintainability on the other. And interception-based, extended COM+ services (such as transactions and synchronization) will remain unavailable to the FTM user. While the TNA gradually will emerge as an FTM replacement, expect context neutrality to remain a useful tool for a relatively small set of specialized system objects.
It’s the Object’s ChoiceContext neutrality should be considered a detail of an object’s implementation, completely transparent to its clients. When manually marshaling interface pointers between contexts and apartments, clients should not rely on the context neutrality of the object and shouldn’t simply pass raw interface pointers around. Passing raw interface pointers is especially tempting in the implementation of the context-neutral object itself, which must use the GIT to transport interface pointers across method invocations. But coupling to this detail of object implementation violates one of the fundamental tenets of COM+ as well as object-oriented programming in general: the separation between interface and implementation.Think of it this way: following the rules and using marshaling mechanisms instead of passing raw object references across contexts insulates you against a change in the object implementationthe details of how the object implements IMarshal. This change would require no new interfaces, no new class IDs, nothing that is externally visible. And the client’s invocation of the marshaling mechanism costs nothing as long as the object does aggregate the FTM; the object itself then short-circuits the marshaling. You can make exceptions to this rule, but not many. Sometimes an object needs to be context neutral to perform its service. Perhaps the best-known example of this is the IStream pointer returned from CoMarshalInterThreadInterfaceIn.Stream. This pointer is certified to be context neutral. If the pointer was not context neutral, there would be no reason to call the API, since its purpose is to marshal a given interface pointer to another context in the same process. Note, however, that the claim of context neutrality holds only for the particular IStream implementation returned from this API, and not IStream in general.
Concurrency Design GuidelinesConcurrency is absolutely crucial to the scalability of any software system. Adding processors to a machine or adding hosts to a distributed architecture won’t help if your system already stalls at 40 percent processor utilization, with threads blocking for access to shared resources. Concurrency is vital and concurrency is hardso much so that it is best to let someone else handle it. See Chapter 13 for a history of concurrency management by operating system services and for details on how to keep your own architecture free from concurrency management.
The Best Concurrency Is No ConcurrencyThe need for managing concurrency in the layers of your own code begins when object references are shared, when multiple clients contend for access to the same server objects and are stalled either by COM+ synchronization services or by code in the server object itself that now needs to manage concurrent access to its internal data structures. Try to avoid such stalls by not sharing access to server object instances in the first place. Try to keep concurrency management out of all layers of your project.
Breaking the Rules: The Case of ASPPurposeful ignorance of an object’s threading model and marshaling implementation is an important design rule. Threading models and the path of object interactions are important considerations throughout a project’s life.cycle to produce a well-performing product, yet nothing in the actual implementation should make assumptions about these internals in other objects. Violating this rule limits flexibility, makes maintenance difficult, and leads to brittle products. But at least one popular technology violates this rule on a number of occasions: Active Server Pages (ASP).ASP places restrictions on objects that are stored in a session or application state. If an object whose threading model is Apartment is stored in a session state, ASP locks down the session to the single thread used to create that object. The intention behind this locking is clear: since access to the object will require switching to the creating thread, forcing that thread to be the one making those calls likely will improve system performance by eliminating expensive thread switches and stalled threads. On the other hand, users whose sessions are now bound to this thread must wait for the thread to become available to service their requests. The rules for objects in application state are even more draconian: an object may be stored in application state only if its threading model is Both and if it aggregates the free-threaded marshaler (unless the AspTrack.ThreadingModel configuration property has been set to 1). This means that ASP actually queries your object for IMarshal, and then determines whether its implementation is provided by the free-threaded marshaler (perhaps by calling GetUnmarshalClass) before allowing it to be stored in application state. That’s getting rather cozy with your implementation details, isn’t it? But again we recognize the reasoning: proxies and thread switches likely would affect every Web application on the servera situation best avoided. ASP is ultimately justified in penetrating your object’s encapsulation to guarantee the performance of all Web applications on the server. Even so, it does not blindly rely on your objects’ context neutrality; it merely determines whether they are context neutral and chooses a course of action based on the result. Remember, ASP is a large and comprehensive framework for building Web applications, hardly reminiscent of a typical application project. Chances are that your own code will be best served by treating the marshaling details of server objects as a black box.
Exceptions: The Case of Client NotificationIn typical software projects, you can apply the single client per server object-design rule to about 95 to 100 percent of all objects. You cannot always apply this rule to 100 percent of the objects because certain types of problems cannot be handled efficiently without centralizing some transient data in memory. Client notification is a good example of such a problem. Databases offer pessimistic and optimistic locking models that inform clients of either concurrency in progress or a conflict at the time of the data update, respectively. But databases generally do not offer a mechanism by which clients can continually track the current state of changing data.If clients in your project need to track changing data, you might find sharing event distribution objects unavoidable. Polling the database might be an alternative, but it tends to further reduce scalability for anything but the smallest number of clients and the longest polling intervals. Sharing an event distribution object does not mean your business logic must block against the concurrent notification mechanism. In fact, it is best to carry out notification asynchronously from the necessarily serial portions of business logic. This suggestion holds true whether notification is initiated directly by your objects or by artifacts you add to your database schema, such as stored procedures and triggers. Note that it also is not necessary to dispatch all callbacks from a single context, apartment, process, or host. Even problems such as notification that introduce scalability concerns into your architecture by forcing concurrency into it can benefit from some amount of distribution. For example, you might allow clients to register for callbacks with a number of hosts. This eases the load on any single node but forces you to keep a network of distribution nodes in sync. The client notification pattern shows that there are situations in which you cannot avoid managing some amount of concurrency yourself. The previous example contains a data structure of callback registrations at each notification distribution node that must be protected from concurrent access. In a C++ implementation, you might find a Standard Template Library (STL) container protected by a Win32 synchronization object. A Visual Basic implementation might use the Shared Property Manager (SPM). But all implementations will need to regulate concurrent access through locking mechanisms controlled by your code. You will be able to make the best use of the information in the upcoming section on locking when faced with situations like these. Such cases should be few and far between, but pay particular attention to your design when you encounter them. The scalability of your product might depend on it.
Standard Synchronization SettingsFollowing are some general statements that describe which synchronization settings apply to which object, depending on its functional category in your project. These observations serve as a good rule of thumbbut keep in mind that all rules were made to be broken.
Concurrency in Local ServersIn the days before MTS, the local server was a popular option for objects that had to be isolated from their clients for stability reasons, because clients in multiple processes could access objects that needed to share a process, or because an object might have to survive beyond the lifetime of the creating client process. The local server remains useful today for objects that are functionally tied to a particular client executable, especially regarding OLE and Automation of a client’s user interface. It makes little sense to have an Automation object intended to alter the number of columns on the currently visible spreadsheet loaded in process or in the process space of a surrogate, when the spreadsheet application itself runs in its own executable. Local servers are also the best option for objects that must run within a Windows NT or Windows 2000 service processbecause the objects require the local system’s security context, a service’s preloading behavior turns out to be beneficial, or perhaps an IT organization prefers to use the administrative mechanisms of Windows NT and Windows 2000 services for your server.These special cases aside, the advent of MTS and COM+ has made the local server almost obsolete. Two factors have contributed to the local server’s demise: lack of need and lack of new features. The lack of need stems from the MTS and COM+ "server application," which allows in-process servers to run in their own process space, eliminating what was previously the most common motivation for choosing a local server over an in-process server. The lack of new features is a result of the inability to install local server objects in the COM+ catalog. Therefore, these unconfigured local server objects cannot take advantage of COM+ transactions, synchronization domains, just-in-time activation, event subscription, and so on. In addition, local servers always have exhibited a number of unpleasant idiosyncrasies with regard to concurrency management. Let’s examine them now.
Apartments in Local ServersA ThreadingModel named value under the LocalServer32 key in the registry has no effect and normally is omitted. Instead, a local server determines the apartment that will be used to make calls to each registered IClassFactory interface by the apartment membership of the thread used to make the registration call via CoRegisterClassObject. Therefore, it is possible to force each class of object into a different apartment by registering each factory in a different apartment. Of course, each factory has the freedom to force each object instance into yet another apartment at instantiation time, as previously discussed.A local server that contains more than one apartment type is referred to as a mixed-mode server. Multi-apartment local servers can be a challenge to implement, since each STA thread must be kept alive while objects may remain in its apartment. Therefore, all these threads must be coordinated for startup and shutdown, introducing race conditions, which we will examine next. The factory registration mechanism also implies that local servers cannot instantiate objects in the thread-neutral apartment, since threads cannot register themselves for TNA membership. The thread-neutral apartment would have held a lesser appeal, since clients encounter the heavy burden of a process switch on each call anyway. Nevertheless, the TNA could be useful for callback interfaces handed to internally created in-process servers. If you want this benefit, you must go with a COM+ server application instead of a local server, or you must use the free-threaded marshaler.
Local Server PitfallsAt least two well-known race conditions are associated with multithreaded local servers. The first has to do with server startup, the second with server shutdown. When initializing, each thread representing an apartment calls CoRegisterClassObject multiple times, once for each object class to be instantiated in that apartment. Then the thread enters a message loop until it receives a quit message, which signals impending process termination and instructs the thread to revoke all registered factories and then terminate. But object instantiation requests can arrive as soon as a factory has been registered andin the case of an STAthe thread has begun servicing its message loop. This means that process termination can be initiated before all threads even have completed their initialization sequence, as a result of the entire "early" set of objects being released. In turn, this can lead to a situation in which threads fulfill new instantiation requests as they complete their initialization, after the process already has decided to shut down. When the shutdown does occur, clients are then left with disconnected proxies, resulting in errors when they attempt to call methods on those proxies.The solution to this problem consists of registering all class factories in a suspended state (REGCLS_SUSPENDED). As each thread finishes registering factories and enters its message loop, it decrements an interlocked counter, and when that counter reaches zero someone calls CoResumeClassObjects, allowing access to all factories by the system simultaneously. The second race condition is similar, albeit associated with server shutdown. When the server’s last object instance is released and the server decides to shut down, it posts quit messages to all threads. But instantiation requests can arrive at factories before each thread processes its quit message and revokes its factories. Again, clients will be left with disconnected proxies. COM+ (and COM) provide the functions CoAddRefServerProcess and CoRelease.ServerProcess, which assist multithreaded local servers in managing their lifetimes. The shutdown race condition is eliminated if all objects call CoAddRefServerProcess when they are initialized, and CoReleaseServerProcess when they are destroyed. These functions also should be called by the factories’ IClassFactory::LockServer method implementations. The server should begin the shutdown process and post quit messages to its threads when CoReleaseServerProcess returns zero. The race condition is avoided because the COM+ library suspends all registered class factories before returning zero from CoReleaseServerProcess.
Last Updated: Friday, July 6, 2001 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||