Click Here to Install Silverlight*
IndiaChange|All Microsoft Sites
MSDN
|Developer Centers|Library|Downloads|How To Buy|Subscribers|My MSDN
 
Thinking about the .NET Garbage Collector - aka make your own toy GC
By jroshan@blr.cognizant.com
 
Article Posted: January 29, 2003
 

This  implementation is based on the .NET GC as described in Jeffrey Richter's MSDN article. I started out writing the GC because (a) I had heard  that the .NET GC was one the best GCs ever written (b) I was mortally afraid of something like this because I had defined the limitations of my understanding to not be able to include something as complex as a GC (c) the article was very good and made me feel like I understood the whole thing. Eventually I began wondering if there are any unanswered questions left. There were many regarding the basic implementation, as described by Mr. Richter, and they have popped up in a more pertinent way now that I have tried to implement a GC of my own, they probably wouldn't have if I hadn't tried this. So largely writing the GC was out of my own curiosity and to possibly get some company of people at my level,  who would like to try a more childish implementation before they move onto the real thing.

Jeffrey Richter's GC Algorithm articles at MSDN Magazine
http://msdn.microsoft.com/msdnmag/issues/1100/GCI/
http://msdn.microsoft.com/msdnmag/issues/1200/GCI2/

Spirit

To avoid potential show stoppers let me say that at the end of this article you are not going to go home with a usable GC. Also this entire thing is just probably a piece of wasted engineering through which I wanted to stretch my own imagination and in the process get some questions about the .NET gc ironed out. Unless the GC interests you this is probably a waste of time to read.

This has been developed when I have had no formal education that covers GCs in any way and I haven't peeped into rotor to see how it is actually being done. I just wanted to keep it real simple and understandable from the non-compscience-proffessor perspective.

This implementation is a long way of asking some basic questions, that is why the questions index.
Please accept it in the spirit in which it is offered.

Questions Index

How does the GC locate the roots of the applications ?
How does the GC know the references that a given object will contain ?
How does back patching pointers work ?
How is the finalize queue implemented ?
How is weak reference table implemented ?
Generations : Is this on the right track?

Index

Disclaimer
Implementation
How objects are allocated on the heap
References to Objects and Object that hold references
The Roots of the application
Code
Mark and sweep
The ref table and fref stack
Finalizers - finalize queue and freachable queue
Weak references
Generations
Finally
Acknowledgements
 

Disclaimer

This is only meant to be my interpretation of the GC algo and is written in the spirit of the .NET GC. This is not meant to be factually correct about the actual GC in terms of algo, design choices or data structures.
This GC in its current form is not practically useful and cannot be plugged into a 'runtime'

Implementation

I will assume that you have read the articles in msdn magazine and continue from there. If you haven't, please do so now. If you have lets get straight into the matter.
In the first few sections I will describe how I have created the equivalent of a heap, objects, reference within objects and application roots.

The central entity in the gc is the managed heap. It is where the gc allocates space for the memory requests by the application. In the managed environment every block of memory requested by the program is used to hold some specific data type.

The heap represents the managed heap stack and heap_top is top pointer. In my GC the memory used for the heap is allocated by the function gc_init() dynamically, This function also initializes the other data structures of the gc. Of course in the real runtime the way space is initially allocated to the managed heap would be rather different.

How objects are allocated on the heap

The main structure that you will need to be familiar with is 'object'. An 'object' describes an instance of an object/a block of allocated memory on the managed heap. 

The important member of 'object' to note here is object::size. Whenever a memory block is to be allocated on the managed heap, the gc is to be told the number of bytes required for it. This size is the valued contained in the object::size. 

The GC creates an instance of object on the heap, immediately following which is the memory area of size bytes. Whenever the gc allocates memory on the managed heap for an application, the total memory used on managed heap (in bytes) will be:
    space allocated = sizeof(object) + size_requested


The 'object' class is thus a header information for each memory block, that defines certain characteristics of the memory region located immediately after it. On the managed heap the request of a memory allocation would look as shown above.

The managed heap consists of many such allocations - each allocation starting with a block defined by the 'object' structure that defines the memory region following it.
Notice from the diagram that the allocation of memory starts from the bottom of the heap and proceeds upwards. The first 'object' will have the same base address as the heap base. The next object will be at location sizeof(object)+object::size and so on.

References to Objects and Objects that hold references :

Now any object can contain references to other objects right ? Since this is meant to be a toy gc I decided to implement references within objects as follows. Every memory allocation is treated as an object. A reference to any object  is created through a root_entry. Every object (optionally) contains an array of root entries. A root_entry in physical terms is a structure that contains a pointer to an object and certain other members. The number of root entries that can be contained by an object is simply object::size/sizeof(root_entry). I use the memory allocated per object to (ie the dark blue area immediately above each red area in the above diagram) to be the region used for creating the objects root_entries.

A root_table is an array of root_entries. Each object can be asked to return a root_table which acts as an array of root_entries. The maximum number of root_entries that the object can contain is determined by its size. Each root entry can contain a short name string and a pointer to another object. Thus an object (object structure + its attached memory region) would now look as follows :

Now, I know that this is not how it works in a real gc at all. This is just to simulate the effect of an object that has references to other objects. This way I can add any number of pointers to an object (limited by the amount of space allocated to it) and reference other objects from this one.

In a real situation the references contained in an object will be defined by the kind of class that the object is an instance of. The runtime will have this knowledge according to the definition of the class.

If the gc is going to be used for some practical purpose I will have to remove this idea of creating root_entrys in the user requested space. I will need another mechanism to detect the presence of the references within an object. But we need not cross that bridge now, for our purpose of simulating objects that have references to other objects this will do.

The Roots of the application :

The roots of the applications are the set of all the pointers held by the application to various objects used by the application. The complete root set of the application are all pointers that reside in applications global area/ static heap/  stack - local to method calls .. etc.

Now the way I look at it, there is a mechanism by which the runtime can access all the roots of the application and thus decide which objects they reference on the managed heap. The set of all reachable objects on the managed heap would be the set of all objects reachable from the applications roots, and objects referenced by each of those objects, and so on recursively.

I just needed some mechanism to have set of pointers to some objects on the stack. In the toy gc I have chosen to allocate a rather large object outside of the heap and used that as a global main object. The root entries in the memory region of that 'main' object is treated as the applications roots. 
So if I am to simulate the effect of a method creating an instance of object on the managed heap, In my GC I would (1) call gc_alloc() to allocate the object on the managed heap and (2) use the address returned by gc_alloc as an entry in the root_table exposed by the main_object - in effect adding one more pointer to my application roots. 

The point is that the exact way in which the gc simulates .NET runtime's accessing the program's roots is not an important factor here, as long as I can simulate the effect of the process.

In reality each object on the heap would be referenced not just from the roots of the application but also from other objects, and might also contain references to other objects within itself. The following is more realistic :

The present setup is similar to what Mr. Richter had at the beginning of the article. This shall for the basis on which I build up the rest of the gc and questions and decisions of the rest of the implementation.

Question : How does the real GC keep track of the roots of the application? The roots are spread out over many memory areas of an application and further since the application has actually run through the JIT at the time of execution, these need to be location accessed by actual machine code and at times even parameters of operations. How are these tracked by the GC?

How are the references in a object of a particular type tracked by the GC? this is a known set unlike the roots of the application, and can be determined by looking at the definition of the class. However does the GC maintain a list of locations for each class, that are references, in the classes instances? How are these tracked?

Maintaining such lists are both expensive and memory hogging for the GC aren't they?

Code

These are the definitions of some of the important things mentioned here : possibly worth  a look.

    #object  
   
struct object
{
	int		flag_type_indicator:2;
	int		flag_used:1;
	int		flag_suppress_finalize:1;
	int		flag_modified:1;

	object **	pthis;
	object **	ppwref;
	int		size;
	int		rtab_count;
	int		rtab_max;
};
	
 
       
    #root_entry  
   
// the following structures are used for simulating 
// pointers/refrences/roots
//
//root_entry - denotes single root
struct root_entry
{
	char 		name[5];	//name of pointer
	object **	ppob;		//pointer to object
	int		flag_valid:1;	
};
//
// root_table - an array of roots 
//(usually located in the extra bytes after
// an object)abstracted as a table.
struct root_table
{
	object * parent_object;
	root_entry * rtab;
	int * rtab_max;
	int * rtab_count;
};
    
 
       
    #weak references  
   
//weak_reference[]s will form the wref tables
struct weak_reference
{
	int		flag_type_indicator:2;
	int		flag_used:1;

	weak_reference ** pthis;
	object ** ppob;
};    
 
       
    #gc calls  
   
// GC
//
extern void 		gc_collect(int generation=0);
extern void 		gc_init();
extern object **	gc_alloc(object *pob);
extern void		gc_register_for_finalize(object ** ppob);
extern void		gc_clear_freachable();
extern object **	gc_add_wref(object ** ppob,int flag_wref_type);

extern void			gc_reset();
//contains the root table (global) of the application
extern root_table  main_rtab;
    
 
       

Mark and sweep

The roots of the application and the references of objects to each other form a graph of objects in the computer memory. All these objects reside in the managed heap. Marking is basically the process by which the GC determines which objects are used. The basic approach is that the GC does a recursive depth first search of the all objects visitable from any given object starting at the roots of the application. Every time an object is encountered it is marked as used, by setting a flag in the 'object' structure.

Consider the above example. The graph formed among the objects will look like this :

A DFS (depth first search) of this graph would not visit the two unreferenced objects and these will not have their 'used' flag set. After the mark phase, the GC will have a colored graph of the heap, something like this :

Here the unreferenced objects have been shown as grey. The GC now goes for its compaction phase - the sweep of the managed heap. In the sweep phase, it collects toward the bottom of the heap, all the objects that have their used flag set. Cool ? Since I can traverse through all the objects on the heap by the simple formula :

    object * pob = (object *) heap;    //first object
    pob = (object *)((char *)pob + pob->size + sizeof(object)); //next object

Now that I can visit each object, sequentially starting from the bottom-most, I need to visit them in O(n) time and decide to push them down in memory by a simple memcpy() (lets just take that as atomic for now). Most of this however would be known to you if you have been through Mr. Richter's  article. After my compaction phase I would not have these grey objects left on the stack, instead of which I would have only the objects that have their used flag set during the 'mark' phase. Thus the GC has swept all garbage away !

 Now the only thing to do is to patch all the pointers in the application so that they point to the new locations of the objects..... !!! what ? In the above example its only one object that moved down in memory. But that is not the case in a real situation, there are many objects that have moved. How am I going to find all the pointers that point to each of those objects ?

Question : How are pointers/references back patched after the objects have been moved?
(1) Does each object maintain an array of all the addresses of all pointers that point to it? If this is so (a) this will have to dynamically grow as references to am object change (b) memory management of these lists per object would become a major issue (c) its an expensive task as the runtime has to constantly keep track of need to change these.

(2) Does the GC create such a structure temporarily during its mark phase ? per object, per reference? So when an object dies that list is to be  searched for all pointers to that object ? What is the efficiency of such a search algorithm, what are the data structures used ? Further the application will have to be frozen when this happens, because if the application creates references to an object after the references to it have been considered, the new reference will not get back patched...

(3) Does the GC search the entire applications memory graph to find references to each of the objects? Unlikely, the DFS is expensive, doing it twice is a rather sad thing.
How, what is the mechanism used ?

The following is the approach that I have chosen to work around these - now you know why I call this a toy GC. :)

The ref table and fref stack

I decided to route the pointers to objects through an indirect reference. Each object on the stack has only one unique location that knows about its actual address in memory. The table of all such pointers to objects on the heap is called the ref table (short for reference table). The following picture shows the relation of an object with its ref table entry.

Every object contains a pointer to its reference table entry called the 'pthis' (makes me smile). If the object is moved around in memory the object goes on to to change the value in *pthis to this and the reference is patched. So when the GC moves objects around in memory, it calls the back_patch_ref() method of each object when it moves. This method will modify the reference table entry and thus the ref table will always to point to the valid location of the object.

Now all references in the application will be actually references to the ref table, which will contain the valid address of the object in memory. So the above diagrams to pointers among objects and app roots will actually look like this, by adding the ref table to the picture.

I have not shown the links from the object to ref table (through pthis).

This solves the problem of back patching references, but brings along some baggage of its own.
Lets for a minute assume that the MS implementation actually used a ref table. Does that mean that every access to an object in .NET is through an indirect memory addressing scheme, two memory fetches per access, every access ? Isn't that a little expensive ? Lets argue this both ways, if this was actually the case, the ref table for a 32bit processor is 32bits per ref. That means that if an application has 10 thousand objects, the ref table would come to < 40k, this can reside in the processor cache, frequently used sections will, so this will be rather fast.
Of course, this is only as long as they have 'ref table'.
Another possibility is that during DFS traversal, they would have changed the addressing mechanism of the application to actually call a stub. The stub when invoked will go on to back patch the actual reference in the application with the latest address available in the ref table. This is simply empty fantasizing, they have to have an asm instruction that they can back patch safely in the context of the application so that the application doesn't break when the stub is invoked. Back patching wouldn't be a hassle in a Java style fully interpreted framework, but in an actual runtime demand compiled JIT its not a simple stunt.

That said, lets consider another aspect, when an object is being allocated by the gc using gc_alloc(). The gc return the address of the ref table entry for the object, ie the value of the objects 'pthis'. How does the entry in the ref table get created ?

The ref table entries cannot be moved around in the ref table because there are actual pointers in the application that point to these entries. As objects are created and they die the ref table will have used and unused entries scattered throughout. So I will have to search the ref table linearly till I find a free entry. In worst case this will be O(n). That is not good. So I seem to have moved the problem of compaction from one place to another and seem to have removed famous O(n) allocation.

Is there a workaround ? Here is mine. I decided that if MS is moving along these tracks then they would still have a O(1) allocation. I just need to find free locations in my ref table in O(1) time. Stacks are O(1) data structures. I decided to give the ref table a supporting stack to keep track of free locations in the ref table. This stack is called fref stack.

The grey areas are vacant slots in the ref table. So every time I need to get a free location in ref table, I need to pop from fref and use the value as an index in my ref table.

    void * free_ref = ref_table + fref_stack [ fref_top -- ];
    pob->pthis = (object **)free_ref; //get pthis to point to the free entry
    *(pob->pthis) = this;    //make it point to 'this'

Cool? So that's about my approach to the problem, that comes along with a serious performance hit regarding the indirect addressing for accessing objects.

Freeing an object's ref table entry : When an object is being released by the GC, the GC uses the object's pthis to locate the ref entry, calculates the offset of the entry in the ref table and pushes the offset in the fref stack, so that it can be used by other variables.

While not relevant here, what if the real GC also works like this ? what sizes will ref and fref have, after all these finite arrays. Well, I would say that multiple ref+fref combinations can be cascaded as a linklist. You could search the linklist serially to find a fref_top that is non zero to find a free entry somewhere. Optionally you can ensure that your first node in the linklist is always one that has a free node - this is simple to implement. Every time an object releases its entry the ref table can check if its the first node in the ref link list, else make itself the first. Similarly if a object is allocated and that causes fref top to be zero, then the node is moved to the end of the linklist, and if the next node has a zero fref (it means that all nodes between these have zero fref top) an empty node is added to the start of the linklist.

So at this point we have (to summarize) :
A method for allocating objects on the heap in O(1) time complexity
A method for marking the heap using DFS of the application's graphs = time complexity of the DFS
A compaction method of the heap that is O(n) for the heap, O(1) per object - assuming atomic memcpy()
A O(1) method for  backpatching references
A O(1) method for deallocating an object - actually to remove the ref table entry to the object
A indirect memory reference over head to access every object on the heap
Data overheads of the object structure per heap object, for the ref table and fref stack.

That includes pros and cons. Lets move on.

Finalizers - finalize queue and freachable queue

The finalizer queue is an array of pointers to objects on the heap (indirectly through the ref table of course, from this point on when I mean pointer to an abject I will mean a pointer to the ref table entry that points to the heap, if otherwise I will say so explicitly). Now the finalizer queue needs the following properties.

  • It must be efficient to add an entry to this list

  • It must easy to remove an entry from this list

  • It must be easy to move over all entries in this list easily.

  • It ok to move the position of entries in this list because relative/absolute positions are not important.

  • No one links to the entries in this list unlike in the case of the ref table.

The Finalize queue is implemented as follows. It fundamentally behaves like a stack, at least in terms of adding elements. It pushes new entries onto the stack, the  the finalize_top is the top pointer.
Unlike a conventional stack where elements are removed only from the top, here elements can be removed from anywhere - every time an element at a interior position is to be cleared, I pop from the stack and insert the popped element at the vacant middle position. When come to think about it, its again an O(1) operation - though here the cost of copying an entry makes a relatively expensive fixed cost step. Then again if you compare that with the overhead of copying 4 bytes to the cost implementing the finalize() for the object, The copy expense can probably be made up for easily if the finalize handling was a little optimized - this is open to debate... please do.

The freachable is implemented in the same way as the finalize is - I cant think of a case where the freachable needs to drop entries from random locations. That need arose in finalize because any object that needs a finalizer call can go out of use at any time. In comparison, all the freachable entries do need finalization - the .NET framework assures no order in which finalizers will be called among the various objects, so this can be a simple stack implementation. So I am not drawing anything here.

Question : How is the finalize queue implemented in the real framework? Is this how it happens in the real framework?

Weak references

This is one more thick thing, bear with me.

The thing about weak references is that there are actual pointers from the application that point to these entries, so they cannot be moved around in the weak reference table (which shall now on be called the wref table).

  • It must be cheap to create a new weak ref. Thus it must be easy to make an entry in wref - it must be cheap to find a free slot in wref.

  • It must be cheap to remove weak references - this happens when there are no references in the program to the wref entry.

  • It must be cheap to set a wref entry to 0 and still retain its value on the wref table (because there are pointers in the application to the wref table entry that have yet to check to see if their object is still in memory)

  • Finally it must be easy to search through all the wref entries efficiently (to see which objects are marked as used)

Now if I use a model like the finalize queue, I have no way of preserving the positions of the entries, those are important here - app roots refer to these. Further the entries in the finalize queue had no need of knowing if they were being referred by anybody, so it was cheap for them to die when they don't point to a valid object. Here after the object dies, the value 0 has to be maintained as long as the program has a reference to the wref entry.

I cant use a simple idea like ref table earlier - the problem is that I need to access all the used entries fast. In the ref table I will have the traverse the entire array to hit all the used entries. It didn't sound like a good idea to maintain two supporting stacks, one holding used entries and one holding free entries to make insertion and removal cheap in a array where I cant move around the entries. Further it still doesnt let me know how long I will have to preserve entries there.

When you come to think about it, wref table entries behave a lot like actual objects of the heap. They need to know their life time - ie how long they are referenced by the application. Further if I rephrase the 'cannot be moved around' to need to be accessed from a fixed location - a lot of problems seem to lift.

The weak ref entries are treated as objects, similar to objects on the heap. But they are slightly special objects in the sense that they reside on a separate region, not the heap - a region where only wref objects reside.
Life time of these objects are determined during the mark phase of the GC, same as for other objects. Weak references that are not marked as used are lost during compaction of the wref table. To keep it easy to access all the used entries in the wref table, I need to compact the entries. If I compact them, how can I have pointers to the entries ? By treating them similar to the way as I treat other objects - I reference them indirectly through the ref table. :)

Its the same diagram as before, the heap has been sidelined a bit and some entries are added to the ref table to point to the wref table entries. ref table entries to the wref table have been given a dark shade just to make them more noticeable - they are in effect not treated any differently from ref table entries to normal objects on the heap. Similarly, there can be any arbitrary number of references to a ref table entry, of a wref object.

Now what would a wref object look like ? It would essentially need a reference to the object that it is a weak reference of. Next it would need a 'pthis' so that it can back patch its address after it is compacted in the wref table. And it needs a flag that can be set during the mark phase.

The above picture shows the object and its ref table relation in grey lines (you are already familiar with that). Further it shows the same kind of relation for the wref object in blue lines. The pob member of the wref object is the pointer to the object. Similarly the object keeps a pwref back to the wref object, I will cover this later. These are shown in red.

How does this work ?

The wref table works as follows. During the mark phase wref objects are marked as used, same as normal objects on the managed heap. Next after the mark is over wref is scanned sequentially from 0 to wref top. This is a simple linear scan. Here objects that don't have their used flag set are overwritten by objects found higher up the stack that are moved down. This can be done in the heap object style 'compaction' or the finalizer queue style pop and overwrite (the later would be cheaper). Each used object has its pob checked to see if the heap object it is referring to is still valid. If not, pob is set to zero. Further, if the wref object is being moved down in memory it back patches the ref table using its 'pthis'.
After compaction is over the wref top is set to the new value. Or if the pop method (which I have implemented) is used, the top will be already be at its correct value.

Similarly inserting into the wref table is just a push operation.
The 'pwref' member for the object has been added because I see no need for a heap object to have multiple wref objects. In case the user requests a wref object, the system checks for nonzero pwref. If so it just returns to the user the value of pwref. If pwref is zero then a new wref object is created, pwref is set and then returned. Having this is at the cost of having an additional 4 bytes in the object. It gives us the benefit of not having multiple wref entries to the same heap object.

Thus all the operations required on wref table can complete in bounded fixed finite time per wref entry. Thus all operations are O(1) for each wref object.

Short and Long weak ref tables are essentially the same, the difference is only according to when the table scan is called. One table is scanned after the initial mark phase. The other table is scanned after the freachable also marks used objects. Both can use the same approach.

Question : How does the .NET GC handle weak references? How different is it from this?

Generations

The last part of the GC. Sorry I don't have any nice pictures on this section, I don't have access to Visio anymore, you will have to bear with my writing skills.

'Generations' is largely a construct for improving the performance of the GC. Like Mr. Richter points out its faster to do all the GC operations for a part of the heap than for the whole heap. Objects are classified to belong to various generations depending upon how long they have lived. How long they have lived in this context is the equivalent of how many sweeps of the GC they have survived. Since objects are created in time-space order on the heap-stack, all the objects of a generation are close together. The heap is partition into three areas, generation 0, 1 and 2.

In the implementation, one will find the pointers :

  • heap_base - which always has the base address of the heap

  • heap_top - which has the location of the next free byte on the heap

  • heap0 - all objects between heap_top and heap0 are gen 0 objects

  • heap1 - all objects between heap0 and heap1 are gen1 and those between heap1 and heap_base are gen2

So the task of implementing generations resolves down into 2 functions for the GC (1) how are heap0 and heap1 set ? and (2) how do you sweep a particular part of the heap only.

Lets cover 2 first : 
Sweep: Lets assume that the GC has just finished a sweep. All objects left have survived so they are at least generation 1 objects. Now, program execution continues and more objects are added to the heap in generation 0. Lets assume that none of the old objects are modified at all during this part of the run. 
When the GC sweeps generation 0 objects :
Remember none of the objects in gen1 or 2  have been modified. Therefore none of them will have any references to a gen0 object. So the reason for survival of a gen0 object is only due to a root reference to it or a reference for another object in gen0. Therefore the GC's DFS (recursive depth first search) can just ignore any references from a gen0 object to higher generation as these paths will never lead back to any gen0 object and its only gen0 objects that we mean to collect. So the DFS is pruned so to speak and is now faster. It will also mark only gen0 objects. Now all the above algorithms hold good as long as we just assume that the managed heap actually begins from the current address of heap0 (heap0 - heap_top is the address range for gen0 objects remember).
The same argument applies for generation 1 sweep. Here we assume that no generation 2 objects are modified, and so we don't DFS into them.

Now, what if older generation objects change ? Coming back to our gen0 sweep, if a gen1 object has been written into, it might contain a reference to a gen0 object. If this object is not marked as used through a traversal of only gen0 objects it would have been collected by the GC. Hence the GC does separate DFS on each of the modified older generation objects. From each of these objects only paths through gen0 objects (if any will be traversed, because our DFS will not traverse down any references to a gen1 or gen2 object ). These individual DFS from these objects are usually short tree spans - the size of which is purely determined by the region of the graph in the gen0 region. Thus now even if objects of any generation are changed we have a method for doing a gen0 sweep.

In steps:

  • Do a DFS from the application roots, ignore all references to gen1,2 objects

  • Do DFS from each modified gen1 and gen2 object, ignore all member references to gen1,2 objects

Similar logic can be applied to a generation 1 sweep. A generation 2 sweep will do a full DFS of the entire heap and a compaction of the entire heap.

Just a minute, how do we know which objects are modified ? Good question, there seems to be some magic happening here - Mr. Richter mention the GetWriteWatch() kernel call. Let me just say that the msdn reference about the function says that it returns a list of modified memory pages of a region set up for memory watch. Well what I think that means is that the OS API can help in tracking changes :) I didn't use that in my implementation because (a) I wanted it to be portable to places where people can just see the algo in effect even is they didnt have OS support for it and (b) I just didn't understand how to use the function (we all (some of us) have our memory blocks about what we are capable of doing, and this just frightened me off), probably when I find a sample usage of the function I will be more comfortable.

In the toy GC when a root_entry in an object is updated I mark the object as modified. If the objects flag is newly set I add a direct reference to this object in of two arrays depending on (and if) it belongs to gen1 or gen2. 
If a gen0 sweep is done, then the modified flag and entries are reset for all gen1 objects. If a gen1 sweep occurs modified flags and entries of gen1 and gen2 objects are reset. If a gen2 sweep occurs, modified objects aren't important, these flags are unset during the compaction phase.

Come to think of it, the way the GC is implemented as 3 generations, how rarely they expect gen1 sweeps to occur in comparison to gen0 sweeps, the three generation architecture is nice. More generations would have added to the overhead of keep the modification information, multiple DFSs etc and would probably not have added much to performance. These are not things that can be vacantly claimed, statistics and algorithmic analysis should prove it.

Question : Is this on the right track?

Finally

That was a lot of typing and I feel tired. I usually have to be dragged around to write documents - but this was somehow a pleasure to work on. Basically because I am curious about the GC and secondly because it was easy to draw these diagrams using Visio. Amazing tool, as each diagram kept coming out, I felt I had to write about it.

The GC implementation isn't complete in the sense that  I haven't thought much about optimizing the DFS by not traversing unmodified sub-branches. I haven't thought threading and synchronization issues. I haven't thought about references to actual objects that might only be a part of a allocated memory block ... etc, to name a few.

I am hoping to get a few answers through this effort and possibly consolidate that into an addendum to the MSDN article and get to know the GC better.

Acknowledgements

Thanks, Sid for the weak reference brainstorming. Thanks Pooj for the finalize queue thoughts. Thanks Don and Arun Dev for letting me have someone to talk to, in the initial days when I was up to this.
Thanks Pooj for the pile of errors you pointed out and the patience.


Some Garbage Collector related urls

A Real-time Garbage Collector based on lifetime of objects - MIT
    http://lieber.www.media.mit.edu/people/lieber/Lieberary/GC/Realtime/Realtime.html
Boehm-Demers-Weiser Conservative GC for C++
    http://www.hpl.hp.com/personal/Hans_Boehm/gc/
Reference Objects and Garbage collection - feature description - Sun/Java
    http://developer.java.sun.com/developer/technical
A garbage collection framework for c++
    http://www.codeproject.com/cpp/garbage_collect.asp
Gorik's Garbage collection page
    http://www.cs.rpi.edu/~gorik/garbage
The Garbage Collection page - Richard Jones
    http://www.cs.ukc.ac.uk/people/staff/rej/gc.html

 

©2009 Microsoft Corporation. All rights reserved. Contact Us |Terms of Use |Trademarks |Privacy Statement
Microsoft