Garbage
Collection: Automatic Memory Management in the Microsoft .NET Framework
Host: Rahul Chitale – Consultant, Microsoft Consultancy
Services, India
May
30, 2002
KunalS_MS: Good Afternoon to all present
KunalS_MS: Thank you for joining us in today's expert chat on
KunalS_MS: Garbage Collection: Automatic Memory Management in the
Microsoft .NET Framework
KunalS_MS: and the Expert who will help us through our queries is
KunalS_MS: Rahul Chitale - Consultant, Microsoft Consultancy Services,
India
KunalS_MS: Hello Rahul and thank you for joining us
RahulCh_MS: Hello everybody.
KunalS_MS: Rahul, why don't you start of by explaining this all together
new concept of Garbage Collection in .NET
KunalS_MS: What does it imply for the developer and how does it make
his/her life easier?
RahulCh_MS: Garbage collection is a 'supporting' process
for the new managed memory heap in the .Net framework.
RahulCh_MS: Basically, the managed heap gives us flexibility
around several issues in normal heap programming like splitting of
the heap etc
RahulCh_MS: But it assumes the presence of a infinite
memory space.
RahulCh_MS: And that memory space is guaranteed by the
Garbage collector
RahulCh_MS: So together, they give us a lightening fast
memory allocation technique with ‘fast enough’ free algorithms.
Guest: (madhu) How does the GC decide which memory to free and when
to free?
RahulCh_MS: Big answer to that Q. I'll keep it simple
for now. Unlike 'normal' garbage collection which really tracks which
objects have been allocated? The GC here assumes that all objects
are garbage... unless
RahulCh_MS: the objects are directly or indirectly part
of the root of the application.
RahulCh_MS: By the root we mean all data being accessed
by pointers on globals, statics, registers and the stack.
RahulCh_MS: So the GC cleans up everything which is not
part of the tree of the objects leading from the root.
Guest: (Kumar) Does the GC employ different algorithms for "fast"
garbage collection? If so, what makes them fast? Why aren't the objects
promoted beyond generation 2?
RahulCh_MS: The GC in .Net uses a 'mark and sweep' algorithm.
There are others - JScript uses a Mark and Collect. There are reference
counted variations.
RahulCh_MS: as well
RahulCh_MS: However, the .Net GC uses only mark and sweep
which performs pretty well in the scenarios intended.
RahulCh_MS: The fast part for the most is attributed
to the memory allocation. Still our tests show that typical Gen 0
cleanups can occur in less than 1 ms of time on a Pentium.
RahulCh_MS: As regards to promotion beyond gen2, 80 %
of the objects can be found to live in generation 0. It becomes actually
unefficient to have anything beyond 3 generations.
Guest: (Kumar) From what is documented, the GC comes into action when
there is memory lack for allocating more objects. Please Comment.
RahulCh_MS: The GC is actually invoked in three circumstances
- a) If there is 'pressure' on the managed heap b) if all threads
are suspended c) if a thread invokes GC.Collect
RahulCh_MS: (a) pretty much talks about the 'low memory'
situations. (b) is self-evident.
RahulCh_MS: (c ) is where things get interesting and
the GC can actually hijack threads into suspend mode for performing
garbage collection
Guest: (Kumar) What's the rationale behind invoking GC when threads
are suspended? There could be no "pressure", then why run GC?
RahulCh_MS: Even if there is no pressure, the GC still
requires to 'finalize' objects which are no longer in use. Here enters
the 2nd hidden thread of the GC.
RahulCh_MS: This is responsible for cleaning up objects
which require some kind of post-use cleanup and which are no longer
referenced in any of the roots.
RahulCh_MS: So even though there may not be pressure
on the managed heap the app could still be holding onto precious resources.
Guest: (Kumar) And isn't hijacking relating to GC's behaviour in multithreaded
situations?
RahulCh_MS: Exactly. The GC will only hijack the thread
by replacing function calls into a stub if the threads don't suspend
themselves
Guest: (Gurneet) What is the difference between the Garabage Collector
of Java and .NET?
RahulCh_MS: Both use Mark and Sweep, but Java doesn't
use a secondary queue for post-use finalization. There are also differences
in the way the 'sweep' is actually done.
Guest: (madhu) I had a program which had a timer and in the timer
we create the object, this object is a local object in the timer function.
What happens is after sometime the system has ran out of memory and
the GC still doesn't do a collection?
Guest: (madhu) Why does this happen?
RahulCh_MS: Unless you've put a perf counter on the GC
clean op, I don't think you can say that the GC didn't collect.
RahulCh_MS: So perhaps you could elaborate on that.
Guest: (madhu) But as soon as i closed the program the memory was
released.
RahulCh_MS: Then there must have been adequate memory
and no pressure existed on the managed heap to fire the GC.
RahulCh_MS: Again, I'll stress that you can't really
'make out' a garbage collection unless you have added the counters.
Guest: (madhu) No the system was running on virtual memory, and the
system has really gone unstable.
Guest: (madhu) Still nothing happened, even i tried GC.Collect, but
nothing happened.
RahulCh_MS: The GC will tune itself on virtual memory.
There is no way to distinguish physical and virtual memory so I would
hazard that your cache settings were probably a bit too high.
RahulCh_MS: Also a GC.Collect doesn't necessarily mean
that the collect will happen on the same thread.
RahulCh_MS: Would suggest that you monitor it using the
CLR memory counters. You should definitely see the collection happening
within a reasonable time of firing GC.Collect
Guest: (Kumar) Assuming an application allocated couple of 100 objects
(an array). There is no shortage of memory. And then, after allocation,
it exits. When will the GC run, and mark the objects as to not having
a valid root?
RahulCh_MS: There is a GC per app-domain, so the GC will
be pulled down when the app domain exits. Before it does exit, the
GC will explicitly dispose all root references still being held.
Guest: (BhuvanMisra) Do i have choice of garbage collector implementations
that would be optmized for a particular kind of environment. Eg. a
separate implementation for client, server ... as the needs of both
are different.
RahulCh_MS: Yes. mscorwks.dll and mscorsrv.dll are actually
the two different variations of the GC which are used. The .Net framework
for Win2k Prof and other workstations uses the lighter workstation
version
RahulCh_MS: and the other is a multithreaded, multiproc
sensitive variation for servers
RahulCh_MS: Thirdly the .Net compact framework version
is also different.
Guest: (Kumar) If the GC will collect the objects long after application
termination, then doesn't this leave memory blocked?
RahulCh_MS: Actually - that's an application decision
- if you as a developer have not implemented IDispose and chose to
clean resource wrappers in finalize.
RahulCh_MS: You do run the risk of these resources not
getting released until the object gets collected either as a result
of pressure or an explicit Collect invocation
Guest: (Kumar) Documentation goes about talking that JIT compiler
can insert safe-points within a code. How and when this insertion
does happen? Can this behaviour be control?
RahulCh_MS: Safe points are used by the compiler to actually
pin point the instruction at which the thread can be safely suspended.
RahulCh_MS: In many cases, the compiler may make the
decision to not issue a safe point immediately
RahulCh_MS: say if a Direct Memory access is happening
RahulCh_MS: Safe point insertion cannot be controlled.
That is the compiler's
decision based on the state of the thread.
RahulCh_MS: By DMA, I do no mean touching the programmer
touching memory but rather say the hard disk writing into RAM as a
result of a request.
Guest: (Kumar) But, under .NET, unless we are doing unsafe programming,
a DMA can't happen! So, what are the other conditions that may result
in safe-point insertion?
RahulCh_MS: Other scenarios - basically anyplace where
the thread is still waiting for the instruction to conclude. Say a
resource contention getting resolved.
Guest: (net2jee) What makes .NET GC better and more efficient compared
to Java GC ?
RahulCh_MS: Holy Q here - three parts to this - the GC's
exact structure of a root differs. Secondly the mark and sweep here
really depends on a compact operation. Thirdly, objects requiring
finalization are treated on a different thread
RahulCh_MS: In the second part, Java actually does a
'sweep'.
RahulCh_MS: In the end, the roots are pretty much the
same - an algo from 1988 called the Boehm-Weiser collector
Guest: (BhuvanMisra) When must one use GC.Collect( )? What is the
implication on system performance if a forced GC is done?
RahulCh_MS: Basically - as an app developer, you do know
your application the best. So your timing might be more relavant.
RahulCh_MS: Say if you are going to open 100 files soon,
it makes sense to fire a collect before hand.
RahulCh_MS: It’s a design decision.
Guest: (Kumar) Also, how can we use weak references to our advantage?
RahulCh_MS: Scenario could be something like this - if
your app is accessing something like a database for object persistence
RahulCh_MS: it could make sense to instead hold onto
a weak reference to a cache object which is holding the object state.
RahulCh_MS: So referring to the weak referenced object
is still more efficient than making a database de-serialization. In
case, memory pressure is high
RahulCh_MS: then you can still fall back on the database,
but in the optimal scenario, there is your cache object to work against.
RahulCh_MS: You also need to decide then if you want
to make it a 'short' or 'long' weak reference
Guest: (piyush) What is that compact operation in case of mark & sweep
process?
RahulCh_MS: It’s pretty simple. Some algorithms just
'sweep' - that is free the unused object space, whereas the .Net GC
frees space by compacting the used objects togethar thus overwriting
any previous state.
RahulCh_MS: So it accomplishes the sweep by doing a compact.
Guest: (netj2ee) I heard there is "Mark and Copy" scheme also in Java!
Is there anything like that in .NET?
RahulCh_MS: The mark is many times referred to as copy.
Am not sure if Java actually has two schemes.
RahulCh_MS: Do post this into the forums. We'll get back
on this.
RahulCh_MS: also called color sometimes.
KunalS_MS: Very well, thank you very much all for joining into today's
session
RahulCh_MS: Just a note. I was expecting some questions
regarding the usage of Dispose
RahulCh_MS: Dispose is the preferred method for object
hierarchy clean up's.
Guest: (HelloWorldDotNET) Can you explain when to use Dispose as opposed
to Finalize?
RahulCh_MS: Since the finalization is non-deterministic
in .Net, if an object A contains object B,
RahulCh_MS: you cannot be sure when you are inside A's
finalizer that B's finalizer has not already been invoked
RahulCh_MS: Therefore, a) You cannot use finalize as
a means of cleaning up resources inside a hierarchy of objects
RahulCh_MS: b) You cannot invoke another objects finalizer
inside your own
RahulCh_MS: c) As a good behavior, all objects which
contain resources as well as other objects should be split into two
RahulCh_MS: one part containing just the resources -
which is the one which should have dispose but no finalizer
RahulCh_MS: and the other containing the object references.
RahulCh_MS: Therefore the dispose method is a method
of doing a standardized 'early' clean up of your resource wrappers
which wrap things like HWNDS, HFILEs
RahulCh_MS: As a design decision, if your object contains
a resource wrapper (eg: Filestream), it should also have a dispose.
If it just contains other managed object references, it need not have
a dispose
KunalS_MS: Very well, we can now Finalize this session!
KunalS_MS: Thank you very much for taking time out for this session
and clarifying our doubts