Managed Memory Model in the .NET Framework

  • 2/18/2009

Garbage Collection

Natural garbage collection in the managed environment is non-deterministic. It occurs at some point in time and is not entirely predictable. Natural garbage collection is not forced with a call to GC.Collect.

When does natural garbage collection occur? Allocations for new objects are added to Generation 0. If that addition exceeds the threshold for Generation 0, garbage collection occurs. The Garbage Collector will attempt to reclaim enough memory from Generation 0 to support the new allocation. If enough memory is not reclaimed, Generation 1 is collected, and then, if necessary, Generation 2. Objects surviving a garbage collection are promoted to the next generation. For example, surviving objects on Generation 0 are then promoted to Generation 1 after garbage collection. This means that older objects tend to migrate to Generation 2. This furthers the policy of grouping objects by age.

The Garbage Collector manages each generation similar to a stack. This makes allocations both quick and efficient. Each generation has an allocation pointer, which delineates the end of the last object and the beginning of the free space. This is where the next object will be allocated. At that time, the new object is stacked upon the previous object, and the allocation pointer is adjusted. The allocation pointer will now point to the end of the new object. For this reason, the oldest objects are at the base of the generation, while the newest objects are toward the top. See 7-1.

Figure 7-1

Figure 7-1. An example layout of Generation 0 after a new allocation.

When garbage collection occurs, objects on the affected generations are invalidated. A memory tree is rebuilt beginning with the root objects and their object graphs. The root objects are composed of the global, static, and local variables. The object graph includes all the other objects that are referenced either directly or indirectly by the root object. Creating the memory tree marks those objects that are reachable. Objects not in the tree are considered unreachable and available for collection. Unreachable objects have no reference variable or a field referring to them. The Garbage Collector compacts the reachable objects on the managed heap. Compacting the heap prevents fragmentation and maintains the stack model.

Although unadvisable, the GC.Collect method of the .NET Framework Class Library (FCL) can be used to force garbage collection. The parameterless version of the function performs a full collection. The single argument version of the function targets a specific generation, which is identified by the parameter. GC.Collect can interfere with the normal practice of the Garbage Collector. First, forced garbage collection is expensive. Second, calling GC.Collect frequently can harm the performance of your application.

Managed Wrappers for Native Objects

Managed classes sometimes wrap native objects. The managed class is an interface between the managed application and the native resource. In this way, the managed class abstracts the native resource. There are plenty of examples of this in the .NET Framework Class Library: the FileStream class abstracts a native file, the Socket class abstracts the Berkeley sockets interface, the Bitmap class abstracts a bitmap, and so on.

Problems can occur when there is a disparity between the size of the managed class and the native resource that it represents. For example, a managed wrapper could be a few kilobytes in size, while the native resource represented by the wrapper is several megabytes in size. The Garbage Collector will track the memory for the managed wrapper. However, the memory for the native resource is unseen. You could have plenty of managed memory available, while unknowingly running out of native memory. This creates a situation where an application crashes for lack of memory, while the Garbage Collector believes there is plenty. Native memory is the invisible elephant in the room. As instances of the manager wrapper are allocated, the elephant is getting bigger, while the room appears nearly empty.

The GC.AddMemoryPressure and GC.RemoveMemoryPressure methods help the Garbage Collector account for native memory. This is especially useful for classes that wrap heavy native resources. GC.AddMemoryPressure applies artificial memory pressure to the managed heap, while GC.RemoveMemoryPressure reduces memory pressure. Each method has a single parameter, which is the amount (bytes) of pressure to apply or relieve. In the constructor for the wrapper class, call GC.AddMemoryPressure and apply memory pressure equal to the amount of native memory required for the native resource. This will force additional garbage collections, where instances of the wrapper object and native resource can be released. In the Finalize or Dispose method, call GC.RemoveMemoryPressure to remove the additional pressure.

The following class demonstrates the proper way to implement a managed wrapper for a native resource that uses a disproportional amount of native memory.

public class Elephant
{
    public Elephant()
    {
      // Obtain native resource and allocate native memory

      GC.AddMemoryPressure(100000);
    }

    ~Elephant()
    {
      // Release native resource and associated memory

      GC.RemoveMemoryPressure(100000);
    }
}

GC Class

The GC class, which is in the System namespace, is an interface between the user and the Garbage Collector. 7-1 lists each method with a description.

Table 7-1. GC Methods

GC Method

Description

GC.Collect

Forces a garbage collection cycle. The default GC.Collect forces a full garbage collection, which is essentially Generation 2. For a more granular garbage collection, use the one-parameter GC.Collect method. The parameter stipulates the generation that should be collected (i.e., 0, 1, or 2).

GC.WaitForPendingFinalizers

Suspends the current thread until the finalization thread has called the finalizers of the objects waiting on the FReachable queue. Call this method after GC.Collect to provide ample time for the finalization thread to finish its work before the current thread resumes.

GC.KeepAlive

Keeps an otherwise unreachable object from being collected during the next garbage collection cycle.

GC.SuppressFinalize

Removes a reference to a finalizable object from the Finalization queue. Remaining overhead related to the finalizer is avoided. GC.SuppressFinalize is usually called in the Dispose method. Because the object has been disposed, finalization is no longer required.

GC.AddMemoryPressure

Applies additional memory pressure to the managed heap. This is typically used to compensate for native resources in managed code.

GC.RemoveMemoryPressure

Removes memory pressure from the managed heap. Like GC.AddMemoryPressure, this is typically used to compensate for native resources in managed code.

GC.CollectionCount

Returns the number of times garbage collection has occurred for the specified generation.

GC.GetGeneration

Returns the generation of the provided object.

GC.GetTotalMemory

Returns the number of bytes allocated on the managed heap.

GC.ReRegisterForFinalize

Reattaches a finalizer to an object. This is usually called on objects that have been resurrected to assure proper finalization.

GC. RegisterForFullGCNotification

Registers the application to be notified when a full collection is likely to happen and after it has occurred.

GC.CancelFullGCNotification

Unregisters the application from receiving notifications about impending full garbage collections.

GC.WaitForFullGCApproach

Notifies an application if a full garbage collection is impending.

GC.WaitForFullGCComplete

Notifies an application that a full garbage collection has completed.

Large Object Heap

The Large Object Heap holds large objects. Most large objects are arrays rather than the assemblage of non-array members of a class. Larger objects are longer lived and typically migrate to Generation 2. Promoting large objects from Generation 0 and eventually to Generation 2 is expensive. Placing really large objects immediately on the Large Object Heap is much more efficient. The Large Object Heap is collected during a full garbage collection, which is Generation 2. During garbage collection, memory for large objects on the Large Object Heap is freed. However, the Large Object Heap is never compacted. Sweeping and consolidating large objects on the Large Object Heap would be expensive. Therefore, that step is skipped. Garbage collection for the Large Object Heap entails these steps:

  1. Memory for unreachable objects is released.

  2. Memory from adjacent and unreachable objects is combined into a free block.

  3. Memory for unreachable objects at the end of the Large Object Heap is released back to Windows.

Because the Large Object Heap cannot be compacted, it can become fragmented. Allocating and releasing disparate-sized large objects on the Large Object Heap makes fragmentation more likely. You are unable to place large objects in the free space from unreachable smaller large objects—unless combined with contiguous space from another free object. The Garbage Collector is forced to search the individual free spans for holes large enough for the pending allocation. Collectively, the free spaces of the Large Object Heap may have enough memory to honor the request but not in a contiguous area.

If you use disparate-sized objects, one possible solution is a buffer of like-sized large objects that can be reused. This keeps the large objects in contiguous memory and could prove to be more efficient. You conserve memory, when the number of instances would otherwise exceed the pool, minimize fragmentation, and reduce the number of full collection operations. Full collections are especially expensive. The downside is when the simultaneous instances are consistently less than the size of the pool. That would waste memory resources and require fine-tuning the pool.

The following code demonstrates how to create and manage a buffer of large objects. In our example, the buffer contains 10 large objects, as shown below.

static BigObject[] bigobjects = {  new BigObject(),
                            new BigObject(),
                            new BigObject(),
                            new BigObject(),
                            new BigObject(),
                            new BigObject(),
                            new BigObject(),
                            new BigObject(),
                            new BigObject(),
                            new BigObject()};

The BigObject class below contains a byte array of 200,000 elements. For this reason, the byte array but not the BigObject class is placed on the Large Object Heap. The code for the class is minimally implemented because the concepts are simple. If an object in the buffer is available for use, the bAvailable field is set to true. The Initialize method initializes an object and makes the status available. The Reset method is called to reset an object from the object pool that is already being used. The reinitialized object is then returned.

public class BigObject
{
    // other data

    public void Initialize()
    {

    // perform initialization
    bAvailable = true;
}

public BigObject Reset()
{
    Initialize();
    bAvailable = false;
    return this;
}

public void Update()
{
}

public bool bAvailabled=true;
byte[] data = new byte[200000]; }

I run the application and create 15 objects. This exceeds the pool limit. Therefore, 10 objects are actually created. The additional five objects reuse objects that are already in the pool. Using Windbg, I have listed instances of the byte array. Windbg is a debugging tool that is discussed more thoroughly in Chapter 9. In the following listing, MT refers to the method table of a class. A method table is an array of methods that belong to a particular class. Instances of the same type share the same method table. For this reason, you can list all instances of the same type from the address of the method table. In this way, the method table is more of a cookie of a particular type of object than an address. They are shown in bold in the following listing. As expected, there are exactly 10 instances of the large byte array, not 15. Five of the instances reuse large objects from the object pool.

!dumpheap -mt 7912dae8
 Address       MT     Size
014aad34 7912dae8     1036
014ab140 7912dae8     1036
014ab54c 7912dae8     1036
014ab958 7912dae8     1036
02486bc0 7912dae8   200016
024b7920 7912dae8   200016
024e8680 7912dae8   200016
025193e0 7912dae8   200016
0254a140 7912dae8   200016
0257aea0 7912dae8   200016
025abc00 7912dae8   200016
025dc960 7912dae8   200016
0260d6c0 7912dae8   200016
0263e420 7912dae8   200016