Garbage Collection in .NET

26 Nov 2024

In the common language runtime (CLR), the garbage collector (GC) serves as an automatic memory manager to manages the allocation and release of memory for an application. [1]

All processes on the same computer share the same physical memory and the page file, if there’s one.
Each process has its own, separate virtual address space.
By default, on 32-bit computers, each process has a 2-GB user-mode virtual address space.
The garbage collector allocates and frees virtual memory for an appliction on the managed heap.
Virtual memory can be in three states:
- Free: The block of memory has no references to it and is available for allocation.
- Reserved: The block of memory is available for using and can’t be used for any other allocation request. However, data can’t be stored to this memory block until it’s committed.
- Committed: The block of memory is assigned to physical storage.
Virtual address space can get fragmented, which means that there are free blocks known as holes in the address space.
When a virtual memory allocation is requested, the virtual memory manager has to find a single free block that is large enough to satisfy the allocation request.
An application can run out of memory if there isn’t enough virtual address space to reserve or physical space to commit.

1. Memory allocation, release, and compaction
2. Generations
3. Unmanaged resources
4. Dispose patterns
- 4.1. System.IAsyncDisposable
- 4.2. System.Object.Finalize
5. Workstation and server GC
- 5.1. Background GC
- 5.2. Background workstation vs. server GC
References

1. Memory allocation, release, and compaction

When initializing a new process, the runtime reserves a contiguous region of address space for the process which is called the managed heap.

The managed heap maintains a pointer to the address where the next object in the heap will be allocated. Initially, this pointer is set to the managed heap’s base address.
All reference types are allocated on the managed heap.
- When an application creates the first reference type, memory is allocated for the type at the base address of the managed heap.
- When the application creates the next object, the runtime allocates memory for it in the address space immediately following the first object.
- Allocating memory from the managed heap is faster than unmanaged memory allocation.
  - Because the runtime allocates memory for an object by adding a value to a pointer, it’s almost as fast as allocating memory from the stack.
  - In addition, because new objects that are allocated consecutively are stored contiguously in the managed heap, an application can access the objects quickly.
When the garbage collector performs a collection, it releases the memory for objects that are no longer being used by the application by examining the application’s roots.
- An application’s roots include static fields, local variables on a thread’s stack, CPU registers, GC handles, and the finalize queue.
- Each root either refers to an object on the managed heap or is set to null.
- The garbage collector uses this list to create a graph that contains all the objects that are reachable from the roots.
- Objects that aren’t in the graph are unreachable from the application’s roots.
- The garbage collector considers unreachable objects garbage and releases the memory allocated for them.
The garbage collector examines the managed heap, and uses a memory-copying function to compact the reachable objects in memory, freeing up the blocks of address spaces allocated to unreachable objects.
- Memory is compacted only if a collection discovers a significant number of unreachable objects.
- If all the objects in the managed heap survive a collection, then there’s no need for memory compaction.
To improve performance, the runtime allocates memory for large objects in a separate heap, the large object heap (LOH).
- The garbage collector automatically releases the memory for large objects.
- However, to avoid moving large objects in memory, this memory is usually not compacted.

Before a garbage collection starts, all managed threads are suspended except for the thread that triggered the garbage collection.

Screenshot of how a thread triggers a Garbage Collection.

2. Generations

The GC algorithm is based on several considerations:

It’s faster to compact the memory for a portion of the managed heap than for the entire managed heap.
Newer objects have shorter lifetimes, and older objects have longer lifetimes.
Newer objects tend to be related to each other and accessed by the application around the same time.

Garbage collection primarily occurs with the reclamation of short-lived objects. To optimize the performance of the garbage collector, the managed heap is divided into three generations, 0, 1, and 2, so it can handle long-lived and short-lived objects separately.

The garbage collector stores new objects in generation 0.
Objects created early in the application’s lifetime that survive collections are promoted and stored in generations 1 and 2.
However, if they’re large objects, they go on the large object heap (LOH), which is sometimes referred to as generation 3.
Most objects are reclaimed for garbage collection in generation 0 and don’t survive to the next generation.

If an application attempts to create a new object when generation 0 is full, the garbage collector performs a collection to free address space for the object.

After the garbage collector performs a collection of generation 0, it compacts the memory for the reachable objects and promotes them to generation 1.
If a collection of generation 0 doesn’t reclaim enough memory for the application to create a new object, the garbage collector can perform a collection of generation 1 and then generation 2.
Objects in generation 2 that survive a collection remain in generation 2 until they’re determined to be unreachable in a future collection.
Objects on the large object heap (which is sometimes referred to as generation 3) are also collected in generation 2.
A generation 2 garbage collection is also known as a full garbage collection because it reclaims objects in all generations (that is, all objects in the managed heap).

Objects that aren’t reclaimed in a garbage collection are known as survivors and are promoted to the next generation:

Objects that survive a generation 0 garbage collection are promoted to generation 1.
Objects that survive a generation 1 garbage collection are promoted to generation 2.
Objects that survive a generation 2 garbage collection remain in generation 2.

3. Unmanaged resources

The .NET garbage collector doesn’t allocate or release unmanaged memory.

For unmanaged resources, they requires to be explicitly cleanup.
The most common type of unmanaged resource is an object that wraps an operating system resource, such as a file handle, window handle, network connection, or database connections.

Although the garbage collector is able to track the lifetime of an object that encapsulates an unmanaged resource, it doesn’t know how to release and clean up the unmanaged resource.

4. Dispose patterns

The Dispose method in .NET is primarily for releasing unmanaged resources (like file handles, network connections, memory allocated outside .NET).

While often used to cascade dispose calls for IDisposable members to release unmanaged resources, it can also be used for other cleanup tasks, for example, to free memory that was allocated, remove an item that was added to a collection, or signal the release of a lock that was acquired.
The .NET garbage collector doesn’t handle unmanaged memory, so the dispose pattern is crucial for managing these resources that implement the IDisposable interface.

A well-written Dispose method should be idempotent (callable multiple times without errors), with subsequent calls doing nothing.

string filePath = "example.txt";
string textToWrite = "Hello, this is a test message!";

// Use the using statement to ensure the StreamWriter is properly disposed of
using (StreamWriter writer = new StreamWriter(filePath))
{
    writer.WriteLine(textToWrite);
}

To enable the deterministic release of unmanaged resources, provide an IDisposable.Dispose dispose pattern implementation.

public void Dispose()
{
    // Dispose of unmanaged resources.
    Dispose(true);
    // Suppress finalization.
    GC.SuppressFinalize(this);
}

// Any non-sealed class should have an Dispose(bool) overload method.
// If the method call comes from a finalizer, only the code that frees
// unmanaged resources should execute.
protected virtual void Dispose(bool disposing)
{
    if (_disposed)
    {
        return;
    }

    if (disposing)
    {
        // TODO: dispose managed state (managed objects).
    }

    // TODO: free unmanaged resources (unmanaged objects) and override a finalizer below.
    // TODO: set large fields to null.

    _disposed = true;
}

~Disposable() => Dispose(false); // finalizer

The disposing parameter should be false when called from a finalizer, and true when called from the IDisposable.Dispose method. In other words, it is true when deterministically called and false when non-deterministically called.

protected virtual void Dispose(bool disposing)
{
    if (_disposed) return;

    if (disposing) // Deterministic call (from IDisposable.Dispose or using)
    {
        // Dispose managed resources:
        if (_managedResource != null)
        {
            _managedResource.Dispose(); // Safe to access managed objects
            _managedResource = null;
        }
    }

    // Free unmanaged resources:
    if (_unmanagedResource != IntPtr.Zero)
    {
        CloseHandle(_unmanagedResource); // Always release unmanaged resources
        _unmanagedResource = IntPtr.Zero;
    }

    _disposed = true;
}

~Disposable() => Dispose(false); // Non-deterministic call (from finalizer)

To enable the non-deterministic release of unmanaged resources when the consumer of a type fails to call IDisposable.Dispose.

Use a safe handle to wrap the unmanaged resource.

using Microsoft.Win32.SafeHandles;
using System;
using System.Runtime.InteropServices;

public class BaseClassWithSafeHandle : IDisposable
{
    // To detect redundant calls
    private bool _disposed;

    // Instantiate a SafeHandle instance.
    private SafeHandle? _safeHandle = new SafeFileHandle(IntPtr.Zero, true);
    //  FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.ReadWrite);
    //  _safeHandle = new SafeFileHandle(fs.SafeFileHandle.DangerousGetHandle(), true);

    // Public implementation of Dispose pattern callable by consumers.
    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    // Protected implementation of Dispose pattern.
    protected virtual void Dispose(bool disposing)
    {
        if (!_disposed)
        {
            if (disposing)
            {
                _safeHandle?.Dispose();
                _safeHandle = null;
            }

            _disposed = true;
        }
    }
}

Or, define a finalizer.

using System;

public class BaseClassWithFinalizer : IDisposable
{
    // To detect redundant calls
    private bool _disposed;

    ~BaseClassWithFinalizer() => Dispose(false);

    // Public implementation of Dispose pattern callable by consumers.
    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    // Protected implementation of Dispose pattern.
    protected virtual void Dispose(bool disposing)
    {
        if (!_disposed)
        {
            if (disposing)
            {
                // TODO: dispose managed state (managed objects)
            }

            // TODO: free unmanaged resources (unmanaged objects) and override finalizer
            // TODO: set large fields to null
            _disposed = true;
        }
    }
}

A finalizer is only required if you directly reference unmanaged resources.

Object finalization can be a complex and error-prone operation, it’s recommend to use a safe handle instead of providing the finalizer.

4.1. System.IAsyncDisposable

The IAsyncDisposable.DisposeAsync is used to asynchronously close or release unmanaged resources such as files, streams, and handles, instead of IDisposable.Dispose to perform a resource-intensive dispose operation without blocking the main thread of a GUI application for a long time. [2]

It’s typical when implementing the IAsyncDisposable interface that classes also implement the IDisposable interface for either synchronous or asynchronous disposal, however, it’s not a requirement.

If a class implements IAsyncDisposable, but not IDisposable, and a consumer only calls Dispose, the implementation would never call DisposeAsync, which would result in a resource leak.

Any nonsealed class should define a DisposeAsyncCore() method that also returns a ValueTask.

public async ValueTask DisposeAsync()
{
    // Perform async cleanup.
    await DisposeAsyncCore();

    // Dispose of unmanaged resources.
    Dispose(false);

    // Suppress finalization.
    GC.SuppressFinalize(this);
}

protected virtual ValueTask DisposeAsyncCore()
{
}

If an implementation of IAsyncDisposable is sealed, the DisposeAsyncCore() method is not needed.

public sealed class SealedExampleAsyncDisposable : IAsyncDisposable
{
    private readonly IAsyncDisposable _example;

    public SealedExampleAsyncDisposable() =>
        _example = new NoopAsyncDisposable();

    // the asynchronous cleanup can be performed directly
    public ValueTask DisposeAsync() => _example.DisposeAsync();
}

An example that implements both dispose and async dispose patterns

class ExampleConjunctiveDisposableusing : IDisposable, IAsyncDisposable
{
    IDisposable? _disposableResource = new MemoryStream();
    IAsyncDisposable? _asyncDisposableResource = new MemoryStream();

    public void Dispose()
    {
        Dispose(disposing: true);
        GC.SuppressFinalize(this);
    }

    public async ValueTask DisposeAsync()
    {
        await DisposeAsyncCore().ConfigureAwait(false);

        Dispose(disposing: false);
        GC.SuppressFinalize(this);
    }

    protected virtual void Dispose(bool disposing)
    {
        if (disposing)
        {
            _disposableResource?.Dispose();
            _disposableResource = null;

            if (_asyncDisposableResource is IDisposable disposable)
            {
                disposable.Dispose();
                _asyncDisposableResource = null;
            }
        }
    }

    protected virtual async ValueTask DisposeAsyncCore()
    {
        if (_asyncDisposableResource is not null)
        {
            await _asyncDisposableResource.DisposeAsync().ConfigureAwait(false);
        }

        if (_disposableResource is IAsyncDisposable disposable)
        {
            await disposable.DisposeAsync().ConfigureAwait(false);
        }
        else
        {
            _disposableResource?.Dispose();
        }

        _asyncDisposableResource = null;
        _disposableResource = null;
    }
}

To properly consume an object that implements the IAsyncDisposable interface, using the await and using keywords together.

await using (var writer = new StreamWriter("./hello"))
{
    await writer.WriteAsync("Hello, World!");
}

using var reader = new StreamReader("./hello");
var text = await reader.ReadToEndAsync();
Console.Write(text); // Hello, World!

4.2. System.Object.Finalize

The Finalize method is used to allow an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.

~Object ();

If a type does override the Finalize method, the garbage collector adds an entry for each instance of the type to an internal structure called the finalization queue. The finalization queue contains entries for all the objects in the managed heap whose finalization code must run before the garbage collector can reclaim their memory.

The garbage collector calls the Finalize method automatically under the following conditions:

After the garbage collector has discovered that an object is inaccessible, unless the object has been exempted from finalization by a call to the GC.SuppressFinalize method.
On .NET Framework only, during shutdown of an application domain, unless the object is exempt from finalization. During shutdown, even objects that are still accessible are finalized.
Finalize is automatically called only once on a given instance, unless the object is re-registered by using a mechanism such as GC.ReRegisterForFinalize and the GC.SuppressFinalize method has not been subsequently called.

Finalize should be overriden for a class that uses unmanaged resources, such as file handles or database connections that must be released when the managed object that uses them is discarded during garbage collection. It shouldn’t be implemented for managed objects because the garbage collector releases managed resources automatically.

public void Dispose()
{
    // Dispose of unmanaged resources.
    Dispose(true);
    // Suppress finalization.
    GC.SuppressFinalize(this);
}

5. Workstation and server GC

The garbage collector is self-tuning and can work in a wide variety of scenarios. However, the CLR also provides the following types of garbage collection to be set based on the characteristics of the workload: [4]

Workstation garbage collection (GC), which is designed for client apps.
- It’s the default GC flavor for standalone apps.
- For hosted apps, for example, those hosted by ASP.NET, the host determines the default GC flavor.
- Workstation garbage collection can be concurrent or non-concurrent.
  - Concurrent (or background) garbage collection enables managed threads to continue operations during a garbage collection.
  - Background garbage collection replaces concurrent garbage collection in .NET Framework 4 and later versions.
- Workstation garbage collection is always used on a computer that has only one logical CPU, regardless of the configuration setting.
- The collection occurs on the user thread that triggered the garbage collection and remains at the same priority.
Server garbage collection, which is intended for server applications that need high throughput and scalability.
- In .NET Core, server garbage collection can be non-concurrent or background.
- In .NET Framework 4.5 and later versions, server garbage collection can be non-concurrent or background. In .NET Framework 4 and previous versions, server garbage collection is non-concurrent.
  
  Figure 1. Server Garbage Collection Threads
- The collection occurs on multiple dedicated threads. On Windows, these threads run at THREAD_PRIORITY_HIGHEST priority level.
- A heap and a dedicated thread to perform garbage collection are provided for each logical CPU, and the heaps are collected at the same time. Each heap contains a small object heap and a large object heap, and all heaps can be accessed by user code. Objects on different heaps can refer to each other.

5.1. Background GC

In background garbage collection (GC), ephemeral generations (0 and 1) are collected as needed while the collection of generation 2 is in progress. [5]

Background garbage collection is performed on one or more dedicated threads, depending on whether it’s workstation or server GC, and applies only to generation 2 collections.
Background garbage collection is enabled by default.
Background garbage collection removes allocation restrictions imposed by concurrent garbage collection, because ephemeral garbage collections can occur during background garbage collection.

A collection on ephemeral generations during background garbage collection is known as foreground garbage collection.

When foreground garbage collections occur, all managed threads (both dedicated background garbage collection threads and user threads) are suspended.

5.2. Background workstation vs. server GC

Background workstation garbage collection uses one dedicated background garbage collection thread, whereas background server garbage collection uses multiple threads. Typically, there’s a dedicated thread for each logical processor.

Figure 2. Background workstation garbage collection
Unlike the workstation background garbage collection thread, the background server GC threads do not time out.

Figure 3. Background server garbage collection
Starting with .NET Framework 4.5, background garbage collection is available for server GC. Background GC is the default mode for server garbage collection.

Concurrent GC

Concurrent garbage is replaced by background garbage collection in the modern .NET Framework versions.

.NET Framework 3.5 and earlier for workstation garbage collection

.NET Framework 4 and earlier for server garbage collection
Concurrent garbage collection enables interactive applications to be more responsive by minimizing pauses for a collection.
- Managed threads can continue to run most of the time while the concurrent garbage collection thread is running.
Concurrent garbage collection is performed on a dedicated thread.
- By default, the CLR runs workstation garbage collection with concurrent garbage collection enabled on both single-processor and multi-processor computers.
  
  Figure 4. Concurrent Garbage Collection Threads

1. Memory allocation, release, and compaction

2. Generations

3. Unmanaged resources

4. Dispose patterns

4.1. System.IAsyncDisposable

4.2. System.Object.Finalize

5. Workstation and server GC

5.1. Background GC

5.2. Background workstation vs. server GC

References