Generics in .NET

26 Dec 2023

Generics in .NET tailor methods, classes, structures, or interfaces to the precise data type it acts upon to increase code reusability and type safety. [1]

For example, List<T> provides type safety and avoids boxing and unboxing overhead compared to using the non-generic ArrayList collection, and Dictionary<TKey, TValue> provides type safety and avoids casting compared to the non-generic Hashtable.

1. Define and use generics
2. Terminology
3. Advantages
4. Covariance and contravariance
5. Generics in the runtime
6. Reflection and Generic Types
Appendix A: FAQs
- A.1. Why the value type need NOT to be boxed in a generic collection in .NET?
- A.2. With reified generics, the mememory would be still allocated on mananged heap?
References

1. Define and use generics

Generics are classes, structures, interfaces, and methods that have placeholders (type parameters) for one or more of the types that they store or use. [1]

A generic collection class might use a type parameter as a placeholder for the type of objects that it stores, which appears as the types of its fields and the parameter types of its methods.
```
public class SimpleGenericClass<T>
{
    public T Field;
}
```
A generic method might use its type parameter as the type of its return value or as the type of one of its formal parameters.
```
T MyGenericMethod<T>(T arg)
{
    T temp = arg;
    //...
    return temp;
}
```

To create an instance of a generic class, specify the actual types to substitute for the type parameters, which establishes a new generic class, referred to as a constructed generic class, with the chosen types substituted everywhere that the type parameters appear.

SimpleGenericClass<string> g = new SimpleGenericClass<string>();
g.Field = "A string";

// SimpleGenericClass.Field           = "A string"
Console.WriteLine("SimpleGenericClass.Field           = \"{0}\"", g.Field);
// SimpleGenericClass.Field.GetType() = System.String
Console.WriteLine("SimpleGenericClass.Field.GetType() = {0}", g.Field.GetType().FullName);

2. Terminology

A generic type definition is a class, structure, or interface declaration that functions as a template, with placeholders for the types that it can contain or use.

For example, the System.Collections.Generic.Dictionary<TKey,TValue> class can contain two types: keys and values.
Generic type parameters, or type parameters, are the placeholders in a generic type or method definition.

The System.Collections.Generic.Dictionary<TKey,TValue> generic type has two type parameters, TKey and TValue, that represent the types of its keys and values.
A constructed generic type, or constructed type, is the result of specifying types for the generic type parameters of a generic type definition.
A generic type argument is any type that is substituted for a generic type parameter.
The general term generic type includes both constructed types and generic type definitions.
Covariance and contravariance (collectively referred to as variance) of generic type parameters enable using constructed generic types whose type arguments are more derived (covariance) or less derived (contravariance) than a target constructed type.
Generic type constraints limit the types that can be used as type arguments.
A generic method definition is a method with two parameter lists: a list of generic type parameters and a list of formal parameters.
- Type parameters can appear as the return type or as the types of the formal parameters.
  T MyGenericMethod<T>(T arg) { T temp = arg; //... return temp; }
- Generic methods can appear on generic or nongeneric types.
  - A method is generic only if it has its own list of type parameters.
  - A method within a generic class isn’t automatically generic, even if it uses the class’s generic type parameters.
    
    class A { T G<T>(T arg) { T temp = arg; //... return temp; } } class MyGenericClass<T> { T M(T arg) { T temp = arg; //... return temp; } }

3. Advantages

Type safety: generics shift the burden of type safety to the compiler.
- There is no need to write code to test for the correct data type because it is enforced at compile time.
- The need for type casting and the possibility of run-time errors are reduced.

Code reusability: no need to inherit from a base type and override members.

// LinkedList<T> is ready for immediate use.
LinkedList<string> llist = new LinkedList<string>();

Better performance: generic collection types generally perform better for storing and manipulating value types because there is no need to box the value types.

Boxing and Unboxing (C# Programming Guide)

Boxing is the process of converting a value type to the type object or to any interface type implemented by the value type.

When the common language runtime (CLR) boxes a value type, it wraps the value inside a System.Object instance and stores it on the managed heap.

Unboxing extracts the value type from the object.

Boxing is implicit; unboxing is explicit.

The concept of boxing and unboxing underlies the C# unified view of the type system in which a value of any type can be treated as an object.

Boxing is a specific operation that involves converting a value type to an object reference type. Storing a value type on the heap can happen in various ways, but it’s not always considered boxing.

Generic delegates enable type-safe callbacks without the need to create multiple delegate classes.

For example, the Predicate<T> generic delegate allows creating a method that implements a specific search criteria for a particular type and to use the method with methods of the Array type such as Find, FindLast, and FindAll.

4. Covariance and contravariance

Covariance and contravariance are terms that refer to the ability to use a more derived type (more specific) or a less derived type (less specific) than originally specified. [2]

A covariant type parameter is marked with the out keyword (Out keyword in Visual Basic).

interface ICovariant<out R>
{
    void DoSomething(Action<R> callback);
}

interface ICovariant<out R>
{
    // only contravariant or invariant types can be used in generic constraints.
    void DoSomething<T>() where T : R; // compiler error!
}

A contravariant type parameter is marked with the in keyword (In keyword in Visual Basic).

interface IContravariant<in A>
{
    void SetSomething(A sampleArg);
    // a contravariant type parameter can be used as a type constraint for an interface method.
    void DoSomething<T>() where T : A;
    A GetSomething(); // compiler error!
}

An interface or delegate type can have both covariant and contravariant type parameters.
```
public delegate TResult Func<in T, out TResult>(T arg)
```

Only interface types and delegate types can have variant type parameters.

// NOT allowed!
// public class MyClass<out T> { ... }  // Error: Variance is invalid for classes

In general, a covariant type parameter can be used as the return type of a delegate or method, and contravariant type parameters can be used as parameter types.

// Covariance (out): Return type can be more specific (Dog?)
// Contravariance (in): Parameter type can be less specific (Animal)
Func<Animal, Dog?> animalToDog = (Animal animal) =>
{
    if (animal is Dog dog)
    {
        return dog;
    }
    return null;
};

public delegate TResult Func<in T, out TResult>(T arg)
public class Animal { }
public class Dog : Animal { }

Liskov’s notion of a behavioural subtype defines a notion of substitutability for objects; that is, if S is a subtype of T, then objects of type T in a program may be replaced with objects of type S without altering any of the desirable properties of that program (e.g. correctness). [3]

Liskov substitution principle imposes some standard requirements on signatures that have been adopted in newer object-oriented programming languages (usually at the level of classes rather than types):

Contravariance of method parameter types in the subtype.

Covariance of method return types in the subtype.

New exceptions cannot be thrown by the methods in the subtype, except if they are subtypes of exceptions thrown by the methods of the supertype.

When referring to a type system, covariance, contravariance, and invariance have the following definitions.

Covariance enables using a more derived type than originally specified.
- An instance of IEnumerable<Derived> can be assigned to a variable of type IEnumerable<Base>.
- Covariant type parameters enable making assignments that look much like ordinary Polymorphism.
  IEnumerable<Derived> d = new List<Derived>(); IEnumerable<Base> b = d;
Contravariance enables using a more generic (less derived) type than originally specified.
- An instance of Action<Base> can be assigned to a variable of type Action<Derived>.
- Contravariance, compared to covariance, seems counterintuitive.
  Action<Base> b = (target) => { Console.WriteLine(target.GetType().Name); }; Action<Derived> d = b; d(new Derived());
Invariance, neither covariant nor contravariant, can use only the type originally specified.
- An instance of List<Base> cannot be assigned to a variable of type List<Derived> or vice versa.
  // public class List<T> : ... List<Base> bases = new List<Derived>(); // compiler error

Covariance and contravariance are collectively referred to as variance.

A generic type parameter that is not marked covariant or contravariant is referred to as invariant.
A brief summary of facts about variance in the common language runtime:
- Variant type parameters are restricted to generic interface and generic delegate types.
- A generic interface or generic delegate type can have both covariant and contravariant type parameters.
- Variance applies only to reference types; if you specify a value type for a variant type parameter, that type parameter is invariant for the resulting constructed type.
- Variance does not apply to delegate combination.
  
  Variance allows the Action<Base> delegate to be assigned to a variable of type Action<Derived>, but delegates can combine only if their types match exactly.
- An overriding method can declare a more derived return type the method it overrides, and an overriding, read-only property can declare a more derived type.
  abstract class Animal { public abstract Food GetFood(); ... } class Tiger : Animal { public override Meat GetFood() => ...; }

5. Generics in the runtime

When compiled to MSIL, generic types and methods include metadata indicating their type parameters. MSIL usage then varies depending on whether the provided type argument is a value or reference type. [4]

When a generic type is first constructed with a value type as a parameter, the runtime creates a specialized generic type with the supplied parameter or parameters substituted in the appropriate locations in the MSIL.
- Specialized generic types are created one time for each unique value type that is used as a parameter.
- Suppose a different value type as its parameter is created at another point, the runtime generates another version of the generic type and substitutes the type arguments in the appropriate locations in MSIL.
- Conversions are no longer necessary because each specialized generic class natively contains the value type.
The first time a generic type is constructed with any reference type, the runtime creates a specialized generic type with object references substituted for the parameters in the MSIL.
- Then, every time that a constructed type is instantiated with a reference type as its parameter, regardless of what type it is, the runtime reuses the previously created specialized version of the generic type, because all references are the same size.
- Because the number of reference types can vary wildly from program to program, the C# implementation of generics greatly reduces the amount of code by reducing to one the number of specialized classes created by the compiler for generic classes of reference types.

Moreover, when a generic C# class is instantiated by using a value type or reference type parameter, reflection can query it at run time and both its actual type and its type parameter can be ascertained.

The runtime creates specific versions of the generic type based on the actual types used to instantiate the generic type. For example, if you have a List<T> and you create a List<int> and a List<double>, the CLR will create two separate versions of the List class, one for each of those value types.

When you instantiate the generic type with a reference type, like List<string> or List<object>, the CLR reuses the same version of the List class that it has already created for reference types.

However, the .NET CLR maintains type safety by treating these as separate types at the type system level, even though the underlying implementation is the same.

6. Reflection and Generic Types

From the point of view of reflection, the difference between a generic type and an ordinary type is that a generic type has associated with it a set of type parameters (if it is a generic type definition) or type arguments (if it is a constructed type). A generic method differs from an ordinary method in the same way. [5]

There are two keys to understanding how reflection handles generic types and methods:

The type parameters of generic type definitions and generic method definitions are represented by instances of the Type class.
If an instance of Type represents a generic type, then it includes an array of types that represent the type parameters (for generic type definitions) or the type arguments (for constructed types). The same is true of an instance of the MethodInfo class that represents a generic method.

A generic type or method is closed if instantiable types have been substituted for all its type parameters, including all the type parameters of all enclosing types. You can only create an instance of a generic type if it is closed.

WriteLine(typeof(Dictionary<,>));
WriteLine(typeof(MySet<>));
WriteLine(typeof(MySet<int>));
MySet<int> mySet = (MySet<int>)typeof(MySet<int>).GetConstructor([])!.Invoke(null);
mySet.AddRange([1, 2, 1, 1, 2, 1, 1, 2, 3, 4]);
mySet.Keys.ToList().ForEach(Console.WriteLine);

static void WriteLine(Type type)
{
    string message = string.Format("{0} {1,-31} {2}",
        $"IsGenericType: {type.IsGenericType}.",
        $"IsGenericTypeDefinition: {type.IsGenericTypeDefinition}.",
        $"Open or Closed: {(type.ContainsGenericParameters ? "Open" : "Closed")}.");
    Console.WriteLine(message);
}

class MySet<E> : Dictionary<E, bool> where E : notnull
{
    public void AddRange(params E[] items) => Array.ForEach(items, Add);

    public void Add(E e) { if (ContainsKey(e)) return; Add(e, true); }
}
// $ dotnet run
// IsGenericType: True.IsGenericTypeDefinition: True.Open or Closed: Open.
// IsGenericType: True.IsGenericTypeDefinition: True.Open or Closed: Open.
// IsGenericType: True.IsGenericTypeDefinition: False.Open or Closed: Closed.
// 1
// 2
// 3
// 4

Appendix A: FAQs

A.1. Why the value type need NOT to be boxed in a generic collection in .NET?

> Why the value type need NOT to be boxed in a generic collection in .NET?

* In .NET, boxing is the process of converting a value type to a reference type, this involves creating a new object on the heap and copying the value into the new object. Unboxing is the reverse process, which can be costly in terms of performance.

When using a generic collection, like List<T>, there’s no need to box the value types because generics are type-safe and can hold any type of data without converting them to an object.

Generics in .NET use a mechanism called 'reified generics', which means that the system knows the actual types a generic class was instantiated with, even at runtime. It uses this information to directly allocate and deallocate memory for the values in the collection, avoiding the need to box and unbox value types.

That’s one of the key benefits of utilizing generics in .NET: they allow you to enforce type safety without incurring the performance cost of boxing and unboxing for value types.

A.2. With reified generics, the mememory would be still allocated on mananged heap?

> With reified generics, the mememory would be still allocated on mananged heap?

* Yes, the memory for any object, including those constructed using generic types, is still allocated on the managed heap in .NET. However, the point of difference lies in how these objects are managed.

With reference types, an object is created on the heap and a reference to this object is used to access it. Value types, on the other hand, hold their value directly. When value types are used in a generic collection, a consecutive block of memory is allocated on the heap for the collection to store the actual values directly rather than references to the values. This optimization contributes to both the performance speed and reduction of pressure on the garbage collector.

The mechanism 'reified generics' helps to keep the information about the specific type that a generic type was instantiated with, and this helps in efficient memory allocation and type safety at runtime.

1. Define and use generics

2. Terminology

3. Advantages

4. Covariance and contravariance

5. Generics in the runtime

6. Reflection and Generic Types

Appendix A: FAQs

A.1. Why the value type need NOT to be boxed in a generic collection in .NET?

A.2. With reified generics, the mememory would be still allocated on mananged heap?

References