Type variance


Type variance is the relationship between subtypes of a composite type and the subtypes of its components. A language's chosen variance determines the relationship between, for example, a list of s and a list of s, or a function returning and a function returning.
If the type is a subtype of, then an expression of type should be substitutable wherever an expression of type is used. Depending on the variance of the type constructor, the subtyping relation of the simple types may be either preserved, reversed, or ignored for the respective complex types. In many programming languages, for example, "list of Cat" will be a subtype of "list of Animal", because the list type constructor is covariant. This means that the subtyping relation of the simple types is preserved for the complex types. On the other hand, "function from Animal to String" is a subtype of "function from Cat to String" because the function type constructor is contravariant in the parameter type. Here, the subtyping relation of the simple types is reversed for the complex types.
A programming language designer will consider variance when devising typing rules for language features such as arrays, inheritance, and generic datatypes. By making type constructors covariant or contravariant instead of invariant, more programs will be accepted as well-typed. On the other hand, programmers often find contravariance unintuitive, and accurately tracking variance to avoid runtime type errors can lead to complex typing rules.
In order to keep the type system simple and allow useful programs, a language may treat a type constructor as invariant even if it would be safe to consider it variant, or treat it as covariant even though that could violate type safety.

Formal definition

Suppose A and B are types, and I denotes application of a type constructor I with type argument U.
Within the type system of a programming language, a typing rule for a type constructor I is:
The article considers how this applies to some common type constructors.

C# examples

For example, in C#, if is a subtype of, then:
  • is a subtype of. The subtyping is preserved because is covariant on.
  • is a subtype of. The subtyping is reversed because is contravariant on.
  • Neither nor is a subtype of the other, because is invariant on.
The variance of a C# generic interface is declared by placing the or attribute on its type parameters. The above interfaces are declared as,, and. Types with more than one type parameter may specify different variances on each type parameter. For example, the delegate type represents a function with a contravariant input parameter of type and a covariant return value of type. The compiler checks that all types are defined and used consistently with their annotations, and otherwise signals a compilation error.
The [|typing rules for interface variance] ensure type safety. For example, an represents a first-class function expecting an argument of type, and a function that can handle any type of animal can always be used instead of one that can only handle cats.

Arrays

Read-only data types can be covariant; write-only data types can be contravariant. Mutable data types which act as both sources and sinks should be invariant. To illustrate this general phenomenon, consider the array type. For the type we can make the type, which is an "array of animals". For the purposes of this example, this array supports both reading and writing elements.
We have the option to treat this as either:
  • covariant: a is an ;
  • contravariant: an is a ;
  • invariant: an is not a and a is not an.
If we wish to avoid type errors, then only the third choice is safe. Clearly, not every can be treated as if it were a, since a client reading from the array will expect a, but an may contain e.g. a. So, the contravariant rule is not safe.
Conversely, a cannot be treated as an. It should always be possible to put a into an. With covariant arrays this cannot be guaranteed to be safe, since the backing store might actually be an array of cats. So, the covariant rule is also not safe—the array constructor should be invariant. Note that this is only an issue for mutable arrays; the covariant rule is safe for immutable arrays. Likewise, the contravariant rule would be safe for write-only arrays.

Covariant arrays in Java and C#

Early versions of Java and C# did not include generics, also termed parametric polymorphism. In such a setting, making arrays invariant rules out useful polymorphic programs.
For example, consider writing a function to shuffle an array, or a function that tests two arrays for equality using the. method on the elements. The implementation does not depend on the exact type of element stored in the array, so it should be possible to write a single function that works on all types of arrays. It is easy to implement functions of type:

boolean equalArrays;
void shuffleArray;

However, if array types were treated as invariant, it would only be possible to call these functions on an array of exactly the type. One could not, for example, shuffle an array of strings.
Therefore, both Java and C# treat array types covariantly.
For instance, in Java is a subtype of, and in C# is a subtype of.
As discussed above, covariant arrays lead to problems with writes into the array. Java and C# deal with this by marking each array object with a type when it is created. Each time a value is stored into an array, the execution environment will check that the run-time type of the value is equal to the run-time type of the array. If there is a mismatch, an or is thrown:

// a is a single-element array of String
String a = new String;
// b is an array of Object
Object b = a;
// Assign an Integer to b. This would be possible if b were actually
// an array of Object, but since it really is an array of String,
// we will get a java.lang.ArrayStoreException at runtime.
b = 1;

In the above example, one can read from the array safely. It is only trying to write to the array that can lead to trouble.
One drawback to this approach is that it leaves the possibility of a run-time error that a stricter type system could have caught at compile-time. Also, it hurts performance because each write into an array requires an additional run-time check.
With the addition of generics, Java and C# now offer ways to write this kind of polymorphic function without relying on covariance. The array comparison and shuffling functions can be given the parameterized types

boolean equalArrays;
void shuffleArray;

Alternatively, to enforce that a C# method accesses a collection in a read-only way, one can use the interface instead of passing it an array.

Function types

Languages with first-class functions have function types like "a function expecting a Cat and returning an Animal".
Those languages also need to specify when one function type is a subtype of another—that is, when it is safe to use a function of one type in a context that expects a function of a different type.
It is safe to substitute a function f for a function g if f accepts a more general type of argument and returns a more specific type than g. For example, functions of type, , and can be used wherever a was expected. The general rule is:
Using inference rule notation the same rule can be written as:
In other words, the → type constructor is contravariant in the parameter type and covariant in the return type. This rule was first stated formally by John C. Reynolds, and further popularized in a paper by Luca Cardelli.
When dealing with functions that take functions as arguments, this rule can be applied several times. For example, by applying the rule twice, we see that if. In other words, the type is covariant in the position of. For complicated types it can be confusing to mentally trace why a given type specialization is or isn't type-safe, but it is easy to calculate which positions are co- and contravariant: a position is covariant if it is on the left side of an even number of arrows applying to it.

Inheritance in object-oriented languages

When a subclass overrides a method in a superclass, the compiler must check that the overriding method has the right type. While some languages require that the type exactly matches the type in the superclass, it is also type safe to allow the overriding method to have a "better" type. By the usual subtyping rule for function types, this means that the overriding method should return a more specific type and accept a more general argument. In UML notation, the possibilities are as follows :
For a concrete example, suppose we are writing a class to model an animal shelter. We assume that is a subclass of, and that we have a base class

class AnimalShelter

Now the question is: if we subclass, what types are we allowed to give to and ?

Covariant method return type

In a language which allows covariant return types, a derived class can override the method to return a more specific type:

class CatShelter extends AnimalShelter

Among mainstream OO languages, Java, C++ and C# support covariant return types. Adding the covariant return type was one of the first modifications of the C++ language approved by the standards committee in 1998. Scala and D also support covariant return types.