Generics and Delegates in C#
This article was originally published on the Encodo Blogs. Browse on over to see more!
The term DRY—Don’t Repeat Yourself—has become more and more popular lately as a design principle. This is nothing new and is the main principle underlying object-oriented programming. As OO programmers, we’ve gotten used to using inheritance and polymorphism to encapsulate concepts. Until recently, languages like C# and Java have had only very limited support for re-using functionality across larger swathes of code.[1] To illustrate this, let’s take a look at a simple class with a descendent as well as some code that deals with lists of these objects and their properties.
Let’s start with some basic definitions[2]:
class Pet
{
public string Name
{
get { return _Name; }
}
public bool IsHouseTrained
{
get { return _IsHouseTrained; }
}
private string _Name;
private bool _IsHouseTrained = true;
}
class Dog : Pet
{
public void Bark() {}
}
class Owner
{
public IList<Pet> Pets
{
get { return _Pets; }
}
private IList<Pet> _Pets = new List<Pet>();
}
This is basically boilerplate for articles about inheritance, so let’s move on to working with these classes. Imagine that the Owner
wants to find all pets named “Fido”:
IList<Pet> FindPetsNamedFido()
{
IList<Pet> result = new List<Pet>();
foreach (Pet p in Pets)
{
if (p.Name == "Fido")
{
result.Add(p);
}
}
return result;
}
Again, no surprises yet. This is a standard loop in C#, using the foreach
construct and generics to loop through the list in a type-safe manner. Applying the DRY principle, however, we see that we’re going to end up writing a lot of these loops—especially if we offer a lot of different ways of analyzing data in the list of pets. Essentially, the code above is a completely standard loop except for the condition—the (p.name == “Fido”)
part. We can then imagine a function with the following form:
IList<Pet> FindPets(??? condition)
{
IList<Pet> result = new List<Pet>();
foreach (Pet p in Pets)
{
if (condition(p))
{
result.Add(p);
}
}
return result;
}
Introducing Delegates
Now we need to figure out what type condition
has. From the function body, we see that it takes a parameter of type Pet
and returns a bool
value. In C#, the definition of a function is called a delegate
, which is also a keyword; for the type above, we write:
delegate bool MatchesCondition(Pet item);
As mentioned above, the return type is a bool
, the single parameter is of type Pet
, and the delegate is identified by the name <em>MatchesCondition</em>. The name of the parameter is purely for documentation. We can then rewrite the function signature above using the delegate we just defined:
IList<Pet> FindPets(MatchesCondition condition) {…}
We’ve managed to move the looping code for many common situations into a shared method. Now, how do we use it? We originally wanted to find all pets named “Fido”, so we need to define a function that does just that, matching the function signature defined by MatchesCondition
:
bool IsNamedFido(Pet p)
{
return p.Name == "Fido";
}
In this fashion, we can write any number of methods, which check various conditions on Pet
s. To use this method, we simply pass it to the shared FindPets
method, like this:
IList<Pet> petsNamedFido = FindPets(IsNamedFido);
IList<Pet> petsNamedRex = FindPets(IsNamedRex);
IList<Pet> houseTrainedPets = FindPets(IsHouseTrained);
Anonymous Methods
This is better than the previous situation—in which we would have repeated the loop again and again—but we can do better. The problem with this solution is that it tends to clutter the class (Owner
in this case) with many little methods that are useful only in conjunction with FindPets
. Even if the methods are private, it’s a shame to have to use a full-fledged method as a kludge for instancing a piece of code to be called. The C# designers thought so too, so they added anonymous methods, which have a parameter list and a body, but no name. Using anonymous methods, we can replace the methods, IsNamedFido
, IsNamedRex
and IsHouseTrained
, with the following code:
IList<Pet> petsNamedFido = FindPets(delegate(Pet p) { return p.Name == "Fido"; });
IList<Pet> petsNamedRex = FindPets(delegate(Pet p) { return p.Name == "Rex"; });
IList<Pet> houseTrainedPets = FindPets(delegate(Pet p) { return p.IsHouseTrained; });
Again, the keyword delegate
introduces a parameter list and body for the anonymous method.
Generic Functions
All of the code above uses the generic IList
and List
classes. None of the looping code in FindPets
is dependent on the type of the list element except for the condition
. It would be really nice if we could re-use this code not just for Pet
s, but for any collection of elements. Generic functions to the rescue. A generic function has one or more generic parameters, which can be used throughout the parameter list and implementation body. The first step in making FindPets
fully generic is to change the definition of MatchesCondition
:
delegate bool MatchesCondition<T>(T item);
As with a generic class, the function’s generic arguments appear within pointy brackets after the identifier—in this case, the single generic parameter is named T
. Pet
has been replaced as the type of the parameter as well. In order to finish making FindPets
fully generic, we’ll have to pass it a list to work with (right now it always uses Pets
) and change the name, so as to avoid confusion:
IList<T> FindItems<T>(IList<T> list, MatchesCondition<T> condition)
{
IList<T> result = new List<T>();
foreach (T item in list)
{
if (condition(item))
{
result.Add(item);
}
}
return result;
}
We’re not quite done yet, though. If you look closely at the function body, all it does is enumerate the items in the parameter list
. Therefore, we can loosen the type-constraint of the parameter from IList
to IEnumerable
, so that it can be called with any collection from all of .NET.
IList<T> FindItems<T>(IEnumerable<T> list, MatchesCondition condition) {…}
And … we’re done. Fully generic! Let’s see how that looks using the examples from above:
IList<Pet> petsNamedFido = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Fido"; });
IList<Pet> petsNamedRex = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Rex"; });
IList<Pet> houseTrainedPets = FindItems<Pet>(Pets, delegate(Pet p) { return p.IsHouseTrained; });
Though we’ve lost something in legibility, we’ve gained quite a bit in re-use. Imagine now that an Owner
also has a list of Vehicle
s, a list of Properties
and a list of Relative
s. You only have to write the conditions themselves and you can search any type of container for items matching any condition … all in a statically type-safe manner:
IList<Pet> petsNamedFido = FindItems<Pet>(Pets, delegate(Pet p) { return p.Name == "Fido"; });
IList<Vehicle> redCars = FindItems<Vehicle>(Vehicles, delegate(Vehicle v) { return (v is Car) and (((Car)v).Color == Red); });
IList<Property> bigLand = FindItems<Property>(Properties, delegate(Property p) { return p.Acreage >= 1000; });
IList<Relative> deadBeats = FindItems<Relative>(Relatives, delegate(Relative r) { return r.MoneyOwed > 0; });
Note: C# 2.0 offers this functionality in the .NET library for both the List
and Array
classes. In the official version, MatchesCondition
is called Predicate
and FindItems
is called FindAll
. It is not known why these functions don’t apply to all collections, as illustrated in our example.
Extension Methods
Can we do something about the legibility of the solution from the last section? In C# 2.0, we’ve reached the end of the line. If you’ve been following the development of “Orcas” and C# 3.0/3.5, you might have heard of extension methods[3], which allow you to extend existing classes with new functions without inheriting from them. Let’s extend any IEnumerable
with our find function:
public static class MyVeryOwnExtensions
{
public static bool FindItems<T>(this IEnumerable<T> list, MatchesCondition<T> condition)
{
// implementation from above
}
}
The keyword this
highlighted above indicates to the compiler that FindItems
is an extension method for the type following it: IEnumerable<T>
. Now, we can call FindItems
with a bit more legibility and clarity, dropping both the generic parameter the actual argument (Pet
and Pets
, respectively) and replacing with a method call on Pets
directly.
IList<Pet> petsNamedFido = Pets.FindItems(delegate(Pet p) { return p.Name == "Fido"; });
Contravariance
For brevity’s sake, the examples in this section assume use of the extension method defined above. To use the examples with C# 2.0, simply rewrite them to use the non-extended syntax.
We use anonymous methods to avoid declaring methods that will be used for one-off calculations. However, larger methods or methods that are reused throughout a class properly belong to the class as full-fledged methods. At the top, we defined a descendent of the Pet
class called Dog
. Imagine that each Owner
has not only a list of Pet
s, but also a list of Dog
s. Then we’d like to bring back our IsNamedFido
method in order to be able to apply it against both lists (copied from above):
bool IsNamedFido(Pet p)
{
return p.Name == "Fido";
}
Now we can use this method to test against lists of pets or lists of dogs:
IList<Pet> petsNamedFido = Pets.FindItems(IsNamedFido);
IList<Dog> dogsNamedFido = Dogs.FindItems(IsNamedFido);
The example above illustrates an interesting property of delegates, called contravariance. Because of this property, we can use IsNamedFido
—which takes a parameter of type Pet
—when calling FindItems<Dog>
. That means that IsNamedFido
can be used with any list containing objects descended from Pet
. Unfortunately, contravariance only applies in this very special case; the type of dogsNamedFido
cannot be IList<Pet>
because IList<Dog>
does not conform to IList<Pet>
.[4]
However, this courtesy extends only to predefined delegates. If we wanted to replace the call to IsNamedFido
with a call to an anonymous method, we’d be forced to specify the exact type for the parameter, as shown below:
IList<Dog> dogsNamedFido = Dogs.FindItems(delegate(Dog d) { return d.Name == "Fido"; });
Using Pet
as the type parameter does not compile even though it is simply an in-place reformulation of the previous example. Enforcing the constraint here does not restrict the expressiveness of the language in any way, but it’s interesting to note that the compiler relaxes the rule against contravariance only when it absolutely has to.
Closures
In the previous section, we created a method, IsNamedFido
instead of using an anonymous method to avoid duplicate code. In that spirit, suppose we further believe that having a name-checking function that checks a constant is also not generalized enough[5]. Suppose we write the following function instead:
bool IsNamed(Pet p, string name)
{
return p.Name == name;
}
Unfortunately, there is no way to call this method directly because it takes two parameters and doesn’t match the signature of MatchesCondition
(and even contravariance won’t save us). You can, however, drop back to using a combination of the defined method and an anonymous method:
IList<Pet> petsNamedFido = Pets.FindItems(delegate (Pet p) { return IsNamed(p, "Fido"); });
This version is a good deal less legible, but serves to show how you can at least pack most of the functionality away into an anonymous method, repeating as little as possible. Even if the anonymous method uses local or instance variables, those are packed up with the call so that the values of these variables at the time the delegate is created are used.
For comparison, Java does not support proper closures, requiring final
hacks and creation of anonymous classes in order to perform the task outlined above. Various proposals aim to extend Java in this direction, but, as of version 6, none have yet found their way into the language specification.
Agents
On a final note, it would be nice to have a cleaner notation for formulating the method call above—in which additional parameters to a function must be collected manually into an anonymous method. The Eiffel programming language offers such an alternative, calling their delegates agents instead[6]. The conformance rules for agents for a method signature like MatchesCondition<T>
are different, requiring not that the signature match perfectly, but only that all non-conforming parameters be provided at the time the agent is created.
Eiffel uses question marks to indicate where actual arguments are to be mapped to the agent, so in pseudo-C# syntax, the method call above would be written as:
IList<Pet> petsNamedFido = Pets.FindItems(agent IsNamed(?, "Fido"));
This is much more concise and expressive than the C# version. It differs enough from an actual function call—through the rather obvious and syntax-highlightable keyword, agent—but not so much as to suggest an entirely different mechanism. The developer is made aware that it’s not a regular method call, but a delayed one. C# could easily implement such a feature as pure syntactic sugar, compiling the agent expression to the previous formulation automatically. Perhaps in C# 4.0?
All in all, though, C#’s support for generics and closures and DRY programming is eminently useful and looks only to improve in upcoming versions like LINQ, which introduces inferred typing, a mechanism that will improve legibility and expressiveness dramatically.
This reduces the expressiveness of the language, but C# forbids this because it cannot statically prevent incorrect objects from being added to the resulting list. Building on the example above, if we assume a class Cat
also descendend from Pet
, it would then be possible to do the following:
IList<Pet> dogsNamedFido = Dogs.FindItems(IsNamedFido);
dogsNamedFido.Add(new Cat());
This would cause a run-time error because the actual instance attached to dogsNamedFido
can only contain Dog
s. Instead of adding run-time checking for this special case and enhancing the expressiveness of the language—as Eiffel or Scala, for example, do—C# forbids it entirely, as does Java.
For further information, the articles, Generic type parameter variance in the CLR and Using ConvertAll to Imitate Native Covariance/Contravariance in C# Generics, are also useful. For more information on closures in C#, see C#: Anonymous methods are not closures and The Power of Closures in C#.