Wildcard Generics

Published by marco on

Updated by marco on

This article was originally published on the Encodo Blogs. Browse on over to see more!


As of version 1.5, Java has blessed its developers with generics, which increase expressiveness through improved static typing. With generics, Java programmers should be able to get away from the “casting orgy” to which Java programming heretofore typically devolved. The implementation in 1.5 does not affect the JVm at all and is restricted to a syntactic sugar wherein the compiler simply performs the casts for you.

Let’s build a class hierarchy and see how much casting Java saves us. Assume that you have defined a generic hierarchy using the following class:

public class DataObject {
  private String name;
  private List<DataObject> subObjects = new ArrayList<DataObject>();

  public String getName() {
    return name;
  }

  public List<DataObject> getSubObjects() {
    return subObjects;
  }
}

Well, now that’s an improvement! The class can express its intent in a relatively clear syntax without creating a specialized list class for the private field and result type. Assume further that there are various sub-classes of this DataObject, which want to provide type-specific helper functions for their sub-lists. For example:

public class A extends DataObject {
}

public class B extends DataObject {
  public List<A> getAs() {
    return getSubObjects();
  }
}

Though this is exactly what we would like, it won’t compile. It returns instead the error:

Type mismatch: Cannot convert from List<DataObject> to List<A>

In the next section, we’ll find out why.

Covariance and Catcalls

For some reason, List<A> does not conform to List<DataObject>, even though A inherits from DataObject. The Generics Tutorial (PDF) Section 3 explains:

“In general, if Foo is a subtype (subclass or subinterface) of Bar, and G is some generic type declaration, it is not the case that G<Foo> is a subtype of G<Bar>. This is probably the hardest thing you need to learn about generics, because it goes against our deeply held intuitions.”

Indeed it is hard to learn and indeed it does go against intuitions. Is there a more specific reason why generics is implemented in this way in Java? Java’s competitor, C#, is limited in exactly the same way and the C# Version 2.0 Specification (DOC) or the Google HTML version offers the following explanation:

“No special conversions exist between constructed reference types other than those described in §6. In particular, unlike array types, constructed reference types do not exhibit “covariant” conversions. This means that a type List<B> has no conversion (either implicit or explicit) to List<A> even if B is derived from A. Likewise, no conversion exists from List<B> to List<object>.

“The rationale for this is simple: if a conversion to List<A> is permitted, then apparently one can store values of type A into the list. But this would break the invariant that every object in a list of type List<B> is always a value of type B, or else unexpected failures may occur when assigning into collection classes.”

The key word here is covariance. Neither Java nor C# supports it (except for return types, where there are no dangers involved) because of function calls that, in the Eiffel world, have long been called “catcalls”. Suffice it to say that both Java and C# have elected to limit expressiveness and legibility in order to prevent this type of error from happening.[1]

Making it work in Java

Since Java has clearly state that it neither condones nor supports what we would like to do, we can choose one of several options:

  1. Be happy with the List<DataObject> and just go back to casting to get the desired <A> when needed
  2. Figure out a way of getting Java to return the desired List<A> without complaining

Since we’re stubborn, we’ll go with (2) above and dig a little deeper into generics. One solution is to create the list on-the-fly and transfer all the elements over to it.

  public List<A> getAs() {
    List<A> result = new ArrayList<A>();
    for (DataObject obj : getSubObjects()) {
      result.add((A) obj);
    }
    return result;
  }

Mmmmm…lovely. It does the soul good and makes the heart swell with pride to write code like this. So clear and understanable—and such a lovely mix of new-style iteration with old-style casting! Methinks we’ll try again. In the first attempt, we returned List<DataObject> from getSubObjects(). Is there another result type we could use?

Wildcards Explained

Java’s generics include something called wildcards, which allow a restricted form of covariance, in which the character ? acts as a placeholder for any class type at all. Wildcards are especially useful for function arguments, where they allow any list of elements to be passed. Imagine we wanted to pass in a list of DataObjects to a function to be printed. Using wildcards, we can write the following:

public void printCollection(Collection<?> _objects) {
  for (Object o : _objects) {
    System.out.println(o);
  }
}

The example above takes an collection at all and prints all of them. It only works because the compiler knows that any class that replaces ? must inherit from java.lang.Object, so it can access any methods of that class from within the function. This is extremely limited since we can’t access any DataObject-specific functions, so Java also includes bounded wildcards, which allow a wildcard to restrict the types of objects that may be used as the generic argument. Let’s rewrite printCollection so that we can access DataObject’s members without casting:

public void printCollection(List<? extends DataObject> _objects) {
  for (DataObject o : _objects) {
    System.out.println(o.getName());
  }
}

Whereas this mechanism suffices for the example above, wildcards exact a hidden price: they do not conform to anything. That is, though List<A> conforms to the format parameter, List<? extends DataObject>, you cannot then call add() on it. That is, the following code doesn’t work:

public void extendCollection(List<? extends DataObject> _objects) {
  _objects.add(new DataObject());
}

The parameter of _objects.add() is of type ? extends DataObject, which is completely unknown to the Java compiler. Therefore, nothing conforms to it … not even DataObject itself!

Using the example above, we can recap the different approaches to using generics in Java:

  • Using List<DataObject> as the formal argument doesn’t allow us to pass a List<A>
  • Using List<?> as the formal argument allows us to use only those functions defined in java.lang.Object on elements of the list.
  • Using List<? extends DataObject> allows us to pass any list of elements whose type conforms to DataObject, but limits the methods that can be called on it.

Making It Work

Let’s return now to our original example and see if we can’t apply our new-found knowledge to find a solution. Let’s redefine the result type of the getSubObjects() function to use a wildcard, while leaving the result type of the getAs() function, defined in B, as it was.

  public List<? extends DataObject> getSubObjects() {
    return subObjects;
  }

However, as we saw in the third case above, this return type uses an unknown (unknowable) generic type and cannot be modified using add() or remove(). Not exactly what we were looking for. Let’s instead put it back the way it was and concentrate on using our newfound knowledge to cast (Yay! Casting! I knew you’d be back!) our result to the correct type. Here’s a naive attempt:

  public List<A> getAs() {
    return (List<A>) getSubObjects();
  }

Ok. From the discussion above, it’s clear this won’t work and the compiler rewards us with the following error message:

Cannot cast from List<DataObject> to List<A>

Fine, let’s try again, this time throwing a wildcard into the mix:

  public List<A> getAs() {
    return (List<A>) (List< ? extends PathElement>) getSubObjects();
  }

Sweet! It compiles! We’re definitely on the home stretch now, but there’s still a warning from the compiler:

Type safety: the cast from List<capture-of ? extends DataObject> to List<A> is actually checking against the erased type list.

This is Java’s way of saying that you have done a complete end-run around it’s type-checking. The “erased type list” is actually List because the compiler uses a strategy called erasure[2] to resolve generic references. The double cast in the example above compiles (and will run), but cannot be statically checked. At this point, there’s nothing more we can do, so we admit defeat the Java way and slap a SuppressWarnings annotation on the function and continue on our way.

  @SuppressWarnings("unchecked")
  public List<A> getAs() {
    return (List<A>) (List< ? extends PathElement>) getSubObjects();
  }

It’s clear that the decision to avoid covariance at all costs has cost the language dearly in terms of expressiveness (and, as a result, type-safety, as evidenced by the casting in the final example). It takes rather a lot of illegible code to express what, at the beginning of the article, seemed a rather simple concept.


[1] Since the Pascal days, it seems that popular, mainstream languages almost always decide for compiler simplicity over programmer expressiveness. Static-typing for languages with covariant parameters offers a more in-depth example of covariance. For more information on this issue and other ways of addressing it—without putting the burden on the programmer—see the paper, Type-safe covariance, which offers both an in-depth look at the “problem” of covariance and offers a concrete solution (which has been since implemented in Eiffel).
[2] This quick overview on Type Erasure (Sun), explains the concept. The reason for this relatively naive implementation of generics is—as almost always in Java—backwards compatibility: “so that new code may continue to interface with legacy code”

Using Java 1.5

Comments

2 Replies

#1 − just to get sure

Marc

hi marco,

just to get things the right way (as i don’t know java very well):

is this wildcard-thing the same as the “WHERE” in C# 2?

For example:

public class A<MYTYPE>
where MYTYPE : MYBASETYPE, new
{

}

have to try your examples above in the c# and see if i get this working as well as it looks like a nice exercise for rainy weekends like this ;-)

cheers, marc

#2

marco

Wildcards are a way to provide partial support for propert generics, in which—if B inherits from A—List<B> also inherits from List<A>. In both Java and C#, this is not the case, which makes passing generic parameters all the more difficult.

The where keyword in C# corresponds to the extends keyword in Java. It indicates that the actual generic parameter must conform to the given base type (which can be a class or an interface). In your example, the implementation of the generic class may call any features defined in MYBASETYPE on MYTYPE.