Friday, June 12, 2009

Hacking a C# Covariant Generic Cast Solution

Click here to skip the story of how I figured out the solution.

So I get onto the phone for my first job interview and everything is going great until they hit me with this question.  Could the following pseudo code compile?

class A { } 
class B: A { }
... 
List<A> a = new List<B>(); 

I had no idea what that would do...I am a recovering VB developer who was a recovering C/C++ developer who was a recovering assembly developer, so details about C# generics are not my strong suit.  I knew it wouldn’t compile in C++, and I guessed that you could probably write code in List<> that run into problems, so I muttered something along the lines "uh I don't think it will compile… that doesn't look very good."  The interviewer told me it would not compile and was an example of the lack of support for covariance/contravariance in generics in C#.  I didn’t get the job. 

Covariance... contravariance... what the heck are they?  If you are like I was and don’t know what that word means, try here  (courtesy of Eric Lippert.)  Read all the installments. 

The gist is that this situation occurs enough that making it work would be handy.  Imagine if you had a grid that displayed a List<DataRow> and you had defined your own MyDataRow class that inherited from DataRow.  Wouldn’t you want to pass your List<MyDataRow> to the grid? 

Back to the interview question.  Since class B inherits from class A at first glance it looks like it should work.  A little research shows that it can work in some other languages.  A neat aside is that  C# supports covariant arrays : 

A[] a = new B[1000];  // this compiles 

Although, it is at the expense of runtime checking:

A[] aa = new B[1000];
aa[0] = new B();  // looking good...
aa[1] = new A();  // runtime ArrayTypeMismatchException

Unfortunately the developers of C# decided to omit the support for covariance and contravariance with generic parameters for now.  Runtime checking is good enough reason for me (I like performance!  Faster please!)  In all fairness, C# isn’t meant to be the highest performing language in the world.  So why not do a quick runtime check on assignments and when passing parameters?  Why not allow inherited classes to pass the check?  The answer is they had more reasons than performance.  Consider this:

class B : A {
  int _SomeVar;  // B is larger than A now...  
}

Now B is larger than A.  If List<T> had a function that 1) created a new T and 2) added the new T to its list, you would need a runtime exception to prevent memory corruption even though B is an inherited class

public static class Extension { // runtime exception or memory corruption… you pick! public static void AddNew<T>(this List<T> list)
where T:new() { T item = new T(); list.Add(item); } } class Program { static void Main(string[] args) { List<A> a = new List<B>(); // if this compiled.. a.AddNew(); // boom here! } }

What’s the use of covariance if it causes runtime exceptions all the time, even for inherited classes?  Needless to say covariant/contravariant generics didn’t make it into the language yet. 

But what if we know we have a situation where we need to make the cast and we know its safe.  All is not lost!  I won’t bore you with all the work I did to try to hack the exact layout of the internal MethodTable structure that is used to hold type information.  I got off track assuming that since there were definitions for covariant generic parameters that all I had to do was use a memory hack to turn on the the GenericParameterAttributes.Covariant flag in the T parameter for List<T>.  I almost got to the point where I could do that, except there is an internal handle based system for storing certain things and the attributes were stored in there.  I was unable to break into the handle storage…yet   But that is for another day. 

I ended up with this.  Note, unsafe code follows:

delegate List<A> CCastDelegate(List<B> b);

...

DynamicMethod dynamicMethod = new DynamicMethod(
"foo1",
typeof(List<A>),
new[] { typeof(List<B>) },
typeof(void));

ILGenerator il = dynamicMethod.GetILGenerator(); il.Emit(OpCodes.Ldarg_0); // copy first argument to stack il.Emit(OpCodes.Ret); // return the item on the stack CCastDelegate HopeThisWorks = (CCastDelegate)
  dynamicMethod.CreateDelegate(typeof(CCastDelegate));

...

List<A> = HopeThisWorks(new List<B>());

Basically, I created a very simple function using IL that takes a single parameter and returns it.  The parameter is define as a List<B> and the return is defined as a List<A>.  This simply bypasses the type safe checking that C# works so hard to maintain.

My full test application code is here:

using System; using System.Collections.Generic; using System.Text; using System.Reflection.Emit; namespace Covariant { class A { public virtual string Name() { return "A"; } } class B : A { public override string Name() { return "B"; } } delegate List<A> CCastDelegate(List<B> b); class Program { static unsafe List<A> CastBasAIL(List<B> bIn) { // This creates a simple IL function that takes a
//
parameter and returns it. // Since it takes it as one type and returns it as
// another, it bypasses the type checking
DynamicMethod dynamicMethod = new DynamicMethod(
"foo1",
typeof(List<A>),
new[] { typeof(List<B>) },
typeof(void)); ILGenerator il = dynamicMethod.GetILGenerator(); il.Emit(OpCodes.Ldarg_0); // copy first arg to stack il.Emit(OpCodes.Ret); // return item on the stack CCastDelegate HopeThisWorks = (CCastDelegate)
dynamicMethod.CreateDelegate(typeof(CCastDelegate)); return HopeThisWorks(bIn); } static void Main(string[] args) { // make a list<B> List<B> b = new List<B>(); b.Add(new B()); b.Add(new B()); // set list<A> = the list b using the work around List<A> a = CastBasAIL(b); // at this point the debugger is miffed with a, but
// code executing methods of a work just fine. // It may be that the debugger simply checks that type
// of the generic argument matches the // signature of the type, or it may be that something
// is really screwed up. Nothing ever crashes. // prove the cast really worked
TestA(a); // add some more elements to B b.Add(new B()); // element added to B shows up in A like we expected TestA(a); return; } static void TestA(List<A> a) { Console.WriteLine("Input type: {0}",
typeof(List<A>).ToString()); Console.WriteLine("Passed in type: {0}\n",
a.GetType().ToString()); // Prove that A is B Console.WriteLine("Count = {0}", a.Count); Console.WriteLine("Item.Name = {0}", a[0].Name()); // see if more complicated methods of List<A> still work int i = a.FindIndex(
delegate(A item) {
return item.Name() == "A";
}
); Console.WriteLine(
"Index of 1st A in List<A> = {0}", i); i = a.FindIndex(
delegate(A item) {
return item.Name() == "B"; }
); Console.WriteLine(
"Index of 1st B in List<A> = {0}\n", i); // can we convert a to an array still? Console.WriteLine(
"Iterate through a, after converting a to an array"); foreach (var x in a.ToArray()) Console.WriteLine("{0}", x.Name()); } } }

Too bad I figured this all out after the interview! 

Bear in mind, I developed and checked this on framework 3.5, under 32 bit windows XP.  I think it will work on the other OS’s and address sizes, but I leave that exercise up to the reader.  Since this is an unsafe cast, the compiler is not protecting you from yourself.  You will get runtime errors and or crashes if you are not careful.

2 comments:

Jaskaran said...

Can you post the MethodTable structure in .net 3.5.

Goff said...

Wow what an interview question..