U2U Consult TechDays 2011 CD Available for Download

At TechDays 2011 in Antwerp, U2U Consult distributed a CD-ROM with two free tools. I’m happy to announce that the CD-ROM contents is now also available for download from our web site.

The U2U Consult SQL Database Analyzer is a tool for SQL Server database administrators and developers. It displays diagnostic information about a database and its hosting instance that is hard to collect when you only use the standard SQL Server tools. Just point the tool at a SQL Server database of your choice, and have a look at the reports generated by the tool.

The U2U Consult Code Analysis Rules for Visual Studio 2010 are a series of additional code analysis rules for Visual Studio 2010 Premium or Ultimate, and two rule sets with recommended rules for libraries and applications. The rules include additional general performance and design rules, as well as a series of rules specifically for WCF. All rules are documented on the CD-ROM. Obviously, they are applicable to all .NET languages, including C# and VB.

With this CD, for the first time we make two of our own tools available to you. These are only two small components out of the U2U Consult Framework, but we hope they are as useful to you as they are to us and our clients. Enjoy.

Farewell Visitor

The Visitor design pattern was first documented in 1995 by the Gang of Four. It’s a workaround for the fact that most strongly typed object oriented languages only support single dispatch, even when sometimes double dispatch is required. With C# 4, we no longer need this workaround. We now have something better, more on that below. Let’s look at an example.

The traditional Visitor pattern

The System.Linq.Expressions namespace contains types that enable code expressions to be represented as objects in the form of expression trees. For example, the following C# statement creates an expression tree:

Expression<Func<double, double>> f = x => Math.Sin(1 + 2 * x);

 

There are many kinds of expressions. The above example creates several objects of different types, including ParameterExpression, ConstantExpression, BinaryExpression and MethodCallExpression, all of which inherit from Expression.

There are many ways to represent expressions as text. For example, we can use an infix notation (as most programming languages do), but we can also use prefix or postfix notation. Anyone who as ever worked with an HP scientific calculator, or a programming language such as Forth, will appreciate postfix notation, also known as reverse polish notation.

As a result, the way to translate a particular expression into a text representation depends on two things: the kind of expression and the kind of notation. More precisely, the method to execute to translate an expression object into a string object depends on the type of the expression, and on the type of the translation algorithm. Using a virtual function would allow the system to choose a method based on one of these dimensions, e.g. the type of expression, but not on both. Virtual functions provide single dispatch, but we need dual dispatch.

In fact, the expression class has a virtual ToString() method, inherited from object. Every type inheriting from Expression has its own version, making the algorithm depend on the type of expression. But it’s a hardcoded implementation, using an infix notation. What if we want a postfix ToString? Or Prefix? Or a C# or F# syntax? This is where the Visitor pattern can help us, and luckily the Expression class has support for it. The class ExpressionVisitor is an abstract base class for algorithms working on expressions. Now when I say algorithms, you probably think of methods with parameters and return values, but that’s not how a Visitor works. A Visitor is an object of some class, and parameters must be passed in, typically via the constructor. The return value, i.e. the result of the algorithm must be read back from a property. Let’s create a base class for our ToString visitors:

public abstract class ToStringVisitor : ExpressionVisitor
{
   
protected readonly StringBuilder resultAccumulator = new StringBuilder
();
 
   
public string
Result
    {
       
get { return
resultAccumulator.ToString(); }
    }
}

This provides us with a base class for Visitors that have a string Result property. There are some issues with it, such as when to Clear() the StringBuilder when the Visitor is reused, but let’s not get into those. We can now create a ToPostfixStringVisitor, and encapsulate it behind a static method:

public static class ExpressionExtensions
{
   
public static string ToPostfixString(this Expression<Func<double, double
>> function)
    {
       
var visitor = new ToPostFixStringVisitor
();

        visitor.Visit(function);

       
return
visitor.Result;
    }

   
private class ToPostFixStringVisitor : ToStringVisitor
    {
       
protected override Expression VisitLambda<T>(Expression
<T> node)
        {
           
// enables reusing the visitor – not absolutely required here as the only
            // place where an instance can be created is in the ToPostfixString method.
            this
.resultAccumulator.Clear();

           
foreach (var parameter in
node.Parameters)
            {
               
this
.Visit(parameter);
            }

           
this.resultAccumulator.Append("-> "
);

           
this
.Visit(node.Body);

           
return
node;
        }

       
protected override Expression VisitParameter(ParameterExpression
node)
        {
           
this
.resultAccumulator.Append(node.Name);
           
this.resultAccumulator.Append(' '
);

           
return
node;
        }

       
protected override Expression VisitBinary(BinaryExpression
node)
        {
           
this
.Visit(node.Left);
           
this
.Visit(node.Right);

           
switch
(node.NodeType)
            {
               
case ExpressionType
.Add:
               
case ExpressionType
.AddChecked:
                   
this.resultAccumulator.Append('+'
);
                   
break
;
               
case ExpressionType
.Multiply:
               
case ExpressionType
.MultiplyChecked:
                   
this.resultAccumulator.Append('*'
);
                   
break
;
               
case ExpressionType
.Subtract:
               
case ExpressionType
.SubtractChecked:
                   
this.resultAccumulator.Append('-'
);
                   
break
;
               
case ExpressionType
.Divide:
                   
this.resultAccumulator.Append('/'
);
                   
break
;
               
case ExpressionType
.Modulo:
                   
this.resultAccumulator.Append('%'
);
                   
break
;
                default
:
                   
throw new NotSupportedException
();
            }

           
this.resultAccumulator.Append(' '
);

           
return
node;
        }

       
protected override Expression VisitMethodCall(MethodCallExpression
node)
        {
           
foreach (var arg in
node.Arguments)
            {
               
this
.Visit(arg);
            }

           
this
.resultAccumulator.Append(node.Method.Name);
           
this.resultAccumulator.Append(' '
);

           
return
node;
        }

       
protected override Expression VisitConstant(ConstantExpression
node)
        {
           
this
.resultAccumulator.Append(node.Value);

           
this.resultAccumulator.Append(' '
);

           
return
node;
        }
    }
}

 

For example, the following line outputs x -> 1 2 x * + Sin

Console.WriteLine(ExpressionExtensions.ToPostfixString(x => Math.Sin(1 + 2 * x)));

 

It works, even though for the sake of example it only supports a very small subset of all expressions. At least in the case of binary operators, it throws a NotSupportedException for operators that are, well, not supported. I really should add a bunch of other methods, for example:


       
protected override Expression VisitConditional(ConditionalExpression
node)
        {
           
throw new NotSupportedException
();
        }

       
protected override Expression VisitBlock(BlockExpression
node)
        {
           
throw new NotSupportedException
();
        }

 

Anyway, how does it work? The Visit() method calls an internal virtual method on Expression called Accept. Being virtual, this chooses what kind of expression to work on an it calls the appropriate VisitX method in the visitor. This one is virtual as well, and it chooses the correct algorithm, our ToPostfixStringVisitor in this case.

So we have double dispatch, via a combination of two single dispatch calls.

Dynamic dispatch to the rescue

As of C# 4, we are not restricted to single dispatch, we now have dynamic dispatch. Let’s see what this example looks like using dynamic:

public static class ExpressionExtensions
{
   
public static string ToPostfixString(this Expression<Func<double, double
>> function)
    {
       
StringBuilder resultAccumulator = new StringBuilder
();

        Visit(function, resultAccumulator);

       
return
resultAccumulator.ToString();
    }

   
private static void Visit(Expression expression, StringBuilder
resultAccumulator)
    {
       
dynamic
dynamicExpression = expression;

        VisitCore(dynamicExpression, resultAccumulator);
    }

   
private static void VisitCore(LambdaExpression node, StringBuilder
resultAccumulator)
    {
       
foreach (var parameter in
node.Parameters)
        {
            Visit(parameter, resultAccumulator);
        }

        resultAccumulator.Append(
"-> "
);

        Visit(node.Body, resultAccumulator);
    }

   
private static void VisitCore(ParameterExpression node, StringBuilder
resultAccumulator)
    {
        resultAccumulator.Append(node.Name);
        resultAccumulator.Append(
' '
);
    }

   
private static void VisitCore(BinaryExpression node, StringBuilder
resultAccumulator)
    {
        Visit(node.Left, resultAccumulator);
        Visit(node.Right, resultAccumulator);

       
switch
(node.NodeType)
        {
           
case ExpressionType
.Add:
           
case ExpressionType
.AddChecked:
                resultAccumulator.Append(
'+'
);
               
break
;
           
case ExpressionType
.Multiply:
           
case ExpressionType
.MultiplyChecked:
                resultAccumulator.Append(
'*'
);
               
break
;
           
case ExpressionType
.Subtract:
           
case ExpressionType
.SubtractChecked:
                resultAccumulator.Append(
'-'
);
               
break
;
           
case ExpressionType
.Divide:
                resultAccumulator.Append(
'/'
);
               
break
;
           
case ExpressionType
.Modulo:
                resultAccumulator.Append(
'%'
);
               
break
;
           
default
:
               
throw new NotSupportedException
();
        }

        resultAccumulator.Append(
' '
);
    }

   
private static void VisitCore(MethodCallExpression node, StringBuilder
resultAccumulator)
    {
       
foreach (var arg in
node.Arguments)
        {
            Visit(arg, resultAccumulator);
        }

        resultAccumulator.Append(node.Method.Name);
        resultAccumulator.Append(
' '
);
    }

   
private static void VisitCore(ConstantExpression node, StringBuilder
resultAccumulator)
    {
        resultAccumulator.Append(node.Value);
        resultAccumulator.Append(
' '
);
    }

   
private static void VisitCore(Expression node, StringBuilder
resultAccumulator)
    {
       
throw new NotSupportedException
();
    }
}

 

The dynamic dispatch is achieved by the Visit method. To learn more about how this works, see http://blogs.msdn.com/b/samng/archive/2008/11/06/dynamic-in-c-iii-a-slight-twist.aspx.

So how is this better?

First of all, this approach works with all classes. The Expression class does have visitor support baked in, but most classes don’t. The dynamic approach also works if the target classes don’t support the visitor pattern. That also means that you don’t have to do anything special with your own classes to enable this technique.

The dynamic approach is also simpler. I’ve noticed that most people don’t immediately “see” the visitor pattern, but the dynamic approach is easier to understand.

Visitor implementations typically have methods that return void, and producing a result must be accomplished via fields and properties (the ExpressionVisitor is a notable exception here, it is optimized for rewriting expressions, i.e. calculating a new expression based on an existing one). With dynamic methods, you choose your parameter and return types (the StringBuilder in this example). Not only is that much simpler to code, the entire thing has no state in fields, only in stack local variables. As a result, it’s completely reentrant and thread-safe.

Note also that the last VisitCore method specifies what to do with expressions that aren’t handled by any of the other methods. Much more convenient than with a Visitor, where you always have to specify a method for each concrete type, unless it just so happens that the behavior you want is the default behavior from the base class.

Conclusion

The dynamic keyword was introduced to facilitate interoperability with dynamic languages and systems, including COM. C# remains primarily a strongly typed language. As such, some people suggested that dynamic has no place in plain C# programs that don’t require such interoperability. However, the above example shows that dynamic dispatch can be very useful in the context of strongly typed C# programs, as an alternative to the Visitor pattern.

Lambda Curry in F#

Bart De Smet commented on my post about Lambda Curry in C#, saying (amongst other things) that F# supports currying out of the box.

That’s true, and it’s a nice feature of the language. However, it is a mechanical operation, almost identical to what the following C# extension method does:

public static class FunctionalExtensions
{
   
public static Func<T2, TResult> Curry<T1, T2, TResult>(this Func
<T1, T2, TResult> func, T1 value)
    {
       
return value2 => func(value, value2);
    }
}

The important point to note is that F# does not perform partial evaluation automatically, which is where in my mind most of the benefit comes from.

To illustrate, consider the following function definition in F#:

open System

let compute x y = Math.Sin(float x) * Math.Sin(float y)

This is exactly the same as the following, illustrating the automatic currying:

let compute x = fun y -> Math.Sin(float x) * Math.Sin(float y)

And when I say “exactly the same”, I do mean just that: they compile to the exact same IL.

If you want the partial evaluation, and the performance benefit of it, you’ll have to do it manually, also in F#:

let compute' x =
   
let
sinx = Math.Sin(float x)
   
fun y -> sinx * Math.Sin(float y)

To illustrate, consider the following program, which is more or less analogous to my previous example:

open System
open
System.Diagnostics

let
compute x y = Math.Sin(float x) * Math.Sin(float y)

let
compute' x =
   
let
sinx = Math.Sin(float x)
   
fun y ->
sinx * Math.Sin(float y)

let
sum f =
   
let mutable
sum = 0.0
   
for x = -1000 to 1000 do
        let
f' = f x
       
for y = -1000 to 1000 do
            sum <- sum + f' y
    sum

let
measureTime f =
    let sw = Stopwatch.StartNew()
   
let
_ = sum f
    sw.ElapsedMilliseconds

printfn
"%d"
(measureTime compute)
printfn
"%d" (measureTime compute')

On the machine I’m testing this, it prints 329 milliseconds for the compute function, and 137 for the compute’ function.

To be honest, this should not come as a surprise. Even if F# wanted to perform a partial evaluation, how could it? It does not know that Math.Sin is a pure function. So it has no choice but to play safe. It does what the developer tells it to do. So if you want partial evaluation, do it yourself, explicitly, no matter what language you’re using.

Lambda Curry

Note: if you’re looking for lamb curry, you came to the wrong place. This post is about C# programming techniques.

Currying a function is a technique named after Haskell Curry, to transform a function with multiple parameters into a series of functions having one parameter each. The technique is important, because it opens the door to an optimization technique called partial evaluation. Let’s look at an example.

Let’s say you need to write a program that sums a two-dimensional function f(x,y) over a two-dimensional range, e.g. –1000 ≤ x ≤ 1000 and –1000 ≤ y ≤ 1000.

Such a two-dimensional function, assuming double parameters and result, can be represented by Func<double, double, double>, and we can sum it using the following method:

        private static double Sum(Func<double, double, double> f)
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += f(x, y);
                }
            }

            return sum;
        }
 

We can apply this to an arbitrary function, for example:

            Func<double, double, double> f = (x, y) => Math.Sin(x) * Math.Sin(y);

            double result = Sum(f);
 

Currying this is now a simple textual transformation. Instead of defining f as Func<double, double, double>, we define it as Func<double, Func<double, double>>.

    using System;

    internal static class Curried
    {
        public static void Main()
        {
            Func<double, Func<double, double>> f = x => y => Math.Sin(x) * Math.Sin(y);

            double result = Sum(f);
        }

        private static double Sum(Func<double, Func<double, double>> f)
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += f(x)(y);
                }
            }

            return sum;
        }
    }
 

Effectively, a function that took two parameters and returned a result is replaced by a function that takes one parameter and returns a function that takes the second parameter and returns the result. It looks a lot simpler than it sounds. Instead of writing f = (x, y) => Math.Sin(x) + Math.Sin(y), we write f = x => y => Math.Sin(x) + Math.Sin(y). And when calling it, instead of writing f(x, y), we write f(x)(y). Simple.

Unfortunately, every call to f(x) now allocates a new Func<double, double> object, and that can become quite expensive. But that can be fixed easily, so here is a smarter solution:

    using System;

    internal static class Smarter
    {
        public static void Main()
        {
            Func<double, Func<double, double>> f = x => y => Math.Sin(x) * Math.Sin(y);

            double result = Sum(f);
        }

        private static double Sum(Func<double, Func<double, double>> f)
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                var fx = f(x);
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += fx(y);
                }
            }

            return sum;
        }
    }
 

I ran a little benchmark on this code. The benchmark executes the main function 20 times, and measures the shortest execution time. It also measures the number of generation 0 garbage collections. This is the result:

Naive: 261 msec, 0 collections
Curried: 340 msec, 733 collections
Smarter: 254 msec, 0 collections
 

As we can see, the curried version was initially slower due to all the memory allocations, but when we fixed that, the smarter version was as fast as the original. In fact is was just a little bit faster, though nothing to get existed about.

However, this is where partial evaluation kicks in. Currently, we are calculating the sinus of x over one million times, not taking advantage of the fact that we could reuse each calculated value a thousand times! So let’s change our definition of f as follows:

            Func<double, Func<double, double>> f = x => { var sinx = Math.Sin(x); return y => sinx * Math.Sin(y); };
 

Now the benchmark shows a completely different result:

Optimized: 143 msec, 0 collections
 

We went from 261 milliseconds in the original version to 143 milliseconds in this version, in fact almost dividing execution time by two! That’s because, to be precise, in the original version we had two times 2001 * 2001 = 8,008,002 Math.Sin calls, and in the optimized version we have 1 time 2001 plus 1 time 2001 * 2001 = 4,006,002 Math.Sin calls. That is a division by a factor of 1.999, yielding a total execution time reduction by a factor of 1.825 (there is some overhead of course).

Of course, the technique is very much related to loop invariant code motion in imperative programming. For example, imagine a Sum function hardcoded for Sin(x) + Sin(y). Would you write is like this?

        private static double SumSinSin()
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += Math.Sin(x) * Math.Sin(y);
                }
            }

            return sum;
        }

Of course not! At least you would move the calculation of Sin(x) out of the loop over y:

        private static double SumSinSin()
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                var sinx = Math.Sin(x);
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += sinx * Math.Sin(y);
                }
            }

            return sum;
        }
 

And that is exactly what we did, except of course that the sum function is parameterized and not hardcoded.

So when would you apply this technique? You would apply it when performance matters, and you have a function that you need to call a lot, that takes more than one parameter, where one parameter varies more than another one (in our example, x remained the same for a long time, while y was different on every call), and part of the function can be evaluated knowing only the value of the parameter that varies the least.

In our example above, we could go even further. For example, we could eliminate the multiplication and even the Sin(y) calculation completely is case Sin(x) is 0 (which would be the case in our example only for x == 0).

            Func<double, Func<double, double>> f = x => 
            {
                if (x != 0.0)
                {
                    var sinx = Math.Sin(x);

                    return y => sinx * Math.Sin(y);
                }
                else
                {
                    return y => 0.0;
                }
            };
 

That is not worth it in this scenario (because the special case applies to less than 0.05 % of all cases), but in some scenarios runtime algorithm specialization can be very significant.

Static Reflection in .NET, part 2

A few weeks ago, I talked about static reflection and its advantages. You’ll remember that the main advantages, compared to the normal reflection API’s, are the compile time checking of parameters and IntelliSense support.

How does it compare at other levels, performance for example? Before we dive into that question, let me state that performance may or may not be important to you. A program that is fast enough is, well, fast enough. It’s unlikely that a (single) reflection call will have a significant impact on, say, the response time of a graphical user interface, and so performance doesn’t matter. If your algorithm requires millions of reflection operations, I’m sure you can rewrite it somehow to reduce that number significantly, and then performance again probably doesn’t matter anymore. That being said, we still want to know, right?

First of all, let’s compare code.

Take this line (using the Example class from the last post):

PropertyInfo pi = typeof(Example).GetProperty("Description");

This line compiles to the following IL (simplified for readability):

ldtoken Example 
call class Type Type::GetTypeFromHandle(valuetype RuntimeTypeHandle) 
ldstr "Description" 
call instance class PropertyInfo Type::GetProperty(string)

Compare that to the following line:

PropertyInfo pi = StaticReflector.Create<Example>().PropertyInfo(e => e.Description);

Which compiles to:

call class IStaticReflector`1<!!0> StaticReflector::Create<class Example>()
ldtoken Example
call class Type Type::GetTypeFromHandle(valuetype RuntimeTypeHandle)
ldstr "e"
call class ParameterExpression Expression::Parameter(class Type, string)
stloc.0 
ldloc.0 
ldtoken instance string Example::get_Description()
call class MethodBase MethodBase::GetMethodFromHandle(valuetype RuntimeMethodHandle)
castclass MethodInfo
call class MemberExpression Expression::Property(class Expression, class MethodInfo)
ldc.i4.1 
newarr ParameterExpression
stloc.1 
ldloc.1 
ldc.i4.0 
ldloc.0 
stelem.ref 
ldloc.1 
call class Expression`1<!!0> Expression::Lambda<class System.Func`2<class Example, string>>(class Expression, class ParameterExpression[])
call class PropertyInfo StaticReflectorExtensions::PropertyInfo<class Example, string>(class IStaticReflector`1<!!0>, class Expression`1<class System.Func`2<!!0, !!1>>)

As you can see, this code doesn’t load the “Description” string, it uses the ldtoken instruction instead. Some bloggers have suggested that this would make it more efficient. Unfortunately, even if the ldtoken instruction is efficient, it is largely offset by the construction of the lambda expression. I ran a little benchmark, in which I compare execution time (in ticks) and memory usage (in generation 0 garbage collection runs) of both approaches, executing each one a million times. This is the result (on my laptop):

Using Reflection       Time:    1089308 Collections:    45
Using StaticReflection Time:   13513777 Collections:   264

As you can see, the Static Reflection approach is about 13.5 times slower than the good old dynamic reflection, and it uses a lot more memory. That should be no surprise either: both cases allocate a PropertyInfo object, but the static case also allocates the expression, which is nothing but food for the garbage collector.

So, one approach seems good at compile time, and the other is good at run time. It seems we’re stuck between a rock and a hard place. But the situation isn’t so bad: we have two options to choose from, each with their pro’s and con’s. What the best one is depends on your requirements, and what you value the most: compile time checking (which may result in productivity and maintainability benefits), or performance.

And who knows, maybe there is a third option, giving the best of both worlds. But that’s for next time.

String.Trim() fixed in .NET 4.0

A long time ago, I wrote a blog post about the problems with String.Trim(). I’m happy to see that all three issues have been addressed in the .NET Framework 4.0.

To start with, Trim() will now be consistent with Char.IsWhiteSpace(). Theoretically, this is a breaking change, but I don’t expect many programs to have a problem with this change. Note that the change is very well documented in the online help.

Secondly, the code of Trim() has been cleaned up considerably. A string that consists entirely of whitespace is no longer scanned twice. I haven’t done any benchmarks, but I expect the performance to be at least as good as for the same function in .NET 2.0 – 3.5.

Last but not least, the frequent abuse of the Trim() function to simply validate strings will greatly decrease with the introduction of the static IsNullOrWhitespace(string value) function, which is much faster than calling Trim().

It’s a small detail, compared to all the other goodies .NET 4.0 brings, but a good addition to the toolbox nonetheless.

Static Reflection in .NET

LINQ expressions have proven to be extremely versatile, popping up in all sorts of areas. “Static Reflection” seem to be the latest hype. But what is static reflection anyway, and why is it good or why is it bad?

Reflection is used to obtain information about the code you are executing, and to use that information to interact with the code dynamically. Sometimes reflection is used to interact dynamically with code that is statically known by a program already. For example, data binding heavily relies on reflection to dynamically read and write properties. The calling program knows about those properties statically, but the data binding libraries do not. In data binding, object properties are often identified by their name, expressed as a string. That string is then used by the libraries to construct a PropertyInfo object.

Time for an example. Given this class:

public class Example
{
    public string Description { get; set; }
}


You can obtain a PropertyInfo object describing the Description property as follows:

PropertyInfo pi = typeof(Example).GetProperty("Description");


We may have an issue here. If I make a typing mistake in the GetProperty call, I don’t get a compiler error. At runtime, the call will return null, probably leading to a NullReferenceException down the road. And of course, Visual Studio Intellisense will not help me to type it right. Also, if I rename the property, for example to “Summary”, the GetProperty call will be broken, without a compile-time error. Static Reflection is one technique to avoid these issues.

Using LINQ expressions, we could create an API that allows us to do something like the following:

PropertyInfo pi = StaticReflector.GetProperty(Example e => e.Description);


The downside of this approach is that it doesn’t work with anonymous types. So I propose a different mechanism. What we need is something that statically gives us access to a type. Any generic interface will do. I propose the following:

public interface IStaticReflector<T>
{
}


Given this interface, we can define a series of extension methods, for example:

public static class StaticReflectorExtensions
{
    public static PropertyInfo PropertyInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        var body = selector.Body as MemberExpression;
        return body.Member as PropertyInfo;
    }
}


Notice how the obj parameter is not really used in the PropertyInfo method. It does serve a purpose however: it allows us to use type inference on the type T, and I get full Intellisense. For example:

IStaticReflector<Example> reflector = null;
PropertyInfo pi = reflector.PropertyInfo(e => e.Description);


Granted, initializing a variable to null and then calling a method on it is a bit weird. We need a more elegant way to create these things:

public static class StaticReflector
{
    public static IStaticReflector<T> Create<T>()
    {
        return null;
    }
}


Now we can write:

PropertyInfo pi = StaticReflector.Create<Example>().PropertyInfo(e => e.Description);


This still doesn’t work on anonymous types though. For those, we could use the following:

public static class ObjectExtensions
{
    public static IStaticReflector<T> GetReflector<T>(this T obj)
    {
        return null;
    }
}


Now we can write things such as:

var anonymous = new { Description = "Example" };

PropertyInfo pi = anonymous.GetReflector().PropertyInfo(e => e.Description);


I do prefer the StaticReflector.Create<T>() method is case the type name is known though.

Are we done? Not really. Let’s go back to dynamic reflection using string names. Lot’s of things could go wrong there, and we don’t get any warnings. The situation has not gone worse, but still lot’s of things can go wrong. So the PropertyInfo method needs some parameter validation. Also, properties certainly aren’t the only thing we can reflect upon. What about fields, methods and constructors? Here’s a full implementation:

using System;
using System.Linq.Expressions;
using System.Reflection;

public static class StaticReflectorExtensions
{
    public static PropertyInfo PropertyInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        PropertyInfo pi = obj.MemberInfo(selector) as PropertyInfo;

        if (pi == null)
        {
            throw new ArgumentException(Strings.InvalidPropertySelector, Strings.Selector);
        }

        return pi;
    }

    public static FieldInfo FieldInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        FieldInfo fi = obj.MemberInfo(selector) as FieldInfo;

        if (fi == null)
        {
            throw new ArgumentException(Strings.InvalidFieldSelector, Strings.Selector);
        }

        return fi;
    }

    public static MemberInfo MemberInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        var body = selector.Body as MemberExpression;

        if (body == null)
        {
            throw new ArgumentException(Strings.InvalidMemberSelector, Strings.Selector);
        }

        if (body.Expression.NodeType != ExpressionType.Parameter)
        {
            throw new ArgumentException(Strings.InvalidMemberSelector, Strings.Selector);
        }

        return body.Member;
    }

    public static MethodInfo MethodInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        var body = selector.Body as MethodCallExpression;

        if (body == null)
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        // instance methods must be called on the parameter
        if (body.Object != null && body.Object.NodeType != ExpressionType.Parameter)
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        // static methods must be defined in the type of the parameter or a base type
        if (body.Object == null && !body.Method.DeclaringType.IsAssignableFrom(typeof(T)))
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        return body.Method;
    }

    public static MethodInfo MethodInfo<T>(this IStaticReflector<T> obj, Expression<Action<T>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        var body = selector.Body as MethodCallExpression;

        if (body == null)
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        // instance methods must be called on the parameter
        if (body.Object != null && body.Object.NodeType != ExpressionType.Parameter)
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        // static methods must be defined in the type of the parameter or a base type
        if (body.Object == null && !body.Method.DeclaringType.IsAssignableFrom(typeof(T)))
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        return body.Method;
    }

    public static ConstructorInfo ConstructorInfo<T>(this IStaticReflector<T> obj, Expression<Func<T>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        var body = selector.Body as NewExpression;

        if (body == null)
        {
            throw new ArgumentException(Strings.InvalidConstructorSelector, Strings.Selector);
        }

        return body.Constructor;
    }

    private static class Strings
    {
        internal const string InvalidFieldSelector = "Invalid field selector";
        internal const string InvalidPropertySelector = "Invalid property selector";
        internal const string InvalidMemberSelector = "Invalid member selector";
        internal const string InvalidMethodSelector = "Invalid method selector";
        internal const string InvalidConstructorSelector = "Invalid constructor selector";
        internal const string Selector = "selector";
    }
}


Next time, we’ll talk about the disadvantages of this approach, and we’ll look at an alternative.

.NET 3.5 SP1 Beta available

I have mentioned before that a CLR update is due to be released this summer.

Scott Guthrie just announced that a beta is now available. On the CLR, he says:

.NET 3.5 SP1 includes significant performance improvements to the CLR that enable much faster application startup times - in particular with "cold start" scenarios (where no .NET application is already running).  Much of these gains were achieved by changing the layout of blocks within CLR NGEN images, and by significantly optimizing disk IO access patterns.  We also made some nice optimizations to our JIT code generator that allow much better inlining of methods that utilize structs.

We are today measuring up to 40% faster application startup improvements for large .NET client applications with SP1 installed.  These optimizations also have the nice side-effect of improving ASP.NET application request per second throughput by up to 10% in some cases.

It's not just an update to the CLR though, it's a significant service pack to both the .NET Framework and Visual Studio 2008. Another novelty I'm definitely going to take a look at is Linq to Entities:

.NET 3.5 SP1 includes the new ADO.NET Entity Framework, which allows developers to define a higher-level Entity Data Model over their relational data, and then program in terms of this model.  Concepts like inheritance, complex types and relationships (including M:M support) can be modeled using it.

The ADO.NET Entity Framework and the VS 2008 Entity Framework Designer both support a pluggable provider model that allows them to be used with any database (including Oracle, DB2, MySql, PostgreSQL, SQLite, VistaDB, Informix, Sybase, and others).

Developers can then use LINQ and LINQ to Entities to query, manipulate, and update these entity objects.

CLR Update this summer

On his blog, Scott Guthrie announced that an update for the .NET CLR will be released this summer:

This summer we are going to ship a servicing update to the CLR that makes some significant internal optimizations in how we optimize our data structures to cut down on disk IO and improve memory layout when loading and running applications. Among many other benefits, this work will significantly improve the working set and cold startup performance of .NET 2.0, 3.0 and 3.5 applications and will dramatically improve end-user experiences with .NET-based client applications.

Depending on the size of the application, we expect .NET applications to realize a cold startup performance improvement of between 25-40%. Applications do not need to change any code, nor be recompiled, in order to take advantage of these improvements so the benefits are automatic.

Free improvements are always good improvements. I do hope this update will include the optimizations on value types the JIT team has been blogging about:

Code generation for value types in .NET 2.0 has several inefficiencies.

1) All value type local variables live entirely on the stack.

2) No assertion propagation optimization is ever performed on value type local variables.

3) Methods with value type arguments, local variables, or return values are never inlined.

[...]

Over the past year or so, the JIT team has been working on significant improvements to value type code generation, as well as the inlining algorithm. In summary, all of the above limitations are being eliminated.