- Posted by Kris Vandermotten on December 24, 2007
In C#, the following function compiles:
static double? Add(double? x, double? y)
{
return x + y;
}
Even though there is no addition operator defined for double?, or for Nullable<T> in general. But the C# compiler translates the above to:
static double? Add(double? x, double? y)
{
return x.HasValue && y.HasValue ? new double?(x.GetValueOrDefault() + y.GetValueOrDefault()) : null;
}
But I cannot write the following:
static double? Abs(double? x)
{
return Math.Abs(x); // won't compile
}
With a bit of Linq though, I can write the following:
static double? Abs(double? x)
{
return from d in x select Math.Abs(d);
}
All I need to make this work is the following extension method:
public static T? Select<T>(this T? x, Func<T, T> selector)
where T : struct
{
return x.HasValue ? new T?(selector(x.Value)) : null;
}
I could even add a Where extension method:
public static T? Where<T>(this T? source, Func<T, bool> predicate)
where T : struct
{
return source.HasValue && predicate(source.Value) ? source : null;
}
Which would allow weird (or cool?) stuff such as:
static double? Asin(double? x)
{
return from d in x where d >= -1 && d <= 1 select Math.Asin(d);
}
Toss in a SelectMany:
public static V? SelectMany<T, U, V>(this T? source, Func<T, U?> k, Func<T, U, V> resultSelector)
where T : struct
where U : struct
where V : struct
{
if (k == null)
{
throw new ArgumentNullException("k");
}
if (resultSelector == null)
{
throw new ArgumentNullException("resultSelector");
}
return source.HasValue && k(source.Value).HasValue ?
new V?(resultSelector(source.Value, k(source.Value).Value)) : null;
}
And now I can write:
static double? SinCos(double? x, double? y)
{
return from xd in x
from yd in y
select Math.Sin(xd) * Math.Cos(yd);
}
Who said Linq was about object-relational mappers? Or was it functional programming perhaps?
- Posted by Kris Vandermotten on November 16, 2007
Linq expressions are the key ingredient behind the IQueryable<T> interface. One aspect of them, largely underestimated if you ask me, is the fact they can be compiled into executable code at runtime.
Let's look at a concrete example: a generic comparer, which compares objects based on one of their properties. Take a Person class, with a Name property (amongst others), and you want to sort a List<Person> by Name. Sure, if you statically know, at compile time that is, the property on which to sort, you can simply use the Linq orderby operator and pass it the Lambda to extract the property value. But what if you don't know the property statically? What if you need to be able to sort any type on any of its properties?
In that case you need a way to extract the property dynamically. Reflection can do the job, but it's slow. With Reflection.Emit you can generate a DynamicMethod, but Expressions provide an easier alternative.
The function I need will have three parameters:
var x = Expression.Parameter(typeof(TObject), "x");
var y = Expression.Parameter(typeof(TObject), "y");
var c = Expression.Parameter(typeof(IComparer<TProperty>), "c");
Given a PropertyInfo (or just a name) for the property, I can generate and expression for the function and compile it:
compare = Expression.Lambda<Func<TObject, TObject, IComparer<TProperty>, int>>
Expression.Property(x, propertyInfo),
Expression.Property(y, propertyInfo)),
That's really all there is to it. I can then wrap this function into an IComparer<TObject>, as shown below.
Why bother? Sure, generation of the comparer will take its fair share of CPU cycles. But execution of the comparer will be a lot faster. When I need to sort, say, 10 objects, it doesn't matter, that will be done quickly enough either way. But when I need to sort thousands of objects, this technique can offer a significant speedup.
And what alternatives do I have? I could use DynamicMethod, which is what I used to do before .NET 3.5. In fact, that's exactly what the Compile() method on an expression is doing. The generated function would be no different, but the generation code definitely is more difficult and error prone to write. Both options share the fact that I'm doing a delegate call for every comparison. I could get rid of that delegate call by generating a full assembly using Reflection.Emit. But then I need to worry about memory leaks, because I can't unload the assembly.
Expressions aren't as powerful as the raw DynamicMethod, but they're easier to use. That's why I prefer them over DynamicMethod whenever I can.
Full source code for the example, including a SortedBindingList<T> that uses PropertyComparer<T>:
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq.Expressions;
namespace Vandermotten.Collections.Generic
public class PropertyComparer<TObject>
public static IComparer<TObject> Create(string propertyName)
return Create(propertyName, ListSortDirection.Ascending);
public static IComparer<TObject> Create(string propertyName, ListSortDirection direction)
return Create(typeof(TObject).GetProperty(propertyName), direction);
public static IComparer<TObject> Create(PropertyInfo propertyInfo)
return Create(propertyInfo, ListSortDirection.Ascending);
public static IComparer<TObject> Create(PropertyInfo propertyInfo, ListSortDirection direction)
if (propertyInfo == null)
throw new ArgumentNullException("propertyInfo");
if (!propertyInfo.CanRead)
throw new ArgumentException("Cannot read property " + propertyInfo.Name);
Type cT = typeof(Comp<>).MakeGenericType(typeof(TObject), propertyInfo.PropertyType);
return (IComparer<TObject>)Activator.CreateInstance(cT, propertyInfo, direction);
private class Comp<TProperty> : IComparer<TObject>
private Comparer<TProperty> comparer;
public Comp(PropertyInfo propertyInfo, ListSortDirection direction)
comparer = Comparer<TProperty>.Default;
var x = Expression.Parameter(typeof(TObject), "x");
var y = Expression.Parameter(typeof(TObject), "y");
var c = Expression.Parameter(typeof(IComparer<TProperty>), "c");
if (direction == ListSortDirection.Ascending)
compare = Expression.Lambda<Func<TObject, TObject, IComparer<TProperty>, int>>
Expression.Property(x, propertyInfo),
Expression.Property(y, propertyInfo)),
compare = Expression.Lambda<Func<TObject, TObject, IComparer<TProperty>, int>>
Expression.Property(y, propertyInfo),
Expression.Property(x, propertyInfo)),
Func<TObject, TObject, IComparer<TProperty>, int> compare;
public int Compare(TObject x, TObject y)
return compare(x, y, comparer);
public class SortedBindingList<T> : BindingList<T>
private ListSortDirection sortDirection = ListSortDirection.Ascending;
private PropertyDescriptor sortProperty;
public SortedBindingList()
public SortedBindingList(List<T> list)
protected override void ApplySortCore(PropertyDescriptor prop, ListSortDirection direction)
List<T> items = Items as List<T>;
sortDirection = direction;
IComparer<T> pc = PropertyComparer<T>.Create(prop.Name, direction);
protected override void RemoveSortCore()
protected override bool SupportsSortingCore
protected override bool IsSortedCore
protected override ListSortDirection SortDirectionCore
get { return sortDirection; }
protected override PropertyDescriptor SortPropertyCore
get { return sortProperty; }
- Posted by Kris Vandermotten on March 14, 2007
I was playing with LINQ To SQL (again) this evening, and I thought: why can't I see the generated SQL statements in my debugger output window?
The DataContext.Log property is a great help to learn writing LINQ To SQL queries. All you need to do is assign a TextWriter to it, and you get to see al the SQL statements generated by the data context. But where do you send them to? Console.Out is an option, but you probably don't want that in a release build. Furthermore, it doesn't quite work for Windows services or ASP.NET applications. That's why I wrote DebuggerWriter, an implementation of TextWriter that writes to the debugger log.
All you need to do to use it is:
MyDataContext db = new MyDataContext();
db.Log = new DebuggerWriter();
Here's the code:
using System;
using System.Diagnostics;
using System.Globalization;
using System.IO;
using System.Text;
namespace Vandermotten.Diagnostics {
/// <summary>
/// Implements a <see cref="TextWriter"/> for writing information to the debugger log.
/// </summary>
/// <seealso cref="Debugger.Log"/>
public class DebuggerWriter : TextWriter
{
private bool isOpen;
private static UnicodeEncoding encoding;
private readonly int level;
private readonly string category;
/// <summary>
/// Initializes a new instance of the <see cref="DebuggerWriter"/> class.
/// </summary>
public DebuggerWriter()
: this(0, Debugger.DefaultCategory)
{
}
/// <summary>
/// Initializes a new instance of the <see cref="DebuggerWriter"/> class with the specified level and category.
/// </summary>
/// <param name="level">A description of the importance of the messages.</param>
/// <param name="category">The category of the messages.</param>
public DebuggerWriter(int level, string category)
: this(level, category, CultureInfo.CurrentCulture)
{
}
/// <summary>
/// Initializes a new instance of the <see cref="DebuggerWriter"/> class with the specified level, category and format provider.
/// </summary>
/// <param name="level">A description of the importance of the messages.</param>
/// <param name="category">The category of the messages.</param>
/// <param name="formatProvider">An <see cref="IFormatProvider"/> object that controls formatting.</param>
public DebuggerWriter(int level, string category, IFormatProvider formatProvider)
: base(formatProvider)
{
this.level = level;
this.category = category;
this.isOpen = true;
}
protected override void Dispose(bool disposing)
{
isOpen = false;
base.Dispose(disposing);
}
public override void Write(char value)
{
if (!isOpen)
{
throw new ObjectDisposedException(null);
}
Debugger.Log(level, category, value.ToString());
}
public override void Write(string value)
{
if (!isOpen)
{
throw new ObjectDisposedException(null);
}
if (value != null)
{
Debugger.Log(level, category, value);
}
}
public override void Write(char[] buffer, int index, int count)
{
if (!isOpen)
{
throw new ObjectDisposedException(null);
}
if (buffer == null || index < 0 || count < 0 || buffer.Length - index < count)
{
base.Write(buffer, index, count); // delegate throw exception to base class
}
Debugger.Log(level, category, new string(buffer, index, count));
}
public override Encoding Encoding
{
get
{
if (encoding == null)
{
encoding = new UnicodeEncoding(false, false);
}
return encoding;
}
}
public int Level
{
get { return level; }
}
public string Category
{
get { return category; }
}
}
}
Enjoy!
- Posted by Kris Vandermotten on January 1, 2007
Last time, we looked at how Linq To SQL might impact how we think about what a Data Access Layer (DAL) is, based on the dependencies between assemblies. This time, we'll take a different approach: let's look at typical Linq to SQL code, and try to decide where to put it. I'll use a code sample from the "DLinq Overview for CSharp Developers" document included in the Linq May CTP (in C# 3.0, but the same applies to VB9).
A simple start
Let's take a look at the following code:
Northwind db = new Northwind(@"c:\northwind\northwnd.mdf");
var q = from c in db.Customers
where c.City == "London"
select c;
foreach (var cust in q)
Console.WriteLine("id = {0}, City = {1}", cust.CustomerID, cust.City);
It should be clear that the first line belongs in the DAL. The DataContext encapsulates a database connection, and knows about the physical location of the database. That is not something that higher layers should know about.
Let's say the actual query definition belongs in the DAL too, but clearly, the foreach loop sits in some higher layer. That means the two first statements need to be encapsulated in some function in the DAL, for example as follows (sticking with the "Entity Access Layer" terminology introduced before):
public class CustomersEal
{
private Northwind db = new Northwind(@"c:\northwind\northwnd.mdf");
public IQueryable<Customer> GetCustomersByCity(string city)
{
return from c in db.Customers
where c.City == city
select c;
}
}
The business layer then contains the following code:
CustomersEal customersEal = new CustomersEal();
foreach (var cust in customersEal.GetCustomersByCity("London"))
Console.WriteLine("id = {0}, City = {1}", cust.CustomerID, cust.City);
Looks good, doesn't it? All the business layer knows about the database, is that it can return Customer objects.
Problems
But wait, what if I write the following in my business layer:
CustomersEal customersEal = new CustomersEal();
var q = from c in customersEal.GetCustomersByCity("London")
orderby c.ContactNamer
select new { c.CustomerID, c.City };
foreach (var cust in q)
Console.WriteLine("id = {0}, City = {1}", cust.CustomerID, cust.City);
This code highlights a few interesting facts.
First of all, it wasn't the DAL that executed the query, at least not in the traditional sense of the word. The DAL (CustomersEal to be precise) merely supplied the definition for the query. The query got executed when the foreach statement started looping over the result! In a traditional DAL, a call to a method like GetCustomersByCity would have executed the query, but not with Linq, at least not if we implement our code like this.
Secondly, the business layer can refine the query definition. This definitely has some advantages, but I realize some might argue that this is really bad. Note though, that the business layer cannot redefine the query, or execute just any query it wants. Or can it? You need the DataContext to start the process, and only the DAL has access to that, right? In fact, the Entity Layer generated by SQLMetal is referenced by the business layer too; it needs it to get to the definitions of the entities!
Thirdly, it is absolutely not clear where a developer should draw the line between what's business logic, and what belongs in the DAL. I could have moved the orderby into the DAL (especially if I always want customers to be ordered by their ContactName). But likewise, I could have moved the where clause to the business layer! How do I decide what to do?
I hate it when developers have to make choices like that during routine development. Choosing takes time, and that's not likely to improve productivity. But much worse is the fact that different developers will make different choices. Even a single developer may make different choices from one day to the next. That leads to inconsistencies in the code. Developers will spend more time trying to understand the code they're reading, because it doesn't always follow the same pattern. That's bad for productivity. In the worst case scenario, developers start rewriting each other's code, just so it matches their choice of the day. That kills productivity. (Wasn't Linq all about improving productivity?)
The solution?
We need a clear and simple criterion to decide which code goes where.
Note that the absolute minimum for a DAL is the following:
public class CustomersEal
{
private Northwind db = new Northwind(@"c:\northwind\northwnd.mdf");
public IQueryable<Customer> GetCustomers()
{
return db.Customers;
}
}
It's a bit silly of course, if that's all this layer does, we might just as well skip it (the connection string should be externalized in a configuration file anyway, and a default constructor that reads the connection string from the config file should be added to the Northwind DataContext in a partial class). Silly or not, it is a "lower bound" to an EAL as we have defined it here. I believe there's an "upper bound" too: I think the DAL shouldn't do projections (well, it definitely should not expose anonymous types). But that still leaves us with a very broad range. How to make a choice?
I'm inclined to say that the only way to make a clear and simple choice once and for all, it to go with the minimalist approach. And indeed, that means we don't need/write/use an Entity Access Layer. The business logic directly accesses the one assembly generated by SQLMetal, one assembly per database that is.
How's that for a DAL?
- Posted by Kris Vandermotten on December 31, 2006
There is no doubt that Linq to SQL will have an enormous impact on the way we write data access layers. I wouldn't be surprised to find out that the impact is so profound, that we might even have to reconsider the very nature of a data access layer. In fact, what is a data access layer (DAL) anyway?
Let's start by trying to create a (working) definition of a DAL. Wikipedia is usually a good place to start, but you'll find that the Wikipedia article on DAL's doesn't exactly contain all the answers. So let' give it a try ourselves.
A DAL is a layer. That means it is part of a layered architecture. Other layers use the DAL to do data access. Indeed, the DAL is the layer accessing the data (and in the context of Linq to SQL, that's relational data), and no other layers access the data directly.
That's a good start, but what is layer? Is that a special kind of component? Not in my mind it isn't. To me, a layer can contain multiple components, and that applies to a DAL as well. Let's say I have a simple banking system. It contains functionality on clients, their accounts, and the operations (such as money transfers) they do on those accounts. That might result in a vertical partitioning of the application in three modules, "Clients", "Accounts" and "Operations". Each of those would be layered, and you'd find components at the intersections of the vertical modules and the horizontal layers. So you'd have "Clients DAL", "Accounts DAL" and "Operations DAL" components. Obviously, these components are related, they have dependencies between them. The Operations DAL depends upon the Accounts DAL (and possibly the Clients DAL as well), and the accounts DAL depends on the Clients DAL.
In .NET, components like these correspond to assemblies. So our DAL would consist of several assemblies, with (non-circular) references (dependencies) to each other. Which part of the functionality do we put where?
The Clients DAL doesn't know about accounts, that's the responsibility of the Accounts DAL. That one knows about accounts, and about clients as well. After all, accounts are owned by clients. So that means the function to retrieve the list of accounts belonging to a given client sits in the Accounts DAL, not in the Clients DAL. Since this function has a client as a parameter (or at least a client id), the Accounts DAL may indeed need a reference to the Clients DAL. Each object returned by this function has a reference to the client owning the account, or at least the id of that client.
But wait, what's the impact of Linq to SQL on what I said so far? If I have a database with Clients and Accounts tables (amongst many others) with a foreign key between them, the typical Client and Account entity classes will have a relationship between them as well. The Account class will have a delay-loaded Client property, and the Client class will have an Accounts property. That's a mutual dependency, so these two classes need to sit in the same assembly. But where does that lead us to? Typically, all tables in a database are somehow related to each other, i.e. there are no disconnected islands of tables with relations between them, but no relations to other islands in the database. But that leads us to just one DAL per database! Is that what we want?
Well, SQLMetal, the Linq to SQL tool that generates an entity model based on a database schema, definitely pushes us in that direction. Typically, it generates one source code file containing one DataContext and all the entities in your data model. But that's just the entities though, that code doesn't do any data access! To actually access the data, you need to write queries! And those queries are the responsibility of our DAL components.
In our example above, we had three DAL components, and all three of them would access the same DataContext. That implies that the DataContext should exist in an assembly of its own, an assembly underlying all DAL assemblies. But that's an additional layer, isn't it?
Well maybe it is. Maybe we need to split our traditional Data Access Layer into two distinct sublayers. For lack of better terms, I'll call them the "Entity Layer" and the "Entity Access Layer".
The Entity Layer has just one assembly in it, so we might just as well refer to that assembly as the Entity Layer as well. The entire assembly is compiled from just one code file (and some housekeeping stuff perhaps, like an AssemblyInfo.cs file), generated by SQLMetal.
The Entity Access Layer (EAL) has several assemblies (three in our example), all using the Entity Layer. The EAL assemblies contain the actual queries.

Next time, we'll look at the interface between the EAL assemblies and the business layer: what parameters are used, what results are returned? Do we expose the Entity Layer types? Do we expose query expressions or query results only?
That's enough food for thought right now, and comments are more than welcome.