Creating a Data Access Layer with Linq to SQL, part 2

Last time, we looked at how Linq To SQL might impact how we think about what a Data Access Layer (DAL) is, based on the dependencies between assemblies. This time, we'll take a different approach: let's look at typical Linq to SQL code, and try to decide where to put it. I'll use a code sample from the "DLinq Overview for CSharp Developers" document included in the Linq May CTP (in C# 3.0, but the same applies to VB9).

A simple start

Let's take a look at the following code:

Northwind db = new Northwind(@"c:\northwind\northwnd.mdf"); 

var q = from c in db.Customers
where c.City == "London"
select c;

foreach (var cust in q)
Console.WriteLine("id = {0}, City = {1}", cust.CustomerID, cust.City);
 

It should be clear that the first line belongs in the DAL. The DataContext encapsulates a database connection, and knows about the physical location of the database. That is not something that higher layers should know about.

Let's say the actual query definition belongs in the DAL too, but clearly, the foreach loop sits in some higher layer. That means the two first statements need to be encapsulated in some function in the DAL, for example as follows (sticking with the "Entity Access Layer" terminology introduced before):

public class CustomersEal 
{
private Northwind db = new Northwind(@"c:\northwind\northwnd.mdf");
    public IQueryable<Customer> GetCustomersByCity(string city) 
{
return from c in db.Customers
where c.City == city
select c;
}
}
 

The business layer then contains the following code:

CustomersEal customersEal = new CustomersEal();
 
foreach (var cust in customersEal.GetCustomersByCity("London"))
    Console.WriteLine("id = {0}, City = {1}", cust.CustomerID, cust.City);
 

Looks good, doesn't it? All the business layer knows about the database, is that it can return Customer objects.

Problems

But wait, what if I write the following in my business layer:

CustomersEal customersEal = new CustomersEal();
 
var q = from c in customersEal.GetCustomersByCity("London")
        orderby c.ContactNamer
        select new { c.CustomerID, c.City };
 
foreach (var cust in q)
    Console.WriteLine("id = {0}, City = {1}", cust.CustomerID, cust.City);
 

This code highlights a few interesting facts.

First of all, it wasn't the DAL that executed the query, at least not in the traditional sense of the word. The DAL (CustomersEal to be precise) merely supplied the definition for the query. The query got executed when the foreach statement started looping over the result! In a traditional DAL, a call to a method like GetCustomersByCity would have executed the query, but not with Linq, at least not if we implement our code like this.

Secondly, the business layer can refine the query definition. This definitely has some advantages, but I realize some might argue that this is really bad. Note though, that the business layer cannot redefine the query, or execute just any query it wants. Or can it? You need the DataContext to start the process, and only the DAL has access to that, right? In fact, the Entity Layer generated by SQLMetal is referenced by the business layer too; it needs it to get to the definitions of the entities!

Thirdly, it is absolutely not clear where a developer should draw the line between what's business logic, and what belongs in the DAL. I could have moved the orderby into the DAL (especially if I always want customers to be ordered by their ContactName). But likewise, I could have moved the where clause to the business layer! How do I decide what to do?

I hate it when developers have to make choices like that during routine development. Choosing takes time, and that's not likely to improve productivity. But much worse is the fact that different developers will make different choices. Even a single developer may make different choices from one day to the next. That leads to inconsistencies in the code. Developers will spend more time trying to understand the code they're reading, because it doesn't always follow the same pattern. That's bad for productivity. In the worst case scenario, developers start rewriting each other's code, just so it matches their choice of the day. That kills productivity. (Wasn't Linq all about improving productivity?)

The solution?

We need a clear and simple criterion to decide which code goes where.

Note that the absolute minimum for a DAL is the following:

public class CustomersEal  
{
    private Northwind db = new Northwind(@"c:\northwind\northwnd.mdf");
 
    public IQueryable<Customer> GetCustomers()
    {
        return db.Customers;
    }
}
 

It's a bit silly of course, if that's all this layer does, we might just as well skip it (the connection string should be externalized in a configuration file anyway, and a default constructor that reads the connection string from the config file should be added to the Northwind DataContext in a partial class). Silly or not, it is a "lower bound" to an EAL as we have defined it here. I believe there's an "upper bound" too: I think the DAL shouldn't do projections (well, it definitely should not expose anonymous types). But that still leaves us with a very broad range. How to make a choice?

I'm inclined to say that the only way to make a clear and simple choice once and for all, it to go with the minimalist approach. And indeed, that means we don't need/write/use an Entity Access Layer. The business logic directly accesses the one assembly generated by SQLMetal, one assembly per database that is.

How's that for a DAL?