Building an Entity Framework 4.0 model on views: practical tips

Many development teams and database administrators use views to create an abstraction layer on top of a physical data model. In this article I'll show you how to build an Entity Framework 4.0 (EF4) model upon such a set of views. I'll create a couple of views in SQL Server -some of them indexed-, import them in a model using the Visual Studio designer, decorate them with different types of associations (foreign key and mapped table), and finally attack the model with some Linq queries. This is how the model looks like: Creating the views Views that are directly consumed by your application should be stable. That's why I prefer such views to be declared with the SCHEMABINDING option: CREATE VIEW U2UConsult.Person WITH SCHEMABINDING AS SELECT BusinessEntityID AS PersonId,        Title,        FirstName,        LastName   FROM Person.Person The SCHEMABINDING option protects a view against modifications to the schema of the underlying tables, at least modifications that would invalidate that view. E.g. it becomes impossible to drop a column on which the view relies: Optional step: indexing the views Sometimes it makes sense to persist the view on disk and let SQL Server make sure that its content remains in sync with the underlying tables. This is very useful for complex views (with lots of joins and calculations) on stable tables. All we need to do is creating a clustered index on the view: CREATE UNIQUE CLUSTERED INDEX [IUX_Person] ON [U2UConsult].[Person]([PersonId]) WITH (FILLFACTOR = 100)   Importing the views You import a view into an entity model just like you import a table. But views -even indexed ones- can not have primary or foreign keys in the database, so there's no metadata to import. The visual designer overcompensates this by inferring a key composed of all non-nullable columns. This is not a good idea: the first thing you need to do is define the primary key of the view: before after Now do the same steps for the Address view: CREATE VIEW U2UConsult.Address WITH SCHEMABINDING AS SELECT AddressID AS AddressId,        AddressLine1,        AddressLine2,        PostalCode,        City FROM Person.Address GO   CREATE UNIQUE CLUSTERED INDEX [IUX_Address] ON [U2UConsult].[Address]([AddressId]) WITH (FILLFACTOR = 100) GO   Defining 1-to-1 or 1-to-many relationships In a table, you would express 1-to-1 or 1-to-many relationships by creating a foreign key relationship. In an entity model, you can do the same with views. For starters, define a new association between two views. The model looks good now, and IntelliSense will help you while building Linq queries against it. However, your missing an important ingredient: the physical model doesn't know to solve the association:  When creating the association, don't forget to check the 'add foreign key properties' box: If necessary, you can refine the data store binding by clicking on the ellipsis next to 'referential constraint' in the properties window: After that, you need to remove the view from the designer. The view is used as entity ànd as association, EF4 does not like that: Defining many-to-many relationships Many-to-many relations are generally implemented through an intermediate table. A many-to-many relationship between two views is built exactly the same way. The AdventureWorks2008 database has an intermediate table between Address and BusinesEntity (= Person): BusinessEntityAddress. Unfortunately we can't use this table to carry the association. Strangely enough the entity framework requires that all its physical (SSDL) primary key fields should be mapped. Using that table as the glue between Persons and Addresses yields the following error:   As a work around, you could define the link table as a view: CREATE VIEW U2UConsult.PersonAddress WITH SCHEMABINDING AS SELECT DISTINCT BusinessEntityID AS PersonId,        AddressID AS AddressId FROM Person.BusinessEntityAddress Then add it to the entity model, and map the association to it. The foreign key checkbox will be disabled for many-to-many associations:   Both types of associations were created using the designer. I didn't need to manually tweak the SSDL-part of the model. So when we upgrade it from the database they will remain intact. Querying the views For Linq it doesn't matter where the data comes from, so you use the views like you would use a table: Person person = (from p in entities.Person.Include("Phones"                  where p.PersonId == 1                  select p).FirstOrDefault(); This gives the following result: If EF4 performance matters to you, you might want to (re-)read this article. Source Here's the sample project, all SQL queries are included: U2UConsult.EF4.Views.Sample.zip (14,33 kb) Enjoy!

A fistful of Entity Framework 4.0 Tips

This article presents some useful tips for building the data access layer of an enterprise application on top of Entity Framework 4.0 (EF40). For those who can't wait, here they are: 1. Only project the columns you really need, 2. Stay away from the Include syntax, 3. Consider alternatives, 4. But then come back and try harder, and 5. Always return custom data tracking objects. The focus of EF40 lies on developer productivity, not database performance. So there are a couple of caveats you should be aware of when you don't want your data access to become the bottleneck. I'll illustrate my tips by showing you some different ways to issue a left outer join on a small entity model with just the Person and PersonPhone entities: Tip 1: Only project the columns you really need You should never return full-fledged entities from your queries. If you're only interested in the FirstName and LastName of Person entities, then the following EF40 query is definitely a bad idea: var query = from p in model.People             select p; A query like this selects all the columns from the underlying table. It most probably has only one covering index in the database: the clustered index. This query will suffer from all kinds of locks on the table. Just execute the following SQL (from Visual Studio or SQL Management Studio), and then start the Linq query: BEGIN TRANSACTION       UPDATE Person.Person        SET Title = NULL      WHERE BusinessEntityID = 1 If default isolation levels are applied to the database, the Linq query will be blocked and eventually time out: [Don't forget to Rollback the transaction.] You should only project (that's just a fancy word for 'select') the needed columns, like this: var query = from p in model.People             select new PersonDto() { LastName = p.LastName, FirstName = p.FirstName }; With a little luck -and help from your database administrator- there might be a suitable covering index in the database that swiftly produces your result set, unhindered by locks. Here's a screenshot of SQL Server Management Studio displaying the generated query plans, and their corresponding costs: The second query runs 24 times faster than the first one. I don't know about you, but I would call this a significant improvement! Tip 2: Stay away from the Include syntax The Include syntax from EF 4.0 is the successor of the LoadOptions from Linq-to-SQL. It allows you to eagerly load associated entities. Here's a sample query, returning persons and their phones: var query = from p in model.People.Include("PersonPhones")             select p; Although it looks like a declarative outer join, it generates weird T-SQL:  SELECT [Project1].[BusinessEntityID] AS [BusinessEntityID],         [Project1].[PersonType] AS [PersonType],         [Project1].[NameStyle] AS [NameStyle],         [Project1].[Title] AS [Title],         [Project1].[FirstName] AS [FirstName],         [Project1].[MiddleName] AS [MiddleName],         [Project1].[LastName] AS [LastName],         [Project1].[Suffix] AS [Suffix],         [Project1].[EmailPromotion] AS [EmailPromotion],         [Project1].[AdditionalContactInfo] AS [AdditionalContactInfo],         [Project1].[Demographics] AS [Demographics],         [Project1].[rowguid] AS [rowguid],         [Project1].[ModifiedDate] AS [ModifiedDate],         [Project1].[C1] AS [C1],         [Project1].[BusinessEntityID1] AS [BusinessEntityID1],         [Project1].[PhoneNumber] AS [PhoneNumber],         [Project1].[PhoneNumberTypeID] AS [PhoneNumberTypeID],         [Project1].[ModifiedDate1] AS [ModifiedDate1]    FROM (SELECT [Extent1].[BusinessEntityID] AS [BusinessEntityID],                 [Extent1].[PersonType] AS [PersonType],                 [Extent1].[NameStyle] AS [NameStyle],                 [Extent1].[Title] AS [Title],                 [Extent1].[FirstName] AS [FirstName],                 [Extent1].[MiddleName] AS [MiddleName],                 [Extent1].[LastName] AS [LastName],                 [Extent1].[Suffix] AS [Suffix],                 [Extent1].[EmailPromotion] AS [EmailPromotion],                 [Extent1].[AdditionalContactInfo] AS [AdditionalContactInfo],                 [Extent1].[Demographics] AS [Demographics],                 [Extent1].[rowguid] AS [rowguid],                 [Extent1].[ModifiedDate] AS [ModifiedDate],                 [Extent2].[BusinessEntityID] AS [BusinessEntityID1],                 [Extent2].[PhoneNumber] AS [PhoneNumber],                 [Extent2].[PhoneNumberTypeID] AS [PhoneNumberTypeID],                 [Extent2].[ModifiedDate] AS [ModifiedDate1],                 CASE WHEN ([Extent2].[BusinessEntityID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]            FROM [Person].[Person] AS [Extent1]          LEFT OUTER JOIN [Person].[PersonPhone] AS [Extent2]              ON [Extent1].[BusinessEntityID] = [Extent2].[BusinessEntityID]         )  AS [Project1] ORDER BY [Project1].[BusinessEntityID] ASC, [Project1].[C1] ASC   As an alternative, you could explicitly code a Linq outer join, or project the association. The database doesn't care, the two join flavors yield the same T-SQL query: var query = from p in model.People             join pp in model.PersonPhones             on p.BusinessEntityID equals pp.BusinessEntityID             into phones             select new PersonWithPhonesDto() { LastName = p.LastName, PersonPhones = phones }; var query = from p in model.People             select new PersonWithPhonesDto() { LastName = p.LastName, PersonPhones = p.PersonPhones }; This is the resulting T-SQL query:   SELECT [Project1].[BusinessEntityID] AS [BusinessEntityID],        [Project1].[LastName] AS [LastName],        [Project1].[C1] AS [C1],        [Project1].[BusinessEntityID1] AS [BusinessEntityID1],        [Project1].[PhoneNumber] AS [PhoneNumber],        [Project1].[PhoneNumberTypeID] AS [PhoneNumberTypeID],        [Project1].[ModifiedDate] AS [ModifiedDate]   FROM (SELECT [Extent1].[BusinessEntityID] AS [BusinessEntityID],                [Extent1].[LastName] AS [LastName],                [Extent2].[BusinessEntityID] AS [BusinessEntityID1],                [Extent2].[PhoneNumber] AS [PhoneNumber],                [Extent2].[PhoneNumberTypeID] AS [PhoneNumberTypeID],                [Extent2].[ModifiedDate] AS [ModifiedDate],                CASE                   WHEN ([Extent2].[BusinessEntityID] IS NULL) THEN CAST(NULL AS int)                   ELSE 1                END AS [C1]     FROM  [Person].[Person] AS [Extent1]     LEFT OUTER JOIN [Person].[PersonPhone] AS [Extent2] ON [Extent1].[BusinessEntityID] = [Extent2].[BusinessEntityID] )  AS [Project1] ORDER BY [Project1].[BusinessEntityID] ASC, [Project1].[C1] ASC   It's still a weird query, but thanks to the 'LastName' projection it runs twice as fast. Here's the proof: Neither the Include nor the standard Linq outer join allow to project selected columns from the PersonPhones table. And by the way: it would be nice if we could get rid of the unwanted sort operation that takes almost 60 % of the processing. Tip 3: Consider alternatives In the methods that require the best possible SQL queries, you might be tempted to abandon the Entity Framework and use another option. Linq to SQL If you're only targeting SQL Server, then Linq-to-SQL (L2S) provides a nice alternative for EF. According to the rumors, Linq-to-SQL still generally produces a higher quality of T-SQL. So let's check it out. Here's the outer join in L2S (it's the same as in EF40) : var query = from p in model.Persons             select new PersonWithPhonesDto() { LastName = p.LastName, PersonPhones = p.PersonPhones }; Here's the resulting query: SELECT [t0].[LastName], [t1].[BusinessEntityID], [t1].[PhoneNumber], [t1].[PhoneNumberTypeID], [t1].[ModifiedDate],(     SELECT COUNT(*)       FROM [Person].[PersonPhone] AS [t2]      WHERE [t2].[BusinessEntityID] = [t0].[BusinessEntityID]       ) AS [value]     FROM [Person].[Person] AS [t0] LEFT OUTER JOIN [Person].[PersonPhone] AS [t1]       ON [t1].[BusinessEntityID] = [t0].[BusinessEntityID] ORDER BY [t0].[BusinessEntityID], [t1].[PhoneNumber], [t1].[PhoneNumberTypeID] Here's a comparison of the query plans in SQL Server Management Studio: As you observe, the L2S query takes more resources than the EF40 version. This is not a real surprise: as opposed to L2S, EF40 is continuously improving. Its bugs get fixed and the Linq provider gets smarter with every iteration. So sticking -or returning- to L2S might give you only a short term advantage. Brew your own query Where query-generating technologies fail, you should build the T-SQL queries yourself. It's the only way to get full access to the database syntax: ranking functions, cross apply calls, common table expressions, optimization and locking hints, etc. Fortunately you can do this whilst still standing on the shoulders of EF40. You don't have to programatically instantiate the whole underlying ADO.NET object stack (Connection, Adapter, Command, Datareader) yourself. EF40 will do it for you: string joinStatement = @"SELECT [t0].[LastName], [t1].[PhoneNumber]                         FROM [Person].[Person] AS [t0]             LEFT OUTER JOIN [Person].[PersonPhone] AS [t1]                             ON [t1].[BusinessEntityID] = [t0].[BusinessEntityID]";   var query = model.ExecuteStoreQuery<PersonWithPhoneDto>(joinStatement).ToList(); Here's the comparison between the Linq outer join and the T-SQL outer join:  The home-made query runs three times faster. This is because we're now able to select only the needed columns from both tables. Tip 4: But then come back to EF40, and try harder If an alternative technology produces much better results than EF40, then you must have done something wrong. After all, EF40 is Microsoft's main data access technology. You observed that simple EF40 Linq queries yield weird T-SQL. Well, the opposite is also true. Consider the following query: var query = from p in model.People             from pp in                 (from zz in model.PersonPhones                     where p.BusinessEntityID == zz.BusinessEntityID                     select new { zz.PhoneNumber }).DefaultIfEmpty()             select new PersonWithPhoneDto() { LastName = p.LastName, PhoneNumber = pp.PhoneNumber }; It's an inner join between 'all persons' and 'all person phones or a default value'. In other words: it's an (left) outer join. On top of that, it only fetches the needed columns from both tables. An indeed this query yields a T-SQL query that is spot on the ideal version. I forgive EF40 for returning the primary key column. This takes no extra database resources -just bandwidth- and you probably need the value anyway in your business layer:          SELECT [Extent1].[BusinessEntityID] AS [BusinessEntityID],                  [Extent1].[LastName] AS [LastName],                 [Extent2].[PhoneNumber] AS [PhoneNumber]            FROM [Person].[Person] AS [Extent1] LEFT OUTER JOIN [Person].[PersonPhone] AS [Extent2]              ON [Extent1].[BusinessEntityID] = [Extent2].[BusinessEntityID] The same query also runs in L2S, but does not return the BusinessEntityId column. And neither of the queries cause an internal sort! Tip 5: Always return custom data transfer objects Your should avoid returning Self Tracking Entities from your data access layer methods. Use lightweight custom data transfer objects. Only this technique will help you to put all previous tips in practice. Source code Here's the full source code of the sample project. I put the L2S code in a separate assembly to avoid namespace collisions: U2UConsult.EF4.Linq.Tips.zip (29,37 kb) Enjoy!

Getting and setting the Transaction Isolation Level on a SQL Entity Connection

This article explains how to get, set, and reset the transaction isolation level on a SQL and Entity connection. In a previous article I already explained how important it is to explicitly set the appropriate isolation level when using transactions. I'm sure you're not going to wrap each and every database call in an explicit transaction (TransactionScope is simply too heavy to wrap around a simple SELECT statement). Nevertheless you should realize that every single T-SQL statement that you launch from your application will run in a transaction, hence will behave according to a transaction isolation level.  If you don't explicitely start a transaction or use a transaction scope, then SQL Server will run the statement as a transaction on its own. You don't want your SQL commands to read unofficial data (in the READ UNCOMMITTED level) or apply too heavy locks (in the SERIALIZABLE level) on the database, so you want to make sure that you're running with the correct isolation level. This article explains how to do this. Setting the Transaction Isolation Level Setting the appropriate isolation level on a session/connection is done with a standard T-SQL statement, like the following: T-SQL SET TRANSACTION ISOLATION LEVEL REPEATABLE READ It's easy to write an extension method on the SqlConnection and EntityConnection classes that allows you to do this call in C#, like this: C# using (AdventureWorks2008Entities model = new AdventureWorks2008Entities()) {     model.Connection.Open();       // Explicitely set isolation level     model.Connection.SetIsolationLevel(IsolationLevel.ReadUncommitted);       // Your stuff here ... } Here's the whole extension method. I implemented it against IDbConnection to cover all connection types. The attached Visual Studio solution contains the full code: C# public static void SetIsolationLevel(this IDbConnection connection, IsolationLevel isolationLevel) {     if (isolationLevel == IsolationLevel.Unspecified || isolationLevel == IsolationLevel.Chaos)     {         throw new Exception(string.Format("Isolation Level '{0}' can not be set.", isolationLevel.ToString()));     }       if (connection is EntityConnection)     {         SqlConnection sqlConnection = (connection as EntityConnection).StoreConnection as SqlConnection;         sqlConnection.SetIsolationLevel(isolationLevel);     }     else if (connection is SqlConnection)     {         IDbCommand command = connection.CreateCommand();         command.CommandText = string.Format("SET TRANSACTION ISOLATION LEVEL {0}", isolationLevels[isolationLevel]);         command.ExecuteNonQuery();     } }   Getting the Transaction Isolation Level If you want to retrieve the current isolation level on your connection, you have to first figure out how to do this in T-SQL. Unfortunately there is no standard @@ISOLATIONLEVEL function or so. Here's how to do it: T-SQL SELECT CASE transaction_isolation_level           WHEN 0 THEN 'Unspecified'           WHEN 1 THEN 'Read Uncommitted'           WHEN 2 THEN 'Read Committed'           WHEN 3 THEN 'Repeatable Read'           WHEN 4 THEN 'Serializable'           WHEN 5 THEN 'Snapshot'        END AS [Transaction Isolation Level]   FROM sys.dm_exec_sessions  WHERE session_id = @@SPID Altough you're querying a dynamic management view, the code requires no extra SQL Permissions (not even VIEW SERVER STATE). A user can always query his own sessions. Again, you can easily wrap this into an extension method, that can be called like this: C# using (AdventureWorks2008Entities model = new AdventureWorks2008Entities()) {     model.Connection.Open();       // Get isolation level     // Probably returns 'ReadUncommitted' (due to connection pooling)     MessageBox.Show(model.Connection.GetIsolationLevel().ToString()); } If you run the same code from the attached project, you'll indeed notice that the returned isolation level will be -most probably- READ UNCOMMITTED. This is because we -most probably- reuse the connection from the SetIsolationLevel() sample. As I already mentioned in a previous article, the transaction isolation level is NOT reset on pooled connections. So even if you're not explicitly use transactions, you still should still set the appropriate transaction isolation level. There's no default you can rely on. OK, here's the corresponding extension method: C# public static IsolationLevel GetIsolationLevel(this IDbConnection connection) {     string query =         @"SELECT CASE transaction_isolation_level                     WHEN 0 THEN 'Unspecified'                     WHEN 1 THEN 'ReadUncommitted'                     WHEN 2 THEN 'ReadCommitted'                     WHEN 3 THEN 'RepeatableRead'                     WHEN 4 THEN 'Serializable'                     WHEN 5 THEN 'Snapshot'                     END AS [Transaction Isolation Level]             FROM sys.dm_exec_sessions             WHERE session_id = @@SPID";       if (connection is EntityConnection)     {         return (connection as EntityConnection).StoreConnection.GetIsolationLevel();     }     else if (connection is SqlConnection)     {         IDbCommand command = connection.CreateCommand();         command.CommandText = query;         string result = command.ExecuteScalar().ToString();           return (IsolationLevel)Enum.Parse(typeof(IsolationLevel), result);     }       return IsolationLevel.Unspecified; } Simple and powerful, isn't it? Stuff like this should ship with the framework! Temporarily using a Transaction Isolation Level With the new GetIsolationLevel() and SetIsolationLevel() methods it becomes easy to set the isolation level to execute some commands, and then reset the level to its original value. I wrapped these calls in a class implementing IDisposable so you can apply the using statement, like this: C# using (AdventureWorks2008Entities model = new AdventureWorks2008Entities()) {     model.Connection.Open();       // Set and reset isolation level     using (TransactionIsolationLevel inner = new TransactionIsolationLevel(model.Connection, IsolationLevel.Snapshot))     {         // Your stuff here ...     } } Again, the code is very straightforward. All you need is a constructor, a Dispose-method, and a variable to store the original isolation level: C# /// <summary> /// Transaction Isolation Level. /// </summary> public class TransactionIsolationLevel : IDisposable {     /// <summary>     /// The database connection.     /// </summary>     private IDbConnection connection;       /// <summary>     /// Original isolation level of the connection.     /// </summary>     private IsolationLevel originalIsolationLevel;       /// <summary>     /// Initializes a new instance of the TransactionIsolationLevel class.     /// </summary>     /// <param name="connection">Database connection.</param>     /// <param name="isolationLevel">Required isolation level.</param>     public TransactionIsolationLevel(IDbConnection connection, IsolationLevel isolationLevel)     {         this.connection = connection;         this.originalIsolationLevel = this.connection.GetIsolationLevel();         this.connection.SetIsolationLevel(isolationLevel);     }       /// <summary>     /// Resets the isolation level back to the original value.     /// </summary>     public void Dispose()     {         this.connection.SetIsolationLevel(this.originalIsolationLevel);     } }   Source Code The attached project contains the extension methods, the IDisposable class, and some demo calls against a local AdventureWorks database: Here it is: U2UConsult.SQL.TransactionIsolationLevel.Sample.zip (87,53 kb) Enjoy!

Transactions and Connections in Entity Framework 4.0

This article describes where, why, and how to use TransactionScope in the Entity Framework 4.0. The proposed best practices apply to a medium to large data access layer, e.g. a DAL that is implemented as one or more WCF services with lots of internal calls. Allow me to start with the conclusions: Always execute all data base access inside a TransactionScope, always explicitely specify an isolation level, and always explicitely open the entities' connection. If you trust me, implement these rules in all your projects. If you don't trust me, continue reading to figure out why. Why you should open the connection explicitely. So you decided to continue reading. Thanks for the confidence . Default data adapter behavior Under the hood of the newest 4.0 Entity Framework, the real work is still done by ye olde ADO.NET 1.* data adapter. Such a data adapter needs a connection and a command to do its work. If the connection is open, the adapter executes the command and leaves the connection open. If the connection is closed, the adapter opens it, executes the command, and then politely closes the connection again. When using the Entity Framework, you can open the connection explicitely as follows: using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities()) {     entities.Connection.Open();       // Your stuff here ... } If you don't open the connection explicitely, it will be opened and closed for you. That is convenient, but in a busy complex data access layer the same physical connection will be openened and closed over and over again. I now hear you saying 'What's wrong with that, the connection is pooled anyway, so there's no overhead in opening and closing.' Well, actually the data adapters behavior comes with a price, and I'm sure you will not always want to pay that price. Performance impact Data base connections are not created each time you need one. In most cases a connection is fetched from the connection pool. A pooled connection is always first cleaned up by the .NET data client with a call to the sp_reset_connection system procedure. A complete list of that procedure's actions can be found here. The list includes the following: It resets all error states and numbers (like @@error), it stops all execution contexts (EC) that are child threads of a parent EC executing a parallel query, it waits for any outstanding I/O operations, it frees any held buffers on the server by the connection, it unlocks any buffer resources that are used by the connection, it releases all memory allocated to the connection, it clears any work or temporary tables that are created by the connection, it kills all global cursors owned by the connection, it closes any open SQL-XML handles that are opened by the connection, it deletes any open SQL-XML related work tables, it closes all system tables, it closes all user tables, it drops all temporary objects, it aborts open transactions, it defects from a distributed transaction when enlisted, it decrements the reference count for users in current database, it frees acquired locks, it resets all SET options to the default values, it resets the @@rowcount value, it resets the @@identity value, it resets any session level trace options using dbcc traceon(), and fires Audit Login and Audit Logout events. If you don't explicitely open the connection yourself, then every other call will be a call to sp_reset_connection. Please don't start panicking now: SQL Server does all of this extremely fast. The demo application that you find at the end of this article clearly shows that even a high number of calls to the procedure incurs no noticable performance impact. These numerous calls only disturb your monitoring experience with SQL Profiler. The following images show a profiler session where the same workload is executed with and without explicitely opening the connection: Default behaviorExplicit open   Although every other SQL command is a call to sp_reset_connection, there's hardly any impact on the response time: But anyway there ís a performance penalty, and you can avoid it. Distributed transaction escalation The continuing opening and closing of a (pooled) connection has more than just a small performanct impact when you're running inside a TransactionScope. It can cause a local transaction to escalate to a distributed transaction unnecesarily. When a transaction spans more than one resource manager (database or queuing system) or involves too many connections, the .NET client decides to call help from the Distributed Transaction Coordinator service (MSDTC) in the Operating System. When that escalation exactly takes place depends largely on the version of SQL Server you're working against. The more recent versions sustain local mode longer. I only have SQL Server 2008R2 instances in my network. These instances even allow multiple connections to share the same transaction without escalating to MSDTC. The only way to force an escalation on my machine, is disabling connection pooling in the connection string (by adding Pooling=False). Escalating to a distributed transaction takes a lot of resources: different services need to initialize and communicate. Here's a screenshot of the demo application, compare the response time for a regular escalated call to the rest (the very first escalation -when stuff needs to be initialized- takes between 4 and 8 seconds): Of course some of the observed overhead comes from the lack of pooling. Anyway, I believe that the difference in response time is high enough to catch your attention. You can strongly reduce the escalation risk by minimizing the number of physical connections per transaction by explicitely opening the connection and keeping it open until you commit. Here's again a comparison of profiler traces, with and without explicit opening: Default behaviorExplicit open Opening the connection brings the response time of the sample workload back to normal: Why you should use TransactionScope Local transactions can only escalate to distributed transactions when you're running in a TransactionScope. So you might decide not to use TransactionScope after all. Unfortunately this is not an option. One of the few things that sp_reset_connection doesn't do is resetting the transaction isolation level. I personally consider this a frakking bug. According to Microsoft, the behavior is by design so this bug will never be solved. If you want tight control over the performance and scalability of your data access, then managing the isolation level is absolutely crucial. There are two ways to control the isolation level in a DAL method: explicitely start a transaction, or set the isolation level on the connection programmatically by issuing a SET TRANSACTION ISOLATION LEVEL statement. Explicitely starting a transaction I see three ways to start a transaction: using SQL statements BEGIN-COMMIT-ROLLBACK starting a transaction on the connection using .NET code, or using TransactionScope. The good thing about the first two techniques is that their default isolation level is READ COMMITTED, which is a nice trade-off between performance (minimal locking overhead) and reliability (no dirty reads possible). For this reason, READ COMMITTED is the default isolation level in the ANSI and ISO standards. Of course you could specify another level, if required. I will not elaborate on T-SQL transactions. You should only use these inside stored procedures and batch scripts, never in C# programs. Here's an example on using C# to start a transaction on the connection. Inside the transaction, you always know the isolation level (because you set it, or because there is a reliable default): using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities(entitiesConnectionstring)) {     entities.Connection.Open();       // Always returns 'Read Committed'     DbTransaction trx = entities.Connection.BeginTransaction();     this.ErrorLabel.Content = trx.IsolationLevel.ToString();       // Your stuff here ...       trx.Commit(); } In the context of a larger application, these first two techniques are too tightly connected to the database: you need a reference to the physical connection to control the transaction and its isolation level. So these techniques are only applicable inside the data access layer. The DAL is not the place where we define what 'a logical unit of works' i.e. 'a transaction' is. That's a decision that should be taken in the business layer, that layer doesn't own the connection. So there are reasons enough to use TransactionScope. Here's how to use TransactionScope. As you see the scope requires no access to the physical connection: // In the Business Layer using (System.Transactions.TransactionScope scope = new System.Transactions.TransactionScope(TransactionScopeOption.RequiresNew)) {     // In the Data Access Layer     using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities(entitiesConnectionstring))     {         // Always returns 'Serializable'         this.ErrorLabel.Content = Transaction.Current.IsolationLevel.ToString();           // Your stuff here ...     }     scope.Complete(); } For reasons that I don't understand, Microsoft has chosen the second worst default for TransactionScope isolation level: SERIALIZABLE. [In case you were having doubts: READ UNCOMMITTED would be the worst choice.] This means that if you don't specify a better isolation level yourself, you place a lock on everything you read, and due to lock escalation you'll probably even lock rows that you don't even read. Needless to say you'll end up with scalability issues. Setting the isolation level in T-SQL When starting a transaction on the connection, or when opening a TransactionScope, you can (and should) specify the appropriate isolation level. There's a third way to control the isolation level on a connection: T-SQL. This is again a low level technique that requires access to the physical data base connection. Here's an example on how to set the isolation level when using Entity Framework: using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities()) {     entities.Connection.Open();     entities.ExecuteStoreCommand("SET TRANSACTION ISOLATION LEVEL READ COMMITTED");       // Your stuff here ... } The isolation level and its corresponding locking strategy will be applied to all your queries on the connection, whether or not you wrap them in a transaction. If you use this technique, you should not forget to reset the isolation level to its original value at the end of your method. Unfortunately there's no straightforward way to determine the current isolation level for a connection. [Keep reading this blog: I'm working on a solution] If you don't explicitely set the isolation level to its previous value, the new value will remain active on the connection. .NET nor SQL Server will reset it, not even when your code was called from inside a transaction! Ignoring the isolation level If you don't take control of it, you have no idea in which transaction isolation level your queries will be running. After all, you don't know where the connection that you got from the pool has been: it could have come from a SSIS-package doing some bulk maintenance (where SERIALIZABLE is the default), or it could been returned from an application that monitors that same SSIS-package (through a READUNCOMMITTED transaction allowing dirty reads). You simply inherit the last used isolation level on the connection, so you have no idea which type of locks are taken (or worse: ignored) by your queries and for how long these locks will be held. On a busy database, this will definitely lead to random errors, time-outs and deadlocks. Demo To illustrate the arbitrary isolation level, the sample application contains the following code snippet that is executed after the transaction. It executes the T-SQL version of Thread.Sleep, which gives you some time to monitor it from the outside: using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities(entitiesConnectionstring)) {     // Script will reveal different values, depending on connection pooling     entities.ExecuteStoreCommand("WAITFOR DELAY '00:00:10'"); } While you run the main application, you can monitor the connections with the following SQL script: SELECT CASE transaction_isolation_level        WHEN 0 THEN 'Unspecified'        WHEN 1 THEN 'Read Uncommitted'        WHEN 2 THEN 'Read Committed'        WHEN 3 THEN 'Repeatable Read'        WHEN 4 THEN 'Serializable'        WHEN 5 THEN 'Snapshot'        ELSE 'Bazinga'        END AS [Transaction Isolation Level]       ,session_id AS [Session Id]       ,login_time AS [Login Time]   FROM sys.dm_exec_sessions  WHERE program_name = '.Net SqlClient Data Provider' For the record: Bazinga is not an official isolation level . You'll see that the isolation level for the connection is not stable: So you should always specify an isolation level yourself. According to SQL Server Books-on-line, setting the isolation level requires a solid understanding of transaction processing theory and the semantics of the transaction itself, the concurrency issues involved, and the consequences for system consistency. I couldn't agree more. How to detect escalation When using TransactionScope, there's always the risk of escalation to distributed transactions. So you should keep an eye on these. There are a couple of techniques to detect and monitor distributed transactions. Here are three of them: Component services Via Control Panel -> Administrative Tools -> Component Services you reach the management console snap-in, that allows you to monitor MSDTC: That same tool also alows you to configure MSDTC, and setup logging. SQL Server Profiler You can also trace distributed transactions with SQL Server Profiler: Stop MSDTC and watch the exceptions in your program Letting the application crash by stopping the MSDTC service, is my favorite solution  (well, at least in the developer or test environment). The trick only works with newer versions of .NET/SQL Server. In the past, MSDTC should run even for local transactions. That bug was fixed. Here's a way to stop MSDTC from C# (must run as administrator): using (ServiceController sc = new ServiceController("Distributed Transaction Coordinator")) {     if (sc.Status == ServiceControllerStatus.Running)     {         sc.Stop(); // Admins only !!     } } Here's the result in the test application: Source code Here's the full test project: U2UConsult.EF40.TransactionScope.Sample.zip (73,24 kb) [You might want to change the AdventureWorks2008 connection strings in app.config.] Enjoy!

Optimistic concurrency using a SQL DateTime in Entity Framework 4.0

This article explains how to implement optimistic concurrency checking using a SQL Server DateTime or DateTime2 column. It's a follow-up of my previous article on using a TimeStamp column for that same purpose. In most -if not all- concurrency checking cases it actually makes more sense to use a DateTime column instead of a TimeStamp. The DateTime data types occupy the same storage (8 bytes) as a TimeStamp, or even less: DateTime2 with 'low' precision takes only 6 bytes. On top of that: their content makes sense to the end user. Unfortunately the DateTime data types are 'a little bit' less evident to use for concurrency checking: you need to declare a trigger (or a stored procedure) on the table, and you need to hack the entity model. A sample table Sample time! Let's start with creating a table to hold some data. Table definition Here's how the table looks like (the solution at the end of the article contains a full T-SQL script). The LastModified column will be used for optimistic concurrency checking: CREATE TABLE [dbo].[Hero](     [Id] [int] IDENTITY(1,1) NOT NULL,     [Name] [nvarchar](50) NOT NULL,     [Brand] [nvarchar](50) NULL,     [LastModified] [datetime] NULL,  CONSTRAINT [PK_Hero] PRIMARY KEY CLUSTERED (     [Id] ASC )) Trigger definition Unlike an Identity or a TimeStamp value, a DateTime value is not automatically generated and/or updated. So we have to give the database a little help, e.g. by creating a trigger for insert and update on that table: CREATE TRIGGER [dbo].[trg_iu_Hero] ON [dbo].[Hero] AFTER INSERT, UPDATE AS BEGIN    SET NOCOUNT ON;      UPDATE [dbo].[Hero]       SET LastModified = GETDATE()     WHERE Id IN (SELECT Id FROM inserted) END Alternatively, you could insert through a stored procedure.  A sample application I already prepared for you a small sample application. Here's how the main window looks like: Cute, isn't it ? The problem In the entity model, you have to make sure that the LastModified column has the correct settings (Fixed and Computed): Run the application with just the generated code. You will observe that when you update a record, the entity's LastModified property will NOT be updated. SQL Server Profiler will reveal that only an update statement is issued. The new value of LastModified is assigned by the trigger but NOT fetched: The solution In order to let the Entity Framework fetch the new value of the DateTime column -or whatever column that is modified by a trigger-, you need to hack the model's XML and manually add the following attribute in the SSDL: Somewhere in Redmond there will certainly be an architect who will provide an excuse for this behavior. To us developers, this sure smells like a bug. Anyway, if you re-run the application with the modified SSDL, the new DateTime value will appear after insert or update. SQL Server profiler reveals the extra select statement: Source Code Here's the source code, the whole source code, and nothing but the source code: U2UConsult.EF40.DateTimeConcurrency.Sample.zip (616,27 kb) Enjoy! Thank you   This article is dedicated to my 3-year old daughter Merel. Last week she briefly turned into a real angel, but then decided to come back. I want to thank from the bottom of my heart everybody who helped saving her life: her mama, her mammie, the MUG, and the emergency, reanimation, intensive care, and pediatric departments of the uza hospital.  

Self-Tracking Entities with Validation and Tracking State Change Notification

This article explains how to extend Self-Tracking Entities (STE) from Entity Framework (EF) 4.0 with validation logic and (tracking) state change notification, with just minimal impact on the T4 files. We'll build a two-tier application that submits local changes in a WPF application via a WCF service to a database table. The STE are extended with validation logic that is reusable on client and server. The client is notified when the change tracker of an entity changes its state. The tracking state is displayed to the end user as an icon. Here's the client application in action: For more details on the foundations of building N-Tier apps with EF 4.0, please read Peter Himschoots article. Source Code For the fans of the source-code-first approach, here it is: U2UConsult.SelfTrackingEntities.Sample.zip (622,23 kb) The structure of the solution is as follows: Preparation Database table First you need a SQL Server table. The provided source code contains a script to generate a working copy of the SalesReason table in the AdventureWorks2008 sample database. This is its initial content: Data Access Layer When you have it, it's time to fire up Visual Studio.NET. Create a WCF web service project with an ADO.NET Entity Model. Add the SalesReason2 table to the model (I renamed the entity and entity set to SalesReason and SalesReasons respectively). While you're in the designer, generate the code for the ObjectContext and the Self-Tracking Entities (right click in the designer, select "Add Code Generation Item", select "ADO.NET Self-Tracking Entity Generator"). Add the canonical service methods to fetch the full list of SalesReasons, and to add, delete, and update an individual SalesReason. Here's an example (I personally like to combine Add and Update operations in a Save method): public List<SalesReason> GetSalesReasons() {     using (AdventureWorks2008Entities model = new AdventureWorks2008Entities())     {         List<SalesReason> result = new List<SalesReason>();         result.AddRange(model.SalesReasons);         return result;     } }   public void DeleteSalesReason(SalesReason reason) {     using (AdventureWorks2008Entities model = new AdventureWorks2008Entities())     {         model.SalesReasons.Attach(reason);         model.SalesReasons.DeleteObject(reason);         model.SaveChanges();     } }   public SalesReason SaveSalesReason(SalesReason reason) {     using (AdventureWorks2008Entities model = new AdventureWorks2008Entities())     {         reason.ModifiedDate = DateTime.Now;         if (reason.ChangeTracker.State == ObjectState.Added)         {             model.SalesReasons.AddObject(reason);             model.SaveChanges();             reason.AcceptChanges();             return reason;         }         else if (reason.ChangeTracker.State == ObjectState.Modified)         {             model.SalesReasons.ApplyChanges(reason);             model.SaveChanges();             return reason;         }         else         {             return null; // or an exception         }     } }   Self Tracking Entities Add a new class library to the project, call it STE. Drag the Model.tt T4 template from the DAL to the STE project. Add a reference to serialization in the STE project. Add a reference to the STE in the DAL project. Everything should compile again now. WPF Client Add a WPF application to the solution. In this client project, add a reference to the STE, and a service reference to the DAL. Add a ListBox and some buttons, with straightforward code behind: private void RefreshSalesReasons() {     this.salesReasons = this.GetSalesReasons();       this.SalesReasonsListBox.ItemsSource = this.salesReasons; }   private ObservableCollection<SalesReason> GetSalesReasons() {     using (DAL.SalesReasonServiceClient client = new DAL.SalesReasonServiceClient())     {         ObservableCollection<SalesReason> result = new ObservableCollection<SalesReason>();         foreach (var item in client.GetSalesReasons())         {             result.Add(item);         }           return result;     } }   private void Update_Click(object sender, RoutedEventArgs e) {     SalesReason reason = this.SalesReasonsListBox.SelectedItem as SalesReason;     if (reason != null)     {         reason.Name += " (updated)";     } }   private void Insert_Click(object sender, RoutedEventArgs e) {     SalesReason reason = new SalesReason()     {         Name = "Inserted Reason",         ReasonType = "Promotion"     };     reason.MarkAsAdded();       this.salesReasons.Add(reason);       this.SalesReasonsListBox.ScrollIntoView(reason); }   private void Delete_Click(object sender, RoutedEventArgs e) {     SalesReason reason = this.SalesReasonsListBox.SelectedItem as SalesReason;     if (reason != null)     {         reason.MarkAsDeleted();     } }   private void Commit_Click(object sender, RoutedEventArgs e) {     using (DAL.SalesReasonServiceClient client = new DAL.SalesReasonServiceClient())     {         foreach (var item in this.salesReasons)         {             switch (item.ChangeTracker.State)             {                 case ObjectState.Unchanged:                     break;                 case ObjectState.Added:                     client.SaveSalesReason(item);                     break;                 case ObjectState.Modified:                     client.SaveSalesReason(item);                     break;                 case ObjectState.Deleted:                     client.DeleteSalesReason(item);                     break;                 default:                     break;             }         }           this.RefreshSalesReasons();     } } Now you're ready to extend the STE with some extra functionality. Validation It's nice to have some business rules that may be checked on the client (to provide immediate feedback to the user) as well as on the server (to prevent corrupt data in the database). This can be accomplished by letting the self-tracking entities implement the IDataErrorInfo interface. This interface just contains an indexer (this[]) to validate an individual property, and an Error property that returns the validation state of the whole instance. Letting the STE implement this interface can be easily done by adding a partial class file. The following example lets the entity complain if its name gets shorter than 5 characters: public partial class SalesReason : IDataErrorInfo {     public string Error     {         get         {             return this["Name"];         }     }       public string this[string columnName]     {         get         {             if (columnName == "Name")             {                 if (string.IsNullOrWhiteSpace(this.Name) || this.Name.Length < 5)                 {                     return "Name should have at least 5 characters.";                 }             }               return string.Empty;         }     } } If you add a data template to the XAML with ValidatesOnDataErrors=true in the binding, then the GUI will respond immediately if a business rule is broken. XAML: <TextBox    Width="180"    Margin="0 0 10 0">     <Binding        Path="Name"        Mode="TwoWay"        UpdateSourceTrigger="PropertyChanged"        NotifyOnSourceUpdated="True"        NotifyOnTargetUpdated="True"        ValidatesOnDataErrors="True"        ValidatesOnExceptions="True"/> </TextBox> Result: The same rule can also be checked on the server side, to prevent persisting invalid data in the underlying table: public SalesReason SaveSalesReason(SalesReason reason) {     if (!string.IsNullOrEmpty(reason.Error))     {         return null; // or an exception     }       ...   Notification of Tracking State Change By default, an STE's tracking state can be fetched by instance.ChangeTracker.State. This is NOT a dependency property, and its setter doesn't call PropertyChanged. Clients can hook an event handler to the ObjectStateChanging event that is raised just before the state changes (there is no ObjectStateChanged event out of the box). You're free to register even handlers in your client, but then you need to continuously keep track of which change tracker belongs to which entity: assignment, lazy loading, and (de)serialization will make this a cumbersome and error prone endeavour. To me, it seems more logical that an entity would expose its state as a direct property, with change notification through INotifyPropertyChanged. This can be achieved -again- by adding a partial class file: public partial class SalesReason {     private string trackingState;       [DataMember]     public string TrackingState     {         get         {             return this.trackingState;         }           set         {             if (this.trackingState != value)             {                 this.trackingState = value;                 this.OnTrackingStateChanged();             }         }     }       partial void SetTrackingState(string newTrackingState)     {         this.TrackingState = newTrackingState;     }       protected virtual void OnTrackingStateChanged()     {         if (_propertyChanged != null)         {             _propertyChanged(this, new PropertyChangedEventArgs("TrackingState"));         }     } } The only thing you need to do now, is to make sure that the SetTrackingState method is called at the right moment. The end of the HandleObjectStateChanging looks like a nice candidate. Unfortunately this requires a modification of the code that was generated by the T4 template. For performance reasons I used a partial method for this. This is the extract from the SalesReason.cs file: // This is a new definition partial void SetTrackingState(string trackingState); //   private void HandleObjectStateChanging(object sender, ObjectStateChangingEventArgs e) {     //     this.SetTrackingState(e.NewState.ToString()); // This is a new line     //       if (e.NewState == ObjectState.Deleted)     {         ClearNavigationProperties();     } } Just modifiying the generated code is probably not good enough: if later on you need to update your STE (e.g. after adding a column to the underlying table), the modifications will get overriden again. So you might want to modify the source code of the T4 template (search for the HandleObjectStateChanding method and adapt the source code). Fortunately this is a no-brainer: most of the T4 template is just like a C# source code file, but without IntelliSense. The rest of the file looks more like classic ASP - ugh. Anyway, you end up with a TrackingStatus property to which you can bind user interface elements, wiith or without a converter in between. In the sample application I bound an image to the tracking state: <Image    Source="{Binding        Path=TrackingState,        Converter={StaticResource ImageConverter}}"    Width="24" Height="24" /> Here's how it looks like: In general, I think there are not enough partial methods defined and called in the Entity Framework T4 templates. To be frank: 'not enough' is an understatement: I didn't find a single partial method.

Optimistic concurrency using a SQL Timestamp in Entity Framework 4.0

This article explains how to implement optimistic concurrency checking in the Entity Framework 4.0, using a SQL Server Timestamp column. But you could have derived that from its title. What is a Timestamp? Despite its name, the SQL Server Timestamp data type has nothing to do with time. DateTime2 on the other hand, is DateTime too [sorry, I couldn't resist that]. A timestamp is just an eight-bytes binary number that indicates the relative sequence in which data modifications took place in a database. The value in a column of the type Timestamp is always provided by SQL Server: it is calculated when a row is inserted, and augmented with each update to the row, to a ever increasing value that is unique in the whole database. The Timestamp data type was initially conceived for assisting in recovery operations, and you also use it to synchronize distributed databases: based on the timestamp you can detect the order in which data was added or updated, and replay a sequence of modifications. But the most common usage of timestamp is optimistic concurrency checking: when updating a row you can compare its current timestamp with the one you fetched originally. If the values are different, you know that someone else updated the row behind your back. And you know this without holding any locks on the server while you were busy, so its a very scalable solution. This type of concurrency checking is called 'optimistic': you assume that in most cases you will be able to successfully update without the need for conflict resolution. A sample of the geeky kind Let's create a table with such a column (the scripts-folder in the provided solution at the end of this article contains a full script): CREATE TABLE [dbo].[FerengiRule](     [ID] [int] NOT NULL,     [Text] [nvarchar](100) NOT NULL,     [Source] [nvarchar](100) NOT NULL,     [Timestamp] [timestamp] NOT NULL,  CONSTRAINT [PK_FerengiRule] PRIMARY KEY CLUSTERED (     [ID] ASC ) Populate it with your favorite 'Ferengi Rules of Acquisition'. You find all of these here. In a WPF solution, create an entity model, and add the table to it. You see that the timestamp column has Fixed as value for the Concurrency Mode property, so the column will appear in the WHERE-clause of any insert, update, or delete query, and Computed as value for the StoreGeneratedPattern property, so a new value is expected from the server after insert or update. Next, build an application with a fancy transparent startup screen, that allows you to open multiple edit windows on the same data. The startup screen could look like this: The mainwindow of the application contains just an editable grid on the table's contents. It allows you to set the concurrency resolution mode, upload local modifications to the database, get the current server data, and last but not least restore the table to its original contents (that's an extremely useful feature in this type of application). Here's how the window looks like: Visualizing a Timestamp value Just like GUIDs and technical keys, you should avoid showing timestamp values on the user interface. This demo is an exceptional to this general rule, so I built a TimestampToDouble converter to translate the eight-byte binary number to something more readable. I don't guarantee a readable output on an old active database where the timestamp value is very high, but it works fine for a demo on a fresh database: public class SqlTimestampToDoubleConverter: IValueConverter {     public object Convert(object value, Type targetType, object parameter, System.Globalization.CultureInfo culture)     {         if (value == null)         {             return null;         }           byte[] bytes = value as byte[];         double result = 0;         double inter = 0;         for (int i = 0; i < bytes.Length; i++)         {             inter = System.Convert.ToDouble(bytes[i]);             inter = inter * Math.Pow(2, ((7 - i) * 8));             result += inter;         }         return result;     }       public object ConvertBack(object value, Type targetType, object parameter, System.Globalization.CultureInfo culture)     {         throw new NotImplementedException();     } }   Concurrency Generated SQL Queries If you run your SQL Profiler while modifying data through via the main window, you'll see that the Entity Framework query's WHERE-clause contains the timestamp, and that after the update the new value is fetched for you: If there's no row to be updated, then someone else must have modified (or deleted) it. You can test that very easily with the sample application by opening multiple edit windows and playing around with the data. Client side conflict resolution When the entity framework discovers a concurrency violation when saving your changes, it appropriately throws an OptimisticConcurrencyException. It's then time for you to solve the conflict. In most cases, that means fetching the current values, sending these to the GUI, and let the end user decide what should happen. The data access layer code for inserts and updates will look like this: foreach (var rule in this.rules) {     try     {         switch (rule.ChangeTracker.State)         {             case ObjectState.Added:                 entities.FerengiRules.AddObject(rule);                 entities.SaveChanges();                 break;             case ObjectState.Modified:                 entities.FerengiRules.ApplyChanges(rule);                 entities.SaveChanges();                 break;             default:                 break;         }     }     catch (OptimisticConcurrencyException)     {         // Return Current Values         // ...     } }   Server side conflict resolution The Entity Framework also provides automated conflict resolution strategies that might be useful in some scenarios (although to be honest: I can't think of any). There's a Refresh method that you can use to decide whether it's the client version or the server (store) version that should be persisted when there's a conflict. Here's how the catch-block could look like: catch (OptimisticConcurrencyException) {     switch (this.conflictResolution)     {         case ConflictResolution.Default:             throw;         case ConflictResolution.ServerWins:             entities.Refresh(RefreshMode.StoreWins, rule);             entities.SaveChanges();             break;         case ConflictResolution.ClientWins:             entities.Refresh(RefreshMode.ClientWins, rule);             entities.SaveChanges();             break;         default:             break;     } }   Source Code Oops ... almost forgot: here's the source code of the whole thing: U2UConsult.EF40.OptimisticConcurrency.Sample.zip (470,96 kb) Enjoy!

WCF Data Services 4.0 in less than 5 minutes

WCF Data Services 4.0 (formerly known as ADO.NET Data Services, formerly known as Astoria) is one of the ways to expose an Entity Data Model from Entity Framework 4.0 in a RESTful / OData way. This article explains how to create such a data service and how to consume it with a browser and with a WPF client. The Data Service Start with an empty ASP.NET Application: Add a WCF Data Service to it: Also add an Entity Data Model to the ASP.NET project: Follow the Model Wizard to create a model containing entities on top of the Employee and Person tables from the AdventureWorks2008 database: In the designer, you should have something like this: A lot of code was generated, let's add our own 50ct in the service's code behind. First let it inherit from DataService<AdventureWorks2008Entities>: public class WcfDataService : DataService<AdventureWorks2008Entities> { .. }   Then modify the InitializeService method as follows. This exposes all operations and grants all access rights (not really a production setting): public static void InitializeService(DataServiceConfiguration config) {     config.SetEntitySetAccessRule("*", EntitySetRights.All);     config.SetServiceOperationAccessRule("*", ServiceOperationRights.All);     config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2; } Believe it or not, we're done (for the first part): the entity model is now exposed in a RESTful way. At the root URL you get an overview of the exposed entities. In the attached sample the root URL is "http://localhost:1544/WcfDataService.svc", but you may of course end up with another port number: At the "/Employees" address you find all employees: In your browser this list of employees may appear like this: This means it's time to -at least temporarily- disable your rss feed reading view. Here's how to do this in IE:     To reach an individual entity, just type its primary key value in parentheses at the end of the URL, like "http://localhost:1544/WcFDataService.svc/Employees(1)": You can navigate via the relationships between entities. This is how to reach the Person entity, connected to the first Employee. The URL is "http://localhost:1544/WcfDataService.svc/Employees(1)/Person": Other OData URI options options can be found here, including: Filtering: http://localhost:1544/WcfDataService.svc/Employees?$filter=JobTitle eq 'Chief Executive Officer' Projection: http://localhost:1544/WcfDataService.svc/Employees?$select=JobTitle,Gender Client side paging: http://localhost:1544/WcfDataService.svc/Employees?$skip=5&$top=2 Version 4.0 also includes support for server side paging. This gives you some control over the resources. Add the following line in the InitializeService method: config.SetEntitySetPageSize("Employees", 3); Only 3 employees will be returned now, even if the client requested all: A Client Enough XML for now. WCF Data Services also expose a client side model that allows you to use LINQ. Create a new WPF application: Add a Service Reference to the WFC Data Service: Decorate the Window with two buttons and a listbox. It should look more or less like this: The ListBox will display Employee entities through a data template (OK, that's XML again): <ListBox    Name="employeesListBox"    ItemTemplate="{StaticResource EmployeeTemplate}"    Margin="4"    Grid.Row="1"/> Here's the template. It not only binds to Employee properties, but also to Person attributes: <DataTemplate    x:Key="EmployeeTemplate">     <StackPanel>         <StackPanel Orientation="Horizontal">             <TextBlock                Text="{Binding Path=Person.FirstName}"                FontWeight="Bold"                Padding="0 0 2 0"/>             <TextBlock                Text="{Binding Path=Person.LastName}"                FontWeight="Bold"/>         </StackPanel>         <StackPanel Orientation="Horizontal">             <TextBlock                Text="{Binding Path=JobTitle}"                Width="180"/>             <TextBlock                Text="{Binding Path=VacationHours}"                Width="60"                TextAlignment="Right" />             <TextBlock                Text=" vacation hours taken." />         </StackPanel>     </StackPanel> </DataTemplate> The Populate-Button fetches some Employee entities together with their related Person entity, and binds the collection to the ListBox (in version 4.0 two-way bindings are supported for WPF): private void Populate_Click(object sender, RoutedEventArgs e) {     AdventureWorks2008Entities svc =         new AdventureWorks2008Entities(             new Uri("http://localhost:1544/WcfDataService.svc"));       this.employeesListBox.ItemsSource =         svc.Employees.Expand("Person").Where(emp => emp.BusinessEntityID < 100); } Here's the result: The Update-Button updates the number of vacation hours of the company's CEO. It fetches the Employee, updates its VacationHours property, then tells the state manager to update the employee's state, and eventually persists the data: private void Update_Click(object sender, RoutedEventArgs e) {     AdventureWorks2008Entities svc =         new AdventureWorks2008Entities(             new Uri("http://localhost:1544/WcfDataService.svc"));       Employee employee =         svc.Employees.Where(emp => emp.BusinessEntityID == 1).First();       employee.VacationHours++;       svc.UpdateObject(employee);     svc.SaveChanges(); } If you now repopulate the listbox, you will see the increased value: Source Code Here's the full source code of this sample (just requires VS2010 with no extra downloads): U2UConsult.WcfDataServices.Sample.zip (96,59 kb) Enjoy!