Transactions and Connections in Entity Framework 4.0 29 June 2010 Diederik-Krols Entity Framework 4.0, SQL Server This article describes where, why, and how to use TransactionScope in the Entity Framework 4.0. The proposed best practices apply to a medium to large data access layer, e.g. a DAL that is implemented as one or more WCF services with lots of internal calls. Allow me to start with the conclusions: Always execute all data base access inside a TransactionScope, always explicitely specify an isolation level, and always explicitely open the entities' connection. If you trust me, implement these rules in all your projects. If you don't trust me, continue reading to figure out why. Why you should open the connection explicitely. So you decided to continue reading. Thanks for the confidence . Default data adapter behavior Under the hood of the newest 4.0 Entity Framework, the real work is still done by ye olde ADO.NET 1.* data adapter. Such a data adapter needs a connection and a command to do its work. If the connection is open, the adapter executes the command and leaves the connection open. If the connection is closed, the adapter opens it, executes the command, and then politely closes the connection again. When using the Entity Framework, you can open the connection explicitely as follows: using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities()) { entities.Connection.Open(); // Your stuff here ... } If you don't open the connection explicitely, it will be opened and closed for you. That is convenient, but in a busy complex data access layer the same physical connection will be openened and closed over and over again. I now hear you saying 'What's wrong with that, the connection is pooled anyway, so there's no overhead in opening and closing.' Well, actually the data adapters behavior comes with a price, and I'm sure you will not always want to pay that price. Performance impact Data base connections are not created each time you need one. In most cases a connection is fetched from the connection pool. A pooled connection is always first cleaned up by the .NET data client with a call to the sp_reset_connection system procedure. A complete list of that procedure's actions can be found here. The list includes the following: It resets all error states and numbers (like @@error), it stops all execution contexts (EC) that are child threads of a parent EC executing a parallel query, it waits for any outstanding I/O operations, it frees any held buffers on the server by the connection, it unlocks any buffer resources that are used by the connection, it releases all memory allocated to the connection, it clears any work or temporary tables that are created by the connection, it kills all global cursors owned by the connection, it closes any open SQL-XML handles that are opened by the connection, it deletes any open SQL-XML related work tables, it closes all system tables, it closes all user tables, it drops all temporary objects, it aborts open transactions, it defects from a distributed transaction when enlisted, it decrements the reference count for users in current database, it frees acquired locks, it resets all SET options to the default values, it resets the @@rowcount value, it resets the @@identity value, it resets any session level trace options using dbcc traceon(), and fires Audit Login and Audit Logout events. If you don't explicitely open the connection yourself, then every other call will be a call to sp_reset_connection. Please don't start panicking now: SQL Server does all of this extremely fast. The demo application that you find at the end of this article clearly shows that even a high number of calls to the procedure incurs no noticable performance impact. These numerous calls only disturb your monitoring experience with SQL Profiler. The following images show a profiler session where the same workload is executed with and without explicitely opening the connection: Default behaviorExplicit open Although every other SQL command is a call to sp_reset_connection, there's hardly any impact on the response time: But anyway there ís a performance penalty, and you can avoid it. Distributed transaction escalation The continuing opening and closing of a (pooled) connection has more than just a small performanct impact when you're running inside a TransactionScope. It can cause a local transaction to escalate to a distributed transaction unnecesarily. When a transaction spans more than one resource manager (database or queuing system) or involves too many connections, the .NET client decides to call help from the Distributed Transaction Coordinator service (MSDTC) in the Operating System. When that escalation exactly takes place depends largely on the version of SQL Server you're working against. The more recent versions sustain local mode longer. I only have SQL Server 2008R2 instances in my network. These instances even allow multiple connections to share the same transaction without escalating to MSDTC. The only way to force an escalation on my machine, is disabling connection pooling in the connection string (by adding Pooling=False). Escalating to a distributed transaction takes a lot of resources: different services need to initialize and communicate. Here's a screenshot of the demo application, compare the response time for a regular escalated call to the rest (the very first escalation -when stuff needs to be initialized- takes between 4 and 8 seconds): Of course some of the observed overhead comes from the lack of pooling. Anyway, I believe that the difference in response time is high enough to catch your attention. You can strongly reduce the escalation risk by minimizing the number of physical connections per transaction by explicitely opening the connection and keeping it open until you commit. Here's again a comparison of profiler traces, with and without explicit opening: Default behaviorExplicit open Opening the connection brings the response time of the sample workload back to normal: Why you should use TransactionScope Local transactions can only escalate to distributed transactions when you're running in a TransactionScope. So you might decide not to use TransactionScope after all. Unfortunately this is not an option. One of the few things that sp_reset_connection doesn't do is resetting the transaction isolation level. I personally consider this a frakking bug. According to Microsoft, the behavior is by design so this bug will never be solved. If you want tight control over the performance and scalability of your data access, then managing the isolation level is absolutely crucial. There are two ways to control the isolation level in a DAL method: explicitely start a transaction, or set the isolation level on the connection programmatically by issuing a SET TRANSACTION ISOLATION LEVEL statement. Explicitely starting a transaction I see three ways to start a transaction: using SQL statements BEGIN-COMMIT-ROLLBACK starting a transaction on the connection using .NET code, or using TransactionScope. The good thing about the first two techniques is that their default isolation level is READ COMMITTED, which is a nice trade-off between performance (minimal locking overhead) and reliability (no dirty reads possible). For this reason, READ COMMITTED is the default isolation level in the ANSI and ISO standards. Of course you could specify another level, if required. I will not elaborate on T-SQL transactions. You should only use these inside stored procedures and batch scripts, never in C# programs. Here's an example on using C# to start a transaction on the connection. Inside the transaction, you always know the isolation level (because you set it, or because there is a reliable default): using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities(entitiesConnectionstring)) { entities.Connection.Open(); // Always returns 'Read Committed' DbTransaction trx = entities.Connection.BeginTransaction(); this.ErrorLabel.Content = trx.IsolationLevel.ToString(); // Your stuff here ... trx.Commit(); } In the context of a larger application, these first two techniques are too tightly connected to the database: you need a reference to the physical connection to control the transaction and its isolation level. So these techniques are only applicable inside the data access layer. The DAL is not the place where we define what 'a logical unit of works' i.e. 'a transaction' is. That's a decision that should be taken in the business layer, that layer doesn't own the connection. So there are reasons enough to use TransactionScope. Here's how to use TransactionScope. As you see the scope requires no access to the physical connection: // In the Business Layer using (System.Transactions.TransactionScope scope = new System.Transactions.TransactionScope(TransactionScopeOption.RequiresNew)) { // In the Data Access Layer using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities(entitiesConnectionstring)) { // Always returns 'Serializable' this.ErrorLabel.Content = Transaction.Current.IsolationLevel.ToString(); // Your stuff here ... } scope.Complete(); } For reasons that I don't understand, Microsoft has chosen the second worst default for TransactionScope isolation level: SERIALIZABLE. [In case you were having doubts: READ UNCOMMITTED would be the worst choice.] This means that if you don't specify a better isolation level yourself, you place a lock on everything you read, and due to lock escalation you'll probably even lock rows that you don't even read. Needless to say you'll end up with scalability issues. Setting the isolation level in T-SQL When starting a transaction on the connection, or when opening a TransactionScope, you can (and should) specify the appropriate isolation level. There's a third way to control the isolation level on a connection: T-SQL. This is again a low level technique that requires access to the physical data base connection. Here's an example on how to set the isolation level when using Entity Framework: using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities()) { entities.Connection.Open(); entities.ExecuteStoreCommand("SET TRANSACTION ISOLATION LEVEL READ COMMITTED"); // Your stuff here ... } The isolation level and its corresponding locking strategy will be applied to all your queries on the connection, whether or not you wrap them in a transaction. If you use this technique, you should not forget to reset the isolation level to its original value at the end of your method. Unfortunately there's no straightforward way to determine the current isolation level for a connection. [Keep reading this blog: I'm working on a solution] If you don't explicitely set the isolation level to its previous value, the new value will remain active on the connection. .NET nor SQL Server will reset it, not even when your code was called from inside a transaction! Ignoring the isolation level If you don't take control of it, you have no idea in which transaction isolation level your queries will be running. After all, you don't know where the connection that you got from the pool has been: it could have come from a SSIS-package doing some bulk maintenance (where SERIALIZABLE is the default), or it could been returned from an application that monitors that same SSIS-package (through a READUNCOMMITTED transaction allowing dirty reads). You simply inherit the last used isolation level on the connection, so you have no idea which type of locks are taken (or worse: ignored) by your queries and for how long these locks will be held. On a busy database, this will definitely lead to random errors, time-outs and deadlocks. Demo To illustrate the arbitrary isolation level, the sample application contains the following code snippet that is executed after the transaction. It executes the T-SQL version of Thread.Sleep, which gives you some time to monitor it from the outside: using (AdventureWorks2008Entities entities = new AdventureWorks2008Entities(entitiesConnectionstring)) { // Script will reveal different values, depending on connection pooling entities.ExecuteStoreCommand("WAITFOR DELAY '00:00:10'"); } While you run the main application, you can monitor the connections with the following SQL script: SELECT CASE transaction_isolation_level WHEN 0 THEN 'Unspecified' WHEN 1 THEN 'Read Uncommitted' WHEN 2 THEN 'Read Committed' WHEN 3 THEN 'Repeatable Read' WHEN 4 THEN 'Serializable' WHEN 5 THEN 'Snapshot' ELSE 'Bazinga' END AS [Transaction Isolation Level] ,session_id AS [Session Id] ,login_time AS [Login Time] FROM sys.dm_exec_sessions WHERE program_name = '.Net SqlClient Data Provider' For the record: Bazinga is not an official isolation level . You'll see that the isolation level for the connection is not stable: So you should always specify an isolation level yourself. According to SQL Server Books-on-line, setting the isolation level requires a solid understanding of transaction processing theory and the semantics of the transaction itself, the concurrency issues involved, and the consequences for system consistency. I couldn't agree more. How to detect escalation When using TransactionScope, there's always the risk of escalation to distributed transactions. So you should keep an eye on these. There are a couple of techniques to detect and monitor distributed transactions. Here are three of them: Component services Via Control Panel -> Administrative Tools -> Component Services you reach the management console snap-in, that allows you to monitor MSDTC: That same tool also alows you to configure MSDTC, and setup logging. SQL Server Profiler You can also trace distributed transactions with SQL Server Profiler: Stop MSDTC and watch the exceptions in your program Letting the application crash by stopping the MSDTC service, is my favorite solution (well, at least in the developer or test environment). The trick only works with newer versions of .NET/SQL Server. In the past, MSDTC should run even for local transactions. That bug was fixed. Here's a way to stop MSDTC from C# (must run as administrator): using (ServiceController sc = new ServiceController("Distributed Transaction Coordinator")) { if (sc.Status == ServiceControllerStatus.Running) { sc.Stop(); // Admins only !! } } Here's the result in the test application: Source code Here's the full test project: U2UConsult.EF40.TransactionScope.Sample.zip (73,24 kb) [You might want to change the AdventureWorks2008 connection strings in app.config.] Enjoy!
Optimistic concurrency using a SQL DateTime in Entity Framework 4.0 31 May 2010 Diederik-Krols .NET 4.0, Entity Framework 4.0, SQL Server, WPF This article explains how to implement optimistic concurrency checking using a SQL Server DateTime or DateTime2 column. It's a follow-up of my previous article on using a TimeStamp column for that same purpose. In most -if not all- concurrency checking cases it actually makes more sense to use a DateTime column instead of a TimeStamp. The DateTime data types occupy the same storage (8 bytes) as a TimeStamp, or even less: DateTime2 with 'low' precision takes only 6 bytes. On top of that: their content makes sense to the end user. Unfortunately the DateTime data types are 'a little bit' less evident to use for concurrency checking: you need to declare a trigger (or a stored procedure) on the table, and you need to hack the entity model. A sample table Sample time! Let's start with creating a table to hold some data. Table definition Here's how the table looks like (the solution at the end of the article contains a full T-SQL script). The LastModified column will be used for optimistic concurrency checking: CREATE TABLE [dbo].[Hero]( [Id] [int] IDENTITY(1,1) NOT NULL, [Name] [nvarchar](50) NOT NULL, [Brand] [nvarchar](50) NULL, [LastModified] [datetime] NULL, CONSTRAINT [PK_Hero] PRIMARY KEY CLUSTERED ( [Id] ASC )) Trigger definition Unlike an Identity or a TimeStamp value, a DateTime value is not automatically generated and/or updated. So we have to give the database a little help, e.g. by creating a trigger for insert and update on that table: CREATE TRIGGER [dbo].[trg_iu_Hero] ON [dbo].[Hero] AFTER INSERT, UPDATE AS BEGIN SET NOCOUNT ON; UPDATE [dbo].[Hero] SET LastModified = GETDATE() WHERE Id IN (SELECT Id FROM inserted) END Alternatively, you could insert through a stored procedure. A sample application I already prepared for you a small sample application. Here's how the main window looks like: Cute, isn't it ? The problem In the entity model, you have to make sure that the LastModified column has the correct settings (Fixed and Computed): Run the application with just the generated code. You will observe that when you update a record, the entity's LastModified property will NOT be updated. SQL Server Profiler will reveal that only an update statement is issued. The new value of LastModified is assigned by the trigger but NOT fetched: The solution In order to let the Entity Framework fetch the new value of the DateTime column -or whatever column that is modified by a trigger-, you need to hack the model's XML and manually add the following attribute in the SSDL: Somewhere in Redmond there will certainly be an architect who will provide an excuse for this behavior. To us developers, this sure smells like a bug. Anyway, if you re-run the application with the modified SSDL, the new DateTime value will appear after insert or update. SQL Server profiler reveals the extra select statement: Source Code Here's the source code, the whole source code, and nothing but the source code: U2UConsult.EF40.DateTimeConcurrency.Sample.zip (616,27 kb) Enjoy! Thank you This article is dedicated to my 3-year old daughter Merel. Last week she briefly turned into a real angel, but then decided to come back. I want to thank from the bottom of my heart everybody who helped saving her life: her mama, her mammie, the MUG, and the emergency, reanimation, intensive care, and pediatric departments of the uza hospital.
The Missing Linq to SQL Spatial 03 January 2010 Diederik-Krols SQL Server, Spatial This article provides hints and hacks on how to use the SQL Server spatial data types -Geography and Geometry- in Linq to SQL. The data provider has a problem with serializing and deserializing the SQL UDT's. If you ever tried to use these data types in a Linq to SQL (or Entity Framework) project then you certainly encountered the following error: “One or more selected items contain a data type that is not supported by the designer”. Setting up a test environment I created a small database called 'Spatial' with a schema called 'Europe', and a table called 'Countries'. Here's the structure of the table: CREATE TABLE [Europe].[Countries]( [CountryID] [int] IDENTITY(1,1) NOT NULL, [CountryName] [nvarchar](50) NOT NULL, [Shape] [geography] NOT NULL, CONSTRAINT [PK_Countries] PRIMARY KEY CLUSTERED ( [CountryID] ASC ) The source code that accompanies this article contains the scripts to create the assets and populate the table. Here's an indication of the table contents: I created a Visual Studio 2008 Console application, and added a reference to Microsoft.SqlServer.Types. Then I added Linq to SQL classes (Europe.dbml). When I dropped the Europe.Counties table into the designer, the result was the expected error message: Hint 1: Use views that cast the data type I decided to create a view in the schema: vw_Countries. It exposes the same signature as the table, but with one difference: the shape column is returned as Varbinary(MAX) instead of Geography: CREATE VIEW [Europe].[vw_Countries] AS SELECT CountryID ,CountryName ,CONVERT(VARBINARY(MAX), Shape) AS Shape FROM Europe.Countries The Geography UDT is physically stored as Varbinary(MAX) in SQL Server, so the table and the view are basically the same. There's one important difference though: Visual Studio is not allergic to the view. You can easily drag and drop it into the data model designer: The Linq data provider delivers the varbinary data as System.Data.Linq.Binary, but is -strangly enough- unable to convert this back to SqlGeography. It took me a while to figure out the conversion myself, but here's a working version: Binary belgiumb = (from c in db.vw_Countries where c.CountryID == 4 select c.Shape).FirstOrDefault(); SqlGeography geo = new SqlGeography(); geo.Read(new System.IO.BinaryReader(new System.IO.MemoryStream(belgiumb.ToArray()))); Hint 2: Package your conversion code We're going to get stuck with this type of conversions for a while, so it makes sense to wrap the calculations e.g. in extension methods. Here's an example: /// <summary> /// Converts a Linq Binary to a SQL Server Geograpy. /// </summary> /// <remarks>Throws an Exception if the Binary contains invalid data.</remarks> public static SqlGeography AsGeography(this Binary binary) { if (binary == null) { return SqlGeography.Null; } SqlGeography result = new SqlGeography(); result.Read( new System.IO.BinaryReader( new System.IO.MemoryStream( binary.ToArray()))); return result; } So now I can call it like this: SqlGeography b = (from c in db.vw_Countries where c.CountryID == 4 select c.Shape).FirstOrDefault().AsGeography(); Console.WriteLine(" The area of Belgium is {0} m²", b.STArea().Value); Hack 1: Change source mappings I wanted to execute my queries directly against the table. So I copy/pasted the view in the dbml, and modified the source mapping: Now I can call the queries like this: belgium = SqlGeography.Parse((from c in db.Countries where c.CountryID == 4 select c.Shape.ToString()).FirstOrDefault()); Hack 2: Change data type mappings Since I wanted IntelliSense, I copy/pasted the Country table again, and changed the .NET type of the Shape property to SqlGeography: The good news is: I now have IntelliSense at design time: The bad news is: I will have exceptions all over the place. If you return a spatial data type in the select, then a query is successfully constructed and sent to SQL Server, but the return values cannot be deserialized. Here's such a query: belgium = (from c in db.Countries2 where c.CountryID == 4 select c.Shape).FirstOrDefault(); It results in a runtime exception: If you call a method -apart from ToString()- on a spatial data type in the where-clause, then you'll bump into a NotSupportedException already at compile time. Here's such a query: var query = (from c in db.Countries2 where (c.Shape.STArea() > 10).Value select c.CountryName).FirstOrDefault(); And its result at compile time: Intermezzo I tried to use Varbinary as data type for the _Shape backing variable, and SqlGeography as data type for the Shape property, and call conversions in the getter and setter. It didn't work: the getters and setters seem to be bypassed by the Linq provider. Conclusion: hack 2 was not a good idea. We'll stop using the Countries2 entity... Hint 3: Use scalar functions Since STIntersects cannot be used, I created a scalar function that takes two Well-Known Texts, and returns whether or not the shapes intersect: CREATE FUNCTION [Europe].[Intersects] ( @Shape1 NVarchar(MAX), @Shape2 NVarchar(MAX) ) RETURNS integer AS BEGIN DECLARE @Geo1 Geography = geography::Parse(@Shape1) DECLARE @Geo2 Geography = geography::Parse(@Shape2) RETURN @Geo1.STIntersects(@Geo2) END Again, Visual Studio's designer has no problem with this, so you can drag and drop the function into the DataContext: So you can use it in your queries: var query = from c in db.Countries where db.Intersects(c.Shape.ToString(), belgium.ToString()) == 1 select c.CountryName; The response time seems to be good enough against such a small table. But I know that it is never a good idea to call functions in the where-clause of a query. It will force a table scan and an evaluation of the function for each row in the table. If you can live with the performance, then I suggest you just stick to this type of functions: they're highly reusable since they don't contain any hard-code table names. Hint 4: Use table valued functions If your geographical tables get bigger, it makes sense to define table valued functions or stored procedures to do the heavy lifting. The following example doesn't mask the usage of STIntersects in the where-clause, and may let SQL Server decide to use a spatial index (or you may even use a hint): CREATE FUNCTION [Europe].[Intersectors] ( @Shape NVarchar(MAX) ) RETURNS TABLE AS RETURN ( SELECT CountryID, CountryName FROM Europe.Countries WHERE Shape.STIntersects(geography::Parse(@Shape)) = 1 ) And yes: I do realize that Intersectors probable doesn't appear in a regular dictionary... Here's how a call to it looks like in Linq - we still have IntelliSense: var query = from c in db.Intersectors(belgium.ToString()) select c.CountryName; I couldn't keep myself from comparing the performance of the two functions. When I checked the actual query plans,my jaw hit the ground , ouch! Completely against my expectation, the cost of the table value function was higher than the cost of the scalar function. And not just a little bit, but 5 times higher: Anyway, you should never blindly base your conclusions on just the query plans, so I started the SQL Profiler to inspect reality. After thorough testing I was relieved . I observed that the tabled valued function easily outruns the scalar version: it consumes between 3 and 8 times less CPU and returns the results 5 to 10 times faster. This is actually an impressive difference for such a small table. Here's the result from a test session (scripts are included in the source code): A negative aspect of this table valued function is its reusability: the function only works against the Europe.Countries table. If you need to query more tables with spatial data, then you need to add complexity to it, or start copy/pasting. Conclusions The current Linq to Sql provider doesn't support the SQL Server spatial data types properly, so you have to be creative if you want or need to use these. It's not a waste of time to invest some effort in implementing the work arounds I suggested, and in optimizing and performance tuning these. Your solution will last for a while, since there seems to be no Microsoft solution on the horizon. The problems are not solved in SQL 2008 R2, Visual Studio 2010, or .NET 4.0. Source code Here's the source code for the test program. It also contains the necessary SQL scripts for creating and populating the entities, as well as for testing and performance tuning: U2UConsult.DockOfTheBay.LinqToSpatialSample.zip (153,76 kb) Here's how its output should look like: Credits I would like to thank my colleague Kris Vandermotten for his input during the research.
Tuning SQL Server Lookups to a Linked Server 31 December 2009 Diederik-Krols SQL Server In SQL Server, if you join a local table with a table on a linked server (e.g. a remote Oracle instance) you should be prepared for horrible performance. In a lot of scenarios it makes a lot more sense to tell the remote server exactly what you need, store that data in a temporary table, and join locally with it. A couples of weeks ago I used this technique to bring the response time of some queries from SQL Server to a linked Oracle instance down from 500 seconds to less than one second. Let's say we want to enumerate the countries that use the Euro as currency, like this (in AdventureWorks2008): SELECT c.CurrencyCode, r.Name FROM Sales.CountryRegionCurrency c INNER JOIN Person.CountryRegion r ON c.CountryRegionCode = r.CountryRegionCode WHERE c.CurrencyCode = 'EUR' Here's the result: Let's suppose that the names of the countries are in a separate database (e.g. an Oracle on an AS/400 box), and there's no replication in place. From SQL Server, we can get access to that source by defining a linked server. For demonstration and practical purposes -I don't have a portable AS/400 with Oracle on it- I'll create a linked server to the local SQL instance: execute sp_addlinkedserver '.' The distributed query will now look like this: SELECT c.CurrencyCode, r.Name FROM Sales.CountryRegionCurrency c INNER JOIN [.].AdventureWorks2008.Person.CountryRegion r ON c.CountryRegionCode = r.CountryRegionCode WHERE c.CurrencyCode = 'EUR' For this particular query in this particular configuration, the response time is actually still nice (it's already five times slower, but you don't really notice that). In real-life queries -and with a real remote Oracle- you'll notice a dramatic decrease in performance. For this demo configuration, you can use Sql Profiler to reveal the query that was sent to the linked server. Instead of performing a selective look-up, SQL Server selected ALL of the rows, and forced even a SORT on it: SELECT "Tbl1003"."CountryRegionCode" "Col1011","Tbl1003"."Name" "Col1012" FROM "AdventureWorks2008"."Person"."CountryRegion" "Tbl1003" ORDER BY "Col1011" ASC Here's a small part of the result for the query: You can imagine what will happen if your lookup target is not a small local table but a large complex view. This is bad for the remote machine, the local machine and the network between the two. All of this happens because SQL Server will try to optimize its own workload, and considers the linked server as a black box (which -in the case of an AS/400- it actually ìs ). What we should send to the linked server is a request for a -limited- number of key-value pairs, such as SELECT id, name FROM blablabla WHERE id in ('id1', 'id2', ...). We should send this query via the OPENQUERY function, so we can use the native SQL syntax of the remote DMBS. A classic way to create a short comma-separated list in T-SQL is with a variable and the COALESCE function. If the key is not numeric, then you need to wrap each value in quotes. OPENQUERY uses OLEDB under the hood, and this doesn't like double quotes. So you have to wrap each value in two single quotes that you have to wrap in single quotes during the concatenation. Oops, you're lost ? Just look at the code: DECLARE @Countries VARCHAR(MAX) -- Create comma-separated list of lookup values ;WITH Countries AS ( SELECT CountryRegionCode FROM Sales.CountryRegionCurrency WHERE CurrencyCode = 'EUR' ) SELECT @Countries = COALESCE(@Countries + ',', '') + '''''' + CountryRegionCode + '''''' FROM Countries After these calls, the @Countries variable holds a comma-separated list of country codes: Unfortunately OPENQUERY does'nt take parameters, so we need to construct the whole query dynamically, and call it via EXECUTE. To store the result, we need to create a temporary table, because unfortunately table variables disappear from the scope with EXECUTE: CREATE TABLE #Countries (CountryRegionCode nvarchar(3), Name nvarchar(50)) DECLARE @Query VARCHAR(MAX) SET @Query = 'INSERT #Countries ' + 'SELECT * FROM OPENQUERY ([.], ' + '''SELECT CountryRegionCode, Name ' + 'FROM AdventureWorks2008.Person.CountryRegion ' + 'WHERE CountryRegionCode IN (' + @Countries + ')'')' EXECUTE (@Query) This is the value of the @Query variable: After these calls, the #Countries table contains the remote data (at least the fraction we're interested in): So we now can join locally: SELECT c.CurrencyCode, r.Name FROM Sales.CountryRegionCurrency c INNER JOIN #Countries r ON c.CountryRegionCode = r.CountryRegionCode WHERE c.CurrencyCode = 'EUR' And while the complexity of the code dramatically increased, the response time went down equally dramatically ... For the sake of completeness, here's the whole demo script: /***************/ /* Preparation */ /***************/ /* Add linked server to local instance */ execute sp_addlinkedserver '.' USE AdventureWorks2008 GO /********/ /* Test */ /********/ -- Local query SELECT c.CurrencyCode, r.Name FROM Sales.CountryRegionCurrency c INNER JOIN Person.CountryRegion r ON c.CountryRegionCode = r.CountryRegionCode WHERE c.CurrencyCode = 'EUR' -- Query through linked server SELECT c.CurrencyCode, r.Name FROM Sales.CountryRegionCurrency c INNER JOIN [.].AdventureWorks2008.Person.CountryRegion r ON c.CountryRegionCode = r.CountryRegionCode WHERE c.CurrencyCode = 'EUR' -- The query sent to the linked server (from Sql Profiler) SELECT "Tbl1003"."CountryRegionCode" "Col1011","Tbl1003"."Name" "Col1012" FROM "AdventureWorks2008"."Person"."CountryRegion" "Tbl1003" ORDER BY "Col1011" ASC /**************/ /* Workaround */ /**************/ DECLARE @Countries VARCHAR(MAX) -- Create comma-separated list of lookup values ;WITH Countries AS ( SELECT CountryRegionCode FROM Sales.CountryRegionCurrency WHERE CurrencyCode = 'EUR' ) SELECT @Countries = COALESCE(@Countries + ',', '') + '''''' + CountryRegionCode + '''''' FROM Countries -- OLE DB Drivers don't like double quotes, so we have to hexuplicate ;-)) -- Uncomment next line for testing -- SELECT @Countries -- Create temporary table to hold results from query to linked server CREATE TABLE #Countries (CountryRegionCode nvarchar(3), Name nvarchar(50)) DECLARE @Query VARCHAR(MAX) -- Build query to linked server SET @Query = 'INSERT #Countries ' + 'SELECT * FROM OPENQUERY ([.], ' + '''SELECT CountryRegionCode, Name ' + 'FROM AdventureWorks2008.Person.CountryRegion ' + 'WHERE CountryRegionCode IN (' + @Countries + ')'')' -- Uncomment next line for testing -- SELECT @Query -- Execute query to linked server EXECUTE (@Query) -- Uncomment next line for testing -- SELECT * FROM #Countries -- Execute query entirely locally SELECT c.CurrencyCode, r.Name FROM Sales.CountryRegionCurrency c INNER JOIN #Countries r ON c.CountryRegionCode = r.CountryRegionCode WHERE c.CurrencyCode = 'EUR' DROP TABLE #Countries /************/ /* Teardown */ /************/ /* Remove linked server */ execute sp_dropserver '.'
Displaying spatial data in WPF: from SqlGeometry to PathGeometry 26 December 2009 Diederik-Krols WPF, SQL Server This article explains how to visualize spatial data (e.g. from SQL Server 2008) in Windows Presentation Foundation without using 3rd party components or proprietary formats. We'll build a little form that allows us to enter test data in the Well-Known Text format (WKT) - manually or via copy/paste, e.g. from SQL Management Studio. A Draw button will convert our input to WPF graphics, and display it. Here's how the application looks like: It's probably a waste of time to do the validation of the input text and the parsing of its structure ourselves, since the native spatial SQL Server .NET UDTs -SqlGeometry and SqlGeography- are specialized in that. These types are stored in the Microsoft.SqlServer.Types assembly, so we should make a reference to that one in our project. On the user interface side, the best candidate type to visualize spatial data in WPF is without any doubt the Geometry class, which represents a composite 2D-shape. To create a WPF version of spatial data, we read the WKT format and use it to initialize a SqlGeometry instance. Then we call some of the OGC-functions to break the SqlGeometry Object Model down into a PathGeometry Structure. For ease of use, it makes sense to wrap this functionality in extension methods for SqlGeometry. Here's the class: namespace U2UConsult.DockOfTheBay { using System.Windows; using System.Windows.Media; using System.Windows.Shapes; using Microsoft.SqlServer.Types; /// <summary> /// Extension Methods for SqlGeometry. /// </summary> public static class SqlGeometryExtensions { /// <summary> /// Translates a SqlGeometry into a Systems.Windows.Media.Geometry. /// </summary> public static Geometry AsWpfGeometry(this SqlGeometry sqlGeometry) { PathGeometry result = new PathGeometry(); switch (sqlGeometry.STGeometryType().Value.ToLower()) { case "point": // Return a little 'X' // (well: 'little' depends on the coordinate system ...) PathFigure pointFigure = new PathFigure(); pointFigure.StartPoint = new Point( sqlGeometry.STX.Value - .1, sqlGeometry.STY.Value - .1); LineSegment line = new LineSegment( new Point( sqlGeometry.STX.Value + .1, sqlGeometry.STY.Value + .1), true); pointFigure.Segments.Add(line); result.Figures.Add(pointFigure); pointFigure = new PathFigure(); pointFigure.StartPoint = new Point( sqlGeometry.STX.Value - .1, sqlGeometry.STY.Value + .1); line = new LineSegment( new Point( sqlGeometry.STX.Value + .1, sqlGeometry.STY.Value - .1), true); pointFigure.Segments.Add(line); result.Figures.Add(pointFigure); return result; case "polygon": // A Spacial Polygon is a collection of Rings // A Ring is a Closed LineString // So, return a PathFigure for each Ring // Outer Ring result.Figures.Add(LineStringToWpfGeometry(sqlGeometry.STExteriorRing())); // Inner Rings for (int i = 1; i <= sqlGeometry.STNumInteriorRing(); i++) { result.Figures.Add(LineStringToWpfGeometry(sqlGeometry.STInteriorRingN(i))); } return result; case "linestring": // Return a PathFigure result.Figures.Add(LineStringToWpfGeometry(sqlGeometry)); return result; case "multipoint": case "multilinestring": case "multipolygon": case "geometrycollection": // Return a Group of Geometries GeometryGroup geometryGroup = new GeometryGroup(); for (int i = 1; i <= sqlGeometry.STNumGeometries().Value; i++) { geometryGroup.Children.Add(sqlGeometry.STGeometryN(i).AsWpfGeometry()); } return geometryGroup; default: // Unrecognized Type // Return an empty Geometry return Geometry.Empty; } } /// <summary> /// Translates a SqlGeometry into a Systems.Windows.Shapes.Path. /// </summary> public static Path AsPath(this SqlGeometry sqlGeometry) { Path path = new Path(); path.Data = sqlGeometry.AsWpfGeometry(); return path; } /// <summary> /// Translates a LineString or a single Polygon Ring to a PathFigure. /// </summary> private static PathFigure LineStringToWpfGeometry(SqlGeometry sqlGeometry) { PathFigure result = new PathFigure(); result.StartPoint = new Point( sqlGeometry.STPointN(1).STX.Value, sqlGeometry.STPointN(1).STY.Value); for (int i = 1; i <= sqlGeometry.STNumPoints(); i++) { LineSegment lineSegment = new LineSegment(); lineSegment.Point = new Point( sqlGeometry.STPointN(i).STX.Value, sqlGeometry.STPointN(i).STY.Value); result.Segments.Add(lineSegment); } return result; } } } To use these extension methods, all we need to do is create a SqlGeometry instance with some data. Then we need to ensure it's valid against the OGC standards, so that the OGC compliant sql methods behave properly. Finally we call the conversion, like this: // Make OGC Compliant if (!sqlGeometry.STIsValid()) { sqlGeometry = sqlGeometry.MakeValid(); } // Transformation Samples Path path = sqlGeometry.AsPath(); Geometry geometry = sqlGeometry.AsWpfGeometry(); We end up with a Geometry that is expressed in the original spatial coordinates: latitude/longitude or X/Y against a specific SRID. So we need to translate and scale it, to project it to windows coordinates. Since we only visualize one shape, it suffices to let it stretch automatically in its container. By the way: don't forget to draw upside-down, because the origin of a Control is the upper left corner while the origin of a map is generally at the bottom left: // Flip Y-coordinate // Origin of a map is usually at the bottom left path.LayoutTransform = new ScaleTransform(1, -1); // Automate Translation & Inflation path.Stretch = Stretch.Uniform; Samples I tested the code against a high number of possible (mostly Belgian) shapes. Here are some examples: The river Zenne at the point where it leaves Brussels, heading to the north. An example of a LineString. The province of Antwerp. An example of a MultiPolygon. The province of Flemish Brabant. An example of a Polygon with an inner Ring. The arrondissement Halle-Vilvoorde. An example of a Polygon. Here's the full code for the WPF Window. To bring you up-to-speed immediately, it starts with a reduced shape (24 Points) of Belgium: XAML: <Window x:Class="U2UConsult.DockOfTheBay.SpatialSample" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Title="SQL Spatial to WPF Sample" Icon="/U2UConsult.DockOfTheBay;component/dotbay.png"> <Grid> <Grid.RowDefinitions> <RowDefinition Height="*" /> </Grid.RowDefinitions> <Grid.ColumnDefinitions> <ColumnDefinition Width="*" /> <ColumnDefinition Width="2*" /> </Grid.ColumnDefinitions> <Button x:Name="DrawButton" Click="DrawButton_Click" Content="Draw" Grid.Row="0" Grid.Column="1" Margin="15" Height="32" Width="64" HorizontalAlignment="Left" VerticalAlignment="Bottom" /> <!-- Reduced shape of Belgium --> <TextBox x:Name="GeometryTextBox" Margin="5" Grid.Row="0" Grid.Column="0" TextWrapping="Wrap" AcceptsReturn="True" AcceptsTab="True" Text="POLYGON (( 5.4695768356323242 49.499450206756592, 5.8744573593139648 49.576767921447754, 5.7810144424438477 49.959678173065186, 6.4083404541015625 50.333068847656250, 6.1721935272216800 50.550515174865723, 6.2783794403076172 50.616397857666016, 5.6911020278930664 50.761138916015625, 5.8341245651245117 51.168460845947266, 5.2405147552490234 51.261853218078613, 5.0372371673583984 51.485389232635500, 4.4786109924316406 51.480998992919922, 3.9020967483520508 51.198946952819824, 3.1825008392333984 51.361250877380371, 2.5581483840942383 51.093193054199219, 2.6278944015502930 50.814075946807861, 3.1747541427612305 50.752677917480469, 3.2816371917724609 50.526985168457031, 3.6048288345336914 50.489061832427979, 3.7020025253295900 50.300303936004639, 4.2115697860717773 50.269905090332031, 4.1991643905639648 49.960120201110840, 4.6723327636718750 49.985515117645264, 4.8746204376220700 50.151000022888184, 4.8553209304809570 49.794033050537109, 5.4695768356323242 49.499450206756592))" ToolTip="Place your Well-Known Text here ..." /> <Border x:Name="DrawingCanvas" Padding="15" Grid.Row="0" Grid.Column="1" /> </Grid> </Window> C#: namespace U2UConsult.DockOfTheBay { using System; using System.Windows; using System.Windows.Controls; using System.Windows.Media; using System.Windows.Shapes; using Microsoft.SqlServer.Types; // Add Reference !!! /// <summary> /// Demonstrates displaying SQL Server Spatial data in WPF. /// </summary> public partial class SpatialSample : Window { public SpatialSample() { InitializeComponent(); } private void DrawButton_Click(object sender, RoutedEventArgs e) { try { // Read Well-Known Text SqlGeometry sqlGeometry = SqlGeometry.Parse(this.GeometryTextBox.Text); // Make OGC Compliant if (!sqlGeometry.STIsValid()) { sqlGeometry = sqlGeometry.MakeValid(); } // Transform to Path Path path = sqlGeometry.AsPath(); // Basic Properties path.Stroke = Brushes.Black; path.StrokeThickness = 1; // Polygons only ... //path.Effect = new DropShadowEffect() { Direction = 225 }; //path.Fill = Brushes.DarkGreen; // Flip Y-coordinate // Origin of a map is usually at the bottom left path.LayoutTransform = new ScaleTransform(1, -1); // Automate Translation & Inflation path.Stretch = Stretch.Uniform; this.DrawingCanvas.Child = path; } catch (Exception ex) { // Input not valid this.DrawingCanvas.Child = new TextBlock() { TextWrapping = TextWrapping.Wrap, MaxHeight = 128, VerticalAlignment = VerticalAlignment.Top, Foreground = Brushes.Red, Text = ex.Message }; } } } }
SQL Spatial Tools: Map Projections 25 December 2009 Diederik-Krols SQL Server, WPF SQL Server Spatial Tools on CodePlex contains useful extra functions for the SqlGeometry and SqlGeography data types, as well as a new data type for affine transformations (to scale, translate, and rotate) and a handful of Map Projections. This article describes how to use these projections and visualize the result in Windows Presentation Foundation. All projections are instantiated from static method calls against the SqlProjection class, with one to five parameters. SQL Spatial Tools contains sample T-Sql scripts, but here's how it looks like in C# (for the inverse projection, don't forget to first assign a Spatial Reference System Identifier to the geometry): Projection SqlGeography shape3D = new SqlGeography(); shape3D = SqlGeography.Parse("some valid WKT"); // Or read from DB SqlProjection proj = SqlProjection.LambertConformalConic(0, 90, 12, 36); SqlGeometry shape2D = proj.Project(shape3D); Inverse Projection SqlGeometry shape2D = SqlGeometry.Parse("some valid WKT"); shape2D.STSrid = 4326; // WGS 84 SqlProjection proj = SqlProjection.AlbersEqualArea(0, 0, 15, 30); SqlGeography shape3D = proj.Unproject(shape2D); That's all there is to! I added this code to an improved version of my SqlGeometry Extension Methods for WPF Visualization. Here's how the resulting application looks like, displaying a reduced shape of Belgium, and the exact location of my office @ U2U Consult: OK, I admit: projecting Belgium is not that spectacular. It will have the same shape in virtually every projection: it's a small surface on an average latitude. So let's apply some transformations on a more representative victim such as the Tropic of Cancer. This will generally be projected as a straight line, but if you stand on the North Pole -e.g. via a Gnomonic Projection- it looks like a circle: And with a Transverse Mercator projection it should look like an Ellipse: Unfortunately the SQL Spatial Tools code only implements the Spherical version of the Transverse Mercator projection, and not (yet ?) the Ellipsoidal version. Otherwise SQL Spatial Tools would have all the ingredients for a latitude/longitude WGS84 to UTM conversion. After all, you only need to project, then scale (to convert to meters), and finally translate (for false easting). This is a horrible calculation, but don't worry: Proj.NET on CodePlex should be able to handle this (this feels like a topic for a later article). Anyway, here's the full solution: U2UConsult.DockOfTheBay.SpatialProjectionsSample.zip (144,07 kb).