The Missing Linq to SQL Spatial

This article provides hints and hacks on how to use the SQL Server spatial data types -Geography and Geometry- in Linq to SQL. The data provider has a problem with serializing and deserializing the SQL UDT's. If you ever tried to use these data types in a Linq to SQL (or Entity Framework) project then you certainly encountered the following error: “One or more selected items contain a data type that is not supported by the designer”.

  • Setting up a test environment

I created a small database called 'Spatial' with a schema called 'Europe', and a table called 'Countries'. Here's the structure of the table:

CREATE TABLE [Europe].[Countries](

    [CountryID] [int] IDENTITY(1,1) NOT NULL,

    [CountryName] [nvarchar](50) NOT NULL,

    [Shape] [geography] NOT NULL,

 CONSTRAINT [PK_Countries] PRIMARY KEY CLUSTERED

(

    [CountryID] ASC

)


The source code that accompanies this article contains the scripts to create the assets and populate the table. Here's an indication of the table contents:

I created a Visual Studio 2008 Console application, and added a reference to Microsoft.SqlServer.Types. Then I added Linq to SQL classes (Europe.dbml). When I dropped the Europe.Counties table into the designer, the result was the expected error message:

  • Hint 1: Use views that cast the data type

I decided to create a view in the schema: vw_Countries. It exposes the same signature as the table, but with one difference: the shape column is returned as Varbinary(MAX) instead of Geography:

CREATE VIEW [Europe].[vw_Countries]

AS

SELECT CountryID

      ,CountryName

      ,CONVERT(VARBINARY(MAX), Shape) AS Shape

FROM Europe.Countries

 

The Geography UDT is physically stored as Varbinary(MAX) in SQL Server, so the table and the view are basically the same. There's one important difference though: Visual Studio is not allergic to the view. You can easily drag and drop it into the data model designer:

The Linq data provider delivers the varbinary data as System.Data.Linq.Binary, but is -strangly enough- unable to convert this back to SqlGeography. It took me a while to figure out the conversion myself, but here's a working version:

Binary belgiumb = (from c in db.vw_Countries

                   where c.CountryID == 4

                   select c.Shape).FirstOrDefault();

SqlGeography geo = new SqlGeography();

geo.Read(new System.IO.BinaryReader(new System.IO.MemoryStream(belgiumb.ToArray())));

 

  • Hint 2: Package your conversion code

We're going to get stuck with this type of conversions for a while, so it makes sense to wrap the calculations e.g. in extension methods. Here's an example:

/// <summary>

/// Converts a Linq Binary to a SQL Server Geograpy.

/// </summary>

/// <remarks>Throws an Exception if the Binary contains invalid data.</remarks>

public static SqlGeography AsGeography(this Binary binary)

{

    if (binary == null)

    {

        return SqlGeography.Null;

    }

 

    SqlGeography result = new SqlGeography();

 

    result.Read(

        new System.IO.BinaryReader(

            new System.IO.MemoryStream(

                binary.ToArray())));

 

    return result;

}

 

So now I can call it like this:

SqlGeography b = (from c in db.vw_Countries

                  where c.CountryID == 4

                  select c.Shape).FirstOrDefault().AsGeography();

Console.WriteLine(" The area of Belgium is {0} m²", b.STArea().Value);

 

  • Hack 1: Change source mappings

I wanted to execute my queries directly against the table. So I copy/pasted the view in the dbml, and modified the source mapping:

Now I can call the queries like this:

belgium =

    SqlGeography.Parse((from c in db.Countries

                        where c.CountryID == 4

                        select c.Shape.ToString()).FirstOrDefault());

 

  • Hack 2: Change data type mappings

Since I wanted IntelliSense, I copy/pasted the Country table again, and changed the .NET type of the Shape property to SqlGeography:

The good news is: I now have IntelliSense at design time:

The bad news is: I will have exceptions all over the place.

If you return a spatial data type in the select, then a query is successfully constructed and sent to SQL Server, but the return values cannot be deserialized. Here's such a query:

belgium = (from c in db.Countries2

           where c.CountryID == 4

           select c.Shape).FirstOrDefault();

 

It results in a runtime exception:

If you call a method -apart from ToString()- on a spatial data type in the where-clause, then you'll bump into a NotSupportedException already at compile time. Here's such a query:

var query = (from c in db.Countries2

             where (c.Shape.STArea() > 10).Value

             select c.CountryName).FirstOrDefault();

 

And its result at compile time:

  • Intermezzo

I tried to use Varbinary as data type for the _Shape backing variable, and SqlGeography as data type for the Shape property, and call conversions in the getter and setter. It didn't work: the getters and setters seem to be bypassed by the Linq provider.

Conclusion: hack 2 was not a good idea. We'll stop using the Countries2 entity...

  • Hint 3: Use scalar functions

Since STIntersects cannot be used, I created a scalar function that takes two Well-Known Texts, and returns whether or not the shapes intersect:

CREATE FUNCTION [Europe].[Intersects]

(

    @Shape1 NVarchar(MAX),

    @Shape2 NVarchar(MAX)

)

RETURNS integer

AS

BEGIN

 

    DECLARE @Geo1 Geography = geography::Parse(@Shape1)

    DECLARE @Geo2 Geography = geography::Parse(@Shape2)

 

    RETURN @Geo1.STIntersects(@Geo2)

 

END

 

Again, Visual Studio's designer has no problem with this, so you can drag and drop the function into the DataContext:

So you can use it in your queries:

var query = from c in db.Countries

            where db.Intersects(c.Shape.ToString(), belgium.ToString()) == 1

            select c.CountryName;

 

The response time seems to be good enough against such a small table. But I know that it is never a good idea to call functions in the where-clause of a query. It will force a table scan and an evaluation of the function for each row in the table. If you can live with the performance, then I suggest you just stick to this type of functions: they're highly reusable since they don't contain any hard-code table names.

  • Hint 4: Use table valued functions

If your geographical tables get bigger, it makes sense to define table valued functions or stored procedures to do the heavy lifting. The following example doesn't mask the usage of STIntersects in the where-clause, and may let SQL Server decide to use a spatial index (or you may even use a hint):

CREATE FUNCTION [Europe].[Intersectors]

(   

    @Shape NVarchar(MAX)

)

RETURNS TABLE

AS

RETURN

(

    SELECT CountryID, CountryName

    FROM Europe.Countries

    WHERE Shape.STIntersects(geography::Parse(@Shape)) = 1

)

 

And yes: I do realize that Intersectors probable doesn't appear in a regular dictionary...

Here's how a call to it looks like in Linq - we still have IntelliSense:

var query = from c in db.Intersectors(belgium.ToString())

             select c.CountryName;

 

I couldn't keep myself from comparing the performance of the two functions. When I checked the actual query plans,my jaw hit the ground Surprised, ouch! Completely against my expectation, the cost of the table value function was higher than the cost of the scalar function. And not just a little bit, but 5 times higher:

Anyway, you should never blindly base your conclusions on just the query plans, so I started the SQL Profiler to inspect reality. After thorough testing I was relieved Tongue out. I observed that the tabled valued function easily outruns the scalar version: it consumes between 3 and 8 times less CPU and returns the results 5 to 10 times faster. This is actually an impressive difference for such a small table. Here's the result from a test session (scripts are included in the source code):

A negative aspect of this table valued function is its reusability: the function only works against the Europe.Countries table. If you need to query more tables with spatial data, then you need to add complexity to it, or start copy/pasting.

  • Conclusions

The current Linq to Sql provider doesn't support the SQL Server spatial data types properly, so you have to be creative if you want or need to use these. It's not a waste of time to invest some effort in implementing the work arounds I suggested, and in optimizing and performance tuning these. Your solution will last for a while, since there seems to be no Microsoft solution on the horizon. The problems are not solved in SQL 2008 R2, Visual Studio 2010, or .NET 4.0.

  • Source code

Here's the source code for the test program. It also contains the necessary SQL scripts for creating and populating the entities, as well as for testing and performance tuning: U2UConsult.DockOfTheBay.LinqToSpatialSample.zip (153,76 kb)

Here's how its output should look like:

  • Credits

I would like to thank my colleague Kris Vandermotten for his input during the research.