Comparing the Cosmos DB SQL API with EF Core for Cosmos DB

In our Azure Development Course, we speak about Cosmos DB, and the different datamodels that can be stored in it. One of the models is SQL which, very confusing, has little to do with SQL. This model allows you store objects as "documents", being as JSON.

{
    "id": "CD1",
    "Publisher": {
        "PublisherName": "Enigma"
    },
    "CDTitle": "DayDream Nation",
    "Band": "Sonic Youth",
    "Tracks": [
        {
            "TrackName": "Teen Age Riot",
            "Duration": "00:06:57"
        },
        {
            "TrackName": "Silver Rocket",
            "Duration": "00:03:47"
        },
        {
            "TrackName": "The Sprawl",
            "Duration": "00:07:42"
        },
        {
            "TrackName": "Cross the Breeze",
            "Duration": "00:07:00"
        },
        {
            "TrackName": "Eric's Trip",
            "Duration": "00:03:48"
        },
        {
            "TrackName": "Total Trash",
            "Duration": "00:07:33"
        },
        {
            "TrackName": "Hey Joni",
            "Duration": "00:04:23"
        },
        {
            "TrackName": "Providence",
            "Duration": "00:02:41"
        },
        {
            "TrackName": "Candle",
            "Duration": "00:04:58"
        },
        {
            "TrackName": "Rain King",
            "Duration": "00:04:39"
        },
        {
            "TrackName": "Kissability",
            "Duration": "00:03:08"
        },
        {
            "TrackName": "Trilogy",
            "Duration": "00:14:02"
        }
    ],
    "_rid": "dSw9ANvJbDkBAAAAAAAAAA==",
    "_self": "dbs/dSw9AA==/colls/dSw9ANvJbDk=/docs/dSw9ANvJbDkBAAAAAAAAAA==/",
    "_etag": "\"01005107-0000-0000-0000-5c45f6c70000\"",
    "_attachments": "attachments/",
    "_ts": 1548089031
}

The API we use for talking to the SQL API is Microsoft.Azure.Documents.Client which can be found in Nuget. There has been a prerelease version of an EF Dataprovider for Cosmos DB SQL API for some time now. So, time for comparing these two.

Connecting to the database

With the SQL API you need a DocumentClient and and a DocumentCollection to talk to. This can be done with following code:

private static DocumentClient GetDocumentClient()
{
	return new DocumentClient(new Uri("https://mvwcosmos.documents.azure.com:443/"), 
        "yourkey");
}

private static DocumentCollection GetCollection(DocumentClient client)
{
	Database db = client.CreateDatabaseQuery().Where(d => d.Id == "mvwDb").AsEnumerable().FirstOrDefault();
	DocumentCollection collection = client.CreateDocumentCollectionQuery(db.CollectionsLink)
		.Where(c => c.Id == "CDs").AsEnumerable().FirstOrDefault();
	return collection;
}

Once you have these you can start storing, retrieving,... data.

If you use EF, you'll need Microsoft.EntityFrameworkCore.Cosmos from Nuget (I'm using version 3.0.0-preview.18572.1). Connecting here is, somewhat simpler. Typical for EF you'll need to create a datacontext-class and specify the connection in the OnConfiguring.

public class CDContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        base.OnConfiguring(optionsBuilder);

        optionsBuilder.UseCosmos("https://mvwcosmos.documents.azure.com:443/", 
            "yourkey",
            "CDCollection");
    }
}

Saving data, and how your data looks like

In my comparison I'm saving some information about music-CDs, with info about the Publisher and the Tracks.

public class CD
{
	[JsonProperty(PropertyName = "id")]
	public string CDId { get; set; }
	public Publisher Publisher { get; set; }
	public string CDTitle { get; set; }
	public string Band { get; set; }

	public Track[] Tracks { get; set; }
}

public class Track
{
	public string TrackName { get; set; }
	public TimeSpan Duration { get; set; }
}

public class Publisher
{
	public string PublisherName { get; set; }
}

Saving the data goes something like this with the SQL API:

DocumentClient client = GetDocumentClient();
DocumentCollection collection = GetCollection(client);

CD cd1 = new CD()
{
	Band = "Sonic Youth",
	CDId = "CD1",
	CDTitle = "DayDream Nation",
	Publisher = new Publisher() { PublisherName = "Enigma" },
	Tracks = new Track[]
	 {
		 new Track() { TrackName = "Teen Age Riot", Duration = new TimeSpan(0,6,57)},
		 new Track() { TrackName = "Silver Rocket", Duration = new TimeSpan(0,3,47)},
		 new Track() { TrackName = "The Sprawl", Duration = new TimeSpan(0,7,42)},
		 new Track() { TrackName = "Cross the Breeze", Duration = new TimeSpan(0,7,0)},
		 new Track() { TrackName = "Eric's Trip", Duration = new TimeSpan(0,3,48)},
		 new Track() { TrackName = "Total Trash", Duration = new TimeSpan(0,7,33)},
		 new Track() { TrackName = "Hey Joni", Duration = new TimeSpan(0,4,23)},
		 new Track() { TrackName = "Providence", Duration = new TimeSpan(0,2,41)},
		 new Track() { TrackName = "Candle", Duration = new TimeSpan(0,4,58)},
		 new Track() { TrackName = "Rain King", Duration = new TimeSpan(0,4,39)},
		 new Track() { TrackName = "Kissability", Duration = new TimeSpan(0,3,8)},
		 new Track() { TrackName = "Trilogy", Duration = new TimeSpan(0,14,2)}
	 }
};

await client.CreateDocumentAsync(collection.DocumentsLink, cd1);

The data is being stored as a nice JSON-document, as you would expect:

/2019/1/2CDs.jpg

Now for EF. First of all: some changes were needed on my business-classes. First of all: the JsonProperty-attribute is no longer needed on the CDId, but the key needs to be a GUID now. This one will automatically get a value on insert. Arrays are also not possible, so my array of Tracks is now a generic list of Tracks. Inserting of data now goes like this :

Publisher pub = new Publisher() { PublisherName = "Neurot Recordings" };
CD cd = new CD()
{
    Band = "Amenra",
    CDTitle = "Mass VI",
    Publisher = pub,
    Tracks = new List<Track>()
    {
        new Track { TrackName = "Children of the eye", Duration = new TimeSpan(0,9,41)},
        new Track { TrackName = "Edelkroone", Duration = new TimeSpan(0,0,23)},
        new Track { TrackName = "Plus près de toi", Duration = new TimeSpan(0,8,40)},
        new Track { TrackName = "Spijt", Duration = new TimeSpan(0,1,59)},
        new Track { TrackName = "A Solitary Reign", Duration = new TimeSpan(0,9,4)},
        new Track { TrackName = "Diaken", Duration = new TimeSpan(0,11,5)}
    }
};

CDContext ctx = new CDContext();
ctx.Database.EnsureCreated();

ctx.CDs.Add(cd);
ctx.SaveChanges();

Again, easier to do. But surprise: the structure used for saving the data is different. When using the SQL API you get a JSON-structure in which each CD gets a collection of Tracks. Using EF Core everything is saved as a separate entity. So with the code above you get one CD, one Publisher and six Tracks.

/2019/1/Track.jpg

Each piece of JSON get a Discriminator-field to specify what type is represents, and has relational references. The CD has a PublisherID, each Track has a CDId. (I still have to test what it does with many-to-many relations).

So, EF uses a kind of relational model for storing JSON documents. Makes sense, because of the nature of EF. But it also means that data created with SQL API cannot be read with EF Core, and vice versa. Bummer.

But, it also gives me some advantages. E.g. : Getting Publishers, and all CD published by them is easier now:

CDContext ctx = new CDContext();
var pub = ctx.Publishers.Include(p => p.CDs).FirstOrDefault();

Console.WriteLine(pub.PublisherName);
foreach (var cd in pub.CDs)
{
    Console.WriteLine(cd.CDTitle);
}

Some more differences: Getting all CDs, with the Publisher. Using SQL API:

var query = client.CreateDocumentQuery<CD>(collection.DocumentsLink).AsEnumerable();
foreach (var cd in query)
{
    Console.WriteLine(cd.CDTitle);
    foreach (var track in cd.Tracks)
    {
        Console.WriteLine(track.TrackName);
    }
}

With EF Core, you need to use the Include-function for also loading the Tracks :

CDContext ctx = new CDContext();

var cds = ctx.CDs.Include(cd=>cd.Tracks);

foreach (var cd in cds)
{
    Console.WriteLine(cd.CDTitle);
    foreach (var track in cd.Tracks)
    {
        Console.WriteLine(track.TrackName);
    }
}

So, quit some differences between the two API's. SQL API starts from a nested JSON-structure, while EF Core forces a relational way of working to the data you save.

By the way: these cds rock !