Choosing the right serialization engine for your Windows Store app

Most Windows 8 Store apps spend a significant time in the saving and retrieving of local data. The local file system is often used as main storage. But even the apps that come with server side storage, often need to use local storage: to host a cache for when there’s no network available or whilst the server data is still downloading. If you don’t use a third-party local database (like SQLite) then you have to manage the persistence (i.e. the serialization and deserialization of your objects) yourself. This article introduces you to the 4 main serialization engines that are available to Windows 8 Store apps: the 3 native ones (XmlSerializer, DataContractSerializer, and JsonSerializer), together with one popular serializer that’s not in the framework: the Json.NET serializer.
I built a little benchmarking app to compare these serializers. It measures the duration and validates the result of a workflow that consists of serializing, locally saving, and deserializing a list of business objects. The app lets you select the number of objects to be processed, the location where the file should be saved (Local Folder or Roaming Folder), and whether or not to compare the deserialized objects with the original ones.

Here’s how the app looks like:
serializers
The Item class represents the business object. It has properties of different types: strings, datetimes, an enumeration, a link to another business object (called SubItem), and a calculated property that we don’t want to serialize. My intention is to use this app for testing real app models, by replacing this Item class with the appropriate model class(es). An extension class contains methods to generate test data for the instances, and to compare the content of two instances (overriding Equals might be an alternative).

We’re testing 4 candidate serialization engines here, but maybe I’m missing some. So to be able to extend the list of engines in the benchmark, I hooked them into a framework. The app talks to the serializers through an abstract base class:

Abstract Serialization Base
  1. /// <summary>
  2. /// Abstract base class for serializers.
  3. /// </summary>
  4. /// <typeparam name="T">The type of the instance.</typeparam>
  5. public abstract class AbstractSerializationBase<T> where T : class
  6. {
  7.     /// <summary>
  8.     /// Serializes the specified instance.
  9.     /// </summary>
  10.     /// <param name="instance">The instance.</param>
  11.     /// <returns>The size of the serialized instance, in KB.</returns>
  12.     public abstract Task<int> Serialize(T instance);
  13.  
  14.     /// <summary>
  15.     /// Deserializes the instance.
  16.     /// </summary>
  17.     /// <returns>The instance.</returns>
  18.     public abstract Task<T> Deserialize();
  19.  
  20.     /// <summary>
  21.     /// Gets or sets the name of the file.
  22.     /// </summary>
  23.     public string FileName { get; set; }
  24.  
  25.     /// <summary>
  26.     /// Gets or sets the folder.
  27.     /// </summary>
  28.     public StorageFolder Folder { get; set; }
  29. }

Each engine gets a concrete subclass. Here’s a short overview of the technologies:

XmlSerializer

Here’s how to serialize and deserialize with the XmlSerializer:

XmlSerializer
  1. public override async Task<int> Serialize(T instance)
  2. {
  3.     XmlSerializer serializer = new XmlSerializer(typeof(T));
  4.     StringWriter stringWriter = new StringWriter();
  5.     serializer.Serialize(stringWriter, instance);
  6.     string content = stringWriter.ToString();
  7.     StorageFile file = await this.Folder.CreateFileAsync(this.FileName, CreationCollisionOption.ReplaceExisting);
  8.     await FileIO.WriteTextAsync(file, content);
  9.  
  10.     return content.Length / 1024;
  11. }
  12.  
  13. public override async Task<T> Deserialize()
  14. {
  15.     StorageFile file = await this.Folder.GetFileAsync(this.FileName);
  16.     string content = await FileIO.ReadTextAsync(file);
  17.     XmlSerializer serializer = new XmlSerializer(typeof(T));
  18.             
  19.     return (T)serializer.Deserialize(new StringReader(content));
  20. }


The XmlSerializer saves all public properties in the object graph, except the ones that are decorated with XmlIgnore:

XmlIgnore Attribute
  1. // Calculated - should not be serialized
  2. [XmlIgnore]
  3. public TimeSpan Duration
  4. {
  5.     get
  6.     {
  7.         return this.End - this.Start;
  8.     }
  9. }

 

The XML that is produced by this serializer, can be defined through attributes, but that is outside the scope of this article. The XmlSerializer is the only one that does NOT fire the methods that are flagged with OnSerializing, OnSerialized, OnDeserializing and/or OnDeserialized attributes. That may be a showstopper in some scenarios. Also notice that the XmlSerializer by default removes the insignificant white space, even in element content. If you don’t like that, you can change the setting. It’s not a configuration of the serializer itself: you have to add an extra field in each class to be serialized. He’re a snippet from the Item and SubItem classes:

Setting 'xml:space' value
  1. // Tell the XmlSerializer to preserve white space.
  2. [XmlAttribute("xml:space")]
  3. public string Space = "preserve";


The XmlSerializer is not the fastest, nor does it generate the smallest files. But in every scenario that I went through, it has beaten all the other engines hands down when it came to deserialization. On average, the XmlSerializer deserializes twice as fast as the competition. So when your app needs to read a large amount of data from local storage at startup, then you should choose this one.

DataContractSerializer

The DataContractSerializer is the one used by WCF. Here’s how to serialize and deserialize with it:

DataContractSerializer
  1. public override async Task<int> Serialize(T instance)
  2. {
  3.     DataContractSerializer serializer = new DataContractSerializer(typeof(T));
  4.     string content = string.Empty;
  5.     using (var stream = new MemoryStream())
  6.     {
  7.         serializer.WriteObject(stream, instance);
  8.         stream.Position = 0;
  9.         content = new StreamReader(stream).ReadToEnd();
  10.     }
  11.  
  12.     StorageFile file = await this.Folder.CreateFileAsync(this.FileName, CreationCollisionOption.ReplaceExisting);
  13.     await FileIO.WriteTextAsync(file, content);
  14.  
  15.     return content.Length / 1024;
  16. }
  17.  
  18. public override async Task<T> Deserialize()
  19. {
  20.     StorageFile file = await this.Folder.GetFileAsync(this.FileName);
  21.     var inputStream = await file.OpenReadAsync();
  22.     DataContractSerializer serializer = new DataContractSerializer(typeof(T));
  23.            
  24.     return (T)serializer.ReadObject(inputStream.AsStreamForRead());
  25. }

 

It will serialize all public properties that are decorated with the DataMember attribute:

DataMember Attribute
  1. [DataMember]
  2. public string Name { get; set; }


During the process, it will fire the OnSerializing, OnSerialized, OnDeserializing and OnDeserialized methods. It serializes the fastest. The files are smaller than the ones produced by the XmlSerializer, but not significantly.

JsonSerializer

Here’s how to serialize and deserialize with the JsonSerializer:

Native JsonSerializer
  1. public override async Task<int> Serialize(T instance)
  2. {
  3.     var serializer = new DataContractJsonSerializer(instance.GetType());
  4.     string content = string.Empty;
  5.     using (MemoryStream stream = new MemoryStream())
  6.     {
  7.         serializer.WriteObject(stream, instance);
  8.         stream.Position = 0;
  9.         content = new StreamReader(stream).ReadToEnd();
  10.     }
  11.  
  12.     StorageFile file = await this.Folder.CreateFileAsync(this.FileName, CreationCollisionOption.ReplaceExisting);
  13.     await FileIO.WriteTextAsync(file, content);
  14.  
  15.     return content.Length / 1024;
  16. }
  17.  
  18. public override async Task<T> Deserialize()
  19. {
  20.     StorageFile file = await this.Folder.GetFileAsync(this.FileName);
  21.     string content = await FileIO.ReadTextAsync(file);
  22.     var bytes = Encoding.Unicode.GetBytes(content);
  23.     var serializer = new DataContractJsonSerializer(typeof(T));
  24.  
  25.     return (T)serializer.ReadObject(new MemoryStream(bytes));
  26. }


It uses the same DataMember attribute as the DataContractSerializer to flag the properties to be serialized. During the process, it will fire the OnSerializing, OnSerialized, OnDeserializing and OnDeserialized methods. It serializes a bit faster than the XmlSerializer, and the saved files are undoubtedly smaller (although not always significantly). If you save in the Roaming Folder –with its limited storage- you should consider using Json (but keep on reading: there’s another Json serializer in the benchmark). Unfortunately the JsonSerializer is by far the slowest when it comes to deserialization, and it crashes on uninitialized DateTime values. The.NET default of DateTime.MinValue is beyond its range:
serializer_json_datetime

So one way or another you have to make sure that your DateTime values remain in the Json range. This is what I did on the constructor of the business class:

Json DateTime Range
  1. public Item()
  2. {
  3.     // To support Json serialization
  4.     this.Start = DateTime.MinValue.ToUniversalTime();
  5.     this.End = DateTime.MinValue.ToUniversalTime();
  6. }

I’m definitely not feeling comfortable with this.

Json.NET Serializer

Before you can use this serializer, you have to add a reference to the Json.NET assembly (e.g. through NuGet). Here’s how to serialize and deserialize with it:

NewtonSoft Json.NET Serializer
  1. public override async Task<int> Serialize(T instance)
  2. {
  3.     string content = string.Empty;
  4.     var serializer = new JsonSerializer();
  5.     // Lots of possible configurations:
  6.     // serializer.PreserveReferencesHandling = PreserveReferencesHandling.All;
  7.     // Nice for debugging:
  8.     // content = JsonConvert.SerializeObject(instance, Formatting.Indented);
  9.     content = JsonConvert.SerializeObject(instance);
  10.     StorageFile file = await this.Folder.CreateFileAsync(this.FileName, CreationCollisionOption.ReplaceExisting);
  11.     await FileIO.WriteTextAsync(file, content);
  12.  
  13.     return content.Length / 1024;
  14. }
  15.  
  16. public override async Task<T> Deserialize()
  17. {
  18.     StorageFile file = await this.Folder.GetFileAsync(this.FileName);
  19.     string content = await FileIO.ReadTextAsync(file);
  20.            
  21.     return JsonConvert.DeserializeObject<T>(content);
  22. }

When defining the properties to be serialized, you have the choice between opting in (serialized properties must be tagged with JsonProperty OR DataMember) or opting out (all public properties are serialized, except the ones flagged with JsonIgnore). Here’s an extract from the Item class again:

Json Attributes
  1. [JsonObject(MemberSerialization.OptIn)]
  2. public class Item
  3. {
  4.     [JsonProperty]
  5.     public DateTime Start { get; set; }
  6.  
  7.     // Calculated - should not be serialized
  8.     [JsonIgnore]
  9.     public TimeSpan Duration
  10.     {
  11.         get
  12.         {
  13.             return this.End - this.Start;
  14.         }
  15.     }
  16.  
  17.     // ...
  18. }

During the process, the Json.NET serializer will fire the OnSerializing, OnSerialized, OnDeserializing and OnDeserialized methods. It’s fast and it generates the smallest files. It’s also the most configurable: it comes with a lot of extra attributes and configuration settings. All of that comes with a price of course: the Newtonsoft.Json.dll adds more than 400KB to you app package. If you ask me, that’s a small price…

Conclusions

Of course you have to test these serialization engines against your own data and workload. But here are at least some general observations:

  • For small amounts of data, it doesn’t matter which technology you use. But adding Json.NET would just make your package larger.
  • Try to stick to one technology in your app. If you’re already using Json to fetch your server data, then use the same serializer to save locally.
  • If you’re dealing with large amounts of data, prepare to handle OutOfMemory exceptions. Unsurprisingly these are thrown when you run out of memory:
    serializer_out_of_memory
  • But OutOfMemory exception is also thrown when you run out of storage. I didn’t find any documentation of Local Storage limitations, but I do get exceptions when trying to allocate more than 100MB:
    serializer_out_of_storage
  • If on startup you need to deserialize large amount of data, prefer the XmlSerializer.
  • If you need one or more of the On[De]Serializ[ing][ed] methods, then don’t use the XmlSerializer.
  • If you need to store and retrieve local data, but none of the serialization engines covers your requirements, then normalization and indexing is what you need. Well, it’s time for a real database then.

Code

Here’s the full code for the sample app. It was written in Visual Studio 2013, for Windows 8.1: U2UConsult.WinRT.SerializationSample.zip (3MB)

Enjoy,
Diederik