When being lazy is (finally) good

In this blog post I want to talk about .NET 4 new Lazy<T> class. First of all, why would you need something called Lazy?

You can use it for data access for example; when you load a row from a database parent table. Would you need to load the child rows automatically, or delay until they’re required. Some systems will delay load automatically, or load all they can (but what then when the child rows have other relations to grandchild rows, etc…). This kind of delayed loading of data is just what Lazy<T> (or Lazy(Of T) when using VB.NET) supports.

It’s a great type to use when you have an object which is very expensive to create, and you only want to create it on first use.

Let’s start with an example; let’s say you have this big-ass class:

   1: class BigAndExpensive
   2: {
   3:   string s = "";
   4:  
   5:   public string GetTheData()
   6:   { 
   7:     return s; 
   8:   }
   9:  
  10:   public BigAndExpensive()
  11:   {
  12:     Console.WriteLine("BigAndExpensive is being created...");
  13:     for (int i = 0; i < 10000; i++)
  14:       s = s + ".";
  15:     Console.WriteLine("BigAndExpensive is finally created...");
  16:   }
  17: }

As you can see, creating is very expensive (it will actually consume about 10 Gb of memory, triggering a lot of garbace collects).

Let’s create an instance of this class without, then with Lazy<T> and look at the performance:

   1: BigAndExpensive be;
   2: Lazy<BigAndExpensive> lbe;
   3:  
   4: using (new MeasureDuration("Not using Lazy evaluation"))
   5: {
   6:   be = new BigAndExpensive();
   7: }
   8: using (new MeasureDuration("Accessing non-lazy object's method"))
   9: {
  10:   string s = be.GetTheData();
  11: }
  12: using (new MeasureDuration("Using Lazy evaluation"))
  13: {
  14:   lbe = new Lazy<BigAndExpensive>(false);
  15: }
  16: using (new MeasureDuration("Accessing lazy object's method"))
  17: {
  18:   string s = lbe.Value.GetTheData();
  19: }
  20: using (new MeasureDuration("Again accessing lazy object's method"))
  21: {
  22:   string s = lbe.Value.GetTheData();
  23: }

In order to use the Lazy<T> object you have to get it’s value property. When the lazy loaded value hasn’t yet been created, accessing the Value will create it.

The MeasureDuration class is a little timer taking advantage of the using statement:

   1: class MeasureDuration : IDisposable
   2: {
   3:   Stopwatch sw;
   4:   string what;
   5:  
   6:   public MeasureDuration(string what)
   7:   {
   8:     this.what = what;
   9:     sw = new Stopwatch();
  10:     sw.Start();
  11:   }
  12:  
  13:   public void Dispose()
  14:   {
  15:     sw.Stop();
  16:     Console.WriteLine("Measured duration of -{0}- took {1} ticks ({2} ms)"
  17:                      , what, sw.ElapsedTicks, sw.ElapsedMilliseconds);
  18:   }
  19:  
  20: }

The output I get on machine looks like this:

image

As you can see, creating a Lazy object is very fast, but of course as you can expect, using it the first time is just as expensive due to the creating process. Using it the second time is again very fast.

Now go back to the code, and look for the Lazy<T> constructor. Change the false argument to true:

   1: lbe = new Lazy<BigAndExpensive>(true);

This will make the instantiation process of the actual instance thread-safe. This means it will be a little slower, but only during construction. Is it worth the price? If you’re using multiple threads YES YES YES!

Now let’s try to see what happens when many threads access an unprotected Lazy object (never be lazy AND unprotected :))

This is the code:

   1: private static void UsingLazyObjectsFromMultipleThreads()
   2: {
   3:   Lazy<BigAndExpensive> createMeOncePlease = new Lazy<BigAndExpensive>(isThreadSafe:false);
   4:  
   5:   ManualResetEvent youMayBegin = new ManualResetEvent(false);
   6:   AutoResetEvent done = new AutoResetEvent(false);
   7:  
   8:   // create a lot of threads that will use our object all at once
   9:   for (int i = 0; i < 20; i++)
  10:   {
  11:     Thread t = new Thread(() =>
  12:       {
  13:         youMayBegin.WaitOne();
  14:         Console.WriteLine("Thread {0} getting data", Thread.CurrentThread.ManagedThreadId);
  15:         using (new MeasureDuration("Multithreading"))
  16:           createMeOncePlease.Value.GetTheData();
  17:         done.Set();
  18:       });
  19:     t.Start();
  20:   }
  21:   youMayBegin.Set();
  22:   // wait for all threads to complete
  23:   for (int i = 0; i < 20; i++)
  24:     done.WaitOne();
  25:  
  26: }

I’ve now used the named argument feature of C# 4.0. In this case it make the code a lot clearer doesn’t it?

So what does the code do. It creates 20 threads which all first wait for the “youMayBegin” event. This way all threads will start running at the same time. Then they each access the “createMeOncePlease” lazy instance, so some of them will start to create the instance (because it hasn’t yet been created). Then they will all signal that they’re done so the main thread can stop too.

So let’s run the code (making sure the isThreadSafe is set to false). I get this:

image

This is bad. Very bad. Instead of calling the constructor of my very expensive object once, it calls it several times. why?

Think about lazy’s possible thread-unsafe implementation:

   1: class Lazy<T> where T : class, new()
   2: {
   3:   T instance = null;
   4:  
   5:   public T Value
   6:   {
   7:     get
   8:     {
   9:       if (instance == null)
  10:         instance = new T();
  11:       return instance;
  12:     }
  13:   }
  14: }

When you run the if statement on multiple thread, each will evaluate to true, then each will create an object and overwrite instance’s value.

So what is the solution? Simply pass true for the isThreadSafe argument.

Running this code once more looks like this on my machine:

image

Good. My expensive object only get’s created once. But why are the calls soo expensive after all. That is because when we access Value, only one thread will be allowed to create the instance, but the other Value calls will need to wait for the first one to complete. If you insert another call using Value you’ll see the speed is very fast.

If you only need initialization to be thread-safe, or only access to the object in a thread-safe you you can also use the contructor taking a LazyThreadSafetyMode enumeration:

   1: None = 0,
   2: PublicationOnly = 1,
   3: ExecutionAndPublication = 2

What if your expensive class requires special construction, like a special constructor? Then you can use another constructor of Lazy<T>, one that takes a delegate( Func<T> ) so you can create your object your way.

   1: Lazy<BigAndExpensive> createMeOncePlease = 
   2:   new Lazy<BigAndExpensive>(() => new BigAndExpensive());