Properties with property changed event, part 3

A while ago, I talked about how to write basic events for changed properties, and about the INotifyPropertyChanged interface. There is a third way to manage events, which is especially useful when your class has many events, but you expect a very low number of them to be actually handled.

As discussed before, every EventHandler stored in your object takes up space, which is a bit of a waste if most of those will be null. But events allow you to implement the add and remove methods yourself, so you can choose where to store the EventHandler delegate.

One way to do that is through the System.ComponentModel.EventHandlerList class. System.ComponentModel.Component exposes an Events property of this type, which is used by all classes inheriting from Component, including all Windows Forms controls.

The following example inherits from System.Windows.Forms.TextBox, adding a CueBanner property (like the Internet Explorer 7 search box) and a CueBannerChanged event:

using System; 
using System.ComponentModel;
using System.Windows.Forms;

namespace U2U.Framework.Windows.Forms
{
/// <summary>
/// Textbox that displays a CueBanner when the Text is empty.
/// </summary>
public class CueBannerTextBox : TextBox
{
private string cueBanner = string.Empty;

/// <summary>
/// Gets or sets the prompt text to display when there is nothing in the Text property.
/// </summary>

[Browsable(true)]
[EditorBrowsable(EditorBrowsableState.Always)]
[Category("Appearance")]
[Description("The prompt text to display when there is nothing in the Text property.")]
[DefaultValue("")]
public string CueBanner
{
get { return cueBanner; }
set
{
if (value == null)
{
value = string.Empty;
}
if (value != cueBanner)
{
cueBanner = value;
NativeMethods.SendMessage(Handle, EM_SETCUEBANNER, IntPtr.Zero, cueBanner);
OnCueBannerChanged(EventArgs.Empty);
}
}
}

private const int EM_SETCUEBANNER = 0x1501;
private static readonly object EVENT_CUEBANNERCHANGED = new object();

/// <summary>
/// Occurs when the value of the <see cref="CueBanner"/> property has changed.
/// </summary>
[Category("Property Changed")]
[Description("Event raised when the value of the CueBanner property changed.")]
public event EventHandler CueBannerChanged
{
add { base.Events.AddHandler(EVENT_CUEBANNERCHANGED, value); }
remove { base.Events.RemoveHandler(EVENT_CUEBANNERCHANGED, value); }
}

/// <summary>
/// Raises the CueBannerChanged event.
/// </summary>
/// <param name="e">An <see cref="EventArgs"/> that contains the event data.</param>
protected virtual void OnCueBannerChanged(EventArgs e)
{
EventHandler handler = base.Events[EVENT_CUEBANNERCHANGED] as EventHandler;
if (handler != null)
{
handler(this, e);
}
}
}
}


You'll also need this:

using System; 
using System.Runtime.InteropServices;

namespace U2U.Framework.Windows.Forms
{
internal static class NativeMethods
{
[DllImport("user32", CharSet = CharSet.Unicode)]
internal static extern IntPtr SendMessage(IntPtr hWnd, int message, IntPtr wParam, string lParam);
}
}


Notice how the CueBannerChanged event provides an explicit implementation of the add and remove methods. The add method adds the delegate in the inherited Events collection, using a private static object as the key, and the remove method removes it from that same collection.

The OnCueBannerChanged method retrieves the delegate from the same collection, using the same key. Notice how this collection can store any type of delegate, not just EventHandlers, so we need to cast it back to EventHandler before we can use it.

Enjoy.

Technorati Tags: , , , , ,

Orcas: February 2008 it will be

Microsoft named the date. In the latest MSDN Flash, they said: "[...] February 2008 is shaping up to be Microsoft's largest launch month - ever - with RTMs of Visual Studio 2008, Windows Server 2008, and SQL Server 2008 all on tap."

Update: In case you haven't heard, Visual Studio 2008 was released on Monday 11/19. The big launch party is still planned for February in Las Vegas.

In the meantime, Rico Mariani concludes his series on LINQ To SQL performance. It looks like it will be easy to get performance virtually on par with manually optimized code using SqlDataReader directly. That's great news, especially if you realize that most code today isn't manually optimized that way. For example, based on the figures he gives, I expect that LINQ To SQL will outperform the TableAdapters and typed DataSets you generate in Visual Studio 2005.

Unfortunately, we can't verify this yet, since the current Orcas Beta 1 doesn't have the optimizations they did to get to this result. But Beta 2 should have them, and that's due to be available "later this summer". That means it might be around when I'm back from my holidays!

Technorati Tags: ,

Versioning .NET Assemblies

Some questions seem to come and go in waves. Recently, several people asked me about versioning .NET assemblies. There are at least four attributes int the BCL that allow you to specify version information for a .NET assembly: AssemblyVersionAttribute, AssemblyFileVersionAttribute, ComCompatibleVersionAttribute and AssemblyInformationalVersionAttribute. How should you use them?

The Assembly Version (AssemblyVersionAttribute) reflects the version of the specification of the assembly. It changes when the API changes (types or methods added, modified or removed), or when the semantics of the API change (a method now does something functionally different). When neither of these conditions are met, existing clients will be compatible with the new “version”, and the Assembly Version should not change.

The File Version (AssemblyFileVersionAttribute) reflects the distribution. It changes when the binary image of the Assembly changes, even when the Assembly Version does not. Typically, this is the result of bug fixes or internal optimizations.

While in theory the File Version allows any string to be used as a value, it is highly recommended to use a four number version string, according to the same syntax and semantics as the assembly version.

These version numbers consist of four numbers in the range 0 to 65534. The four values indicate:

  • Major Version: change when features have been modified or removed.
  • Minor Version: change when features have been added or Major version changed.
  • Build number: change when bugs have been fixed or Minor Version changed.
  • Revision number: change when non-functional improvements were made or the Build number changed.

When using a correct numbering scheme, compatibility between versions is as follows:

  • A change in major version: the new version is not compatible with the old version.
  • A change in minor version (but not in major version): the new version is backwards compatible with the old version, but not forward compatible. Applications using the new features don’t work with the old version, but old applications do work with the new version.
  • A change in build number (but not in major or minor version): the new version is binary compatible with the old version, both forward and backward. A change in behavior may be observed as a result of a bug fix.
  • A change in revision number only: the new version is binary compatible with the old version, both forward and backward. Only non-functional changes in behavior, such as changed performance characteristics, may be observed as a result of non-functional changes.

When one of these numbers is modified, typically all lower level numbers are reset to zero.

If an assembly exposes types defined in another assembly in its public API, and the other assembly’s Assembly version changes, then the Assembly version of this assembly should change as well. If the other assembly has increased its major version, increase this assembly’s major version as well. Avoid exposing types defined in third-party assemblies, in order to limit this problem.

Summary:

Changed version number Reason Compatibility
Major version Features changed or removed None
Minor version Features added Forward only
Build number Bugs fixed Forward and backward
Revision number Non-functional changes Forward and backward

Typically, the Major Version and Minor Version are the same in the Assembly Version and in the File Version. The Assembly Version will have the build number and revision number equal to zero, while the File Version is updated with every bug fix or non-functional improvement.

Assembly Information (2)

When the Assembly Version includes the Minor version, strong named assemblies compiled against the previous version will not pick up the new version from the GAC. You need to recompile the client assemblies against the new version, configure the client applications, or create a publisher policy. Likewise, assemblies compiled against the new (minor) version won't accidentally pick up the old version.

When you leave the build and revision numbers equal to zero in the Assembly version, you don't need to do anything when you want to deploy a bug fix or internal optimization.

The VB and C# compilers generate an operating system resource in the assembly file, such that several of these attributes show up in the Version tab of the file properties dialog box. Please note that the C++ compiler does not do this. In C++, a version resource needs to be added manually by the developer, and the developer must manually synchronize its contents with the assembly level attributes.

U2U.Framework.ApplicationLayer.dll Properties

Typically, there should be no reason to include a ComCompatibleVersionAttribute or AssemblyInformationalVersionAttribute.

String.Trim() has problems

String.Trim() has some problems:

  1. It has bugs
  2. It is slow
  3. It is too often abused

String.Trim() has bugs

Yep, I'm not joking. The documentation says: "Removes all leading and trailing white-space characters from the current String object". So what do you expect the following program to do?

for (int i = (int)char.MinValue; i <= (int)char.MaxValue; i++) 
{ 
    char c = (char)i; 
    string s = c.ToString(); 
    bool charIsWhiteSpace = char.IsWhiteSpace(c); 
    bool trimTreatsCharAsWhiteSpace = s.Trim() == ""; 
    if (charIsWhiteSpace != trimTreatsCharAsWhiteSpace) 
    { 
        Console.WriteLine("Problem with char {0:X}: charIsWhiteSpace == {1}, trimTreatsCharAsWhiteSpace == {2}.", 
             (int) c, charIsWhiteSpace, trimTreatsCharAsWhiteSpace); 
    } 
}


According to the documentation, I would expect this to write nothing to the console, but here you go:

Problem with char 180E: charIsWhiteSpace == True, trimTreatsCharAsWhiteSpace == False.
Problem with char 200B: charIsWhiteSpace == False, trimTreatsCharAsWhiteSpace == True.
Problem with char 202F: charIsWhiteSpace == True, trimTreatsCharAsWhiteSpace == False.
Problem with char 205F: charIsWhiteSpace == True, trimTreatsCharAsWhiteSpace == False.
Problem with char FEFF: charIsWhiteSpace == False, trimTreatsCharAsWhiteSpace == True.
 

I looked them up:

180E: Mongolian vowel separator
200B: Zero width space
202F: Narrow no-break space
205F: Medium mathematical space
FEFF: Zero width no-break space

I'm not sure about the Mongolian thing ;-), but the others do look like white space to me.

String.Trim() is slow

Granted, this function has quite a bit of work to do. Basically, it needs to:

  1. Find the first non-white-space character in the string
  2. If there is none, return the empty string
  3. Find the last non-white-space character in the string
  4. If you get the entire string, just return it, otherwise return an appropriate substring.

And basically, that's what the function actually does. Well, in fact it swaps steps 2 and 3, so all space strings are actually scanned twice. Much worse is how the searching is done, or in fact, how a character is determined to be white-space or not. Reflector shows a pretty weird loop construct there. Unrolling that inner loop could speed up things considerably.

String.Trim() is too often abused

Have you ever written code like this:

if (s != null && s.Trim() != "")
{ 
    // ... 
}


Don't! In a test like that, you're not interested in the actual result, so don't calculate it! Look at the algorithm again: of the four steps mentioned, we only need the first one executed. We don't need the actual substring!

Now take a look at this function:

static bool IsEmptyOrWhiteSpace(string s) 
{ 
    foreach (char c in s) 
    { 
        if ((c < (char)9 || c > (char)13) && 
            c != ' ' && 
            c != (char)133 && 
            c != (char)160 && 
            c != (char)5760 && 
            (c < (char)8192 || c > (char)8203) && 
            c != (char)8232 && 
            c != (char)8233 && 
            c != (char)12288 && 
            c != (char)65279) 
        { 
            return false; 
        } 
    } 
    return true;
}


It's compatible with String.Trim(), so it has the same bugs mentioned above. But that makes it a perfect replacement in s.Trim() != "" tests, it won't change the semantics of your code.

I did a small benchmark, comparing its speed and memory usage with this function:

static bool Naive(string s) 
{ 
    return s.Trim() == ""; 
}


And these are the results:

Length:   0 WhiteSpace at start:   0 WhiteSpace at end:   0 
 Naive                      Time:     1,65 GC:      0
 IsEmptyOrWhiteSpace        Time:     0,17 GC:      0 
 
Length:   1 WhiteSpace at start:   1 WhiteSpace at end:   1 
 Naive                      Time:     5,27 GC:      0
 IsEmptyOrWhiteSpace        Time:     0,27 GC:      0 
 
Length: 100 WhiteSpace at start:   0 WhiteSpace at end:   0 
 Naive                      Time:    12,44 GC:      0
 IsEmptyOrWhiteSpace        Time:     0,72 GC:      0 
 
Length: 100 WhiteSpace at start:   1 WhiteSpace at end:   1 
 Naive                      Time:    37,23 GC:  20733
 IsEmptyOrWhiteSpace        Time:     1,08 GC:      0 
 
Length: 100 WhiteSpace at start:  44 WhiteSpace at end:  55 
 Naive                      Time:   177,77 GC:   1920
 IsEmptyOrWhiteSpace        Time:    20,46 GC:      0 
 
Length: 100 WhiteSpace at start: 100 WhiteSpace at end: 100 
 Naive                      Time:   169,46 GC:      0
 IsEmptyOrWhiteSpace        Time:    38,50 GC:      0
 

I called each function millions of times for different strings. For each string, I display its length and the number of leading and trailing spaces (ASCII 32). The first column shows actual time, second column shows the number of generation 0 garbage collection runs. Both functions were called through a delegate, which allowed me to take loop and call overhead into account. Obviously, I compiled with optimizations turned on.

As you can see, the IsEmtyOrWhiteSpace function is always faster than the naive approach, sometimes by orders of magnitude! A common case in many applications is a long string, e.g. an HTML document, ending with a CR/LF combination. That's actually the worst case for the naive approach, while being the best case for the IsEmptyOrWhiteSpace function.

Obviously, IsEmptyOrWhiteSpace doesn't do any memory allocations, while the naive approach in certain cases does.

Oh, and in case you wondered: I tried for loops and an unsafe version as well, both were slower than the simple foreach loop used above.

Technorati Tags: