U2U Consult TechDays 2011 CD Available for Download

At TechDays 2011 in Antwerp, U2U Consult distributed a CD-ROM with two free tools. I’m happy to announce that the CD-ROM contents is now also available for download from our web site.

The U2U Consult SQL Database Analyzer is a tool for SQL Server database administrators and developers. It displays diagnostic information about a database and its hosting instance that is hard to collect when you only use the standard SQL Server tools. Just point the tool at a SQL Server database of your choice, and have a look at the reports generated by the tool.

The U2U Consult Code Analysis Rules for Visual Studio 2010 are a series of additional code analysis rules for Visual Studio 2010 Premium or Ultimate, and two rule sets with recommended rules for libraries and applications. The rules include additional general performance and design rules, as well as a series of rules specifically for WCF. All rules are documented on the CD-ROM. Obviously, they are applicable to all .NET languages, including C# and VB.

With this CD, for the first time we make two of our own tools available to you. These are only two small components out of the U2U Consult Framework, but we hope they are as useful to you as they are to us and our clients. Enjoy.

StyleCop for C# released

There have been rumors for years about a tool called StyleCop, used internally within Microsoft. According to the rumors, it was comparable to FxCop (Code Analysis), but would do its job at the source level (instead of the IL level used by FxCop). That way it would be able to check consistency of code style, you know, where to put spaces and comments and line breaks and stuff.

StyleCop has finally been released, and it turns out the rumors were true. It's a Visual Studio Add-In, that sits nicely in the project menu, right below Code Analysis.

Source Analysis

As you can imagine, the rules caused a lot of debate. The thing is, everybody can understand that it's a good idea not to declare protected members in sealed types (for example), but matters of style can't be debated rationally. After all, it's a matter of taste, or is it not?

As I've said before, "I hate it when developers have to make choices like that during routine development. Choosing takes time, and that's not likely to improve productivity. But much worse is the fact that different developers will make different choices. Even a single developer may make different choices from one day to the next. That leads to inconsistencies in the code. Developers will spend more time trying to understand the code they're reading, because it doesn't always follow the same pattern. That's bad for productivity. In the worst case scenario, developers start rewriting each other's code, just so it matches their choice of the day. That kills productivity."

So no, it's not a matter of style, it's all about productivity. What your standard is doesn't matter, what matters is that you have a standard, and that people follow it without wasting time.

So naturally, I took Source Analysis for a test drive on a bunch of code I have written. First impression: lots and lots of warnings! But many do return, so I made just a few setting changes:

Microsoft Source Analysis Project Settings

Only a handful of warnings remained, and to be honest, they had a point. It wasn't much, but my code improved thanks to this tool. And this was just my own code. The real value of a tool like this lies in the consistency it can bring to team projects, ending all pointless debates and holy wars about personal preferences.

The XML based file headers are clearly a Microsoft internal thing. But hey, if you want a copyright notice in every file, you might just as well do it this way. Remember, having a standard is important, which one it is doesn't matter (much).

Conclusion: very good addition to the toolbox, highly recommended.

New Reflector add-in: AssemblyInfo

I'm happy to announce the availability of AssemblyInfo version 2.0. AssemblyInfo is now a Reflector Add-in., adding a new language to Reflector.

In addition to what reflector already shows you, it will show

  • General file information, such as size, creation date and last modified date
  • All file version information embedded in the exe or dll
  • The Authenticode X509 certificate used to signe the file, if any
  • All useful information from the COFF and PE headers
  • The sections and their characteristics,
  • The native dll imports
  • All useful information from the CLR header, if any (it works on unmanaged files too)

AssemblyInfo will also show you all information on other .NET metadata, such as types and their members, in a way as close as possible to the internal .NET structures. It's like looking at metadata the way the CLR looks at it, at least that was my intention ;-).

Download AssemblyInfo, unzip it to a location of your choice, and add it to Reflector via the View menu, Add-ins option. Select the AssemblyInfo language from the language dropdown and you're ready to go.

For example, here's the output on the AssemblyInfo.exe module:

// Module AssemblyInfo.exe  
 Location            : C:\Program Files\Reflector\AssemblyInfo.exe 
 Size                : 49,5 KB
 Created             : 15/12/2007 13:20:24
 Modified            : 15/01/2008 23:12:42
File Version Info:
 Type:               : Executable (EXE)
 File Version:       :
 Product Version:    :
 Flags:              : None
Language: Language Neutral, Codepage 1200:
 CompanyName         : Kris Vandermotten
 FileDescription     : AssemblyInfo
 FileVersion         :
 InternalName        : AssemblyInfo.exe
 LegalCopyright      : Copyright (c) 2007, 2008 Kris Vandermotten
 OriginalFilename    : AssemblyInfo.exe
 ProductName         : AssemblyInfo
 ProductVersion      :
 Assembly Version    : 
Authenticode X509 Certificate:
COFF Header:
 Type                : Executable (EXE)
 Memory              : 32 bit
 Target machine      : I386
 Timestamp           : 15/01/2008 23:12:42
 Characteristics     : ExecutableImage, LineNumsStripped, LocalSymsStripped, Machine32Bit
PE Header:
 Kind                : 32 bit PE file
 Linker Version      : 8.0
 Image Base Address  : 0x00400000
 Section Alignment   : 8192
 File Alignment      : 512
 OS Version          : 4.0
 Image Version       : 0.0
 Subsystem           : WINDOWS_CUI
 Subsystem Version   : 4.0
 Stack Reserve       : 0x00100000 (1 MB)
 Stack Commit        : 0x00001000 (4 KB)
 Heap Reserve        : 0x00100000 (1 MB)
 Heap Commit         : 0x00001000 (4 KB)
 .text               : 47 KB (ContainsCode, MemExecutable, MemReadable)
 .rsrc               : 1,5 KB (ContainsInitializedData, MemReadable)
 .reloc              : 512 bytes (ContainsInitializedData, MemDiscardable, MemReadable)
Native imports:
CLR header:
 Runtime version     : 2.5 (v2.0.50727)
 Flags               : IlOnly, StrongNameSigned
 MetaData            : 30,37 KB
 Managed Resources   : 0 bytes
 VTableFixups        : 0 bytes
 Native Export Thunks: 0 bytes 
 Version GUID        : 3a1cd2b7-7cb5-4c65-8256-bd1e5a783e56 

I could not have written this without the book Expert .NET 2.0 IL Assembler by Serge Lidin, a book that I can highly recommend if you want to know more about the inner workings of .NET.

Enjoy, and let me know what you think.

Technorati tags: , , ,

Free tool: AssemblyInfo

Have you ever come across an assembly file you wanted to know as much as possible about, without running it? Have you ever had problems with deployment, because you weren't sure what version of a DLL you copied on a machine? How do you tell the difference between a 32 bit and 64 bit assembly anyway?

I created a little command line tool, which dumps a lot of information about an assembly. For example, here is the output of the tool when run against itself:

AssemblyInfo (c) 2007 Kris Vandermotten 
COFF Header: 
 File is executable (EXE). 
 File is 32 bit. 
 Target machine:     I386 
PE Header: 
 File is a 32 bit PE file. 
 Linker Version:     8.0 
 Image Base Address: 0x00400000 
 Section Alignment:  8192 
 File Alignment:     4096 
 OS Version:         4.0 
 Image Version:      0.0 
 Subsystem Version:  4.0 
 Subsystem:          WINDOWS_CUI 
Version information: 
 File Version: 
 Product Version: 
 Language: Language Neutral, Codepage 1200 
  OriginalFilename:  AssemblyInfo.exe 
  InternalName:      AssemblyInfo.exe 
  CompanyName:       Kris Vandermotten 
  LegalCopyright:    Copyright (c) 2007 Kris Vandermotten 
  ProductName:       AssemblyInfo 
 Name:               AssemblyInfo, Version=, Culture=neutral, PublicKeyToken=e56d967875f629d3 
 Hash Algorithm:     SHA1 
 Processor:          MSIL 
 Runtime Version:    v2.0.50727 
 EntryPoint:         Void Main(System.String[]) in AssemblyInfo.Program
  PE Kind:           ILOnly 
  Machine:           I386 
  X509 Certificate:  none 
Referenced Assemblies: 
 mscorlib, Version=, Culture=neutral, PublicKeyToken=b77a5c561934e089 
Assembly Attributes: 
 [System.Runtime.CompilerServices.RuntimeCompatibilityAttribute(WrapNonExceptionThrows = True)] 
 [System.Reflection.AssemblyCopyrightAttribute("Copyright (c) 2007 Kris Vandermotten")] 
 [System.Reflection.AssemblyCompanyAttribute("Kris Vandermotten")] 

Download AssemblyInfo and enjoy.

Technorati Tags: , , ,

Generate documentation for your FxCop rules

I must admit that I am a big fan of FxCop (or Visual Studio "Code Analysis"). I believe the power of FxCop is underestimated, and the FxCop team more than deserved the award they got.

Many organizations develop their own frameworks, to support the applications they are building, to increase the productivity of their developers, to enforce standards and reduce maintenance costs. Building custom Code Analysis rules is a powerful way to augment those frameworks. You can verify that the frameworks are used the way they are supposed to be used. And if not, you can politely assist the developer to use the framework correctly.

Developing custom Code Analysis rules is not hard at all, and you can expect some posts in the future about the subject. But first I want to talk about something else: those rules need to be documented too!

As it turns out, every rules assembly contains an XML resource with a description for all the rules in the assembly. So why would you want to duplicate that information in separate documentation? Well, for the same reasons that you want to generate proper documentation from you XML comments in C#, C++ or VB source code: make it human (=user) readable!

So I created a little tool to generate an HTML file based on the XML resource in the rules assembly. I intend to build on this in the future, but a 1.0 version is here, ready to be downloaded. To get an idea of the output it produces, here's a screen shot of the HTML for one of the standard rules assemblies (the smallest one):


Obviously, there isn't much of a point generating this for the standard rules, they're well documented in the MSDN. But this may come in handy for your own rules. Let me know if you like the idea, or which additional features you need.

Tools I use for C# development

I'm often asked what tools I recommend for (general) .NET development, or at least which tools I use on a regular basis. Here's a list:

  • Visual Studio Team Suite. I you can't get Team Suite, get the Professional version. If you can't get that one, use the Express version.
  • Code Analysis, which is built into Visual Studio Team Suite. FxCop does the same thing, and is free to download.
  • .NET Reflector. After MSDN Help (and possibly Google), this tool delivers the best documentation on the .NET Framework. Another free download.
  • The unit testing framework that is built into Visual Studio Team Suite. If you don't have Team Suite, NUnit is a very good, free alternative, possibly in combination with NCover. TestDriven.NET integrates them into Visual Studio. All Free.
  • If I need to edit graphics such as icons or other bitmaps, I use Paint.NET with some additional plug-ins installed. Free as well.
  • Hardly needed at home, but for team projects at U2U we use Team Foundation Server.

There definitively are some other tools I should take a look at, for example WiX, or SandCastle and the SandCastle Help File Builder.

I you know of any must-have tools I didn't mention, drop a comment.

Compiling regular expressions

Regular expressions are a very powerful tool for text processing, and they're well supported by the .NET Framework. Of course, you want your regular expressions to run as fast as possible. You can do two things to speed up regex processing: optimizing the regular expression itself, and optimizing the execution environment. I won't be talking about regular expression optimization itself; there is plenty of information on the net about that. One very good source is www.regular-expressions.info, but I'm sure there are others. What I do want to talk about is how the .NET Framework executes your regular expressions.

Your regular expression typically enters your program as a string. This string is first decoded into an internal form that is more easily processed. Typically, the regular expression is then interpreted. This interpretation is reasonably fast, especially when processing relatively small texts. When the text is large, or the same expression is executed many times, your program may benefit from compilation to IL. You enable compilation by setting the RegexOptions.Compiled flag. As the docs say, this yields faster execution but increases startup time. Obviously, this IL needs to be further compiled by the just-in-time compiler before your CPU can execute it.

Let's look at the following function:

static bool IsValidEmail(string text)
    Regex regex = 
        new Regex(@"^(?!\.)[a-zA-Z0-9!#\$%&'\*\+-/=\?\^_`\|~\.]+(<!\.)@(\w+[\-\.])*\w{1,63}\.[a-zA-Z]{2,6}$", 
            RegexOptions.ExplicitCapture | RegexOptions.Compiled); 
    Match m = regex.Match(text); 
    return m.Success; 

What happens when you call this function?

  1. The pattern is decoded into an internal form
  2. The internal form is compiled into IL
  3. The IL is compiled to machine code
  4. The machine code is executed

Much worse is that these steps are repeated for every function evaluation! That's right, nothing is cached, and all of the above four steps are repeated for every function call. Much better is the following:

static readonly Regex regex = 
    new Regex(@"^(?!\.)[a-zA-Z0-9!#\$%&'\*\+-/=\?\^_`\|~\.]+(?<!\.)@(\w+[\-\.])*\w{1,63}\.[a-zA-Z]{2,6}$", 
        RegexOptions.ExplicitCapture | RegexOptions.Compiled); 
static bool IsValidEmail(string text) 
    Match m = regex.Match(text); 
    return m.Success; 

In this case, step 1, 2 and 3 are executed just once. Only the actual execution of the machine code (step 4) is done in each call.

To get a feeling of the performance impact, I've run a few tests. Obviously, test results depend on your hardware configuration, the actual regular expression and the (length of) the input, so results may vary significantly. Anyway, in my test, interpreted execution took more than twice as long as compiled execution. The decoding step (step 1) took a long as 11 compiled executions, compilation to IL took as long as 300 compiled executions and JIT compilation to machine code (step 3) took as long as 1000 compiled executions.

What does this mean in practice? Compilation speeds up execution significantly, and it's worth doing it if you'll execute the compiled regular expression many times (at least about 500 times in my test). It also means that you should avoid steps 1 to 3. You can do so by caching the result of these steps (see above), or do them at compile time or deployment time instead of at runtime.

The Regex class has a method called CompileToAssembly, which allows you to execute steps 1 and 2 at compile time. It generates an assembly on disk (a DLL) containing strongly typed Regex classes. Unfortunately this is just a method, and the .NET framework does not come with a tool to execute this function (unlike sgen.exe, which does a similar thing for XML serialization).

I've built a command line tool around this function. It takes an xml file as input, containing a number of settings for the assembly to build, and the definition of all the regular expression classes you want to include in that assembly. The output is the assembly itself and an XML documentation file, which provides intellisense in Visual Studio (or it can be used to build a help file with SandCastle). To build an assembly for the above regular expression, the minimal input file would contain the following:

<?xml version="1.0" encoding="utf-8" ?>
<project name="Example.RegularExpressions">

The root element, project has one mandatory attribute name containing the (simple) name of the assembly to build. Optional elements add more information. For example:

<project name="Example.RegularExpressions">
  <copyright>© 2007 Kris Vandermotten</copyright>

The following elements are supported: version, title, description, configuration, company, product, copyright, trademark, culture and strongNameKeyFile. Except for the latter, they all translate to standard attributes at the assembly level. Combined with the name attribute, the version, culture and strongNameKeyFile elements allow specifying a strong name for the assembly. The location of the key file is relative to the location of the source file.

The project element then contains any number of regex elements. Each regex element contains at least the name, namespace and pattern elements. Optionally, it contains options, ispublic, and doc elements. In the doc element you can include the very same XML documentation you would use in C#, C++/CLI or VB.NET, for example:

<summary>Regular expression class to validate email addresses</summary>
According to RFC 2822, the local-part of the address may use any of these ASCII characters:
<item>Uppercase and lowercase letters (case sensitive)</item>
<item>The digits 0 through 9</item>
<item>The characters ! # $ % &amp; ' * + - / = ? ^ _ ` { | } ~</item>
<item>The character . provided that it is not the first or last character in the local part.</item>
      .NET doesn't like { and }

If you don't include a doc element, the pattern is included in a summary element. The result is that Visual Studio intellisense will show the pattern.

Since strongly named assemblies can be build, you can deploy them in the GAC and compile them to native machine code at deployment time with ngen.exe if you want to. That way, even step 3 above can be eliminated.

Download RegexCompiler here. Source code is available upon request.