Sunday, May 10, 2009

An (F) Sharp Pain in the Butt

Recently we did a major refactoring at work, which involved extending System.Data.DataSet.  As we were still using the version 3 framework, we decided to extend DataSet rather than use an extension method.  After finishing the refactoring, we were asking ourselves if the refactoring was useful or not.

This brought up the subject of how we could find all the locations in our program where we used DataSet and all the locations where  we used our new derived class.  No problem I said, "I think we can use FxCop to scan the assemblies and locate all our problems".  This post is about how I got from that statement to where I am now.

First, while I had heard that you could easily write rules for FxCop, I had never really investigated as to how you might go about doing this.  Fortunally I found a great tutorial at binary coder .  So after a quick peruse of this document, I thought it might be useful to write a rule that would show me all my methods that are never called.  The tutorial showed how to do this very easy task and within a few minutes I had something that was spitting out rule violations left, right and center.  Here is my code:

using System;
using Microsoft.FxCop.Sdk;
sealed class MethodCalledRule : BaseIntrospectionRule {
public MethodCalledRule ()
: base ("MethodCalledRule", "Vigis.Fxcop.Rules", typeof (MethodCalledRule).Assembly) {
}

public override TargetVisibilities TargetVisibility {
get {
return TargetVisibilities.All;
}
}

public override ProblemCollection Check (Member member) {
Method m = member as Method;
if (m != null) {
MethodCollection mc = CallGraph.CallersFor (m);
if (mc == null || mc.Count <>
if (!m.Name.Name.Contains ("_") && !("Main".Equals (m.Name.Name)) && !("Dispose".Equals (m.Name.Name))) {
this.Problems.Add (new Problem (this.GetNamedResolution ("DeleteMethod", m.Name.Name)));
}
}
}
return this.Problems;
}
}
Basically,  we walk through all the members of all the classes.  When we find a method, we ask the CallGraph for all the methods that call it (the callers).  If there are none, then I have a method that is potentially never called (I am not writing API code).  Basically I found that the Main and Dispose methods of this (a WinForms application) were never explicitely called, so I decided to explicitely ignore them.  I also determined that methods that implement event logic, while reference by the event, were never called either.  Note, I decided not to check for method referenced by events, but rather to just ignore all methods with an underscore in there name (a standard for WinForms event handlers).

This was all well and good and I was happy with the result.  My total investement in this affair was about an hour at this time (including reading the tutorial).

Now, I have being investigating the use of F# as an alternative language to C# for about a year now.  Have read "Expert F#" about twice and I had twiddled with the compiler, but never really written anything with a purpose.  I decided that this was a project that I could write in F# (only 30 lines of C# code and it does something).  Now, I must say that I am hesitant about F#.  I don't think that it has tool support (compiler and Visual Studio integration) sufficient for commercial software development.  I have strong reservations about the syntax, because I don't believe that a competant C# programmer can even read the code without a significant amount of training.  Thus, in the near term, I don't think that F# is an appropriate choice for commercial software develepment.  However, with the prevalence of multi-core architectures, it is clear that multi-processing must become part of our everyday way of writing programs, and I do believe that functional languages and F# in particular have their place in this world.  Maybe not today, but soon.  For this reason, I think an investement in learning F# is justified.

So, armed with my book and a good internet connection for googling any problem, I set out to write my FxCop rule in F#.  First I have to say that the assembly created worked perfectly and so hats off to microsoft.  There is nothing strange about an F# assembly at the byte code level.  My problem is the source code level.

In truth, I took me about 4 hours to implement my simple FxCop rule:

namespace Vigis.Fxcop.FSharp

    open System
    open Microsoft.FxCop.Sdk

    type MethodCalledRule () = 
        inherit BaseIntrospectionRule ("MethodCalledRule", "Vigis.Fxcop.FSharp.Rules", typeof.Assembly)
        
        override this.TargetVisibility with get() = TargetVisibilities.All
        
        override this.Check(m : Member) = 
            match m with
            | :? Method as me -> 
                match CallGraph.CallersFor me with
                | v when v.Count <> 0 -> this.Problems
                | _ ->
                    | "Main" | "Dispose" -> this.Problems
                    | n when n.Contains ("_") -> this.Problems
                    | e ->
                        this.Problems.Add (new Problem (this.GetNamedResolution ("DeleteMethod", [| box e |] )))
                        this.Problems
            | _ -> this.Problems
            
Now, in the unlikely event that someone ever reads this article and in the even more unlikely event that person knows something about F#, please don't laugh at my code.  The FxCop rule in C# is standard code, we are basically extending a base class and overriding a couple of do nothing methods.  This is not the way F# is typically implemented.  It took me a long time to find the right override syntax and I have the impression that there are several ways the simple extend and override object might be implemented.  If we are going to use F# in an OO world, we will have to have real idioms for doing OO operations and these will have to be explained in the standard documentation.  I don't believe that this is the case today.  I suffered leafing through books and googling useless sites before coming up with this implementation.  If it is not good, its not for lack of trying.

Next, once I had the basic skeleton of what I wanted to implement, I still ran into problems at every turn.  First, and this is nothing against F# or other functional languages, is that I am a OO programmer.  Before that I was a procedural programmer.  My functional programming is limited to a couple of exercises in university about twenty years ago (lisp and prolog).  So, as you can imaging the idea that I always have to return the right value at the end of an expression cames hard (but the compiler eventually quit complaining).

I still don't know the correct idiom to use when implementing these things.  An FxCop rule basically means you have to override a Check method that returns a ProblemCollection specifically this.Problems.  So, we have a Check where the last line has to read this.Problems in all cases.  Now, I kind'a like the match pattern expression, especially when we have to try object casts.  I do this a lot in real life (Describe for those of you in the club). Seeing language support pleases me.  But, how do you finish with all groups emiting the same result.  My implementation does not please me.  Maybe I will go back and change it later, once I know what I am doing.

The worst thing was when I had to add a problem.  This basically involves giving the name of a resolution resource (that was defined in an XML file) and then passing an array of objects which will be applied to the resource string (using something like a string.Format).  After much head shaking and grinding of teeth, I came up with this concoction.

[| box e |]

Basically e is a string and [| somthing; something-else |] says create an array with the arguments.  The problem is that when you create an array with a string, it creates a string array.  I tried a number of hack arounds, but I did not come across the correct incantation.  Thus I finshed by saying, box the string as an object, then create an object array.  I was unable to cast a string array to an object array and I don't know why.  I consider this to be hard, because .NET programming is full of passing arguments as object arrays so we need a good idiom to get this done.  I probably spent more than an hour on this issue, and I didn't find anything better than this.

Now, the standard analysis, done by F# affectionados is "look how few lines of code I used".  In this case F# is clearly better - 18 lines against 28 for C#.  If we don't count lines containing a single curly braces,  the contest turns out to be 18 to 18.  Note, I used a match structure in F# to do what was done in a single if test in C#, so I don't think that number of lines is really meaningful.  Obviously I wrote the C# much more quickly than the F#.  I will likely to continue to write C# must more quickly than F# for sometime to come, so this is really not a good measure either.  

In the end, I don't need no stinking measures to justify this experiment.  I think this was a useful exercise and I am very pleased to have done it.  And I think this is what is important.  I should really be competent in both OO and functional development so that I can bring either tool to bear if the situation arises.