Sunday, May 10, 2009

An (F) Sharp Pain in the Butt

Recently we did a major refactoring at work, which involved extending System.Data.DataSet.  As we were still using the version 3 framework, we decided to extend DataSet rather than use an extension method.  After finishing the refactoring, we were asking ourselves if the refactoring was useful or not.

This brought up the subject of how we could find all the locations in our program where we used DataSet and all the locations where  we used our new derived class.  No problem I said, "I think we can use FxCop to scan the assemblies and locate all our problems".  This post is about how I got from that statement to where I am now.

First, while I had heard that you could easily write rules for FxCop, I had never really investigated as to how you might go about doing this.  Fortunally I found a great tutorial at binary coder .  So after a quick peruse of this document, I thought it might be useful to write a rule that would show me all my methods that are never called.  The tutorial showed how to do this very easy task and within a few minutes I had something that was spitting out rule violations left, right and center.  Here is my code:

using System;
using Microsoft.FxCop.Sdk;
sealed class MethodCalledRule : BaseIntrospectionRule {
public MethodCalledRule ()
: base ("MethodCalledRule", "Vigis.Fxcop.Rules", typeof (MethodCalledRule).Assembly) {
}

public override TargetVisibilities TargetVisibility {
get {
return TargetVisibilities.All;
}
}

public override ProblemCollection Check (Member member) {
Method m = member as Method;
if (m != null) {
MethodCollection mc = CallGraph.CallersFor (m);
if (mc == null || mc.Count <>
if (!m.Name.Name.Contains ("_") && !("Main".Equals (m.Name.Name)) && !("Dispose".Equals (m.Name.Name))) {
this.Problems.Add (new Problem (this.GetNamedResolution ("DeleteMethod", m.Name.Name)));
}
}
}
return this.Problems;
}
}
Basically,  we walk through all the members of all the classes.  When we find a method, we ask the CallGraph for all the methods that call it (the callers).  If there are none, then I have a method that is potentially never called (I am not writing API code).  Basically I found that the Main and Dispose methods of this (a WinForms application) were never explicitely called, so I decided to explicitely ignore them.  I also determined that methods that implement event logic, while reference by the event, were never called either.  Note, I decided not to check for method referenced by events, but rather to just ignore all methods with an underscore in there name (a standard for WinForms event handlers).

This was all well and good and I was happy with the result.  My total investement in this affair was about an hour at this time (including reading the tutorial).

Now, I have being investigating the use of F# as an alternative language to C# for about a year now.  Have read "Expert F#" about twice and I had twiddled with the compiler, but never really written anything with a purpose.  I decided that this was a project that I could write in F# (only 30 lines of C# code and it does something).  Now, I must say that I am hesitant about F#.  I don't think that it has tool support (compiler and Visual Studio integration) sufficient for commercial software development.  I have strong reservations about the syntax, because I don't believe that a competant C# programmer can even read the code without a significant amount of training.  Thus, in the near term, I don't think that F# is an appropriate choice for commercial software develepment.  However, with the prevalence of multi-core architectures, it is clear that multi-processing must become part of our everyday way of writing programs, and I do believe that functional languages and F# in particular have their place in this world.  Maybe not today, but soon.  For this reason, I think an investement in learning F# is justified.

So, armed with my book and a good internet connection for googling any problem, I set out to write my FxCop rule in F#.  First I have to say that the assembly created worked perfectly and so hats off to microsoft.  There is nothing strange about an F# assembly at the byte code level.  My problem is the source code level.

In truth, I took me about 4 hours to implement my simple FxCop rule:

namespace Vigis.Fxcop.FSharp

    open System
    open Microsoft.FxCop.Sdk

    type MethodCalledRule () = 
        inherit BaseIntrospectionRule ("MethodCalledRule", "Vigis.Fxcop.FSharp.Rules", typeof.Assembly)
        
        override this.TargetVisibility with get() = TargetVisibilities.All
        
        override this.Check(m : Member) = 
            match m with
            | :? Method as me -> 
                match CallGraph.CallersFor me with
                | v when v.Count <> 0 -> this.Problems
                | _ ->
                    | "Main" | "Dispose" -> this.Problems
                    | n when n.Contains ("_") -> this.Problems
                    | e ->
                        this.Problems.Add (new Problem (this.GetNamedResolution ("DeleteMethod", [| box e |] )))
                        this.Problems
            | _ -> this.Problems
            
Now, in the unlikely event that someone ever reads this article and in the even more unlikely event that person knows something about F#, please don't laugh at my code.  The FxCop rule in C# is standard code, we are basically extending a base class and overriding a couple of do nothing methods.  This is not the way F# is typically implemented.  It took me a long time to find the right override syntax and I have the impression that there are several ways the simple extend and override object might be implemented.  If we are going to use F# in an OO world, we will have to have real idioms for doing OO operations and these will have to be explained in the standard documentation.  I don't believe that this is the case today.  I suffered leafing through books and googling useless sites before coming up with this implementation.  If it is not good, its not for lack of trying.

Next, once I had the basic skeleton of what I wanted to implement, I still ran into problems at every turn.  First, and this is nothing against F# or other functional languages, is that I am a OO programmer.  Before that I was a procedural programmer.  My functional programming is limited to a couple of exercises in university about twenty years ago (lisp and prolog).  So, as you can imaging the idea that I always have to return the right value at the end of an expression cames hard (but the compiler eventually quit complaining).

I still don't know the correct idiom to use when implementing these things.  An FxCop rule basically means you have to override a Check method that returns a ProblemCollection specifically this.Problems.  So, we have a Check where the last line has to read this.Problems in all cases.  Now, I kind'a like the match pattern expression, especially when we have to try object casts.  I do this a lot in real life (Describe for those of you in the club). Seeing language support pleases me.  But, how do you finish with all groups emiting the same result.  My implementation does not please me.  Maybe I will go back and change it later, once I know what I am doing.

The worst thing was when I had to add a problem.  This basically involves giving the name of a resolution resource (that was defined in an XML file) and then passing an array of objects which will be applied to the resource string (using something like a string.Format).  After much head shaking and grinding of teeth, I came up with this concoction.

[| box e |]

Basically e is a string and [| somthing; something-else |] says create an array with the arguments.  The problem is that when you create an array with a string, it creates a string array.  I tried a number of hack arounds, but I did not come across the correct incantation.  Thus I finshed by saying, box the string as an object, then create an object array.  I was unable to cast a string array to an object array and I don't know why.  I consider this to be hard, because .NET programming is full of passing arguments as object arrays so we need a good idiom to get this done.  I probably spent more than an hour on this issue, and I didn't find anything better than this.

Now, the standard analysis, done by F# affectionados is "look how few lines of code I used".  In this case F# is clearly better - 18 lines against 28 for C#.  If we don't count lines containing a single curly braces,  the contest turns out to be 18 to 18.  Note, I used a match structure in F# to do what was done in a single if test in C#, so I don't think that number of lines is really meaningful.  Obviously I wrote the C# much more quickly than the F#.  I will likely to continue to write C# must more quickly than F# for sometime to come, so this is really not a good measure either.  

In the end, I don't need no stinking measures to justify this experiment.  I think this was a useful exercise and I am very pleased to have done it.  And I think this is what is important.  I should really be competent in both OO and functional development so that I can bring either tool to bear if the situation arises.

Saturday, March 07, 2009

I Must Have Dosed Off for a Year

Wow... what just happened.  I think I must have dosed off for a year there... Just kidding.  I have been a bit busy and I recently thought I might start blogging again.  First an update.  I looked at my last post... it was a depressing entry about writing our license server with Xheo.  Well, just as a way of update date here is how the story plays out.

In fact, in the end I did get my license server working.   Basically it allows us to install a single application that manages all licenses.  When a customer buys a product we give him a license server that he uses on his site to manage all his applications.  A license administation client uploads licenses to the server which stores them in a secure way.  The license server also talks to my local license activation server every once in a while so I can have an idea of how many purchased licenses are actually being deployed.  Licenced applications don't have to worry about installing licenses.  They simply ask my server for a license by user name and machine id.  If they own a license, they are authorized, otherwise they will aquire a license if one is available.  If none are available, the application stops.  

In the comming weeks we will be running a beta test with a new product.  The product uses a project server hosted on our site that licenses our remote beta testers by accessing a local license server (the product is a desktop application).  The beta tester don't even know that they are using a licensed product.

Overall, the licensing system works pretty well.  I don't think I understand everything about Xheo and sometime trying to resolve problems is like playing baseball in the dark (a swing and a miss) but Xheo does provide a lot of features and it can be coerced into working.

Funning thing is, a year after I implemented and tested this, I was forced to reactivate a licence service.  Unfortunatly someone had changed the network setting of my server (we have an internet facing project server accessing a license server behind the firewall.  It took us almost a week to unravel the network mess before finally figuring out what was wrong.  The only thing we knew was the we didn't have a valid license...


Thursday, January 10, 2008

Xheo-du

Well, I think I have enough Xheo to get me started working on other aspects of my license server. It was not easy and the documentation did not really help. In the end I had to show the license registration dialog or the registration information just was not sent to the server (even though there were default values and the registration method was called). I don't pretend to understand everything about DeployX, but I have enough continue.

I think this underscores the problem with Microsoft's help format. Help is cut up into a hierarchical collection of factoids and it is very difficult to read linearly (i.e. there is no start and there is no end, so in general you can only read a snippet of the available documentation). Why can't someone produce a nice PDF from an existing VS2005 help file (or maybe somebody can?...).

Wednesday, January 09, 2008

Xheo Shmeo

I have found a name for my pain. I call it Xheo DeployX . I have been working on a new license server for about the past month and I am really at my wits end. It is rare that I come to such a point in a project, but I would like to give up.

For those of you who don't know Deployx is a software license package for .NET. It is very complete with a lot of features and is also very expensive. We would like to use it to write our own centralize licensing server so that corporate users could simplify deployment (and we could get paid for what we are doing - a big change). Unfortunately it is a black box and a brick sXXt house. I don't know how it works, I don't know how it is supposed to work if it did work. I don't know when it is working and when it is not working. All I know is that is it not doing what I hoped it would do.

The documentation is minimal and the support staff surly (when they answer at all). If there were any other product that might be able to do my job I would have bought it.

Depression... depression... depression.

Tuesday, September 19, 2006

What about AOP?

I have read a lot of stuff on AOP recently. In the past couple years both the CACM and IEEE Software dedicated entire issues to AOP. However, after reading both of these and a host of other stuff I remained largely unconvinced.

However, when coding tricky projects we have been faced with times when we wished that we could inject code compiler at compile time - a typical thing that AOP will allow. In addition, I can see the utility of AOP for a lot of developer tools, for example, TypeMock - a C# mock object framework, uses AOP extensively. This brought me back to AOP.

I decided I would try (again) to do something with AspectJ. It seems to be one of the big guns of the AOP world (in CACM and IEEE Software, there were articles that originated with this project). In addition, there are very good AspectJ tools for Eclipse. This time, to help me I bought the Addison Wesley book on AspectJ with Eclipse (I forget the name just now). And decided I would try to read it over the summer holidays.

To my surprise, the book offered a new picture of AspectJ. In its introductory chapter it actually solved a real problem (serialization of objects) with something that seemed to be an AOP application (not a just a developer tool) that was useful. Much of the authors claims made sense. That is, use aspects to modualize objects that are in different inheritance hierachies. The serialization example was very convincing.

The idea is that a lot of busness objects should not mix thier busness logic with there "book keeping" code (for example how the object is stored). The result was similar code that cut across a number of unrelated objects could be treated in a unified way.

The more I looked at the examples presented by the author and the more I thought of it, the less excited I became. I am, however convinced that if used as another programming tool, AOP can be useful, but if you abuse the AOP hammer, I think you will end of with a big, complex, distributed mess. Here are my main beefs,

  1. In Holub's Pattern book he makes a convincing pitch that objects should be able to save themselves in context. According to Holub, getter and setter (or properties if you come from C# land) are evil. I believe Holub in this case. The serialization aspect can not just save everyone by looking set and get methods and use them to serialize the object. Object serialization is a task the object must participate in. This is made clear in Holub's book. Otherwise, we trade modularity of the serialization code for a tangle of aspect and object code, with the guts of the objects exposed in the wild, rather than carefully concealed.
  2. I note that in AspectJ you can write point cuts on private join points. For example, we could execute a bit of code every time the program accessed a private field. Is this really a good idea. Is it not really fragile to use a private point cut? The implementing private method has made no contract to exist and may be changed at anytime. This type of aspects might be good to hack to do something like unit testing with mocks, but we can not write application code like this.
  3. Aspect interpretation is hard... too hard. When I read my first paper on AspectJ the authors emphasized that tool support would be required if the framework was to be adopted. I think that tool support is good, but tool requirement is bad. That is, if when I write a point cut I need a tool to show me which objects (might be affected), then I am in trouble. I think that writing point cuts is like writing regular expression. Seems easy until you have to do something really hard. Sometimes you ge more hits tht you think you will and it is very difficult to see the difference. With this, the advantage of AOP (localized code for dispersed object) becomes a disadvantage. You never know which objects will be hid by an AOP bug and the object and aspect code are dispersed, so you might have a hard time locating the bad line of code.
  4. Your object can not count on aspects (usually). Here is a philosphical point. Should the object know there will be an aspect attached to it. If not, why does it not implement all the code required to do its business (rather that stuffing some code in a aspect). If it does know, why is it not more specific about its contract with the aspect. I think this is where I am now. I would like to use aspects, but I would like my objects to expose a public point cut interface that could be used by the aspect for stuff the object would have had to do. Then and only then can the object remove "non-business" code. Otherwise, the only thing we have done is created a dependance between the object and the aspect that is implicit, required but very weak (can you ship the object without the aspect - AspectJ say - go ahead).

So there is my 2 cents worth on AOP. I think when we do AOP the objects would do well to define a set of exposed point cuts. I don't think we should bend over backwards looking for aspects - aspects are suppose to deal with cross-cutting issues. If there are no cross-cutting issues using an aspect (or inventing an abstract cross-cutting issue) is just making things more difficult than it needs to be (I had the impression to read an article like this in IEEE Software). Finally, I don't think that an aspect should ever point cut a private or protected or package join point. Maybe for a developer tool, but not for a application. An OO programmer would not use reflection to execute a private method just because he could after all, so an aspect should not do the same thing.

Thursday, April 06, 2006

On 10 Things Jim Mischel Hates About C#

The other day a read Jim Mischel expound on the 10 things he hates about C#. I was a bit disappointed because most of the things he didn't like about C# he wouldn't like about C or any of its decendants. I think he missed all the meat. Take a look here before I add my two cents worth.

Jim's Top Ten

1. No Anonymous Inner Classes

Anyone who says they don't like the syntax of anonymous inner classes misses the point of C-like languages. That is, minimizing the amount of text minimizes the amount of errors and using common idioms eliminates common errors. No anonymous inner classes ensure that there is always a large amount of boilerplate infrastructure to hack up everytime you ever want to implement a class that will only ever be created in one place (does IEnumerable and IEnumerator mean anything to you). I hate creating private named IEnumerables when I could easily do the same thing with an anonymous class. The syntax might be something pascal programmers are not use to, but we are after all highly paid programmers. Besides, putting the code right beside where it is used rather than in some dusty corner *always* makes it easier to maintain. I can not see why anyone would prefer not to use anaymous inner classes at every opportunity... must be a hold over from our BASIC days.

2. Interfaces Can't have Inner Objects

How many times do you implement an exception that is only thrown by an interface method, or an enum that is only used in an interface. Why can not these be inner classes and enums on the interface itself. This allows the internals to be encapsulated within the interface and it reduces namespace pollution by seldom used items.

3. Enums aren't Objects

Enums can not have any body. But this is not the way they are used. Enums often have services they have to perform, like showing as a translated string, saving themselves in a persistence framework or just display themselves in the UI (as a drop down menu for example). These are real needs (google whack "enum C# ToString" and you will see what I mean). Enums should not be just a short list of ints. In an object oriented language they should be full class objects.

4. I Don't Like Delegates (and I like Events even less).

The syntax of delagates still confuses me, their place in the object hierachy confuses me (e.g. what is "this" when you are in a delegate method). I don't think there is a need for a function wrapper in an object oriented language. An no, there is nothing object-oriented about a delegate.
I can say the same for events, except the syntax is even worse and more confusing. How can this:

button.Click -= new Event (myEvent);

remove an event listener when I just created a new delegate. I think the entire event system in .NET was made so that the people who implemented VisualStudio had an easier time (and to hell with the people who have to use the generated code).

I don't think I have to go to 10. I am disapointed Jim just came up with the same "don't like C conditionals" ... blah, blah, blah. C is about minimizing typing, using standard implementation idioms and putting the code near to where the action is taking place. I think if you don't like these ideas, then, effectively, you won't like C or any of its decendants (no sweat). Having said that, there is lots of things about C# that we don't have to like (assuming we like C). I listed a few. I think if you have not felt the same need, you aren't really trying. Almost every day I code C# I miss anonymous inner classes and inner classes on interfaces.

Sunday, March 19, 2006

XML, Strings and "Simple" Text Processing

Last friday I was hacking up a simple system for maintaining a presistent local parameter store. I did not want to waste a lot time on it, so I decided I would persist my parameters in an XML file and store the file everytime a parameter was added or changed. After finishing the implementation, the first thing I wanted to do was persist an XML document, represented as a string (a log4net configuration document). No problem I thought, its just text - I had a key node with the parameter key and a value node where I persisted the XML document.

When I went to implement a simple parameter editor, I thought... after editing, maybe my XML document is non-conforming. I think I will reload it into an XML document after editing to test its validity. Much to my disappointment, when I did this everything broke. All my XML elements were XML encoded as <node-name> and when I reloaded the document I got lots of errors. What went wrong.

In fact, when I loaded the document into the XML document and later stored the document inner XML (in .NET C#) the new text contained an node giving the encoding of the document. Since this was being stored as an XML node in my parameters document, the XML framework thought it must be text and encoded it for me. This show why XML sucks. You can not just blindly add text to a document. If I would have added XML that was not well-formed in a single node, this could break my containing document.

To counter this I decided to base64 encode the xml text and push it into a CDATA section of my parameters document. Okay, the text of my parameters document would not be as easy to hack, but I could guarantee that any parameter could be put in the document without breaking the container. This is the source of today's rant. I find the C# API for such a task is crappy.

First, what I expected to do was to create a string object that was the base64 representation of my initial string object. But the Convert method for creating a base64 string takes a byte array (okay that is logical), but to create a byte array from a string you have to get text encoding object and call the GetBytes method. One would expect that string could expose a constructor that takes and encoding and a byte array and a GetBytes method that could create a byte array. I spent 1/2 hour looking though the crappy .NET documentation trying to find the objects collaborating in the string to byte array sequence before stumbling upon the correct incantation. Then I spent another 1/2 hour doing the same thing for converting a byte array back to a string.

The basic problem here is that you should be able to say, "string, give me a byte array representation of yourself - here is my encoding scheme" - I think this is the Java strategy and I think it is the logical one. What .NET has is "encoding, here is a string, take a peak inside its current implementation and representation and give me a byte array represetation of it". I think it breaks encapsulation and you really have to have a lot of experience working in domains where character encodings are important (i.e. where 99% of all north american programmers like myself don't work) before you even guess where the solution might be located.

The second problem is that all the .NET documentation (no... make that all Microsoft documentation) is broken in the 1/2 screen factoids with circular references between every three pages. Examples are designed to make people quit trying. For example, I recently looked at the documenation for getting directory information and an example program was presented showing how to print out the various parts of a file path without showing what the various method calls should return... I suppose they expect you to create a project, copy and paste the example code and run the example if you are really interested in knowing what the method calls do, rather than just clearly explaining it in the description of the methods.