Wednesday, February 4, 2009

Useful Extension Methods 1 through 3 of N

Quite often when I'm writing code I'll notice a very small bit of logic that keeps popping up all over the place. It's usually something so trivial that most people barely notice it. But it's also usually something that shows up so often that, despite being small and ignorable, it constitutes a fairly constant level of noise in the code. By noise I simply mean something that takes up more characters than it needs to, obscuring the real meat of the logic of your code. Common functionality like this that everyone knows and understands should just get out of the way, fade into the background, and let the unique logic stand out.

You may have also heard of this noise idea by another name: accidental complexity.

As Reg Braithwaite has pointed out, looping syntax is one of these things. With the ubiquity of IEnumerable and the advent of extension methods in C#, there is almost never a good reason to write an explicit for loop anymore. Looping is a ubiquitous bit of logic that nonetheless takes up quite a lot of characters. Even the vaunted foreach loop is now officially more verbose than it very often needs to be.

Microsoft made it easy to get rid of the explicit loop when what you are doing is essentially a mapping operation, with the inclusion of the IEnumerable.Select extension method.

Say your Foo class has a static function taking a Bar and returning a Foo, and you want to use this function to take a collection of Bars and create a collection of Foos.

You could do this:
IEnumerable<Bar> bars = GetAllBars();
List<Foo> foos = new List<Foo>();

foreach (var bar in bars)
foos.Add(Foo.FromBar(bar));

Or you could do this, which is obviously much more concise:
IEnumerable&lt;bar> bars = GetAllBars();
IEnumerable&lt;foo> foos = bars.Select(Foo.FromBar).ToList();

Note that the Select function completely takes care of the looping logic. Once you know that, this code reveals itself as being extremely elegant. But what if the action you're taking doesn't return anything? You're "stuck" writing an explicit loop, right? Not at all.

Take this code:
IEnumerable<Foo> foos = bar.GetAllItems();

foreach (var foo in foos)
PrintToScreen(foo);

To start removing the noise, first define an extension method for IEnumerable called ForEach. This is extension method #1.
public static IEnumerable<X> ForEach<X>(this IEnumerable<X> lhs, Action<X> func)
{
foreach (X x in lhs)
{
func(x);
yield return x;
}
}

Then rewrite:
bar.GetAllItems()
.ForEach(PrintToScreen)
.ToList();

Now there's just the issue of that nasty little ToList call. Right now, we need that in order to force the collection to be iterated. The yield return syntax essentially causes a function's execution to be deferred until an element is actually requested. This is actually potentially useful even if none of the things you need to do will return values. You can chain together a bunch of actions on the collection by chain-calling ForEach with different delegates. But it's still silly to create and throw away a List just to do this.

So we create an Evaluate function that does a simple explicit iteration and nothing more, to force the iterator to be evaluated. This is extension method #2.
public static void Evaluate<X>(this IEnumerable<X> lhs)
{
foreach (X x in lhs) ;
}
And now you can replace ToList with Evaluate, which will iterate the collection without allocating a new List.
bar.GetAllItems()
.ForEach(PrintToScreen)
.Evaluate();

This is nice, and is going to be useful if we need to chain ForEach calls. But when we don't need that there's still that Evaluate call at the end that's going to be repeated every time we want this functionality, which could be an awful lot. So, let's get rid of that too.

To do that we define a Visit function (named for the Visitor Pattern), that will call ForEach with the given delegate, and then Evaluate as well. This is extension method #3.
public static void Visit<X>(this IEnumerable<X> lhs, Action<X> func)
{
lhs
.ForEach(func)
.Evaluate();
}

Now we can finally get all this done in a single function call:
bar.GetAllItems()
.Visit(PrintToScreen);

This takes some getting used to. But it really has the potential to condense your code. It isn't readily apparent looking at one bit of code, but once you start talking about nested loops or consecutive loops, you'll see the difference. Not to mention the total effect it will have across your codebase. Loops are everywhere. Shave off 50 characters from each one and your talking about a lot of characters in aggregate.

For my part, after I determined to avoid explicit loops whenever possible, the comparatively verbose explicit looping syntax became almost painfully extraneous to my eyes. I feel like function calls are much more elegant.

3 comments:

mwatts said...

I agree but I think you might have left off explaining what the .Evaluate() function does and how you implemented it.

Chris Ammerman said...

You're absolutely right, mwatts. Fixed!

Anonymous said...

Was very handy. Thanks a lot!
--Sufian