Friday, May 25, 2012

Next, Please! - A Closer Look at IEnumerable (Part 5 - Usage Tips & Wrap-up)

This is a continuation of "Next, Please! - A Closer Look at IEnumerable".  The articles in the series are collected together here: JeremyBytes - Downloads.

Last time, we took a look at the Strategy pattern and then used that pattern to create a StrategicSequence that allows us to pass in whatever algorithm we want to generate the sequence.  This gives us an class that can be extended without modifying the class directly and gives the application (client) the opportunity to select which algorithm to use.

Today, we'll explore some tips for using classes that implement IEnumerable<T> (whether they are classes we create ourselves or classes from the .NET Framework).

IEnumerable<T> Tips
As we've seen, IEnumerable<T> is a fairly simple interface that gives us access to a lot of functionality.  But, there are a number of things we should take into consideration when we decide on how to use that functionality in our applications.

Tip: Don't Modify Collection Items in a foreach Loop
This is actually a "don't ever do this" rather than a "tip".  When using a foreach loop (or using the enumerator more directly), it is very tempting to modify the items as you are iterating them.  However, this only causes problems (some more obvious than others).

First, what happens if we try to add or remove an item in a foreach loop?  Consider the following code:


The idea behind this code is that we iterate through our people collection (with a foreach loop).  If the FirstName property of the person is "John", then we want to remove that person from the collection.  Then we want to add that person object to the list box of our UI.

But we are not allowed to add or remove items from a List<T> while we are enumerating it.  This code will compile just fine, but if we try to run it, we get the following Exception:
System.InvalidOperationException: "Collection was modified; enumeration operation may not execute."
So adding and removing items is not allowed.  What about modifying items?  That's where things get a bit interesting.  Let's look at the following code:


Inside the foreach loop, we first add the "person" to the list box of our UI.  Then if the FirstName property is "John", we change it to "Test".  We would expect that we would see the "John" items in the list box (since we add thems before changing them).  But here is the output:


As you can see, "Test" is showing up in our output!  This is not the output we would expect from this code.  The strangeness is a result of trying to update a collection while we are iterating that collection.

Based on these problems, we never want to modify a collection as we are iterating it.  In most situations, it is fairly easy to come up with another solution that accomplishes our same goals by handling the modification outside of the iteration.  Note: "for" loops do not exhibit this same issue since they do not use the enumerator; this is sometimes a good solution (but not always).

Tip: Avoid Creating A Custom Class
In our samples from Part 3 and Part 4, we created classes that return a sequence of numbers.  I purposely chose this type of sequence because we could base our "MoveNext()" on a calculation of some type.  In most situations, though, there is no need to create a custom class that implements IEnumerable<T>.

As mentioned in Part 1, almost every .NET collection implements the IEnumerable interface.  This includes arrays, generic lists, linked lists, queues, stacks, and dictionaries.  Generally speaking, if you find yourself in a situation where you need an IEnumerable implementation (because you need to iterate through a set of items), you will probably also need the more advanced collection functionality that you get with one of these framework classes.

In most scenarios, we should start by looking at the .NET collection classes.  One of these classes will mostly likely fulfill our needs.

Tip: Use a Custom Class to Hide Functionality
Sometimes the .NET collections have more functionality than we want to expose in our application.  For example, let's say that we need a collection that we can iterate through, but we don't want to use "List<T>" because we don't want the collection items to be directly modifiable.

Here's an example how we can use a custom class to wrap a List<T> object and hide its functionality:


We have a custom class that implements IEnumerable<T> (so we'll get our iteration functionality).  The class contains a private list that is not visible externally.  The list gets initialized by the class constructor.

Notice the GetEnumerator() method.  Instead of implementing our own IEnumerator<T> class or using "yield return", we are simply calling the internal list's GetEnumerator() method.  This is perfectly valid.  And it makes sense to do it in this situation -- the IEnumerator is already implemented by an object in our class, so there is no reason for us to create a custom implementation.

The result of this class is that we have a list of objects that we can iterate through, but we have no way to modify the collection externally.  Note that we are also hiding the other List<T> members in the private class (such as IndexOf), but we still have access to all of the IEnumerable<T> extension methods, such as Single() and Where(), if we are searching for particular items in the collection.

Tip: Use a Custom Class to Reduce Overhead
Sometimes we don't need all of the functionality provided by a collection object.  In that case, we can create a custom class that implements IEnumerable<T> that only has the functionality that we need.  This is exactly what we did with our sequence classes: IntegerSequence, FibonacciSequence, and StrategicSequence.

We did not need any of the overhead associated with a collection since our values are calculated as we need them.  Granted, our example is a rather contrived one.  But it demonstrates that it is possible and practical to have a class that implements IEnumerable<T> without other collection-type functionality.

IEnumerable<T> Review
The last several articles have shown a lot of different aspects of IEnumerable<T>.  Let's do a quick review:

IEnumerable<T> Interface Members
We started by looking at the IEnumerable<T> and IEnumerator<T> interfaces. We saw the methods and properties (such as GetEnumerator(), Current, and MoveNext()) that are specified in each.

Iterator Pattern
The Iterator pattern is described by the Gang of Four as a way to get items one-by-one from a collection -- "Next, Please!"  The IEnumerable<T> interface describes an implementation of this pattern.

foreach Loop
We can use a "foreach" loop to iterate through the items contained in any class that implements IEnumerable<T>.  This is an easier way to interact with the class then if we were to explicitly call the GetEnumerator() method along with MoveNext() and Current.

Extension Methods
Extension methods give us a way to add functionality to a class without modifying the class directly.  From a syntactic standpoint, extension methods behave as if they are native methods of the class.

IEnumerable<T> and LINQ
LINQ (Language INtegrated Query) provides us with a myriad of extension methods on the IEnumerable<T> interface.  This gives us the ability to write queries and perform all types of functions against any class that implements IEnumerable<T>, including (but not limited to) filtering, sorting, aggregation, and grouping.

Implementing IEnumerable<T>: IntegerSequence
Our first custom implementation of IEnumerable<T> let us take a look at the members of IEnumerable<T> and IEnumerator<T>.  We saw what each member is designed for, and we used the members to create a class that returns a series of consecutive positive integers.

yield return
"yield return" provides us with shortcut syntax when implementing an enumerator.  It lets us create a method that retains its state between calls.  Using this, we can mimic a full implementation of the IEnumerator<T> class with a single method.

Implementing IEnumerable<T>: FibonacciSequence
Our second custom implementation of IEnumerable<T> was a bit more complex.  We used what we learned from the simple IntegerSequence and created a class that returns the Fibonacci sequence, where each value is the sum of the previous two values.

Strategy Pattern
The Strategy pattern is described by the Gang of Four as a way to create a set of encapsulated, interchangeable algorithms.  We took a look at the advantages of using the pattern as well as some of the negative consequences.

The StrategicSequence Class
Our final custom implementation of IEnumerable<T> let us explore the Strategy pattern in a bit more detail.  The StrategicSequence class accepts a strategy (the algorithm for calculated the sequence) as a constructor parameter.  This lets us externalize the algorithms into separate classes.  We created the IntegerStrategy, FibonacciStrategy, and SquareStrategy that implemented these algorithms.  And we saw the advantages that this gave to our particular application.

Wrap Up
In summary: IEnumerable<T> is awesome!  Okay, I might be a little too excited about an interface, but it offers us a ton of functionality with very little effort.  Implementing the interface ourselves is not that difficult, and even better, there are lots of implementations already provided for us in the .NET framework.

I find myself using the LINQ extension methods all of the time.  It's so easy to just drop in a Where() method to do a filter of a collection, or an OrderBy() to sort the list, or a Single() to pick out a specific item. And all we need to take advantage of this functionality is a class that implements IEnumerable<T> and a good understanding of lambda expressions (but we've already got that, right?).

Hopefully you are as excited about IEnumerable<T> as I am (or at least a bit more interested than you were when we started).  Learning the features included in the .NET framework lets us take advantage of extremely powerful functionality that is already there just waiting for us.

Happy Coding!

No comments:

Post a Comment