The Internals of C# foreach

Many moons ago I discussed the foreach loop. Iexpand on that post here as I continue my series for the MSDN C#Developer Center.

The advantage of a foreach loop over a forloop is the fact that it is unnecessary to know the number of itemswithin the collection when an iteration starts. This avoids iteratingoff the end of the collection using an index that is not available. AsBill Wagner pointed out in his Custom Iterators article last week, aforeach loop also allows code to iterate over a collection withoutfirst loading the collection in entirety into memory. In this article Iam going to explore how the foreach statement works under the covers.This sets the stage for a follow on discussion of how the yieldstatement works.

foreach with Arrays

Consider the foreach code listing shown in Listing 1:

Listing 1: foreach with Arrays

int[] array = new int[]{1, 2, 3, 4, 5, 6};
foreach(int item in array)
{
Console.WriteLine(item);
}

From this code, the C# compiler creates CIL equivalent of a for loop like this (Listing 2):

Listing 2: Compiled Implementation of foreach with Arrays

int[] tempArray;
int[] array = new int[]{1, 2, 3, 4, 5, 6};
tempArray = array;
for (int counter=0; (counter < tempArray.Length); counter++)
{
readonly int item = tempArray[counter];
Console.WriteLine(item);
}

Since the collection is an array and indexingthe array is fast, the foreach loop implementation relies on supportfor the Length property and the index operator ([]). However, these twoarray features are not available on all collections. Many collectionsdo not have a known number of elements and many collection classes donot support retrieving elements by index.

foreach with IEnumerable<T>

To address this, the foreach loop uses the System.Collections.Generic.IEnumerator<t>.(Given C# 2.0.s generics support, there is very little reason toconsider using the non-generic equivalent collections so I will ignorethem from this discussion except to say their behavior is almostidentical.) IEnumerator<t> is designed to enable the iterator patternfor iterating over collections of elements, rather than thelength-index pattern shown in earlier. IEnumerator<t> includes threemembers. The first is bool MoveNext(). Using this method, we can movefrom one element within the collection to the next while at the sametime detecting when we have enumerated through every item using theBoolean return. The second member, a read-only property called Current,returns the element currently in process. With these two members on thecollection class, it is possible to iterate over the collection simplyusing a while loop as demonstrated in the code listing below (Listing3): </t></t></t>

Listing 3: Iterating over a collection using while

System.Collections.Generic.Stack<int> stack =
new System.Collections.Generic.Stack<int>();
int number;
// …
// This code is conceptual, not that the actual code.
while(stack.MoveNext())
{
number = stack.Current;
Console.WriteLine(number);
}
</int></int>

The MoveNext() method in this listing returnsfalse when it moves past the end of the collection. This replaces theneed to count elements while looping. (The last member on IEnumerator<t>,Reset(), will reset the enumeration.) This while listing shows the gistof the C# compiler output, but it doesn.t actually work this waybecause it omits two important details about the actual implementation:interleaving and error handling. </t>

Interleaving

The problem with is that if there are two (ormore) interleaving loops over the same collection, one foreach insideanother, then the collection must maintain a state indicator of thecurrent element so that when MoveNext() is called, the next element canbe determined. The problem is that one interleaving loop can affect theother. (The same is true of loops executed by multiple threads.)


Figure 1: IEnumerable class diagram

To overcome this problem, the IEnumerator<t>interfaces are not supported by the collection classes directly. Asshown in Figure 1, there is a second interface called IEnumerable<t> whose only method is GetEnumerator(). The purpose of this method is to return an object that supports IEnumerator<t>.Rather than the collection class maintaining the state itself, adifferent class, usually a nested class so that it has access to theinternals of the collection, will support the IEnumerator<t> interfaceand keep the state of the iteration loop. Using this pattern, the C#equivalent of a foreach loop will look like what's shown in Listing 4. </t></t></t></t>

Listing 4: A separate enumerator maintains state during an iteration

System.Collections.Generic.Stack<int> stack =
new System.Collections.Generic.Stack<int>();
int number;
System.Collections.Generic.Stack<int>.IEnumerator<int> enumerator;

// …
// If IEnumerable<t> is implemented explicitly,
// then a cast is required.
// ((IEnumerable)stack).GetEnumeraor();
enumerator = stack.GetEnumerator();
while (enumerator.MoveNext())
{
number = enumerator.Current;
Console.WriteLine(number);
}
</t></int></int></int></int>

Error Handling

Since the classes that implement the IEnumerator<t>interface maintain the state, there are occasions when the state needscleaning up after all iterations have completed. To achieve this, theIEnumerator<t> interface derives from IDisposable. Enumerators thatimplement IEnumerator do not necessarily implement IDisposable, but ifthey do, Dispose() will be called as well. This enables the calling ofDispose() after the foreach loop exits. The C# equivalent of the finalCIL code, therefore, looks like Listing 5.
Listing 5: Compiled Result of foreach on Collections </t></t>

System.Collections.Generic.Stack<int> stack =
new System.Collections.Generic.Stack<int>();
int number;
System.Collections.Generic.Stack<int>.Enumerator<int> enumerator;
IDisposable disposable;
enumerator = stack.GetEnumerator();

try
{
while (enumerator.MoveNext())
{
number = enumerator.Current;
Console.WriteLine(number);
}
}
finally
{
// Explicit cast used for IEnumerator<t>.
disposable = (IDisposable) enumerator;
disposable.Dispose();

// IEnumerator will use the as operator unless IDisposable
// support determinable at compile time.
// disposable = (enumerator as IDisposable);
// if (disposable != null)
// {
// disposable.Dispose();
// }
}
</t></int></int></int></int>

Notice that because the IDisposable interface is supported by I
Enumerat
or<t>, the C# code can be simplified with the using keyword as shown in Listing 6. </t>

Listing 6: Error handling and resource cleanup with using

System.Collections.Generic.Stack<int> stack =
new System.Collections.Generic.Stack<int>();
int number;

using(
System.Collections.Generic.Stack<int>.Enumerator<int> enumerator =
stack.GetEnumerator())
{
while (enumerator.MoveNext())
{
number = enumerator.Current;
Console.WriteLine(number);
}
}
</int></int></int></int>

However, recall that the using keyword is notdirectly supported by CIL either, so in reality the former code is amore accurate C# representation of the foreach CIL code.

Readers may recall that the compiler preventsassignment of the foreach variable identifier (number). As isdemonstrated in Listing 6, an assignment of number would not be achange to the collection element itself so rather than mistakenlyassume that, the C# compiler prevents such an assignment altogether.

In addition, the element count within acollection cannot be modified during the execution of a foreach loop.If, for example, we called stack.Push(42) inside the foreach loop, itwould be ambiguous whether the iterator should ignore or incorporatethe change to stack . should iterator iterate over the newly added itemor ignore it and assume the exact same state as when it wasinstantiated.

Because of this ambiguity, an exception oftype System.InvalidOperationException is thrown if the collection ismodified within a foreach loop, reporting that the collection wasmodified after the enumerator was instantiated.

(This content was largely taken from the Collections chapter of my book, Essential C# 2.0 [Addison-Wesley])

Twitter Digg Delicious Stumbleupon Technorati Facebook Email

No comments yet... Be the first to leave a reply!