Patterns: Iterator And .NET - Yield I Say!

by Rasmus Kromann-Larsen November 11, 2008 23:24

Introduction

In this second post of my patterns and principles series, I aim to give an overview of the Iterator pattern, a pattern most of us .NET people have so integrated in our languages that we don't even think about it. But it is still useful to know the theory of the pattern and how it is integrated into the framework - the solution baked in allows for more variation than you'd think.

The Theory

According to Gang of Four, the Iterator pattern's intent is to:

Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation.

Basically what we want to do is abstract the traversal so we don't have to worry about it. The Iterator will provide us with a nice interface for getting the next object in our data structure, maintain state about how far we've already progressed in our traversal and tell us when we're done. Abstracting the traversal also makes it easier to change the actual traversal - like if you want to iterate over your data structure in reverse order.

Furthermore, encapsulating the traversal logic in an Iterator will often result in higher cohesion and lower coupling for client code. Higher cohesion because they can more clearly express their intent instead of worrying about iteration state and order - and lower coupling because they are not as tied to the actual implementation of the data structure being iterated. As a result, it is possible with Iterators to provide a uniform interface for traversing different data structures.

Iterators are also sometimes used as Generators, where they generate a series of values instead of actually iterating over an object structure. If implemented lazily, these can generate potentially infinite series like a never-ending stream of numbers or primes or whatever.

How Does It Work In .NET?

One of the reasons we rarely think about the Iterator pattern is because it's so embedded into our languages. In the .NET world, an Iterator is actually called an Enumerator - and if we look in the framework documentation, we find an interface named IEnumerator that looks something like this (the generic version):

public interface IEnumerator<T>
{
	bool MoveNext();
	void Reset();
	T Current { get; }
}

This looks a lot like the abstraction described in the Gang of Four book. But how often do you actually see the IEnumerator interface in your code - not too often I bet. This is because the pattern is even more tightly integrated into the framework. Digging deeper, we find the IEnumerable interface which looks like this:

public interface IEnumerable<T>
{
	IEnumerator<T> GetEnumerator();
}

So any class that implements the IEnumerable interface is able to supply you with an Iterator. Lots of classes in the .NET framework implement IEnumerable - and a naive usage of it might look something like this:

void NaiveEnumeration()
{
	var list = new ArrayList<int> { 1, 2, 3, 4, 5 };
	
	var enumerator = list.GetEnumerator();

	while(enumerator.MoveNext()) 
	{
		var number = enumerator.Current;
		Console.WriteLine(number);
	}
}

But iteration is something we do often - and the pattern has even mandated its own keyword - foreach - so when you go like this:

void NormalEnumeration()
{
	var list = new ArrayList<int> { 1, 2, 3, 4, 5 };
	
	foreach(var number in list)
	{
		Console.WriteLine(number);
	}
}

You're actually using the IEnumerable and IEnumerator interfaces, you just don't see them. Simply put, foreach is really just syntactic sugar for the above construction - conceptually at least.

But this isn't all. Since C# 2.0, there has also been the yield keyword. Yield can be somewhat tricky to wrap your head around at first, but once you've used it a few times, you really appreciate the power of it. It provides a nice and clean way of implementing the Iterator pattern without worrying too much about managing state. It basically allows you to point out values have the framework create an Iterator for you. The reason it can be somewhat confusing is that it messes with the normal semantics of executing a method. Lets take an example:

IEnumerable<int> GetNumbers()
{
	var number = 1;

	while(true)
	{
		yield return number;
		number += 1;
	}
}

At first sight, this method looks kind of broken. Notice that the function returns an IEnumerable - that is: an object that provides an IEnumerator. The IEnumerator is created for us behind the scenes and whenever it encounters your yield return statement, it "freezes" your method and returns this value. When MoveNext is called the next time (explicitly or through a foreach loop), the code picks up exactly where it stopped last time - in this case adding 1 to number and yielding once again. Note that even though this code won't loop forever when creating the Iterator, a foreach statement using GetNumbers will - as expected.

There's also a yield break statement that you can use when implementing an Iterator with yield - it just returns nothing and stops the iteration, much like the break statement in a for loop.

Variation Point (And More .NET)

As with most patterns, the Iterator has variation points. One of the variation points in the Iterator pattern is who controls the iteration. Gang of Four distinguishes between an internal iterator and an external iterator.

With an external iterator the client of the iterator has the responsibility for advancing the iterator explicitly and to request the item that is being processed. The examples we saw above using the various constructs are all examples of external iterators.

An internal iterator on the other hand is more declarative, with an internal iterator, we actually don't see the iterator itself, but we provide an operation to be performed on the iterated elements. An example of this is the ForEach method defined on List<T>. This allows you to pass a delegate that is to be executed on each element in the list. In C# 3.0, using lambdas the above could look something like this:

void InternalEnumeration()
{
	var list = new ArrayList<int> { 1, 2, 3, 4, 5 };
	
	list.ForEach(number => Console.WriteLine(number));
}

In this case we no longer control the iterator and can't stop after 2 elements if that's what we wanted.

Considering LINQ in this light is left as an exercise to the reader.

Conclusion

In this post I've given a short introduction to the Iterator pattern and shown how we use it every day without even thinking about it. But they can be used for much more than our every day scenario iterating over already made collections. And be sure to play around with the yield statement.

kick it on DotNetKicks.com

Tags: , , ,

Design Patterns

Patterns and Principles (Part I) - Getting Started

by Rasmus Kromann-Larsen October 24, 2008 15:54

Introduction

Design patterns and principles are a fundamental thing in software development. Yet they're can be quite elusive and difficult to get into. As one of my goals with this blog is to further my own knowledge, as well as share it with others, I've wanted to do posts on basic object oriented principles and patterns. I believe that patterns is one of those things you grasp best when actively thinking about them - and thus to improve my own skillset on patterns, writing blog posts and thinking up good examples is a great way to go.

While this post is mostly an introduction to the series, my approach will definitely be a pragmatic one. I'm aiming to have two types of posts in this series, basic posts introducing specific design patterns or principle with up-to-date .NET examples and more advanced posts on the variations of the patterns and my own crazy experimentation with them. In any case, both types of posts should be brimming with examples if all goes as planned.

Design Principles

Drops of distilled wisdom and experience. Most of these principles deal with increasing maintainability, testability, flexibility, reducing (unneeded) complexity and attaining high cohesion and low coupling. Allowing you to mitigate at least one of the inevitable three (death, taxes and changing requirements).

I personally see them as guidelines for thought, not golden rules, you might encounter situations where fewer of these principles may apply, as there's often trade-offs involved. However, keeping them in mind is a way to open your eyes to other solutions. In my experience, some of them are easier to be adamant about (DRY springs to mind), while some are more subjective considerations and best practices. I guess my point is that you should avoid following anything blindly without thought. Broaden your horizon, don't narrow it.

I'll not dig too deep into these, but rather give a short introduction in this.. uh, introduction post. Note that this is not an exhaustive list.

  • Single Responsibility Principle
    • Separation of Concerns. An object should have one and only one reason to change, thus increasing cohesion and avoiding coupled responsibilities. It ties into many of the other principles.
  • Open/Closed Principle
    • Your software entities should be closed for modification, but open for extension. Hard to explain briefly, but the gist is to be able to extend the system without modifying existing code (save for bugs). Examples could be: Avoiding dependencies on internal workings and down-casts to specific types.
  • The Interface Segregation Principle
    • Do cohesive, responsibility-based interfaces (think roles) instead of huge general interfaces. Your clients will then only depend on a minimal subset of your methods, instead of potentially depending on methods they're not using.
  • DRY
    • Don't Repeat Yourself. Duplication is bad, mkay? A good example of this is duplicate code, you'll always miss at least one spot when making changes later.
  • Dependency Inversion Principle
    • "High level modules should not depend upon low level modules. Both should depend upon abstractions.". Seeks to lower coupling in the system and increase testability. Applied through dependency injection and often IoC (Inversion of Control) containers.
  • Liskov Substitution Principle
    • Informally: When defining an interface or contract, the system should be able to use any (correct) implementation of it. That is, clients of the contract should not have to know the implementation details (or depend on them). Ties into Design by Contract.
  • Law of Demeter
    • Also known as the Principle of Least Knowledge. Don't talk to strangers. The law states that a method on an object should only call methods on itself, parameters passed in to the method, objects it created and composite objects. This means don't go dot, dot, doting yourself into the entire object tree.
  • Tell, Don't Ask
    • Aim to tell objects what to do instead of asking it about it's state and deciding what to do. The idea is that the object probably knows better than you. It also forces you to think about responsibilities.
  • YAGNI
    • You Aren't Gonna Need It. From Extreme Programming. Don't waste your time adding functionality based on what you think the future might bring. You will (most likely) be wrong. In addition, you will have to maintain this extra code and complexity. Variation of KISS (Keep It Simple, Stupid).
  • Favor Composition Over Inheritance
    • Inheritance is often over- and misused. Inheritance is an 'is-a relation' and is often used as a 'has-a relation' (composition). An advantage of composition is that composed objects can be replaced dynamically - and they can vary independently. Inheritance still has it's place in some cases though (Hint: When you have an 'is-a relation').

Design Patterns

Design patterns are reusable solutions to recurring problems in software development. One of the best points about design patterns is that they allow developers to talk on a higher level, since they have a shared vocabulary of design techniques. Seeing pattern names in code can also communicate an intent that can otherwise be hard to see.

A lot of the same things from the design principles section apply here too. An UML diagram describing a design pattern is just one instance of the pattern - they're meant for inspiration and almost all patterns have several variation points.

I'll not even try to list design patterns yet, they come in all shapes and sizes, better save something for next time.

Literature

While blog posts and other online sources are good for quick answers, nothing beats sitting down with a well-written book on a subject.

 

Design Patterns: Elements of Reusable Object-Oriented Software

Erich Gamma, Richard Helm, Ralph Johnson, John M. Vlissides

If you've read anything about patterns, you will undoubtedly have heard of the GoF (Gang of Four) book, the often proclaimed Bible of Design Patterns. While this is a great book, especially as a reference catalogue of patterns when you want to look something up, I was kind of lost when I read it the first time. The book is rather abstract and it can be rather confusing for someone starting out with design patterns. I really think you should make it part of your book collection, but if you're starting out with patterns, I would recommend starting out with this book instead:

Design Patterns Explained: A New Perspective on Object-Oriented Design

Alan Shalloway, James Trott

This book was my personal eye-opener. It is somewhat more chatty than the GoF book, slightly less catalogue, slightly more "getting into object-oriented design". It's a great introduction to design patterns and the authors go to great lengths to not only describe the patterns, but to discover them by examining different solutions and quantifying the strengths and weaknesses. This is a great book for bridging the gap before GoF.

 

Refactoring: Improving the Design of Existing Code

Martin Fowler

While refactoring is not design patterns per se, refactoring is a method to mold your code (or others code) towards some of the same goals as the ones presented by design patterns. It's all about improving the maintainability and flexibility of your software. Fowler does a fine job of explaining the reasons for the different refactorings, describing code smells and which tools to use to get rid of them. Another reason knowledge about refactoring is good is that often, you won't have the luxury (or curse) of working solely with your own code. Refactoring can be a great tool to unravel spaghetti code and gaining insight while (hopefully) adding tests to support it.

Online Resources

If you want to get started reading more about patterns and principles, here's a few good links for getting started.

Conclusion

In this post, the first in a series of N, I gave a short introduction to design patterns and principles. I've outlined some recommended getting-started literature and hope to have sparked your interest. Next part will be a basic post describing the first pattern.

Note: I could have more sources in this post, but most of this post is tidbits from experience, opinion and a compilation of snippets from way too many sources. I'll list them in following posts, when we dig into the detail.

kick it on DotNetKicks.com

Tags: , ,

Development | Design Patterns

Powered by BlogEngine.NET 1.6.1.0
Theme by Mads Kristensen | Modified by Mooglegiant | Adjusted by Rasmus Kromann-Larsen

About Me

I am a danish .NET developer blogging about the technical side of my life, mostly .NET stuff, but also fundamental topics like design patterns, principles and productivity boosters.

In addition, I am a core group member of Aarhus .NET User Group.