Why AAA style mocking is better than Record / Playback

by Rasmus Kromann-Larsen May 24, 2009 21:06

If you've followed me on twitter for more than a couple of days, you will most probably have heard my grumbling each time I run into issues using record / playback mocking - so I thought I'd write a short post on my experiences with both and why I think that I keep bumping into issues with record / playback.

Phases of a test

If you take a look at the normal flow of a test method without mocking, it will usually perform some kind of setup, then perform some action that invokes the code under test - and then at the end, make some assertions about the state of some component that you want verified. While some tests continue after this point, this is where you should stop if you're following the "one assert per test" rule - this is sort of the single responsibility principle for tests. This flow is also the inspiration for the AAA name - first you Arrange your test setup, then you Act upon the class under test - and then you Assert something about the state. Here's a simple example without mocking:

[Test]
public void CanRemoveCategories()
{
    // Arrange
    var collection = new CategoryCollection("test");

    // Act
    collection.RemoveCategory("test");

    // Assert
    Assert.That(collection.Count, Is.EqualTo(0));
}

This test is chronologically sound, it makes sense and it is easy to read - but then again, this is a state-based test, I said I was going to talk about behavioral tests - mock tests.

Phase confusion

Many people find that mocking is rather difficult to understand and that it often very hard to read and understand. Since our tests act as an API-description for our code - and since we want to be able to figure out how to fix our failed tests - readability is important. Now, let's look at one of my tests from a Record / Playback point of view. I'm using Rhino Mocks as my mocking framework in this test:

[Test]
public void ShouldIgnoreRespondentsThatDoesNotExistRecordPlayback()
{
    var guid = Guid.NewGuid();
    IEventRaiser executeRaiser;

    using(_mocks.Record())
    {
        Expect.Call(_view.Respondents).Return(new[] {guid.ToString()});
        Expect.Call(_repository.GetById(guid)).Return(null);

        _view.ExecuteOperation += null;
        executeRaiser = LastCall.IgnoreArguments()
            .Repeat.Any()
            .GetEventRaiser();

        Expect.Call(_view.OperationErrors = null)
            .IgnoreArguments()
            .Constraints(List.IsIn("Non-existant respondent: " + guid));
    }

    using(_mocks.Playback())
    {
        new BulkRespondentPresenter(_view, _repository);
        executeRaiser.Raise(null, EventArgs.Empty);
    }
}

Now, at a glance, can you tell me what this test is really doing? There's something with a view and a repository and we can probably deduce quite a bit from the test name. But it's rather hard to separate the different phases I talked about before Arrange, Act and Assert. Below, I've tried to annotate the test with the phases:

[Test]
 public void ShouldIgnoreRespondentsThatDoesNotExistRecordPlayback()
 {
     // Arrange
     var guid = Guid.NewGuid();
     // Part of Act
     IEventRaiser executeRaiser;

     using(_mocks.Record())
     {
         // Arrange (or Assert?)
         Expect.Call(_view.Respondents).Return(new[] {guid.ToString()});
         Expect.Call(_repository.GetById(guid)).Return(null);

         // Part of Act
         _view.ExecuteOperation += null;
         executeRaiser = LastCall.IgnoreArguments()
             .Repeat.Any()
             .GetEventRaiser();

         // Assert
         Expect.Call(_view.OperationErrors = null)
             .IgnoreArguments()
             .Constraints(List.IsIn("Non-existant respondent: " + guid));
     }

     using(_mocks.Playback())
     {
         // Arrange
         new BulkRespondentPresenter(_view, _repository);
         // Act
         executeRaiser.Raise(null, EventArgs.Empty);
     }
 }

No wonder it's hard to read and understand. The phases are mixed all over - and the Asserts are in the middle of the test - this is nothing like the natural flow of the previous test without mocking. I usually like to have the phases separated in my test with comments as well, but it's just not possible in this test. I wrote this test up rather quickly, so there might be a better way of doing it that I am missing - if there is, please yell at me :-)

Sorting out the confusion

AAA mocking, as the name suggests, is all about clearing out the confusion in that last test - it's about maintaining the original test flow. It just so happens also to have some other benefits, that I will get into later in the post. I've written the same test as above in an AAA style, this time with Moq, since I'm trying it out at the moment, but Rhino Mocks has similar syntax. Moq is pretty heavy on lambda expressions, but even if you haven't worked with those yet, I'm sure you will grasp the idea. If you want a general introduction to mocking with Moq, Justin Etheredge has a small series about it.

[Test]
public void ShouldIgnoreRespondentsThatDoesNotExist()
{
    // Arrange
    var guid = Guid.NewGuid();
    _viewMock.Setup(x => x.Respondents).Returns(new[] { guid.ToString() });
    _repositoryMock.Setup(x => x.GetById(guid)).Returns(() => null);

    // Act
    _viewMock.Raise(x => x.ExecuteOperation += null, EventArgs.Empty);

    // Assert
    _viewMock.VerifySet(x => x.OperationErrors =
        It.Is<IList<string>>(l => l.Contains("Non-existant respondent: "+guid)));
}

Those comments are actually in my original test as well - and in my test live template I generate all my tests with. If you compare this test to the one above, you will see that it has more or less the same components, but this time, they're arranged in a way that makes sense for the next reader of the test. The fact that the test is shorter is also slightly unfair, since my first test used an event raiser, which involves "many" lines of code. Also the separation of the phases allowed me to move the actual construction of the presenter our of the actual test and into shared setup code.

So what techniques did AAA mocking introduce to help alleviate the pains? First of all, the mocks no longer has states - that's what record and playback really refer to: A mock in record state will record calls made on it and then expect them to be called again during the playback state. Furthermore, it cleanly separates mock setup from mock expectations.

What is gained?

So what did we gain with AAA style mocking over the traditional record / playback style?

  • The main selling point for me is readability and test simplicity - it is much easier for me to explain mocking to someone else with AAA.
  • If you have done any fairly advanced record / playback mocking, you will find that when the mocks have states, it will often result in subtle test failures.
  • Clean separation of test phases.
  • Greatly improved ability to move shared code out of tests. Since you have the first part of your test handling setup, extending this part to start before the actual test (in a setup method) is no problem. With record / playback mocking, you will often run into state failures if you attempt it.

kick it on DotNetKicks.com

Tags:

Development | Testing

Code Quality and Software Entropy

by Rasmus Kromann-Larsen March 31, 2009 22:05

complexityCode is an odd thing, it can be beautiful, ugly, horrible, elegant, it can smell - some people even compare code to poetry. If you are a developer, you will know that there are an almost infinite number of ways that you can write a piece of code with the same functionality. The way chosen will affect the different qualities that the code - and thus the program as a whole - will exhibit. These qualities are things like performance, testability, robustness, security, scalability etc.

However, when developers talk about code quality (at least when I do) the main focus is often maintainability and flexibility. From my point of view, any software project is always decaying, especially if you have multiple people working on it, hacks are made, designs are twisted into fulfilling new responsibilities that their original creator didn't foresee or intend. Even with a clean design and a focus on maintaining high quality, the amount of code is almost always increasing with new functionality, as is the programs complexity. This is sometimes known as software entropy - chaos. If this entropy is ignored over longer periods of time, the technical debt incurred will keep increasing until the interest is so high that the project will grind to a halt - creating new features takes a long time and introduces many new and interesting bugs.

Maintainability is important on most projects, it often depends on the complexity of the program and the expected lifetime - so if you are doing prototypes or mock-ups, focusing on maintainability may not be beneficial, although in my experience, when management sees a really cool prototype, they often feel like building a project on top of it. Most software projects will have a rather long lifetime, often with multiple versions and a need to maintain the project even after it has shipped.

Obtaining high code quality is a craft, it requires discipline and continuous refactoring and improvement. This will be the main topic of my blog for a while, save the random posts about ReSharper or other intriguing things that may come up. Topics within this field off the top of my head: Unit testing, design principles (SOLID and others), design patterns, understanding and taming dependencies, inversion of control containers, simplicity and many more - ideas are welcome.

kick it on DotNetKicks.com

Tags:

Craftsmanship | Development

Craftsmanship In The Wild

by Rasmus Kromann-Larsen March 07, 2009 22:38

imageI was out dining today and had an experience I simply had to share.  It was a moderately expensive restaurant and they had cocktails as part of their menu.

As my after dinner cocktail, I chose a Mojito, which is actually a fairly difficult cocktail to make properly - at least if you want it to be as strong as it ought to be, while still masking the taste with the proper levels of sugar and mint.

I watched as the bartender mixed the drink - he didn't measure, but he was focused on the task at hand - he even tasted the drink elegantly with a straw to check its quality. The Mojito was extraordinary - perfect - it was so good that I felt like I really needed to order another one. However, this time, a girl, clearly an apprentice bartender wanted to make my next drink. She used the same ingredients but her focus was all around, she didn't sample the drink, just mixed everything approximately as she had been told. Watching the process, I wasn't surprised when the drink was a disappointment - it was too sweet and kind of watery.

As I asked for the bill, at the first bartender, I complimented his craftsmanship and mentioned that the second Mojito had not quite been what his had been - he immediately cut the cost of the second drink in half. He didn't even blink.

What kind of bartender do you want to be?

kick it on DotNetKicks.com

Tags: , , ,

Development | Craftsmanship

Code Coverage – What You Need To Know

by Rasmus Kromann-Larsen February 23, 2009 00:34

People often talk about the percentage of code coverage they have from unit tests, whether it be actual coverage or the goal they or their project have set. Having a high coverage percentage is often seen as a quality, but code coverage is not really a metric that gives much value in itself. In this post I want to investigate the different types of code coverage that exists and address some of the problems with code coverage.

Types of Code Coverage

There are various forms of code coverage, not just the one we seem to always talk about. A short introduction to some of the existing coverage types:

Statement coverage – How many percent of statements / lines have been covered?

Branch coverage – How many percent of branches (control structures) have been covered? In the case of an if statement, has the expression evaluated to both true and false. There are variations of branch coverage that talk about more detailed decision coverage, like covering all permutations of true and false in an expression that contain stuff like AND and OR.

Path coverage – How many percent of all possible paths have been covered? This may not sound too different from the two above, but it is actually much more complicated. An example could be a function with two simple if statements (A and B) after each other. To obtain full path coverage, you would have to get all four permutations of true and false in the two if statements (A B - !A B - A !B - !A !B) whereas in statement coverage you would only have to exercise the if statements in isolation. Mind you that this is the simple case – if statements with multiple sub expressions, nested ifs and loops are much worse.

The coverage type we usually talk about is actually statement coverage – but as you can see from just looking at branch and path coverage, much is left to be desired. Coverage types like path coverage are also hard to measure, as many loops will have a near-infinite number of paths.

Another thing that is very hard to measure with code coverage is multi-threaded behavior. It is definitely not caught in our statement coverage and if getting full path coverage is hard, consider getting full path coverage with multiple threads.

One Problem – Test Quality

The problem with using code coverage on its own is that code coverage tells you nothing about the quality of the tests that cover the code. Code coverage tells you how much of your code has actually been executed, but it tells you nothing about the asserts that were made in the tests – that is, it says nothing about the correctness of the code.

The perfect example of this is writing a test with zero asserts (state or interaction). This test will produce a certain percentage of coverage, while ensuring nothing except that the code can successfully execute. While such sanity tests can be useful in some tricky cases with exceptions, this is often a worthless test – it has no quality whatsoever, it doesn’t verify any intent of the programmer who wrote the code it is testing.

The thing that is easy to sell about code coverage is that is is reasonably easy to measure. It is easy to set a percentage, a goal and then try to obtain this goal. 100% code coverage is a great goal – it might be unrealistic in most situations, but it is a good thing to strive towards. But if the tests suck or the programmers who write the tests become lazy the code coverage will be nothing but misleading.

The problem with test quality is that measuring it is really really hard – at least programmatically. How can you even begin to measure how much this code matches the intent of the programmer. Usually the number of asserts that make sense for a given test match only a very low number of the values in the system. And even if we could measure test quality using a program, we might as well do away with the tests and just measure programmer intent versus the real program.

The solution, to me, is discipline and good engineering principles. Code coverage can be really valuable to a team that treat their test code like production code, sharing ownership of the code and doing regular inspections / code reviews to ensure that the test quality is high.

The Inverse Property

One of the things I like best about code coverage is it’s quality as an inverse property – that is, a tool that can specifically tell me which parts of my code that I have not tested. This is a clear signal that you need to be more careful when touching this code.

This is also one of the reasons that I actually like to remove tests that either are very low quality or haven’t kept up to date with the intent of the code they are testing. The ideal solution is to rewrite the tests to match the intent of the code / desired quality, but this is not always realistic. To me, such tests are more harmful than no tests at all – they give a false sense of security and will only confuse if anyone look at them. At least if there is no tests my coverage tool can tell me there is a problem in this part of the system.

And while we are at it, it is actually amusing that some people treat test code like it is nowhere near production code and then the same people seem to think that it is blasphemy to delete even a single test. Treat your test code like your normal code, delete / rewrite it if it is obsolete or doesn’t make sense. Maintaining code that doesn’t add value to the system makes no sense.

Conclusion

Code coverage is a useful metric, but often not in isolation. If you use it then be aware of the implications of the way you’re using it, increasing test quality is much more valuable to me than reaching a specific coverage percentage, although it is good to have some sort of goal. Write tests to verify intent, not to increase the coverage percent – use coverage to find the intents not covered.

kick it on DotNetKicks.com

Tags:

Development | Testing

Testing a Visitor - Mocking and Test Readability

by Rasmus Kromann-Larsen January 22, 2009 21:53

The other day I was using TDD to write a visitor for an object graph at work. I often use mocks a lot and was using mocks in this particular batch of tests as well. However, in the end creating my own fake class turned out to be much better (in my opinion). Favoring state-based testing over interaction-based testing (sometimes) can really simplify the noise within a test and provide clarity.

What I wanted to test was that objects of the correct types were visited from the parent object and was thus producing a number of tests that look something like this:

private MockRepository mocks;

[SetUp]
public void Setup()
{
    mocks = new MockRepository();
}

[Test]
public void ShouldVisitObject1BelowRootObject()
{
    var rootObject = ObjectMother.Build<RootObject>();

    var object1 = ObjectMother.Build<Object1>();
    rootObject.AddObject(object1);

    var visitor = mocks.DynamicMock<IVisitor>();

    using (mocks.Record())
    {
        Expect.Call(delegate { visitor.Visit(object1); });
    }

    using (mocks.Playback())
    {
        rootObject.Accept(visitor);
    }
}

Now this was a pretty big object graph, so there were a lot of tests that looked very much like this one. What I am doing here is interaction-based testing, using a mock to verify that my class under test calls a particular method. There is quite a bit of mock noise in this test, stuff that is related to setting up mocks and our expectations. Noise like this can be much worse - and it was worse in my production code.

In this particular model, all the objects have a collection of sub items - and this collection conveniently contained some the various object types to visit. So I had a lot of tests that very almost identical to the one above. I thought I'd be clever and refactor my tests so I wouldn't have to duplicate as much code for each of these particular cases. After wrestling with it for a bit I came up with this:

private MockRepository mocks;

[SetUp]
public void Setup()
{
    mocks = new MockRepository();
}

[Test]
public void ShouldVisitObject1BelowRootObject()
{
    TestVisitor<RootObject, Object1>(
        delegate(IVisitor visitor, Object1 obj)
        {
            visitor.Visit(obj);
        });
}

It still looks kind of weird, with the delegate floating around (using C# 2.0 at work for now, so no lambda syntax). The test is still arguably even less readable. The delegate is actually used in the helper method to add the expectation to the mock, since the visitor has overload methods for concrete classes (this is the whole idea of visitor, implementing double dispatch - promise I will post about it soon), you can't really abuse generics.

The helper method is shown below and is also kind of hairy.

private delegate void VisitExpectDelegate<T>(IVisitor visitor, T child);

private void TestVisitor<TParent, TChild>(VisitExpectDelegate<TChild> visitExpectDelegate)
    where TParent : AbstractObject
    where TChild : AbstractObject
{
    var parent = ObjectMother.Build<TParent>();
    var child = ObjectMother.Build<TChild>();
    parent.AddObject(child);

    var visitor = mocks.DynamicMock<IVisitor>();

    using (mocks.Record())
    {
        visitExpectDelegate(visitor, child);
    }

    using (mocks.Playback())
    {
        parent.Accept(visitor);
    }
}

Another small thing that annoyed me besides the readability issue is that the code only works for a very specific case, checking that sub items are visited below a parent object. Furthermore, trying to mock visitor behavior of a deeper class hierarchy can turn into a mocking headache real fast.

I asked a colleague about the readability of my test and if he thought it was acceptable. He suggested that I tried faking (creating my own class) the visitor instead of mocking it. In essence, he suggested that I'd use state-based testing over my current interaction-based method.

The result was this:

private CountingVisitor visited;

[SetUp]
public void Setup()
{
    visited = new CountingVisitor();
}

[Test]
public void ShouldVisitObject1BelowRootObject()
{
    var rootObject = ObjectMother.Build<RootObject>();
    rootObject.AddObject(ObjectMother.Build<Object1>());

    rootObject.Accept(visited);

    Assert.That(visited.TotalCount, Is.EqualTo(2));
    Assert.That(visited.GetCount<Object1>(), Is.EqualTo(1));
}

My fake was basically just a visitor that would count how many times it had visited a given type of object and the total count of visits it had made. For me this test is so much more readable - build an object graph, give it to the visitor, make asserts about the visit count using NUnit's "Assert.That" syntax. The cool thing about this is that it makes no assumptions about sub items and can actually be used for any visitor that visits any object graph. It could also test deeper object graphs with ease. I am aware that it doesn't test which particular instances are visited, but I didn't feel that it would add much value to add this to the visitor, although it is possible.

The fake visitor looks like this and is really just a few generic tricks and a dictionary.

private class CountingVisitor : IVisitor
{
    private readonly Dictionary<Type, int> _count = new Dictionary<Type, int>();
    private int _totalCount = 0;

    private void Add<T>(T obj)
    {
        if (!_count.ContainsKey(typeof(T))) _count[typeof(T)] = 0;
        _count[typeof (T)] += 1;
        _totalCount += 1;
    }

    public int TotalCount
    {
        get { return _totalCount; }
    }

    public int GetCount<T>()
    {
        if(!_count.ContainsKey(typeof(T))) return 0;
        return _count[typeof (T)];
    }

    public void Visit(RootObject obj) { Add(obj); }
    public void Visit(Object1 obj) { Add(obj); }
    public void Visit(Object2 obj) { Add(obj); }
}

So remember the state-based testing, mocks are useful animals and sometimes they will be the only reasonable way or the easiest and you should use them, but other types of fakes (particularly hand-crafted ones - I think this one is actually called a "spy") can really give a good boost in readability and flexibility with a minimal code effort.

kick it on DotNetKicks.com

Tags: , , , ,

Development | Testing

Reducing Check-In Friction (in Continuous Integration)

by Rasmus Kromann-Larsen January 11, 2009 21:43

In a continuous integration environment, one of main motivations is to avoid big bang integrations, where multiple people and/or multiple teams build their part of the project and it is all fused together before release. The benefits of having an automated continuous build are huge, since problems become visible early. The build server polls the source control server and produces a build and runs the appropriate tests when new code is checked in.

imageTo have an efficient continuous integration environment, changes should be checked in often. Not checking in often results in more merge conflicts, less visibility of current project status) and reduced benefits from source control in general (less check-ins mean less chance of rolling back, bigger risk of loosing code). To encourage developers to check in their code often, it should be a non-event - it should be easy and not dangerous. However, continuous integration is all about visualizing build failure, where you use a simple tray application, lava lamps or whatnot. So checking in something that doesn't work can cause some developer stress, rushing to fix the build.

When the build is broken, other developers can't check in or out (at least they shouldn't - they will either make the problems worse or get broken code), so you want to minimize the time the build is broken. It is useful to have some discipline about the process of checking in, like the "Check In Dance" described Jeremy Miller's post. In general you want to make sure of the following:

  • You have the latest bits from source control.
  • The code can build successfully.
  • Relevant tests pass (unit tests, maybe static analysis tools like FxCop)

Jeremy also describes notifying the team that a change is coming. While this might be a good idea for smaller teams, I find that it would be rather disturbing if you are part of a bigger team (10-20+ developers). In this case I would opt for a more optimistic check-in policy, assuming that check-ins won't happen at the same time or won't clash. It involves slightly more praying and can give some annoying conflicts sometimes though.

Reducing friction on the check-in process is important. To make check-in a non-event, it needs to be very simple for the developer. A good solution for this is often to have a "smoke-screen" to verify the quality of your code before checking in. If you look at the above steps, it is rather simple to collect them in a single build target that could be run easily by the developer. This will also increase developer confidence in not breaking the build, thus eliminating check-in fear and enabling often check-in.

You will want to make sure that the build target can be executed in a few minutes and that tools like FxCop use identical settings on developer machines and build server (personal experience). Another option is to use pre-tested check-ins if you are using TeamCity, but I find that the build target will serve the purpose just as well.

I hope you enjoyed the post. May your future builds be green.

kick it on DotNetKicks.com

Tags: ,

Development

Craftsmanship over Crap

by Rasmus Kromann-Larsen January 06, 2009 19:55

I was catching up on my blog reading (ObjectMentor to be more specific) when I found Uncle Bob's post on quintessence as the fifth element for the Agile Manifesto. His statement of "Craftsmanship over Crap" really rang true with me.

We (software developers) like to compare our work to that of craftsmen and many people are starting to realize that maintainability is one of the cornerstones of writing good software. However, our software still keeps decaying, brilliant designs are mangled with hacks, quick bug fixes or just plain outdated with changing requirements.

However, I challenge you, next time you go to fix that bug or that small change request that almost fits the current design, step back and think first. Spend those extra minutes considering if there is a way you can fix it, maybe even improving the current design. Don't go on an overengineering spree, just keep it simple and elegant. Then, when you're done, sit back and marvel at your work and smile.

We've all made a quick hack, we will all do it again, but I promise you, those few extra minutes spent will be worth it in the long run. Technical debt has high interests. How often have you made a quick change while thinking (or even better made a comment) that you will come back and fix it later. How often have you forgotten?

Actively thinking about this and being proud of the results is really good for my personal productivity. Ideally, if you always leave your code slightly better than you found it, it should resist decay very well. Challenge your team to join you in this quest.

 

- Leave the code slightly better than you found it.

- Write a regression test for a bug, so it's not reintroduced.

- Don't live in a house with broken windows.

- Write tests to gain the confidence to perform refactoring.

- Be a craftsman.

 

And to quote J.P Boodhoo: 'Develop with Passion'.

kick it on DotNetKicks.com

Tags: , , ,

Craftsmanship | Development

Evolution of Solutions - and Perceived Performance

by Rasmus Kromann-Larsen December 20, 2008 00:04

I entered a small competition yesterday and wanted to write a short post describing my progress and findings. The competition was rather simple, write a method that returns an array containing the numbers from 1 to 1000 in a random order.

My first solution was the naive LINQ solution.

public static int[] NaiveLinq(int max)
{
    var random = new Random();
    var query = from number in Enumerable.Range(1, max)
                orderby random.Next()
                select number;

    return query.ToArray();
}

I reckoned that several people would post this solution, so I felt like doing something slightly more fancy - and since parallelization is so hot these days, why not do it with PLINQ instead, so it would actually be faster on multi-core systems. The change is really simple.

public static int[] PLinq(int max)
{
    var random = new Random();
    var query = from number in ParallelEnumerable.Range(1, max)
                orderby random.Next()
                select number;

    return query.ToArray();
}

I benchmarked it (on my quad-core machine), and saw that the PLINQ solution was indeed faster - but only for rather big solutions, the overhead was simply too big in small instances of the problem (array size < 3000). This is not too well for a competition that is supposed to make arrays of size 1000. However, reluctant to ditch my parallel idea, I made a hybrid solution, which would use regular LINQ for small instances and PLINQ for big instances, based on my benchmark:

public static int[] GenerateUsingHybridLinq(int arrayLength)
{
    var random = new Random();

    // Use PLINQ if we're above the "very scientific" limit of 3000.
    if (arrayLength >= 3000)
    {
        return (from i in ParallelEnumerable.Range(1, arrayLength)
                orderby random.Next()
                select i).ToArray();
    }

    return (from i in Enumerable.Range(1, arrayLength)
            orderby random.Next()
            select i).ToArray();
}

I posted this solution to the competition and decided I had spent enough time on it. However, it bugged me because I knew that the Random class in the .NET framework is not guaranteed to be thread-safe. It also bugged be that I was actually solving a shuffling problem by sorting. Shuffling can be solved in O(n) while sorting is O(n*log(n)). In addition I thought that the LINQ solutions were kind of dull.

So I decided to take a stab at solving it without resolving to sorting. Shuffling is a well known problem, so I implemented the Fisher-Yates algorithm, often used for shuffling cards. It is actually a rather elegant algorithm.

public static int[] Generate()
{
    Random random = new Random();
    var numbers = new int[1000];

    // Add sorted numbers to shuffle
    for (var i = 0; i < 1000; i++)
        numbers[i] = i + 1;

    var last = numbers.Length;

    while (last > 1)
    {
        // Select a random entry in the array to swap
        var swap = random.Next(last);

        // Decrease relevant end of array
        last = last - 1;

        // Swap numbers using XOR swap, we don't need no stinkin' temp variables
        if (last != swap)
        {
            numbers[last] ^= numbers[swap];
            numbers[swap] ^= numbers[last];
            numbers[last] ^= numbers[swap];
        }
    }

    return numbers;
}

I adjusted it a bit and finally found a place to make an ode to the awesomeness that is XOR swap, swapping two values without using a temporary variable. Even though the algorithm was faster asymptotically, I was curious how it would venture against the sort-based LINQ solutions performance-wise. Here is the result:

image

Note that both HLINQ and PLINQ use all 4 cores on my quad-core machine. I realize there's an overhead using LINQ, but I'm still impressed how much faster this simple little algorithm is.

I submitted my new solution, with a note to replace my hybrid solution, just before midnight, but unfortunately I think it might have been too late, it didn't seem to make the cut for the competition. At least my hybrid solution made it into the final round - and non-thread-safe Random surely won't be a problem for instances of size 1000 - since it will use regular LINQ for that.

kick it on DotNetKicks.com

Tags: , , ,

Development

A WinDbg Debugging Journey - NHibernate Memory Leak

by Rasmus Kromann-Larsen December 19, 2008 20:29

Disclaimer: This is not a stab at the NHibernate team. They are doing an awesome job, it might as well (and for a long time I thought it was) have been in my own code. In addition - the memory leak is already solved on the NHibernate trunk.

Introduction

A few weeks back, an ASP.NET application (using NHibernate 2.0.0.4000) I am running got under heavier load than usual. I had noted that the memory usage was slightly high earlier, but it had never been a real problem - this is all the server is doing. However, under heavier load, memory pressure started approaching 700-800mb and the dreaded OutOfMemoryException started appearing when doing big chunks of work.

To be honest, I have never done much memory debugging - learning opportunity! If you do a sweep of the web these days on .NET and debugging, you will no doubt find the blog of Tess Ferrandez, who is an ASP.NET Escalation Engineer working at Microsoft. She has even done a lab series aptly named buggy bits that ease you through debugging and identifying various kinds of application problems.

After reading through her articles and watching her TechEd presentation on the subject, I downloaded WinDbg, configured it as Tess had described and started experimenting. This blog post will describe my journey and hopefully help other solve similar problems.

The Puzzle

The first thing I did was to grab a memory with adplus, one of the tools included with WinDbg. From my understanding, it stops the application momentarily and just writes the entire contents of memory to disk. This produced a huge .DMP file - a memory dump. My managed heap was at around 800mb at the time, but the dump file was slightly bigger.

Working with WinDbg is not your standard draggy-droppy windows application, it looks sort of like a console and you type bizarre command and it produces even more bizarre output for you to reason about. I started out using the "!eeheap -gc" command, which produces some basic information about your heaps.

image

As you can see, my heap size was around 814mb. If you dig into the information (not all shown on screenshot), you will find that my garbage collectors generation 2 is much bigger than generation 0 and 1. (You can read more about garbage collection and generational garbage collection here.)

After looking at this, I fired off the "!dumpheap -stat" command to get an overview of the objects in the heap.

image

The output looked like that, the first column denoting the type of object, second is number of objects, third is the shallow size of the objects, that is, not including whatever it references - the fourth is the type name.

Now, the first time I looked at this, I noted the NHibernate objects but focused more on the 385mb of strings - usually, SELECT isn't broken - I was convinced this was a problem in my code. I dug a bit deeper but didn't really find much, partly because WinDbg isn't super easy. This lead me to find some other places in my code that needed StringBuilders, but this proved not to be the root cause (thanks anyway Søren).

Later, when thinking about the problem, it came to me that that maybe 1.9 million NHibernate SqlStrings was a wee bit too many. I decided to did deeper into this - I found the SqlString in my list and found that the type is denoted with 0eafd714. Now, !dumpheap can do more that just give you statistics, it can give you filtered lists using various arguments. I wanted to sample some instances of these SqlStrings to see where they were bounded, so I used the command "!dumpheap -mt 0eafd714". This makes WinDbg give me a list of all the instances of the NHibernate SqlString - this is a very long list.

image

Now, the second column denotes the type, the third is the shallow object size and the first is what we are looking for - the instance address. I picked a few of them at random and used the !gcroot command to show where they were rooted. That is, give me the chain of references that lead to this object. An example is "!gcroot 57c2f130", which produces the following output.

image

You can see the actual instance at the bottom and then follow the chain upwards. It seems this current SqlString is rooted in a QueryPlanCache in the NHibernate SessionFactory.

At this point I actually downloaded the NHibernate source and started looking around. Conceptually, the NHibernate SessionFactory keeps a cache of recent HQL queries, so it doesn't have to rebuild them. According to the source code, it would store the 128 most recently used queries.

Now WinDbg can actually tell you the "deep" size of an object, object size + objects it references. This is done using the !objsize command. Now this literally took several hours of processing, so I don't have a screenshot for the blog post, but executing "!objsize 067016bc" command should give me the memory size of my SessionFactory. According to my log file, it told me:

sizeof(067016bc) =    716798348 (  0x2ab9798c) bytes (NHibernate.Impl.SessionFactoryImpl)

That is one big SessionFactory (~700mb). I dug further down the reference chain to try and figure out what was wrong with the cache. Remember I said that this cache was supposed to hold 128 queries. When I got to the hashtable in the cache and dumped it using the "!do 067030d4" command, it revealed the following:

image

According to this, my cache contains 92000 queries. After digging around in the code, writing a few unit tests and getting some help from the NHibernate user group, I finally found out that it was a bug in the object indexer in the LRUMap, such that it didn't enforce the 128 limit properly.

It was a small innocent bug, but having a 700mb (and growing) hashtable hanging around in your system forever is not really that pleasant. I ended up writing a hack that used reflection to access the field that contained the cache and clearing it periodically. It is already fixed on the NHibernate trunk, but I haven't gotten around to updating yet.

Since implementing my clear hack, I haven't seen memory usages above 50mb.

Conclusion

I've told my small debugging tale of how I got introduced to WinDbg and how it helped me track down a major issue in my application and reduce memory usage from ~800mb to ~50mb. It's a funky tool and can be quite scary at first, but if it helps me remove memory leaks, I am all for it. The second lesson learned is that sometimes - although I still won't look there first - SELECT is broken.

kick it on DotNetKicks.com

Tags: , , ,

Development | NHibernate

Design By Contract Preconditions With Expression Trees

by Rasmus Kromann-Larsen November 06, 2008 23:58

Introduction

Seems like Design by Contract is coming to C# 4.0, replacing the somewhat inadequate Debug.Assert, which is the only thing built into the framework at the moment. However, what are the options for today? In this post, I'll take a look at how to improve current precondition checking techniques using C# 3.0 expression trees.

Design By What?

Design by contract is technique for strengthening the contracts for classes by adding 3 kinds of checks:

  • Preconditions - What the called method expects from the caller. This is usually various forms of checks on method arguments.
  • Postconditions - What the called method guarantees upon returning. Often guarantees about the return value.
  • Invariants - What is guaranteed by the class? That is, the class invariant should be true upon one of the objects methods and again when the method returns.

So why is this even important? One of the merits of Design by Contract are that it can communicate a whole lot about your classes to other people using or reading your code. but they can also be helpful to you, as they allow you to express your intent more clearly and will support the fail-fast principle of defensive programming. The idea here is to produce the error as close to the source as possible.

Lets do a simple example to illustrate why this might be useful, consider the following two classes:

public class Person
{
    public Person(string name)
    {
        Name = name;
    }

    public string Name { get; set; }
}

public class Account
{
    private Person _owner;

    public Account(Person owner)
    {
        _owner = owner;
    }

    public string GetOwnerName()
    {
        return _owner.Name;
    }
}

It seems that the writer of the Account class is implying that an Account object should have an owner - an instance of Person. However, there's nothing to stop a potential client from doing this:

static void Main(string[] args)
 {
     var account = new Account(null);
     Console.WriteLine(account.GetOwnerName());
 }

This fails with a NullReferenceException in line 4 with the following stack trace:

DbCExpressionTrees.exe!DbCExpressionTrees.Account.GetOwnerName() Line 17	C#
DbCExpressionTrees.exe!DbCExpressionTrees.Program.Main(string[] args = {string[0]}) Line 13 + 0xa bytes	C#

Now this example is very contrived, since it's blatantly obvious where the bug is. But still, consider if the call to GetOwnerName had been in a completely different layer of the application, maybe even minutes after the Account object had been created. I'm sure you've had your fun with your debugger tracking down errors like this, if you've done any moderate size programs - I know I have.

What we need is a way for the writer of the Account class to communicate a stronger contract on what he's expecting from his client. In an ideal world, this contract would be enforced at compile-time and not allow the program to compile if contracts were broken. A thing I've always wanted for situations like this is being able to specify arguments as non-nullable in C# - that is, give me an object of this type and NOT null - since this makes sense in a lot of situations. Anyway, the only way to get something resembling this today is using Spec#, but this is a research project and still under development. So we will have to settle for runtime checks for starters.

Returning to the fail-fast principle - why is this useful? Consider the following change to the Account constructor:

public Account(Person owner)
{
    if(owner == null)
        throw new ArgumentNullException("owner");

    _owner = owner;
}

Executing the client code from before, our program will now fail when trying to create the invalid Account object with the following stack trace - and a clearly readable exception message (that owner is not allowed to be null):

DbCExpressionTrees.exe!DbCExpressionTrees.Account.Account(DbCExpressionTrees.Person owner = null) Line 13	C#
DbCExpressionTrees.exe!DbCExpressionTrees.Program.Main(string[] args = {string[0]}) Line 12 + 0x17 bytes	C#

The benefit here is that this stack trace points directly to the first offense against the "contract". Consider the differences in debugging time on the two examples. Examples like this could also be made for postconditions and invariants.

Expression Trees

Okay, I admit it, I've been itching to play around with Expressions since C# 3.0 was released and especially with all the cool usages in ASP.NET MVC. Furthermore I'll often go to great lengths to avoid "magic strings" in favor of something more type-safe (and refactor-friendly). Also, I happened to stumble upon these two posts by The Wandering Glitchand Søren Skovsbøll where they experiment with Design By Contract and C# 3.0.

Realistically, I preconditions are only viable part of Design by Contract to implement in C# at the moment. While you can probably do crazy postcondition and invariant checking by using exotic things like IL injection or interceptors, I really don't think we'll see any really good solutions until the language provides better support. So I decided to see what I could do on preconditions.

Now, Søren and Andrew (the Glitch) used a general Requires method for defining their preconditions. Søren's looks like this:

public static void Require(this T obj,
	Expression<Func<bool>> booleanExpression)
{
   var compiledPredicate = booleanExpression.Compile();
   if (!compiledPredicate())
      throw new ContractViolationException(
         “Violation of precondition: ”
         + booleanExpression.ToNiceString());
}

When using a lambda wrapped in an expression, we don't get the delegate that we're used to, instead we get something that resembles an abstract syntax tree that represents the expression. This is what enables us to pull various information about the expression. As shown above, the expression can then be compiled into the delegate that we're used to and executed.

However, compiling expressions is not the cheapest operation ever - and I personally believe that it can be beneficial to leave your contracts (if they are runtime) in your production code. Since breaking the contracts in production could lead to undefined behavior (or at least unintended), it would be nice to find the offender easily from the log containing the stack trace. So, since we're most likely going to use preconditions many places, it'd be super nice if they were as fast as possible. But preferably still without the strings.

My preference so far has been to do specialized methods for checking various things. If we have very specific methods for checking stuff the occurs often, we can make assumptions (or requirements) about the format of the lambda expressions and cut out the expression compilation. For more exotic things I'll still use the Requires method as shown above.

The example I'll show here is the same I did in my example earlier namely checking method arguments for null. This is arguably the precondition seen most often - however I've also done others checking arguments for empty strings in much the same way.

My ArgumentNotNull method defined on my static Check class looks like this:

public static void ArgumentNotNull<T>(
     Expression<Func<T>> argumentExpression) where T : class
{
    var memberExp = argumentExpression.Body as MemberExpression;

    if (memberExp == null)
        throw new ArgumentException
           ("Invalid Contract: ArgumentExpression "+ 
             "was not a MemberExpression.");

    var constantExpression = memberExp.Expression as ConstantExpression;

    if (constantExpression == null)
        throw new ArgumentException
          ("Invalid Contract: ArgumentExpression didn't "+
           "contain a ConstantExpression.");

    // Argument will be a field on the class.
    var fieldInfo = memberExp.Member as FieldInfo;
    // The contant expression will contain the object we're 
    // calling from.
    var methodOwner = constantExpression.Value;

    // Use the fieldInfo to extract the information directly from the owner
    if (fieldInfo != null && fieldInfo.GetValue(methodOwner) == null)
        throw new ArgumentNullException(memberExp.Member.Name);
}

The use in the Account class will look like this:

public Account(Person owner)
{
    Check.ArgumentNotNull(() => owner);
    _owner = owner;
}

and it will throw an exception that looks exactly like the one in my first example, so all the expression / reflection magic was really just to extract the argument name in a type-safe way. The ArgumentNotNull expects only lambda expressions containing a single argument and can thus make assumptions on the generated expression and pull the field value directly from the correct instance without compiling the expression.

But writing these specialized methods takes longer time and the Requires method can capture infinitely more conditions - so is this really worth it performance-wise? I did a small micro-benchmark - note: I've focused on the scenario where there's no error, since it by far the most common occurrence - we don't really care about performance if we're killing the program with an exception.

static void Main(string[] args)
{
    var timerArgumentNotNull = new Stopwatch();
    var timerRequires = new Stopwatch();
    var obj = new object();

    timerArgumentNotNull.Start();

    for (var i = 0; i < 10000; i++)
        TestMethod(obj);

    timerArgumentNotNull.Stop();

    timerRequires.Start();

    for (var i = 0; i < 10000; i++)
        TestMethodWithRequire(obj);

    timerRequires.Stop();

    Console.WriteLine("Argument not null: {0}", 
		timerArgumentNotNull.ElapsedMilliseconds);
    Console.WriteLine("Requires not null: {0}", 
		timerRequires.ElapsedMilliseconds);

    Console.ReadKey();
}


private static void TestMethod(Object obj)
{
    Check.ArgumentNotNull(() => obj);
}

private static void TestMethodWithRequire(Object obj)
{
    Check.Requires(() => obj != null);
}

And the results were as follows:

image

Now the Requires function could probably be optimized (maybe some expression caching), but the difference is quite remarkable.

Conclusion

In this post I've described some of the benefits with Design By Contract and defensive programming and tried to give some insight into using C# 3.0 Expression Trees for avoiding "magic strings" in our precondition checking.

kick it on DotNetKicks.com

Tags: , , , , ,

Development

Powered by BlogEngine.NET 1.6.1.0
Theme by Mads Kristensen | Modified by Mooglegiant | Adjusted by Rasmus Kromann-Larsen

About Me

I am a danish .NET developer blogging about the technical side of my life, mostly .NET stuff, but also fundamental topics like design patterns, principles and productivity boosters.

In addition, I am a core group member of the two largest .NET user groups in Denmark: CNUG and ANUG.