I spent some time over the past year working with Kevin Solorio, a great friend and colleague. Kevin worked on our engineering team at Within3, and has since moved to a great team at Designing Interactive.

Kevin and I would get together on monday mornings at a local coffee shop and work on some code. Sometimes it was talking through a specific project, other times we picked out parts of our codebase that we wanted to improve. This allowed us to progressively improve our application, as well as develop a shared knowledge of the different moving parts.

There have been many discussions about what it means to make your Rails tests fast. I don't want to re-gurgitate what is being said in the Rails community. I want to highlight an example and the thought process.

What's the real problem?

I don't believe that the problem is Rails.

The problem is in a clear separation of concerns (or, Single Responsibility Principle). The tests just happen to highlight that the code could potentially be doing too much and needs to be extracted. As programmers we need to understand and feel that pain before we can remove it.

At this point, some would jump to the seemingly easy route. They identify Rails as the problem, and so the solution must be to remove Rails from the equation. This can be done several ways:

  1. Excessive mocking and stubbing. Essentially putting different actors in place of the real objects.
  2. Re-opening and re-defining classes or modules.

You might be wondering what is so bad with the above approaches.

By doing either of these, you are simply masking the problem and making the test suite fragile.

  1. The mocking and stubbing can give you a false sense of security. You can very easily write tests that will happily pass, even though the implementation may change in the core code.
  2. Re-opening and re-defining classes creates dependency issues, as well as creating the potential to break later tests that may rely or call on that model.

Yes, a build server will ultimately catch these if you set them up properly. However, it blinds you to problem in the short term, leading you to believe your code is fine when it clearly isn't.

Remember: it's about quality, not quantity. We don't want to sacrifice the quality of our tests for speed.

If you truly feel the pain of the code, you need to look at the code and ask: Does this need Rails?. It's important to not skip this step. This is the key in making your tests faster: detaching them from Rails where it makes sense.

Where's the code?

I will present a trivial example. We have a Rails ActiveRecord Person model. This model has grown over the years and houses many different methods and responsibilities. One of the methods is called get_name_forms and is used as a helper method in other areas of the app to pre-filter searches for content authored by our person. This is useful when passing to internal search engines (Solr, Sphinx, etc), and also when passing to third-party search services.

The code started out in our model like this:

The get_name_forms is one of many methods found in the person model. Now we write a test to verify we are getting back the expected data.

Let's go back to feeling the pain. In order to test this simple method, I have to include my spec_helper, which in turn loads up my slow Rails environment. Writing tests around this now becomes a chore. The feedback loop from the tests may be minutes. This does not make for a happy programmer.

Earlier I mentioned one of the solutions would be to re-open and re-define classes or modules. At this point, you may be tempted to do something like:

Now we have the appearance of speed. We have essentially re-created our Person model declaration. The feedback loop will be shorter, but does it ultimately solve our problem?

This is a code smell. Why are you masking Rails here? Is Rails the problem, or is your code the problem? Let's break down the pieces we actually need: first_name, middle_name, and last_name. We need three things, all of which are primitives. They are Strings. Our expected output is an Array. Now ask yourself, does all of this logic need to live in our model?

The first pass could be something that moves it out of the model and into it's own file.

Now we have a separate object, and we use Dependency Injection to pass it the Person record. Our test is still slow, however, since we are using the Person model. We want to get away from the that dependency.

Let's take a step higher. The goal of this method is to simply return an array of different name forms when provided a first, middle, and last name. It's simple. It has one responsiblity. It shouldn't concern itself with our Person model at all.

What we have done here is moved the problem to another part of the application, we haven't yet solved our initial problem of speeding up our feedback loop while having quality tests we can trust.

As a sidenote, this could be a perfectly viable solution if you were to adhere to an interface. We could just as easily pass an object that implements first_name, middle_name, and last_name.

Now what?

Let's take it down to it's smaller components to achieve our goal. We create a new object that will give us an array of name forms at the end, but instead of using Dependency Injection, we are going to pass it the primitives.

What we have now is an object that only concerns itself with one thing: providing an array of name forms from a combination of first, middle, and last name. There is no Rails. There is no need to load up a Rails environment with spec_helper. It's just plain Ruby. Here's our test:

This test runs much faster than our initial test runs, and we didn't have to mock, stub, or re-open any classes or modules. This object can work inside or outside of Rails, it has no dependencies on Rails itself.

Now our feedback loop is much faster. We can continue to re-factor the object with the confidence that we have the test coverage we need. In order to keep backwards compatability, we could easily include this object in our Rails model and then delegate accordingly (use at your own discretion).

What did we do?

First and foremost: we felt the pain. When you feel that pain, drop down a level and truly figure out what the pain points are. I understand this is a trivial example, but in many cases we can break things down into smaller components that have a single responsibility, and can be built like Legos to achieve what we need.

Second: we created a test suite we can have confidence in. We didn't try and sweep our pain under the rug for the sake of speed. We made our code better for us and for those who have to come after us and work on our code.

Third: We isolated the name formatting to it's own responsiblity. Now developers can work on a specific aspect of the name formatting without having to worry about the rest of the Person model. The feedback cycle is faster.

7 Comments Add your comment

  1. Alexander September 6th, 2012

    Very helpful article. Being thinking about that for a while, but completely removing rails from tests is a huge performance win. Everything went better than expected

  2. Josh Cheek September 9th, 2012

    Good blog, enjoyed and mostly agreed with all of your points.

    I found the explicit use of self to be very distracting.

    The require statement in the last test implies load paths are not being correctly managed.

    For the tests like `it "returns the initial of the first name"`, I prefer to use the word "knows", because `Formatter::PersonNameForm` doesn't return the initial, rather `Formatter::PersonNameForm#first_initial` returns it. By saying the class (implicitly: instances of the class) knows the first initial, you don't incur the confusion when you describe different methods (e.g. the next test says it returns the middle initial -- causing me to ask which is it, does it return the first initial or the middle initial). Try running specs with `--format documentation` to see what you are really saying.

    The tests around the nuances of the initials seem like a lot of work with no real benefit, and which prevent you from refactoring to a different implementation. Especially since these were not part of the initial implementation (in the Person model).

    In the end, how do you deal with changes to interfaces? What happens if your Person class changes its `first_name` to `given_name` or `fname`? How do you test this glue? If you don't test it, then as you said, "The mocking and stubbing can give you a false sense of security. You can very easily write tests that will happily pass, even though the implementation may change in the core code." If you do test it, then you must test it within the full context that you spent this post trying to get away from. The pain around these glue tests leads me to suspect that dynamic typing does not fit well with TDD.

  3. Nate Klaiber September 10th, 2012

    Thanks for the response, Josh.

    The explicit use of self is a habit. Working in legacy codebases, this helps to distinguish local variables inside a method from methods on the object itself. I prefer to use an explicit self.

    The require statement is because I am not requiring the spec_helper.rb. This is the same if you were to develop a gem, where you have to require the file under test in order to test it. This isn't a load path issue - it's a part of not having to load up my Rails environment. I don't have a separate spec_helper, because I am only concerned with testing this one file. The components are then responsible to include other files as necessary.

    I agree with your points on the methods surrounding the initials. I'll be posting an addendum to this with the final code of this in production. I moved all of the *_initial methods into private, and also removed the instance variables. The only thing this object does is returns the name forms array.

    This is a continued part of the refactoring process. I don't want callers to use this class to try and get at the initials or full_name, when the purpose is only to return the name forms of a provided name. Moving them into private will help prevent abuse from other callers in the system.

    If my Person class changes, it doesn't matter. The object takes in primitives (Strings) of first, middle, and last. It's up to the caller to pass these things in. If you want to have this covered, it would be an integration test with the model. That would be your glue. The point is, by having this isolated and unit tested - this object has no knowledge of the outside world. It works as it says it does, and is proven with the unit test.

    If the Person changes first_name to given_name, the responsibility is on the caller and this object is unaffected. If you look earlier, we initially passed in a Person instance (or an interface that responds to those methods). That was removed in order to pass in first, middle, and last. This removed the binding to that interface.

    The benefits are this: We have smaller units of code that perform specific tasks and sit outside of the Rails stack. They are plain Ruby objects. Then they can be used within the models if you so choose (as mentioned in the article). However, with the standard unit tests - this is much faster and the contract is tested and protected. Smaller components that build up to larger components. Each smaller unit is tested, and then you can test the integration of the other components that build up larger objects.

  4. Josh Cheek September 10th, 2012

    "this helps to distinguish local variables inside a method from methods on the object itself"

    That's a good point, I really struggle with local variables.

    ---

    "This is the same if you were to develop a gem, where you have to require the file under test in order to test it."

    You shouldn't need to do this, RSpec will set certain load paths by default (lib and spec, I believe). And you can externally add load paths in rspec (or Ruby) with the `-I` flag. If you commonly have a directory you store these in other than lib, you could probably add `-I my/path` to your ~/.rspec file. Or alternatively, you could probably do something like `export RUBYPATH="my/path:$RUBYPATH` in a .rvmrc file or something.

    ---

    "If my Person class changes, it doesn't matter. The object takes in primitives (Strings) of first, middle, and last. It's up to the caller to pass these things in. If you want to have this covered, it would be an integration test with the model."

    Right, do you cover this? If not, how do you change interfaces without fear of going green while broken? If so, can you show the test(s) you used?

  5. Nate Klaiber September 11th, 2012

    You are correct. RSpec will setup those load paths if I include the spec_helper.rb. I don't have it included here, and I don't have an alternate spec_helper.rb to not load my Rails environment. I am also aware of the command line options, .rspec, and shell variables. I don't use those, because there is no pain felt by including the necessary file in the spec file itself. It's explicit and causes no problems. I also don't always rely on Rails autoloading for all files, and keep explicit requires, as that's part of why the Rails app is slow in the first place . It just loads all of lib - which grew exponentially over the years.

    Yes - I do cover that. For the same reason I don't create unit tests with the guts of ActiveRecord or other Rails objects that are already tested at the core, I don't unit test this method when included in an ActiveRecord model. I will, however, put it in an integration test which will load up dependencies and other libraries.

    This isn't an inherent problem with dynamic languages - the same scenario would exist in a statically typed language. When I am working within the PersonNameForm object, I have the confidence that my object adheres to it's contract and is green and working. At that point, I can't possibly know all callers of the object to load up their tests and then double test it with those models. This is why I keep the unit tests focused and without knowledge of the outside world, and then use the integration tests to let me know if other expectations have changed.

    When this refactoring started I put the unit tests on the Person model itself. Then I extracted the method to another object and replaced it. Then I moved the test from the model to an integration test.

  6. Josh Cheek September 12th, 2012

    You honestly shouldn't need a spec_helper, here is an example https://gist.github.com/3703788 The pain I see is that when you rearrange your files (e.g. grouping specs in folder, or moving a model into lib) you will have to go edit all your files. If you don't want to do that, might as well just require_relative, then at least you don't have to do the File.join(File.dirname, ...)

    -----

    "This isn't an inherent problem with dynamic languages - the same scenario would exist in a statically typed language."

    Well, you said you need a second full stack test to verify all of your wiring. But wiring is guaranteed in static languages*, and you don't run into the possibility of passing tests where you shouldn't have, the compiler is comprehensive** where integration/acceptance tests are not.

    * well, unless you take multiple primitives of the same type and mismatch the order

    ** unless you're doing something like typecasting void pointers in C, but at that point you're basically back to a dynamic language, anyway

  7. Nate Klaiber September 12th, 2012

    I think we will have to agree to disagree with the loading of the file itself. I am aware of my options to autoload or require directories of files. I prefer the explicit require in this legacy application. This preference is due to a handful of other reasons that I will work up in a follow up article.

    Depending on the project, I have at least three stacks to run. I have my unit tests testing the smallest parts and components. I have integration tests that test the component being built up into objects. I have acceptance tests that ensure those objects work in the context they are required. This is my personal test workflow, and I understand that may be different for others.

Comments are closed.