Refactoring in Practice: Speeding up Your Rails Tests
I spent some time over the past year working with Kevin Solorio, a great friend and colleague. Kevin worked on our engineering team at Within3, and has since moved to a great team at Designing Interactive.
Kevin and I would get together on monday mornings at a local coffee shop and work on some code. Sometimes it was talking through a specific project, other times we picked out parts of our codebase that we wanted to improve. This allowed us to progressively improve our application, as well as develop a shared knowledge of the different moving parts.
There have been many discussions about what it means to make your Rails tests fast. I don't want to re-gurgitate what is being said in the Rails community. I want to highlight an example and the thought process.
What's the real problem?
I don't believe that the problem is Rails.
The problem is in a clear separation of concerns (or, Single Responsibility Principle). The tests just happen to highlight that the code could potentially be doing too much and needs to be extracted. As programmers we need to understand and feel that pain before we can remove it.
At this point, some would jump to the seemingly easy route. They identify Rails as the problem, and so the solution must be to remove Rails from the equation. This can be done several ways:
- Excessive mocking and stubbing. Essentially putting different actors in place of the real objects.
- Re-opening and re-defining classes or modules.
You might be wondering what is so bad with the above approaches.
By doing either of these, you are simply masking the problem and making the test suite fragile.
- The mocking and stubbing can give you a false sense of security. You can very easily write tests that will happily pass, even though the implementation may change in the core code.
- Re-opening and re-defining classes creates dependency issues, as well as creating the potential to break later tests that may rely or call on that model.
Yes, a build server will ultimately catch these if you set them up properly. However, it blinds you to problem in the short term, leading you to believe your code is fine when it clearly isn't.
Remember: it's about quality, not quantity. We don't want to sacrifice the quality of our tests for speed.
If you truly feel the pain of the code, you need to look at the code and ask: Does this need Rails?. It's important to not skip this step. This is the key in making your tests faster: detaching them from Rails where it makes sense.
Where's the code?
I will present a trivial example. We have a Rails ActiveRecord Person model. This model has grown over the years and houses many different methods and responsibilities. One of the methods is called get_name_forms and is used as a helper method in other areas of the app to pre-filter searches for content authored by our person. This is useful when passing to internal search engines (Solr, Sphinx, etc), and also when passing to third-party search services.
The code started out in our model like this:
The get_name_forms is one of many methods found in the person model. Now we write a test to verify we are getting back the expected data.
Let's go back to feeling the pain. In order to test this simple method, I have to include my spec_helper, which in turn loads up my slow Rails environment. Writing tests around this now becomes a chore. The feedback loop from the tests may be minutes. This does not make for a happy programmer.
Earlier I mentioned one of the solutions would be to re-open and re-define classes or modules. At this point, you may be tempted to do something like:
Now we have the appearance of speed. We have essentially re-created our Person model declaration. The feedback loop will be shorter, but does it ultimately solve our problem?
This is a code smell. Why are you masking Rails here? Is Rails the problem, or is your code the problem? Let's break down the pieces we actually need: first_name, middle_name, and last_name. We need three things, all of which are primitives. They are Strings. Our expected output is an Array. Now ask yourself, does all of this logic need to live in our model?
The first pass could be something that moves it out of the model and into it's own file.
Now we have a separate object, and we use Dependency Injection to pass it the Person record. Our test is still slow, however, since we are using the Person model. We want to get away from the that dependency.
Let's take a step higher. The goal of this method is to simply return an array of different name forms when provided a first, middle, and last name. It's simple. It has one responsiblity. It shouldn't concern itself with our Person model at all.
What we have done here is moved the problem to another part of the application, we haven't yet solved our initial problem of speeding up our feedback loop while having quality tests we can trust.
As a sidenote, this could be a perfectly viable solution if you were to adhere to an interface. We could just as easily pass an object that implements first_name, middle_name, and last_name.
Let's take it down to it's smaller components to achieve our goal. We create a new object that will give us an array of name forms at the end, but instead of using Dependency Injection, we are going to pass it the primitives.
What we have now is an object that only concerns itself with one thing: providing an array of name forms from a combination of first, middle, and last name. There is no Rails. There is no need to load up a Rails environment with spec_helper. It's just plain Ruby. Here's our test:
This test runs much faster than our initial test runs, and we didn't have to mock, stub, or re-open any classes or modules. This object can work inside or outside of Rails, it has no dependencies on Rails itself.
Now our feedback loop is much faster. We can continue to re-factor the object with the confidence that we have the test coverage we need. In order to keep backwards compatability, we could easily include this object in our Rails model and then delegate accordingly (use at your own discretion).
What did we do?
First and foremost: we felt the pain. When you feel that pain, drop down a level and truly figure out what the pain points are. I understand this is a trivial example, but in many cases we can break things down into smaller components that have a single responsibility, and can be built like Legos to achieve what we need.
Second: we created a test suite we can have confidence in. We didn't try and sweep our pain under the rug for the sake of speed. We made our code better for us and for those who have to come after us and work on our code.
Third: We isolated the name formatting to it's own responsiblity. Now developers can work on a specific aspect of the name formatting without having to worry about the rest of the Person model. The feedback cycle is faster.