Discussion of the Week #1
With Factlink, you can discuss any statement on the web. But the web is big! Therefore we are introducing the Discussion of the Week: a weekly discussion on an actual topic.
This week we are discussing exercising:
Join in the the discussion by clicking the Factlink! Currently only beta users can participate actively. Sign up for the beta now.
We will be expanding on this concept in the next couple of weeks. If you have ideas for topics to discuss, or how to further improve this concept, don’t hesitate to contact us!
Stubbing the object under test and getting away without it
TL;DR Get better design by mocking and stubbing on the object under test while TDD-ing, as long as you don’t keep the mocks and stubs in your final (regression) test.
Stubbing methods on the object under test is highly frowned upon, but I will show you how to get better design by stubbing methods on the object under test, without delivering an unrefactorable mess.
Lately the Ruby community has focused on narrow unit tests. Started by the fast tests movement the focus has shifted from more general integration tests, to testing just one object. When writing a unit test it is very common to mock and stub the dependencies of the object. However, most people think stubbing and mocking should stop at the object boundary.
I think you should be free to continue to stub and mock all the way down to the method level, as this gives you better possibilities to design top-down (which I prefer).
Test every method, stub every method
It feels really good to test in as small as units as possible. You can clearly define what you want the unit to do, and implement it. The question is, how small do you want to make your unit? Most people stop at the object level, because you should be able to freely refactor within your objects. Because stubbing the object under test inhibits refactoring the object, most people are against it.
I disagree, to me the object boundary seems like a rather arbitrary place to stop stubbing. I think the key problem lies in the often overlooked third part of the TDD cycle. Red-Green-Refactor. You don’t stop when your test is green. Lots of time people do clean up their code, but overlook their tests. Refactoring your tests is just as important. If you follow this process there is no reason to stop at the object level. Instead we can take the methods as unit.
Therefore I propose the following:
- RED Test every method (in isolation, stubbing every other method)
- GREEN Make code work
- REFACTOR Refactor test until:
- only the public interface is tested
- no more methods on the object under test are stubbed
Imagine a todo app, which supports multiple todo lists. We can send the app a todo by email “Writing: write blogpost”. This will add the todo ‘write blogpost’ on the ‘Writing’ todo list.
One convention used here is only mocking the methods we want to really test expectations on, and stub the rest. Due to this we normally only test one thing per test (though not necessarily one assert/expectation). For further reading on when to use mocks and stubs I recommend Martin Fowlers article Mocks Aren’t Stubs .
We skip the parsing of the email, and assume we get some strings
todo_text from our email parser. We start by writing the test for the TodoSaver class which takes arguments
todo_text, and saves the todo using the
describe TodoSaver do describe '#save' do it 'creates the todo and adds it to a list' do todo_list, todo = mock, mock saver = TodoSaver.new('Writing', 'write blogpost') saver.stub todo_list: todo_list TodoItem.should_receive(:create).with('write blogpost') .and_return(todo) todo_list.should_receive(:add).with(todo) saver.save() end end describe '#todo_list' do pending "it returns a todo_list" end end
Note that we expressed the
save method in the assumption that we have a todo_list method. To make sure we don’t forget to create this method we added a pending test.
We can now implement the first version of TodoSaver:
class TodoSaver def initialize list_name, todo_text @list_name = list_name @todo_text = todo_text end def save todo = TodoItem.create(@todo_text) todo_list.add todo end end
This code will of course never work: It is missing the
todo_list method. But we added a pending test to remind us of that. We will test our
todo_list, to ensure it behaves like we expect.
describe '#todo_list' do it 'retrieves an existing list by its normalized name' do todo_list = mock saver = TodoSaver.new('WriTing', mock) TodoList.stub(:retrieve).with('writing').and_return(todo_list) expect(saver.todo_list).to eq todo_list end pending 'creates a new list if no previous list was found' end
To implement this, we use some TodoList class in our data layer, which can search by a normalized (lowercased) list name. We want this to prevent lists from not being found when you send in a todo with a different casing for the list. We can easily fulfill the requirements in above test:
def todo_list TodoList.retrieve(@list_name.downcase) end
We have a second case however: when the list mentioned isn’t recognized, we want to create a new list. We assume that the user wants to use the casing used in the email for this new list:
it 'creates a new list if no previous list was found' do todo_list = mock saver = TodoSaver.new('WriTing', mock) TodoList.stub(:retrieve).with('writing').and_return(nil) TodoList.stub(:create).with('WriTing').and_return(todo_list) expect(saver.todo_list).to eq todo_list end
We can easily implement this alternative using a lazy or:
def todo_list TodoList.retrieve(@list_name.downcase) || TodoList.create(@list_name) end
We implemented the full TodoSaver class, so this seems like a good time to refactor. I think
todo_list should be a private method, but since it’s tested we cannot refactor that now. Therefore we now inline the tests made for the
todo_list, into the test for the
save method. This means splitting one test into two tests, supporting the different scenarios tested for the
describe TodoSaver do describe '#save' do context 'the list to save it to exists' do it 'creates the todo and adds it to a list' do todo_list, todo = mock, mock saver = TodoSaver.new('WriTing', 'write blogpost') TodoList.stub(:retrieve).with('writing') .and_return(todo_list) TodoItem.should_receive(:create).with('write blogpost') .and_return(todo) todo_list.should_receive(:add).with(todo) saver.save() end end context 'the list to save it does not exist' do it 'creates the todo and adds it to a newly created list' do todo_list, todo = mock, mock saver = TodoSaver.new('WriTing', 'write blogpost') saver.stub todo_list: todo_list TodoList.stub(:retrieve).with('writing').and_return(nil) TodoList.stub(:create).with('WriTing')
.and_return(todo_list) TodoItem.should_receive(:create).with('write blogpost') .and_return(todo) todo_list.should_receive(:add).with(todo) saver.save() end end end end
Because we don’t test
todo_list anymore, we can now refactor our implementation, and make all methods but
class TodoSaver def initialize list_name, todo_text @list_name = list_name @todo_text = todo_text end def save todo_list.add todo end private def todo TodoItem.create(@todo_text) end def todo_list existing_list or new_list end def existing_list TodoList.retrieve(@list_name.downcase) end def new_list TodoList.create(@list_name) end end
The result is fully tested class, only tested by it’s public methods, without stubbing methods on the object under test. This makes it easy to refactor the object.
Why do this?
Because we took the intermediate step of stubbing out methods on the object under test we were able to TDD in a top-down fashion. This meant we could postpone design decisions about data retrieval details until we actually cared about them.
I think this yields a very valid method of designing classes, which works better for me, because I like to design top-down. I am however really interested in your feedback. Do you ever use this method? Do you keep stubbed out methods on the object under test? Do you have a radically different method which yields the same results?
I have mixed feelings about meetings. They are necessary to get the team on the same page, but they often feel like wasted time. At Factlink we try to keep our meetings short by using agile practices: a short standup, quick planning sessions, and a brief demo plus retrospective. However, recently our meetings have become increasingly longer.
When reducing our sprint cycle to one week, we had to cut most of our meeting times in half (except standup). The new times should have been:
- Sprint planning: 1 hour
- Demo: ½ hour
- Retrospective: ½ hour
- Standup: 15 minutes
With the demo we usually hit our target. Since we have one each week, only a limited set of features can be demoed. At the demo there is also more pressure, because we don’t want to bore our guests with technical details. The retrospective often goes a bit over time, but that’s probably because we have beers afterwards, kick back and reflect on the past week.
However, a couple of weeks ago we noticed that both our standup and sprint planning became way too long. We’ve had sprint plannings of over 2 hours, and standups of 30–45 minutes.
We did two things to shorten our meetings. First, we started strictly enforcing time limits for meetings. When a time limit is reached, the meeting ends right then. This may be annoying at first, but it gives us a better feeling for how much time there still is in a meeting, and consequently an idea about what might not need to be said. This way, we develop an ‘intuition’ that helps us assess when to stop a discussion.
Our second measure was the one-legged standup. The idea of a standup is that standing becomes annoying after a while, so meetings stay short. Apparently it didn’t bother us enough (or we developed some serious standing-stamina). Now everyone had to stand on one leg throughout the entire standup. It worked wonders: the first day our standup took less than 10 minutes, while we still discussed every story on the board.
After about a week we stopped with the one-legged standup, as we felt we had gotten used to shorter standups again. It seems to have worked: we haven’t had a standup longer than 15 minutes since then. Our other meetings have become shorter as well. This week’s sprint planning ended after exactly one hour, without any intervention.
YOLO: spend less time deploying, more time for development
As a developer, you want to focus on solving problems and building fancy new features. This is easy at the beginning of a project, but as your product starts to shape up and the user base increases, delivering consistent quality has to become an important aspect of your development process.
Since manual testing and deployment usually is quite labor intensive and error prone, we decided early on to automate these processes at Factlink.
We have been using Continuous Integration within our development process for over one and a half year. It allows us to release our code base often while maintaining high quality. Some advantages that we’ve experienced so far:
- improved quality of software
- short(er) release cycles
- quality control and deployment made simple
- easy to automate tasks
We’ve accomplished this by improving the process we use for integrating our code back into the main branch and by automating a lot of tasks like testing and deployment. Jenkins, the Continuous Integration server we use, is a great friend when it comes to automation and it serves as a central hub in our integration, test, and deployment process.
Integrate your code often
Working on large features often causes problems when trying to integrate your code back into the main branch. It can be frustrating and boring to fix merge conflicts. The product owner probably won’t be happy either because it means less time to spend on developing new features. Stuff should just work, right?
Using git-flow to support smoother branching enables developers to work on small feature branches that can be quickly and integrated back into the development branch. By keeping the feature branch in sync with the development branch, the merging of a feature branch back into development typically becomes a piece of cake and merge conflicts will arise far less frequently.
We’ve set up Jenkins to automatically check out the latest commits pushed to the development branch and to the feature branches, and then run the complete test suite and addtional tests for each of them. The steps that Jenkins runs for each job are defined in our projects’ codebase, allowing developers to easily adjust or add steps.
For each commit to the development branch, Jenkins is running the following tasks:
- install latest dependencies
- codebase security checks
- unit tests
- integration tests
- acceptance tests
- screenshot tests
- deploy to testing server - but only if all previous steps were successful
The order of running these tasks is important: we want to break on errors as soon as possible to avoid waiting time before we can start fixing things. This explains the order of the tests: first unit tests (fast), then integration tests (semi-fast), integration test (average), and finally acceptance tests (slow). When one of the tests fails, Jenkins will notify the development team by posting a message to our development Hipchat channel and showing a red screen on our office dashboard.
Changes to the development branch will automatically be deployed to the testing environment and changes to the master branch will automatically be deployed on the staging environment. We don’t deploy to production automatically yet, because we still have a couple of manual checks that need to be done, for instance testing and uploading our Chrome Extension (something that is hard to automate). The manual testing is pretty easy and takes around five minutes to complete. When successful, deploying to production is just a final click of a button.
Using Continuous Integration in your project
When starting a new project, you should really invest some time in setting up a Continuous Integration environment.
Things to organize are:
- a solid test suite
- a DTAP lane
- a Continuous Integration server
- a branching model for your codebase
- storing your test scripts in version control, not on the CI server
When done right, this approach creates a great foundation for getting to fast development cycles and quality deployments!
Increasing Development Speed by Decreasing Cycle Time
Every team has been there: the team’s velocity is decreasing without a clear reason why. Sprints are unfinished and bugs are creeping in. Some stories are almost done and with a quick calculation you can usually argue that in hindsight your velocity was… well, less than previous sprint. During retrospective, initiatives are discussed to improve things but somehow, during the next iteration, even less is accomplished. Where is this ‘hyperproductivity’ that Jeff promised us?
I have seen this happen in several teams and witnessed the frustration it causes. Not only for customers (who don’t get what was promised), or management (that gets frustrated by inefficiency and waste), but also for the development team. Team members increasingly feel incapable of doing their jobs like they feel they should.
So when we saw this coming at Factlink, we decided to try to act before our velocity would plummet. Initially, we used sprints of two weeks each in which we delivered around 13 story points. For a few sprints in a row, we failed to reach our target velocity. A natural response to this would be to increase the length of iterations. The logic behind this being:
- “Waste less time on meetings”
- “Waste less time on deploying the release”
- “Have more time to perform solid testing in order to make sure all stories have been completed correctly”
- “Take on larger stories successfully”
What we actually did might sound counterintuitive: we decreased the length of our sprints. Instead of using sprints of two weeks each, we went for sprints of one week. The idea was that the change could potentially increase the speed at which we learn and adapt. The challenge was to squeeze everything into one week while minimizing overhead at the same time. We decided to cut meeting time in half for each meeting — one hour for our sprint planning and half an hour for both our demo and retrospective. Then we got to work!
Surprisingly, we finished the next sprint with a velocity of 12 points — roughly the same as we used to achieve in two weeks before. Initially, we were a bit unsure if it wasn’t just a lucky shot but week after week, we were able to deliver significantly more than when we had sprints of two weeks each.
We have reached a velocity of about 18 points per week and climbing. This means we increased our velocity over two week from 13 to 36 points!
What had happened? We had some discussions on this subject and came up with a several factors that potentially could have impacted our velocity:
- The shorter cycle time forced us to split up stories into smaller ones. This resulted in improved estimates and increased reliability of product delivery.
- It also pushed us to speed up our automated testing and deployment infrastructure. This has reduced both the amount of time it takes to discover bugs and the time spent on deployments.
- The end of the week horizon to deliver the product creates an increased focus on finishing things.
- We got rid of ‘almost done’. We used to almost finish most stories in the first week, after which we we only needed to finish the last bits in the second week (remember the Paretto Principle?). Now everyone gets nervous during the Tuesday morning standup meeting when there are no stories that have been completed yet, after the first day of work on the new sprint.
We are currently considering what other things we can do to increase our velocity. Should we introduce even shorter sprints? Why not iterate twice a week? Or every day? Twice a day?
An important thing that any Agile team should ask themselves is: What is the optimal length of each sprint for our team?
Please let me know your thoughts and experiences!