Tue 29 Nov 2011
The ATDD Experiment
The job of software testing at Rally is different than any other testing job I’ve had, in a very positive way. As with all software organizations, we want to safely reduce the amount of manual testing and continuously improve our regression test suite so we can shorten the time it takes to qualify release candidates. At Rally, the developers write automated unit, integration, and GUI tests as part of each story so we improve our test coverage with each commit, which will lead to less manual testing, more reliable software, and shorter cycle times.
Testers, by profession, are not a trusting bunch. We want to prove or see proof that whatever is being tested is really working properly. One of the biggest areas many testers have problems with is trusting that the automation is doing the right thing (especially when we didn’t write it).
To help us sleep better at night, we’re trying an experiment to see whether we can improve our automated tests and expand further into the TDD universe. Or, in our case, the ATDD (Acceptance Test Driven Development) universe.
As developers are ready to work on a story, we (the developer(s) and the tester on the team) walk through the implications of the story in terms of the system as a whole and figure out what needs to be tested and the appropriate type of test to use. This gives the developers insight into both the happy and cranky path tests to cover, as well as making sure that the testers are comfortable with the quality of the automated test coverage. For small stories, we add the list of tests into the Acceptance Criteria of the story or bug. For large stories, the list may be in the form of an attached spreadsheet that is updated as the tests are implemented.
The desired result of early and additional planning is that testers will feel the need to run only a subset of the tests manually because the rest have been automated and because we have confidence in the automation and the coverage it provides. In practice, we’re finding areas where our automation isn’t quite doing what it should be doing, so as testers we have to do some risk analysis and figure out where we need to poke, just to make sure that things are behaving correctly. So there’s still some refining to do, but it feels like our approach is sound. As more tester-validated automation is implemented, we feel safer in reducing the amount of manual testing, our regression test suite keeps growing bigger and more reliable, and our cycle times are growing shorter. Seems like progress.

“For small stories, we add the list of tests into the Acceptance Criteria of the story or bug. For large stories, the list may be in the form of an attached spreadsheet that is updated as the tests are implemented.”
Can you elaborate on why you are resorting to attached spreadsheets for the tests? Do you have an outstanding story to to support your test needs withing Rally?
The ATDD process has been a period of experimentation with the team. We’ve tried adding tests to “acceptance criteria”, to discussion posts, and as spreadsheets attached to the story. All the time we’ve been working on ways to measure its efficacy.
ATDD has been intentionally lightweight to encourage quick and easy adoption.
Our next step, which I think you’re asking about, is to use Rally’s quality control features: test plans, test cases, and test suites. IMO it is very important that this process does not impede regular work cadence. If we put too much process in the system it’ll become a moras of paperwork, questionnaires, and checklists. I’ve seen that happen and people just stop doing it.
On the flip-side, if we can quantitatively demonstrate that any process overhead is worthwhile, enhancing quality, decreasing defect count, then it’s an easy sell and we’ll keep doing it.
To extend Steve’s comment just a bit, it makes a HUGE difference that the vast majority of ATDD tests are automated.
Over time, only manual tests really benefit from administrative tracking (test plans comprised of cases and suites traceable to individual stories). After all, it’sbeing manual that makes them vulnerable to being skipped, ignored, misinterpreted, etc. It’s the delayed feedback (inevitable for manual tests) and resulting obscured cause of failures that necessitates direct linkage to the story.
In contrast, automating tests shrinks the need for tracking to the brief period while they are under development. Listing them together helps us ensure desired coverage. After that, they “just run,” and failures attract attention as needed. A bullet list in the acceptance criteria or an attached spreadsheet meets this need without burdening the process once that value has been realized.