This is the first in a series of posts that will cover some of the challenges we faced in automating functional tests at the browser level and the actions we took to remedy issues with our existing test suite.

A long, long time ago (i.e. October of last year), our main suite of browser based functional tests was oft-maligned, cursed regularly and doubted as a matter of course. With good reason.

Exceptionally long execution times (~9hrs) prevented us from learning quickly what mistakes we had made in writing the tests. False failures had become a fact of life, regularly we would see tests fail in the build environment that would pass locally, causing enormous frustration – we even got to the point of running all failed tests twice on our CI system, just to make sure they were really broken. Our relative lack of experience with the Ruby language was limiting our ability to refactor effectively, or even debug easily.

Since it was clear that the tests were no longer delivering the value they once had been, efforts to develop a new Java-based framework were underway. This contrived to further undermine the existing Ruby framework as there was a reluctance to maintain what was seen as an obsolete technology.

But we came to the realization that we were only digging ourselves deeper by continuing on that path, and this brought on a period of departmental soul-searching while a self-selected sub-team discussed the direction we should take with the browser tests, and most importantly at the time, in which language we’d write the tests. We couldn’t continue supporting two frameworks so needed to get on with choosing one.

There were some strong opinions on the topic of language, on both sides of the debate. Some of us, myself included, preferred to abandon the nascent Java framework in favor of the far more complete Ruby framework, a language which lent itself well to acceptance tests. Others argued the Java framework had a brighter future since it was our primary development language and the framework which was already in place matched the architectural direction we were taking, unlike the Ruby framework which was covering ‘legacy’ architecture.

We reached consensus after some to-ing and fro-ing while we considered the two options we had, plus assessing a few others for good measure. Essentially, it came down to the effort that would be required to convert all the ‘old’ style pages into Java, leaving us with two frameworks to maintain until all the legacy coverage had been supplanted – and that’s assuming the work to convert existing pages would ever be prioritized to completion. Converting the Java framework into Ruby would be almost trivial in comparison. So we chose to convert the Java framework into Ruby, allowing us to concentrate our efforts on one framework and hopefully start climbing out of the hole we had found ourselves in.

We had recognized that there was a great deal of value in the overall test framework, and we were in a position to leverage it to our advantage now that we had passed through analysis paralysis, chosen a language and we formed a temporary team to put things right.

That team, and the department as a whole, have since taken great steps to fix the situation. Steps ranging from getting the test suite execution time under twenty minutes (more on this in my next post), to making it much less likely to cry wolf through eliminating test ‘flakiness‘. Some fixes were simple one-liners that enabled us to remove a vast number of synchronization errors. Others were far more involved and required writing custom Selenium handlers to efficiently parallelize the test suite. We also invested in department wide Ruby training, giving the wider team a much higher level of confidence in maintaining the tests, allowing for collective ownership of the suite.

Throughout this process of refactoring, we’ve learned how to write better, smarter and above all, more robust tests that fail when they find a problem in the application, not because the application highlighted a problem in a test.