Feature toggles have been an obsession of mine on the blog lately after a customer pointed out that we talk about them a lot but don’t have any posts about them on our blog.

Feature Toggles present several interesting “issues” when it comes to testing. I was going to say “problems” but that’s not entirely accurate. I think that addressing some of the “issues” actually helps the testability of the code under test and as a result improves its quality. These issues manifest at multiple levels and in several different ways.

Depending on how you implement toggles, unit testing code that relies on feature toggles can be pretty easy. In our infrastructure, the actual toggles are a Java enumeration. We have a component that’s responsible for checking whether a toggle is enabled. As often as possible, we adhere to the idea of Dependency Injection when designing code. By injecting that component everywhere we need to check toggles, we make testing the toggle-dependent code as easy as injecting a fake object (mock, stub etc) into the code under test.

One issue that can present itself when testing toggles at the unit level is the idea of checking toggles inside of code that is not managed by your dependency injection framework (in our case Spring). For example, if some code in an ORM-managed entity needs to check a toggle, how do you do that? The short answer is that you shouldn’t. We’ve done it and it sucks.

The toggle-dependent logic should only be checked in places where you control the lifecycle of the object under test. Doing otherwise makes the code harder to test cleanly. I try to stick by the idea that if you can’t easily write an automated test for a piece of code, then there’s something wrong with the code itself.

And no, checking the toggle in a static method does not count at writing a clean test. We still do it when it can’t be avoided, but we try to avoid it at all costs.

Higher level tests, such as browser tests or integration tests, present a different set of concerns.

We’ve blogged a lot about our browser test implementation and how we sped it up by running tests in parallel. Our integration tests (tests that go to database but don’t drive the browser) are the same way. Our feature toggle implementation started off with the idea of the toggle being on or off globally.

Do you see the next problem yet?

Having tests that rely on the toggle being off running in parallel with tests that rely on the toggle being on does not put your test suite into a happy state. You will end up with data-interaction issues that cause intermittent test failures (flaky tests).

This potential for data interaction caused us to change the way we manage toggles. We now allow toggles to be turned on for individual users, whole subscriptions (a group of users who share the same data), or globally within the app. Because the subscription boundary is a “sacred boundary” within our application as far as data is concerned, our tests that interact with the database most often use a single subscription. The subscription is created as part of the test setup, and by also turning on the toggle for that subscription during the setup, we allow many different tests to potentially exercise the same toggle. Those tests can now be run in parallel with zero chance of data interaction.

One problem we still need to solve is how to effectively test code that can only be globally toggled. Some code runs early enough during a web request that we don’t yet know who the user is, or even what subscription they belong to. In those cases, we will usually set the toggle globally when the test suite starts. It allows us to test one state of the toggle (whichever state is more important) but it doesn’t allow to test the other state in a thread-safe way.

We still have some hurdles to overcome with the way we manage testing with toggles, but we’re doing a good job of learning as we go. We’ve come up with effective ways to test toggles in production-like scenarios and eliminate data interaction between tests.

What did I miss? Let me know in the comments.