Currently, our  browser tests are written in Ruby and RSpec, using Selenium to interact with our site through the browser. Awhile back, Selenium 2.0 was released, the primary new feature being integration with the WebDriver API. According to their documentation:

This addresses a number of limitations along with providing an alternative, and simpler, programming interface. The goal is to develop an object-oriented API that provides additional support for a larger number of browsers along with improved support for modern advanced web-app testing problems.

Although we’ve since upgraded the selenium server, our client-side testing code still uses the Selenium 1.x Client API. While this has been working for us, I wanted to spend my hackathon moving us to the newer WebDriver API to take advantage of its improvements. A great starting point has been Selenium’s migration documentation. I’m currently knee-deep in the migration process and below are some of the issues I’ve had to solve.


To ease the developer’s burden of migrating their entire codebase to use the new WebDriver API, the Selenium team has included a WebDriverBackedSelenium class which looks like the old Selenium 1.x object and internally adapts those 1.x commands to the 2.x WebDriver class’s commands. This means that you can pass this class to all your tests and all their calls will still work while you migrate your tests one-by-one. While this approach is great and what I want to do, it’s unfortunately only available in their Java client download and we are using the Ruby selenium-webdriver gem, which does not have this handy class.

The approach I’ve taken is to create a Ruby version of this class for the commands we use, except for specific cases where it makes sense to simply migrate the client code on the spot. Some of the translation is very straightforward, like migrating from the old API call of

to the WebDriver API call of

However, some of the work is not as easy or does not have an exact mapping between the 2.x and 1.x API’s. In these cases, it has been helpful to look at how the Java WebDriverBackedSelenium client translates these commands. All of the command mappings can be found in the WebDriverCommandProcessor class.

Element locators have changed

In Selenium 1.x, you could locate elements with a few different strategies: by id, name, identifier (id or name), link text, DOM, XPath, and CSS. You indicate which strategy you are using by prefixing your locator string with the strategy name (e.g. “id=element_id”). If you don’t specify the prefix, it guesses which one to use based on a few simple rules.

In the WebDriver API, the prefix and locator are separated out as separate arguments named ‘how’ and ‘what’ (e.g. {:id => “element_id”}). A simple translation layer has helped to allow me to migrate the client code one at a time, allowing for both syntaxes. One case that has proven difficult is when the prefix is not specified and the string could either be an id or a name (the “identifier” strategy). Unfortunately, we’ve done this with both id’s and names, which has made it difficult to know which one is correct without running the test and looking for a pass or fail.

CSS3 selectors no longer use the sizzle library

In Selenium 1.x, the CSS selectors were handled via the Sizzle library, which implements a superset of the CSS3 spec. The WebDriver API, however, delegates CSS selecting to the browser’s native engine. Of course, we’re using some of the Sizzle features which are not part of the CSS3 spec. The primary offender has been searching for an element that contains a specific piece of text. Here’s an example of such a locator:

"css=div.simple-picker-menu div.simple-picker-menu-item:contains('#{value}')"

There’s not an equivalent CSS3 selector to find this element. The approach I’ve taken is to format the locator with an additional element to specify the text, which looks like this:

{:css => "div.simple-picker-menu div.simple-picker-menu-item", :text => value}

What this does, under the covers, is to use the first hash element to find all matching elements, not using the text. Then it selects the first matching element containing the given text.

But, what about crazier examples like this Selenium 1.x locator:

"css=.rally-dropdown-component:contains('Type') input.x4-form-field[value='#{}']"

In this example, the check for text is in the middle of the selector, not at the end. These situations generally have 2 possible solutions:

  1. Rewrite the locator using XPath instead of CSS, which allows for selecting elements based on their textual contents.
  2. Break the locator in two. The first finds the element (or elements) with the matching text and the second finds the matching downstream element.

These are typically more difficult to implement as it is common for these strings to be used in different places from where they are defined. This makes the second option a bit more difficult to implement, without some refactoring.

Filling out form fields

In Selenium 1.x, you can put text into an input field like so:

selenium.type(locator, value)

The WebDriver approach is to use a command like this:


However, there is a slight behavioral difference. The 1.x command will replace the existing text with the new value whereas the WebDriver command will simply have the browser type the given keys, adding to any existing text. In these situations, I’ve needed to call element.clear before element.send_keys.

Interacting with elements that are not visible on the page

Related to the previous topic, Selenium 1.x lets you perform actions on elements that are not visible to the page. This means that I could call

selenium.type(locator, value)

which would populate, for instance, a hidden input field. Changing this code to use the new send_keys method won’t work because WebDriver will complain that it can’t interact with elements that are not visible. The solution I’ve taken for hidden input fields is to set them by executing JavaScript to set the element’s value attribute.

Some other instances of this issue that I’ve encountered are:

  • Shared behaviors that click on a button that may or may not be visible and if it is not visible, it doesn’t matter. In 1.x, the click would still happen. I now need to add conditional logic to only click on the button if it is visible.
  • We have some dropdown menus with links that only show up when the user hovers over a button. Our tests were written to simply click on the menu item link, which worked fine with Selenium 1.x. Now, we have to hover over the button first.

Wrapping up

I still haven’t quite finished migrating all the tests yet (we have over 2,500 of them). One of the last big hurdles remaining is migrating the tests that drag and drop elements on the page. Like some of the issues above, it initially seemed trivial to migrate to the WebDriver drag and drop API, but the first attempts didn’t work properly.

Have you had to migrate from Selenium 1.x to 2.0? Is there any particular aspect that you found challenging?

I also wanted to investigate how we could run our test suite against multiple browsers. My goal is to run all of our tests against all of our supported browsers on every commit (or at least once per day). But, how can we run the tests against multiple browsers? The testing framework typically specifies which browser to test against? Have you solved this problem?

Request a Call

Looking for support?

Send Us Your Feedback

Provide us with some information about yourself and we'll be in touch soon. * Required Field