Quick Build


In our quest for ever-faster build times, we noticed that a large portion of our time running tests was getting the next OID for persisted objects.

After some investigation, we discovered the bottleneck. For every persisted object created in the system, a new connection was obtained from our datasource and a query was issued to retrieve a single OID. I know, I know… not very efficient. We wanted to refactor the code to grab batches of OIDs and cache them in memory.

How do we get groups of OIDs from an Oracle sequence? We had a couple of choices.

Modify the sequence’s INCREMENT BY to something larger than one.
We liked this approach because it allowed us to get a batch of OIDs with one query. But.. This approach has problems for external code that inserts data into our database. The external code would also have to know about the change in INCREMENT BY.

Changing all of the external integration code seemed like a big task. We wanted to find an approach that allowed us to keep the sequence unchanged, but allowed us to get a batch of OIDs with one query.

Query the sequence for a batch of OIDs
How do we do that? We need a query in Oracle that will generate N number of results. We tried several queries before we finally settled on one clever query. (Or at least it was clever to us.)

SELECT SEQ.nextval FROM (SELECT LEVEL FROM DUAL CONNECT BY LEVEL <= :1)

The clever part is the CONNECT BY LEVEL. This allows us to generate an arbitrary number of rows in our result set. The sequencer does the rest. We capture the results from the query and hand them out in memory. This query gave us the best of both worlds. We got to get a batch of OIDs with one query without having to change any of our external systems.

Where do we go from here?
In a word, GUIDs. This was a necessary performance related step on the way to implementing GUIDs.

We hear this all the time, “Slow down so you can speed up.” Recently the Engineering team here at Rally decided to take a slightly different approach.

Like most software projects that are several years old, our build system and directory structure were beginning to show their age and some cracks were starting to appear in the foundation. When we switched from Ant to Maven about 3 years ago we thought it would be a great idea to use Maven modules. Modules seem like a reasonable idea so we broke our code up into several. One for “server” code, those things that represent our domain model and all of their services. One for web application code, controllers, etc. And lastly, one module that was the guts of our web-app, CSS, JavaScript, JSP, etc. We also created a module for setting up the database and yet another one to configure the application server. Because of Maven’s convention over configuration getting all of these modules to work together and play nice required quite a bit of duct tape and bailing wire.

After a few years the team started to realize the way we broke down the modules doesn’t really make sense. Perhaps we want to break them up by functional area, or in some yet to be determined way. Regardless, we knew it would be more complex than what we currently have. Our solution was to simplify so we can make it more complex in the future. We did this in two ways.

First we collapsed all code into a single Maven module. We are now completely aligned with Maven’s conventions, so the build is no longer a mysterious black box, all developers can enhance it without worrying about breaking it. Additionally, I always know where code is, and the tests run a little faster without jumping between modules. When we’re ready we can start pulling things apart again.

Next we decided the application server configuration and deployment should be separate concerns. Our Operations team kicks ass, so why not let them be directly responsible for how to deploy things without having to edit the code base? Ops gets to do what they do best, and our code base loses a few thousand lines of build configuration. Seems like a good thing to all of us.

What did all of this get us? One thing that can’t be underestimated is the increased level of developer happiness with a clean and simple build and directory structure. Our tests run faster and there is much less directory changing.

Anyone else ever done something similar? Let us know.