Author Archive

Scala has an interesting feature, by-name parameters, that can take a little time to get your head around. Take the following code:

def foo() = {
   println("in foo")
   "FOO"
}

def bar(baz: => String) = {
   println("in baz")
   println(baz)
   baz
}

bar(foo())

What do you think is the output of this scala snippet? Coming from Java, I would expect the following:

in foo
in baz
FOO

Much to my surprise, the code actually outputs the following:

in baz
in foo
FOO
in foo

In scala, when you declare a method parameter like baz: => String the parameter is not evaluated until it is used in the method body. But… be careful, because it will get evaluated each time the parameter is used within the method body.

Where would this behavior be valuable? Anywhere you can have an expensive computation that might not be needed in the method body. One common place might be a logging method.

def log(message: => String) {
   if(notLogging) return
   writeToLogFile(message)
}

log("hello there " + calculateExpensiveLogStatement())

In the above log statement, "hello there" + calculateExpensiveLogStatement() will not be evaluated until the message parameter is used in the log method. How does this work? I am not really sure :) , but I think of it as scala generating an anonymous function for each invocation of the log method. This allows message to not be evaluated until it is used, just as if we had passed in a function pointer.

Thoughts?

Regular expressions are an important tool in any programming language. In Scala, regular expressions are really fun and powerful when combined with pattern matching. Let’s take a look at a possible usage for regular expressions: HTTP basic authentication. In HTTP basic auth, a client is authenticated by sending a header (Authorization) to the server that contains their credentials.

Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

There are a couple of good candidates for using regular expressions when parsing a basic auth header.

1. Stripping the "Basic" from the front of the value
2. Splitting the given username and password from the decoded Base64 string

Here is an example of a Scala class using regular expression to parse a basic auth header.

class BasicAuthenticationParser {
  private val Header = "Basic\\s+(.+)".r
  private val UsernamePassword = "([^:]+):(.+)".r

  def parse(header: String):AuthenticationToken = header match {
    case Header(encoded) => decodeHeader(new String(Base64.decodeBase64(encoded)))
    case _ => UnknownAuthenticationToken
  }

  private def decodeHeader(header: String):AuthenticationToken = header match {
    case UsernamePassword(username, password) => new BasicAuthenticationToken(username, password)
    case _ => UnknownAuthenticationToken
  }
}

As we can see, creating regular expressions is as simple as using the .r method on a String. Let’s take one of the methods in the above class and break it down.

  private val UsernamePassword = "([^:]+):(.+)".r
  private def decodeHeader(header: String):AuthenticationToken = header match {
    case UsernamePassword(username, password) => new BasicAuthenticationToken(username, password)
    case _ => UnknownAuthenticationToken
  }

This method is used to break apart the username:password that comes in the authentication header. UsernamePassword is the regular expression that represents this concept. UsernamePassword contains two match groups; one for the username and one for the password. This is where the fun begins.

header match {
    case UsernamePassword(username, password) => new BasicAuthenticationToken(username, password)
    case _ => UnknownAuthenticationToken
}

This snippet says we are going to match the header parameter against the UsernamePassword regular expression. If it matches, then we are going to assign the value of first match group to the variable username and the value of the second match group to the variable password. These variables can then be used in the statement on the right hand side of the case. If we don’t match, then we will simply return UnknownAuthenticationToken

What? It cannot be that easy. That was about 5 or 6 lines shorter than the Java version. Pattern matching is all made possible by a concept in Scala called an extractor objects. I will talk more about pattern matching and extractor objects in a later post.

A few of us at Rally have started learning Scala. As you can imagine, when learning a new language there are bumps along the road to enlightenment. My first bump while learning Scala was around how Scala determines return type. Can you spot the error?

object Example {
  def main(args: Array[String]) {
    val (a, b) = bad
  }

  def bad {
    ("foo", 2)
  }
}

This code will not compile. The compiler gives an error like:

error: constructor cannot be instantiated to expected type;
found : (T1, T2)
required: Unit
val (a, b) = bad

Hmm… Oh Sh&@T. I forgot the =. If a method doesn’t explicitly define a return type, then an equal sign must be placed after the parameter list in order for the Scala compiler to fill in the return type blank. If you neglect to type an equal sign then the compiler assumes the method returns a generic Unit type. Below you can see the fixed code.

object Example {
  def main(args: Array[String]) {
    val (a, b) = good
  }

  def good = {
    ("foo", 2)
  }
}

Rob Park and I will be hosting a CodeRetreat at Rally on March 27th.

What is a CodeRetreat (taken from the CodeRetreat Ning site)?

At CodeRetreat, we retreat from the world to advance in our craft. We sharpen our saws, together. We retreat from production and business value to increase our production capacity, our quality, our velocity, our ability to produce business value. We retreat from immersion in deep technology weeds (Oh no! The JBoss ClassLoader is giving me CCE’s!) to advance in our ability to learn and adopt to any technology well. We retreat from our fears, and embrace new practices, patterns, languages.

We retreat from our local ponds and swim in a larger pool. We connect with other passionate coders who we seldom get to code with. We make new connections and learn new lessons.

Code Retreat Format

This will be a fun day for all. Breakfast and Lunch will be provided. If you will be attending, please register at http://coderetreat.ning.com/ or send me an email at aesterline at rallydev dot com.

In our quest for ever-faster build times, we noticed that a large portion of our time running tests was getting the next OID for persisted objects.

After some investigation, we discovered the bottleneck. For every persisted object created in the system, a new connection was obtained from our datasource and a query was issued to retrieve a single OID. I know, I know… not very efficient. We wanted to refactor the code to grab batches of OIDs and cache them in memory.

How do we get groups of OIDs from an Oracle sequence? We had a couple of choices.

Modify the sequence’s INCREMENT BY to something larger than one.
We liked this approach because it allowed us to get a batch of OIDs with one query. But.. This approach has problems for external code that inserts data into our database. The external code would also have to know about the change in INCREMENT BY.

Changing all of the external integration code seemed like a big task. We wanted to find an approach that allowed us to keep the sequence unchanged, but allowed us to get a batch of OIDs with one query.

Query the sequence for a batch of OIDs
How do we do that? We need a query in Oracle that will generate N number of results. We tried several queries before we finally settled on one clever query. (Or at least it was clever to us.)

SELECT SEQ.nextval FROM (SELECT LEVEL FROM DUAL CONNECT BY LEVEL <= :1)

The clever part is the CONNECT BY LEVEL. This allows us to generate an arbitrary number of rows in our result set. The sequencer does the rest. We capture the results from the query and hand them out in memory. This query gave us the best of both worlds. We got to get a batch of OIDs with one query without having to change any of our external systems.

Where do we go from here?
In a word, GUIDs. This was a necessary performance related step on the way to implementing GUIDs.

It’s Tuesday morning and a bug has just been found in production. We were about to start working on a user story, but we need to switch gears and fix this bug.

Our repository currently looks like the following.

      BRANCH
        |
(A) -- (C)

(B) -- (D)
        |
     *master*

We need a topic branch based on BRANCH so that we can fix this bug.

git checkout -b DE1234 BRANCH
     *DE1234*
        |
      BRANCH
        |
(A) -- (C)

(B) -- (D)
        |
      master

We find and fix the defect. After committing the fix, our repository is in the following state.

      BRANCH  *DE1234*
        |        |
(A) -- (C) ---- (E)

(B) -- (D)
        |
      master

We need to integrate our fix with origin/BRANCH. The workflow for this integration is very similar to the user story workflow.

If you want to read a full explanation, read the first article in this series.

git fetch
git rebase origin/BRANCH
      BRANCH  origin/BRANCH   *DE1234*
        |           |             |
(A) -- (C) ------- (F) --------- (E')

(B) -- (D)
        |
      master
git push origin HEAD:BRANCH
                origin/BRANCH
                      |
      BRANCH       *DE1234*
        |             |
(A) -- (C) -- (F) -- (E')

(B) -- (D)
        |
      master

After the above series of commands, we have integrated and pushed our changes to the origin BRANCH. We now need to merge our changes to the master branch.

We need to ensure that our local BRANCH has all of our changes.

git checkout BRANCH
git pull
                origin/BRANCH
                      |
                   *BRANCH*
                      |
(A) -- (C) -- (F) -- (E')

(B) -- (D)
        |
      master

We now need to pull changes for master.

git checkout master
git pull
                origin/BRANCH
                      |
                    BRANCH
                      |
(A) -- (C) -- (F) -- (E')

(B) -- (D)
        |
     *master*
        |
  origin/master

We are ready to merge.

git merge BRANCH
                origin/BRANCH
                      |
                    BRANCH
                      |
(A) -- (C) -- (F) -- (E') --+
                             \
(B) -- (D) ----------------- (G)
        |                     |
  origin/master            *master*

Our changes have now been merged into the master. We still need to push our merge to origin’s master.

git push origin master
                origin/BRANCH
                      |
                    BRANCH
                      |
(A) -- (C) -- (F) -- (E') --+
                             \
(B) -- (D) ----------------- (G)
                              |
                           *master*
                              |
                        origin/master

We have successfully merged our changes from the branch and pushed them to the origin. We are now ready to start on our user story.

We recently started using Git as our version control system at Rally. Switching to Git has been a learning experience for many of us. I would like to share our experiences using Git in a team environment.

Working on a user story.

It’s Monday morning of a new iteration, we need to start the first user story of the iteration. The first thing we want to do is create a topic branch. In Git, we always want to work on a topic branch. We name our topic branch after our current story.

git checkout -b enhanced_search

After executing the above command, our repository will look like the following.

(A) -- (B) -- (C)
               |
             master
               |
        *enhanced_search*

Note: The current branch is surrounded by asterisks.

During the course of completing the story, we make two commits.

(A) -- (B) -- (C) -- (D) -- (E)
               |             |
             master   *enhanced_search*

We now want to integrate our commits into master. What do we do? Git has two options for integrating our commits; merge or rebase.

Merge

The first thing we need to do is a fetch. A fetch ensures we have a copy of the currently pushed commits.

git fetch

Someone has pushed a commit since we started working on the story.

                     origin/master
                           |
                +---------(F)
               /
(A) -- (B) -- (C) -- (D) -- (E)
               |             |
             master   *enhanced_search*

We now need to go back to master and make sure it is updated with the latest commits.

git checkout master
git pull
                     origin/master
                           |
                        *master*
                           |
                +---------(F)
               /
(A) -- (B) -- (C) -- (D) -- (E)
                             |
                       enhanced_search
git merge enhanced_search
                     origin/master
                           |
                           |   *master*
                           |      |
                +---------(F) -- (G)
               /                /
(A) -- (B) -- (C) -- (D) -- (E)
                             |
                       enhanced_search

Our commits have now been merged into our local master branch. The last thing we need to do is push our commits to the origin.

git push origin master
                            origin/master
                                  |
                               *master*
                                  |
                +---------(F) -- (G)
               /                /
(A) -- (B) -- (C) -- (D) -- (E)
                             |
                       enhanced_search

Rebase

Git provides another option to merging called rebase. Rebasing in Git allows you to rewrite your commit history. This is a very powerful and potentially dangerous operation. When using rebase, always remember the golden rule.

Never rebase a branch that has been pushed.

Let’s start back at the beginning. We had made a couple of commits on the enhanced_search branch. While working on the enhanced_search someone on the team has pushed a new commit to the origin.

                     origin/master
                           |
                +---------(F)
               /
(A) -- (B) -- (C) -- (D) -- (E)
               |             |
             master   *enhanced_search*

We are now ready to rebase our enhanced_search branch with respect to origin/master. Keep in mind, our above picture shows our local repository after performing a fetch. Always perform a fetch before starting any merge or rebase operation.

git rebase origin/master
                origin/master
                      |
(A) -- (B) -- (C) -- (F) -- (D') -- (E')
               |                     |
             master           *enhanced_search*

Our commits from the enhanced_search have been replayed on the current origin/master. We are now ready to push our changes to the origin.

git push origin HEAD:master
                               origin/master
                                     |
(A) -- (B) -- (C) -- (F) -- (D') -- (E')
               |                     |
             master           *enhanced_search*

Should we Merge or Rebase?

I like the rebase approach. Why? I think the commit history is easier to understand. The reader of the history doesn’t know or care that we used a local topic branch. In the end, both approaches allow us to integrate our commits.

Stay tuned for more of our experiences with Git.

The Active Object pattern has been one of my favorite patterns through the years. The Active Object pattern allows method execution to be decoupled from method invocation. In other words, the use of background processing (threading, MOM, job scheduling) is decoupled from the implementation of the algorithm.

public void sendChangeNotification(Change change)
{
     Email email = generateEmail(change);
     mailQueue.queueMessage(email);
}

public void onMessage(EmailMessage message)
{
     emailSender.send(message);
}

Notice that the sendChangeNotification method knows we are using a queue to process the sending of email. What happens if we don’t want to use a queue? What if we want to use a thread? If we change how we process emails, the sendChangeNotification method will have to be modified. This is a volition of the OCP.

We can use a variation of the Active Object pattern that will allow us to decouple the “when” from the “how”.

public interface EmailSender
{
    void send(Email email);
}

public void sendChangeNotification(Change change)
{
     Email email = generateEmail(change);
     emailSender.send(email);
}

I know this seems like a small change, but it has big impact. The sendChangeNotification now uses the emailSender to send the generated email. We don’t know, nor do we care, if the email gets sent now or sometime later. We leave the “when” detail to the implementer.

public class JmsEmailSender implements EmailSender
{
     public void send(Email email)
     {
         emailQueue.queueMessage(email);
     }

     public void onMessage(EmailMessage message)
     {
         emailSender.send(message.getEmail());
     }
}

The JmsEmailSender contains all of the JMS plumbing that is required to place and receive email messages on the queue. JmsEmailSender’s only responsibility is to interact with JMS.

Given this new design, we are now free to change the implementation of “when” the email gets sent without changing the sendChangeNotification method. This little change can make a big difference in testing and debugging. Our sendChangeNotification method now complies with OCP.

Thoughts?

Test Fumes

A test fume is a test manifestation of a code smell. Over the next several blog posts, I would like to explore different test fumes that I have identified and possible fixes for the smells they identify. Without further ado…

Production Code Doppelgänger

Tests are the evil twins of our production code. Evil? Yes, evil. Tests, if listened to correctly, tell all the little dirty secrets that lurk in our production code. What kind of secrets? Code smells. Code smells are essentially design flaws, violations of SOLID, that have started rotting our code.    What do code smells have to do with tests?   Good question.    I have started to notice that tests are a good way to spot flaws in your production code’s design.

Random Parameter

A random parameter is a parameter that seems completely out of place in the current context.    I am good at coming up with definitions, so maybe an example would be better.

void requestArgumentsShouldBeAppliedToUserPreferences()
{
   Workspace workspace = mock(Workspace.class);
   Preferences prefs = mock(Preferences.class);
   Map<String, String> args = map("sort", "name");
   when(workspace.getOID()).thenReturn(123);

   new PreferenceProcessor(workspace, prefs).process(args);

   verify(prefs).setPreferenceValue("workspace.123.sort", "name");
}

The workspace parameter really jumps out at me. What does a PreferenceProcessor need with a workspace? Hmm… It looks like we use the workspace id to namespace user preferences. Why does the PreferenceProcessor need to know about our namespace scheme? This seems like a violation of the Single Responsibility Principle (SRP).    If we decide to change how we namespace user preferences, the poor PreferenceProcessor will have to change.

The workspace parameter, which seemed random/out of place, pointed us to a code smell.    We fixed it.

void requestArgumentsShouldBeAppliedToUserPreferences()
{
   Preferences prefs = mock(Preferences.class);
   Map<String, String> args = map("sort", "name");

   new PreferenceProcessor(prefs).process(args);

   verify(prefs).setPreferenceValue("sort", "name");
}

That seems better, but where did the namespace logic go? We created a NamespacePreferenceDecorator that was responsible for managing user preference namespaces. Now, if the namespace scheme changes, the PreferenceProcessor won’t care.

Test Fumes?

OK.   OK.   How did I come up with that name?    I wanted a catchy name to describe what I thought of as a test manifestation of a code smell.    I had described this concept to Rod Hilton.     We spent several minutes batting around ideas until we settled on test fumes.

Recently, at Rally, we made an effort to speed up our tests.    This is an ongoing battle that I suspect all organizations face.    We wanted to minimize the amount of data created by our integration tests.   To minimize the data, we decided to share expensive data across tests.

Sharing data caused several of our integration tests to fail.    Lucky for us, all the problems shared the same symptom; bad assertions.    Take the following example.

 createUnScheduledUserStory();
 createScheduledUserStory();
 createUnScheduledUserStory();
 commit();

 assertCriterionReturns(new UnscheduledCriterion(), 2);

Is this assertion good?    Does it ask the right question?    This assertion failed when we started sharing, but I would like to argue that it was not good in the first place.    Do we really want to know that two items were returned from the query?    No.    We really want to know if the two we created were returned from the query.

public void unscheduledStoriesShouldBeFound()
{
  UserStory[] unscheduled = {
      createUnScheduledUserStory(),
      createUnScheduledUserStory()
  };
  commit();

  assertCriterionFinds(new UnscheduledCriterion(), unscheduled);
}

public void scheduledStoriesShouldNotBeFound()
{
  UserStory scheduled = createScheduledUserStory();
  commit();

  assertCriterionDoesNotFind(new UnscheduledCriterion(), scheduled);
}

This is how the test now exists.    This works when sharing data across tests and it asks a better question.   Now we have no doubt that our criterion returns the expected data.

Lesson learned; Create specific assertions.