Thu 20 Oct 2011
Regular Expression Pattern Matching
Regular expressions are an important tool in any programming language. In Scala, regular expressions are really fun and powerful when combined with pattern matching. Let’s take a look at a possible usage for regular expressions: HTTP basic authentication. In HTTP basic auth, a client is authenticated by sending a header (Authorization) to the server that contains their credentials.
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==
There are a couple of good candidates for using regular expressions when parsing a basic auth header.
1. Stripping the
"Basic"from the front of the value
2. Splitting the given username and password from the decoded Base64 string
Here is an example of a Scala class using regular expression to parse a basic auth header.
class BasicAuthenticationParser {
private val Header = "Basic\\s+(.+)".r
private val UsernamePassword = "([^:]+):(.+)".r
def parse(header: String):AuthenticationToken = header match {
case Header(encoded) => decodeHeader(new String(Base64.decodeBase64(encoded)))
case _ => UnknownAuthenticationToken
}
private def decodeHeader(header: String):AuthenticationToken = header match {
case UsernamePassword(username, password) => new BasicAuthenticationToken(username, password)
case _ => UnknownAuthenticationToken
}
}
As we can see, creating regular expressions is as simple as using the .r method on a String. Let’s take one of the methods in the above class and break it down.
private val UsernamePassword = "([^:]+):(.+)".r
private def decodeHeader(header: String):AuthenticationToken = header match {
case UsernamePassword(username, password) => new BasicAuthenticationToken(username, password)
case _ => UnknownAuthenticationToken
}
This method is used to break apart the username:password that comes in the authentication header. UsernamePassword is the regular expression that represents this concept. UsernamePassword contains two match groups; one for the username and one for the password. This is where the fun begins.
header match {
case UsernamePassword(username, password) => new BasicAuthenticationToken(username, password)
case _ => UnknownAuthenticationToken
}
This snippet says we are going to match the header parameter against the UsernamePassword regular expression. If it matches, then we are going to assign the value of first match group to the variable username and the value of the second match group to the variable password. These variables can then be used in the statement on the right hand side of the case. If we don’t match, then we will simply return UnknownAuthenticationToken
What? It cannot be that easy. That was about 5 or 6 lines shorter than the Java version. Pattern matching is all made possible by a concept in Scala called an extractor objects. I will talk more about pattern matching and extractor objects in a later post.
