Duplication in a codebase is problematic for several reasons. First, if the code really should be the same, it makes it easy to introduce bugs if you need to change it. The more places the same snippet of code exists, the more chances you have to forget to implement changes. Perhaps more importantly, if some parts of the code are intentionally different, it can become hard to tell whether the differences are intentional.

Today, I ran across a code snippet that looked something like this. The comment was there already – as much as I wish otherwise, I didn’t add it.

final Class queryClass = queryType.getTypeClass();
// TODO Yuck, this could use some refactoring
if (queryClass.equals(Project.class)) {
Workspace workspace = RallyRequestContext.getCurrent().getWorkspace();
Collection<Project> accessibleProjs = filter(workspace.getProjects(), accessibleTo(user));
Collection<Long> oids = transform(accessibleProjs, toOids());
} else if (queryClass.equals(Scope.class)) {
List<Long> oids = newArrayList();
Workspace workspace = RallyRequestContext.getCurrent().getWorkspace();
Collection<Project> accessibleProjs = filter(workspace.getProjects(), accessibleTo(user));
for (Project project : accessibleProjs) {
}
} else if(queryClass.equals(AttributeConfiguration.class)) {
List<Long> oids = newArrayList();
Workspace workspace = RallyRequestContext.getCurrent().getWorkspace();
Collection<Project> accessibleProjs = filter(workspace.getProjects(), accessibleTo(user));
for (Project project : accessibleProjs) {
}
}

Clearly, there’s duplication here, both subtle and blatant. There are subtle differences in the three blocks above. Can you tell which are intentional and which are just noise?

First, there’s the alternating usage of Google Guava’s Collections2.transform and for each loops. Let’s at least make those consistent. That gives us this:

final Class queryClass = queryType.getTypeClass();
// TODO Yuck, this could use some refactoring
if (queryClass.equals(Project.class)) {
Workspace workspace = RallyRequestContext.getCurrent().getWorkspace();
Collection<Project> accessibleProjs = filter(workspace.getProjects(), accessibleTo(user));
Collection<Long> oids = transform(accessibleProjs, toOids());
} else if (queryClass.equals(Scope.class)) {
List<Long> oids = newArrayList();
Workspace workspace = RallyRequestContext.getCurrent().getWorkspace();
Collection<Project> accessibleProjs = filter(workspace.getProjects(), accessibleTo(user));
} else if(queryClass.equals(AttributeConfiguration.class)) {
List<Long> oids = newArrayList();
Workspace workspace = RallyRequestContext.getCurrent().getWorkspace();
Collection<Project> accessibleProjs = filter(workspace.getProjects(), accessibleTo(user));
}

Now that we’ve eliminated the subtle duplication, we’re left with blatant duplication. Let’s fix that up some by moving all of the repeated code out of the individual if-blocks:

final Class queryClass = queryType.getTypeClass();

Workspace workspace = RallyRequestContext.getCurrent().getWorkspace();
List<Long> oids = newArrayList();

Collection<Project> accessibleProjs = filter(workspace.getProjects(), accessibleTo(user));

// TODO Yuck, this could use some refactoring
if (queryClass.equals(Project.class)) {
} else if (queryClass.equals(Scope.class)) {
} else if (queryClass.equals(AttributeConfiguration.class)) {
}