This article was originally published on the Encodo Blogs. Browse on over to see more!
A developer on the Microsoft C# compiler team recently made a post asking readers to post their solutions to a programming exercise in Comma Quibbling by Eric Lippert (Fabulous Adventures in Coding). The requirements are as follows:
On top of that, he stipulated “I am particularly interested in solutions which make the semantics of the code very clear to the code maintainer.”
Before doing anything else, let’s nail down the specification above with some tests, using the NUnit testing framework:
[TestFixture]
public class SentenceComposerTests
{
[Test]
public void TestZero()
{
var parts = new string[0];
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{}", result);
}
[Test]
public void TestOne()
{
var parts = new[] { "one" };
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{one}", result);
}
[Test]
public void TestTwo()
{
var parts = new[] { "one", "two" };
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{one and two}", result);
}
[Test]
public void TestThree()
{
var parts = new[] { "one", "two", "three" };
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{one, two and three}", result);
}
[Test]
public void TestTen()
{
var parts = new[] { "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten" };
var result = parts.ConcatenateWithAnd();
Assert.AreEqual("{one, two, three, four, five, six, seven, eight, nine and ten}", result);
}
}
The tests assume that the method ConcatenateWithAnd()
is declared as an extension method. With the tests written, I figured I’d take a crack at the solution, keeping the last condition foremost in my mind instead of compactness, elegance or cleverness (as often predominate). Instead, I wanted to make the special cases given in the specification as clear as possible in the code. On top of that, I added the following conditions to the implementation:
That said, here’s my version:
public static string ConcatenateWithAnd(this IEnumerable<string> words)
{
using (var enumerator = words.GetEnumerator())
{
if (!enumerator.MoveNext())
{
return "{}";
}
var firstItem = enumerator.Current;
if (!enumerator.MoveNext())
{
return "{" + firstItem + "}";
}
var secondItem = enumerator.Current;
if (!enumerator.MoveNext())
{
return "{" + firstItem + " and " + secondItem + "}";
}
var builder = new StringBuilder("{");
builder.Append(firstItem);
builder.Append(", ");
builder.Append(secondItem);
var item = enumerator.Current;
while (enumerator.MoveNext())
{
builder.Append(", ");
builder.Append(item);
item = enumerator.Current;
}
builder.Append(" and ");
builder.Append(item);
builder.Append("}");
return builder.ToString();
}
}
Looking at this from a maintenance or understanding point-of-view, I have the following notes:
foreach
-statement.StringBuilder.Append()
are intentional. I wanted to avoid having to use escaped {}
in the format string (e.g. String.Format(“{{{0} and {1}}}”, firstItem, secondItem)
is confusing if you’re not aware how curly brackets are escaped in a format string).Other than those things, it seems relatively compact and efficient. With my own version written, I looked through the comments on the post to see if any other interesting solutions were available. I came up with two that caught my eye, one by Jon Skeet and another by Hresto Deshev, who submitted his in F#.
Hristo’s example in F# is as follows:
#light
let format (words:list<string>) =
let rec makeList (words: list<string>) =
match words with
| [] -> ""
| first :: [] -> first
| first :: second :: [] -> first + " and " + second
| first :: second :: rest -> first + ", " + second + ", " + (makeList rest)
"{" + (makeList words) + "}"
That’s so cool: the formulation in F# is almost plain English! That’s pretty damned maintainable, I’d say. I have no way of judging the performance of this just-in-time parsing, but it does make use of recursion: lists with thousands of items will incur thousands of nested calls.
Next up is Jon Skeet’s version in C#:
public static string JonSkeetVersion(this IEnumerable<string> words)
{
var builder = new StringBuilder("{");
string last = null;
string penultimate = null;
foreach (string word in words)
{
// Shuffle existing words down
if (penultimate != null)
{
builder.Append(penultimate);
builder.Append(", ");
}
penultimate = last;
last = word;
}
if (penultimate != null)
{
builder.Append(penultimate);
builder.Append(" and ");
}
if (last != null)
{
builder.Append(last);
}
builder.Append("}");
return builder.ToString();
}
This one is very clever and handles all cases in a single loop rather than addressing special cases outside of a loop (as mine did). Also, all of the formatting elements—the curly brackets and item separators—are mentioned only once, improving maintainability. I immediately liked it better than my own solution from a technical standpoint. While I’m drawn to the cleverness and elegance of the solution, I’m not the target audience. Skeet’s version forces you to reason out the special cases; it’s not immediately obvious how the special cases for zero, one and two elements are handled. Also, while I am tickled pink by the aptness of the variable name penultimate
, I wonder how many non-native English speakers would understand its intent without a visit to an online dictionary. The name secondToLast
would have been a better, though far less sexy, choice.
It’s very easy to underestimate how little people are willing to actually read code that they didn’t write. If the code requires a certain amount of study to understand, then they may just leave it well enough alone and seek the original developer. If, however, it looks quite easy and the special cases are made clear—as in my version—they are far more likely to dig further and work with it. Since the problem is defined as a three special cases and a general case, it is probably best to offer a solution where these cases are immediately obvious to ease maintainability—and as long as you don’t sacrifice performance unnecessarily. Cleverness is wonderful, but you may end up severely limiting the number of people willing—or able—to work on that code.
This one’s from Chris Meadowcroft:
const string EnglishListPrefix = "{";
const string EnglishListSuffix = "}";
const string IntermediateSeparator = ", ";
const string LastSeparator = " and ";
static string BuildStringWithEnglishListSyntax(IEnumerable<string> itemsEnumerable)
{
StringBuilder result = new StringBuilder(EnglishListPrefix);
using(IEnumerator<string> items = itemsEnumerable.GetEnumerator())
{
if(items.MoveNext()) // make sure it's not an empty list
{
bool isFirstString = true;
bool isLastString = false;
while(!isLastString)
{
string current = items.Current;
isLastString = !items.MoveNext();
if(!isFirstString)
{
result.Append(isLastString ? LastSeparator : IntermediateSeparator);
}
else
{
isFirstString = false;
}
result.Append(current);
}
}
}
result.Append(EnglishListSuffix);
return result.ToString();
}