More Roman Numerals and Bash

When in Rome: finishing the Roman numeral converter script.

In my last article, I started digging in to a classic computer science puzzle: converting Roman numerals to Arabic numerals. First off, it more accurately should be called Hindu-Arabic, and it's worth mentioning that it's believed to have been invented somewhere between the first and fourth century—a counting system based on 0..9 values.

The script I ended up with last time offered the basics of parsing a specified Roman numeral and converted each value into its decimal equivalent with this simple function:

mapit() {
   case $1 in
     I|i) value=1 ;;
     V|v) value=5 ;;
     X|x) value=10 ;;
     L|l) value=50 ;;
     C|c) value=100 ;;
     D|d) value=500 ;;
     M|m) value=1000 ;;
      * ) echo "Error: Value $1 unknown" >&2 ; exit 2 ;;

Then I demonstrated a slick way to use the underutilized seq command to parse a string character by character, but the sad news is that you won't be able to use it for the final Roman numeral to Arabic numeral converter. Why? Because depending on the situation, the script sometimes will need to jump two ahead, and not just go left to right linearly, one character at a time.

Instead, you can build the main loop as a while loop:

while [ $index -lt $length ] ; do

    our code

    index=$(( $index + 1 ))

There are two basic cases to think about in terms of solving this algorithmic puzzle: the subsequent value is greater than the current value, or it isn't—for example, IX versus II. The first is 9 (literally 1 subtracted from 10), and the second is 2. That's no surprise; you'll need to know both the current and next values within the script.

Sharp readers already will recognize that the last character in a sequence is a special case, because there won't be a next value available. I'm going to ignore the special case to start with, and I'll address it later in the code development. Stay tuned, sharpies!

Because Bash shell scripts don't have elegant in-line functions, the code to get the current and next values won't be value=mapit(romanchar), but it'll be a smidge clumsy with its use of the global variable value:

mapit ${romanvalue:index-1:1}

mapit ${romanvalue:index:1}

It's key to realize that in the situation where the next value isn't greater than the current value (for example, MC), you can't automatically conclude that the next value isn't going to be part of a complex two-value sequence anyway. Like this: MCM. You can't just say M=1000 and C=500, so let's just convert it to 1500 and process the second M when we get to it. MCM=1900, not 2500!

The basic logic turns out to be pretty straightforward: