Problem 101: Professor Bumblebee
(U.M. ACM Runoffs 2001, problem 1)
Description
On his last expedition to the sunless wastelands of central Georgia,
Professor Bumblebee discovered a tribe of godless naked savages so
primitive that they still used word-perfect on their lap-tops. In
the name of anthropology, the professor killed them all and took
their lap-tops back to his laboratory to study their writings.
Imagine his surprise when he found that they were even more primitive
than he had realised. Their writing seemed to be in perfect English,
but used no punctuation or digits. All their numbers were written
in roman numerals.
Professor Bumblebee's graduate students are all too stupid to understand
roman numerals, so you have to write a program that does the conversion.
Input Format
The input will be plain text consisting only of letters (capital and lower
case), spaces, and newlines. A word is defined to be any sequence of letters
(possibly mixed case) separated from other letters. Some of the words
will represent numbers in roman numerals, and some will not. There is
no limit to the length of the file.
In case you have forgotten, the rules for correct roman numbers
are as follows: If you already know the rules for roman numerals,
you may rest assured that there are no tricks in here, these are the
standard rules as used by real romans in the early imperial period.
Numbers are not case sensitive: capital and lower case mean the same.
The following letters are allowed and have the indicated numeric values;
all other letters are forbidden: I=1, V=5, X=10, L=50, C=100, D=500, M=1000.
The value of a roman number is found by adding up all the values of its
letter components.
In normal circumstances, the letters of a roman number must appear in
descending order of numeric value. The letters I, X, C may appear up to
three times each; the letters V, L, D may appear only once.
The one exception to this rule is that
there are 6 two-letter sequences that are considered to be single
numerals: IV=4, IX=9, XL=40, XC=90, CD=400, CM=900. These two-letter numerals
may appear only once each, and must still appear in their proper
(descending order) position. If IV appears, neither I nor V may appear
anywhere else. If IX appears, I may not appear anywhere else (but up to three X's may
have already appeared earlier). If XL appears, neither X nor L may
appear elsewhere (but this does not preclude a later IX, which is considered
a separate numeral). If XC appears, X may not appear elsewhere. If CD appears,
neither C nor D may appear elsewhere. If CM appears, C may not appear
elsewhere. The two-letter combinations are considered to be single numerals,
so the rule that X may not appear again after XC does not forbid IX from
following XC. There is no restriction on the number of Ms. It may be helpful
to remember that there is only one correct may to write each number in roman
form, so if XIV is valid for 14, nothing else that should mean 14 can be legal.
| Examples: | |
|
| III = 3 | | XIX = 19
|
| IIII is illegal | | XX = 20
|
| IV = 4 | | XL = 40
|
| IVI is illegal | | XC = 90
|
| V = 5 | | XCIX = 99
|
| VI = 6 | | IC is illegal
|
| VIII = 8 | | CMII = 902
|
| VIIII is illegal | | CMXCIX = 999
|
| IX = 9 | | MMMCMXCIX = 3999
|
Output Format
The output should be an EXACT copy of the input, except that any words
that form correct roman numbers should be replaced by the equivalent
number written in normal modern digits. For example, if the word "VI"
appears in the input, it should be replaced by "6", but "VIP" should
be left alone.
Sample Input
hello Di
iiii i agree with you i used to use vi a lot for editing
but in april MCMLXXXVII a smelly VIP
pointed out that pico is xii times easier so I started using that
instead and after CXLIV days my editing accuracy was much improved
Sample Output
hello 501
iiii 1 agree with you 1 used to use 6 a lot for editing
but in april 1987 a smelly VIP
pointed out that pico is 12 times easier so 1 started using that
instead and after 144 days my editing accuracy was much improved
End of Problem