Slach :
I'm trying to remove valid roman numbers ( numbering) from a text that contains headlines, Paragraphs, etc...
I'm using this regex :
Pattern ROMAN = Pattern.compile("^[([]?x{0,3}(i[xv]|v?i{0,3})[)\.]/]{1,2}", Pattern.CASE_INSENSITIVE);
Although it matches also empty parenthesis.
What I want to do is to remove the following:
Input :
iv. foo foo foo.
Output:
foo foo foo.
Input :
v) foo foo foo.
Output:
foo foo foo.
But also do nothing when not using them for numbering:
Input :
foo foo foo i) foo v) .
Output:
foo foo foo i) foo v) .
Another example of what the regex should match : iv)
X)
ix/
V/
x.
IV.
Nikolas :
How about something like the following Regex:
^((?=[mdclxvi])m*(c[md]|d?c{0,3})(x[cl]|l?x{0,3})(i[xv]|v?i{0,3})(?:\)|\.))
This matches a roman number that is followed by either )
or .
characters. There is a nice article about matching roman numbers Regular Expressions Cookbook by Steven Levithan, Jan Goyvaerts from O'Reilly.