你可以使用以下的正则表达式来完成这个任务:
^M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$
简单来说,
M{0,4}
指定了千位数,并将其限制在
0
到
4000
之间。这是一个相对简单的规则。
0: <empty> matched by M{0}
1000: M matched by M{1}
2000: MM matched by M{2}
3000: MMM matched by M{3}
4000: MMMM matched by M{4}
当然,如果你想允许更大的数字,你可以使用类似于
M*
的表达式,以允许任意数量(包括零)的千位数。
接下来是
(CM|CD|D?C{0,3})
,稍微复杂一些,这是用于百位数的部分,涵盖了所有可能的情况。
0: <empty> matched by D?C{0} (with D not there)
100: C matched by D?C{1} (with D not there)
200: CC matched by D?C{2} (with D not there)
300: CCC matched by D?C{3} (with D not there)
400: CD matched by CD
500: D matched by D?C{0} (with D there)
600: DC matched by D?C{1} (with D there)
700: DCC matched by D?C{2} (with D there)
800: DCCC matched by D?C{3} (with D there)
900: CM matched by CM
第三,
(XC|XL|L?X{0,3})
遵循与前一部分相同的规则,但适用于十位数:
0: <empty> matched by L?X{0} (with L not there)
10: X matched by L?X{1} (with L not there)
20: XX matched by L?X{2} (with L not there)
30: XXX matched by L?X{3} (with L not there)
40: XL matched by XL
50: L matched by L?X{0} (with L there)
60: LX matched by L?X{1} (with L there)
70: LXX matched by L?X{2} (with L there)
80: LXXX matched by L?X{3} (with L there)
90: XC matched by XC
最后,
(IX|IV|V?I{0,3})
是单位部分,处理
0
到
9
,并且与前两个部分相似(罗马数字,尽管看起来很奇怪,但一旦你弄清楚它们的规则,它们就会遵循一些逻辑规律):
0: <empty> matched by V?I{0} (with V not there)
1: I matched by V?I{1} (with V not there)
2: II matched by V?I{2} (with V not there)
3: III matched by V?I{3} (with V not there)
4: IV matched by IV
5: V matched by V?I{0} (with V there)
6: VI matched by V?I{1} (with V there)
7: VII matched by V?I{2} (with V there)
8: VIII matched by V?I{3} (with V there)
9: IX matched by IX
只需记住,该正则表达式也会匹配空字符串。如果您不希望这样(并且您的正则表达式引擎足够现代化),您可以使用正向先行断言:
^(?=.)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$
这是一个“检查匹配但丢弃”的操作,意味着它向前查找以检查第一个字符(
.
)是否存在于起始标记(
^
)之后,但不吸收该第一个字符。例如,如果字符串是
M
,那么它将匹配
.
,但仍然可用于正则表达式的下一部分
M{0,4}
。然而,空字符串将无法匹配前瞻,因此会失败。
另一种选择是,如果您不仅仅限于使用正则表达式,可以事先检查长度是否为零。
/^M{0,3}(?:C[MD]|D?C{0,3})(?:X[CL]|L?X{0,3})(?:I[XV]|V?I{0,3})$/i
- Crissov