尝试使用以下正则表达式:
^(?:[\p{L}\p{Mn}\p{Pd}\'\x{2019}]+\s[\p{L}\p{Mn}\p{Pd}\'\x{2019}]+\s?)+$
在 PHP 中,这意味着:
if (preg_match('~^(?:[\p{L}\p{Mn}\p{Pd}\'\x{2019}]+\s[\p{L}\p{Mn}\p{Pd}\'\x{2019}]+\s?)+$~u', $name) > 0)
{
// valid
}
你应该这样阅读:
^
(?:
[
\p{L}
\p{Mn}
\p{Pd}
\' # single quote, or
\x{2019} # single quote (alternative)
]+ # one or more times
\s # any kind of space
[ #match a:
\p{L} # Unicode letter, or
\p{Mn} # Unicode accents, or
\p{Pd} # Unicode hyphens, or
\' # single quote, or
\x{2019} # single quote (alternative)
]+ # one or more times
\s? # any kind of space (0 or more times)
)+ # one or more times
$ # end of subject
我真的不知道如何将这个转换到JavaScript,我甚至不确定JavaScript是否支持Unicode属性,但在PHP PCRE中,在IDEOne.com上似乎完美运行:
$names = array
(
'Alix',
'André Svenson',
'H4nn3 Andersen',
'Hans',
'John Elkjærd',
'Kristoffer la Cour',
'Marco d\'Almeida',
'Martin Henriksen!',
);
foreach ($names as $name)
{
echo sprintf('%s is %s' . "\n", $name, (preg_match('~^(?:[\p{L}\p{Mn}\p{Pd}\'\x{2019}]+\s[\p{L}\p{Mn}\p{Pd}\'\x{2019}]+\s?)+$~u', $name) > 0) ? 'valid' : 'invalid');
}
很抱歉我无法帮助你解决Javascript方面的问题,但这里可能会有人可以。
有效性验证:
- John Elkjærd
- André Svenson
- Marco d'Almeida
- Kristoffer la Cour
无效性验证:
- Hans
- H4nn3 Andersen
- Martin Henriksen!
要替换无效字符(虽然我不确定你为什么需要这样做),你只需要稍微更改一下:
$name = preg_replace('~[^\p{L}\p{Mn}\p{Pd}\'\x{2019}\s]~u', '$1', $name);
示例:
- H4nn3 Andersen -> Hnn Andersen
- Martin Henriksen! -> Martin Henriksen
请注意,您始终需要使用u修饰符。