Unicode大写字母的正则表达式为什么无法匹配“Ó”？

Question

Unicode大写字母的正则表达式为什么无法匹配“Ó”？

7

似乎它无法将带重音的Ó识别为大写字母

#!/usr/bin/env perl
use strict;
use warnings;
use 5.14.0;
use utf8;
use feature 'unicode_strings';

" SIMÓN " =~ /^\s+(\p{Upper}+)/u;
print "$1\n";

返回结果

SIM

Perl应该能够使用Unicode数据，它已经将Ó标记为大写。来自emacs describe-char

character code properties: customize what to show
  name: LATIN CAPITAL LETTER O WITH ACUTE
  old-name: LATIN CAPITAL LETTER O ACUTE
  general-category: Lu (Letter, Uppercase)
  decomposition: (79 769) ('O' '́')

- user525602

我不认为你能够得到比提供的更小的案例了 :-) - paxdiablo

1

啊，抱歉，@pst。我只是看了一下行数而已，没有注意到内容。你说的没错，正则表达式本身可能可以简化。 - paxdiablo

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ikegami · Accepted Answer

您需要使用use open ':std', ':locale';来正确编码输出。

如果这样做不起作用，则表示您的文件尽管告诉Perl它是UTF-8编码，但实际上并没有。