Python的re.search在多行字符串上无法工作

4

我有一个字符串中加载了这个文件:

// some preceding stuff
static char header_data[] = {
    1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,
    1,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,
    1,1,0,1,0,1,0,1,1,0,1,0,1,0,1,1,
    1,0,1,1,1,0,0,1,1,0,0,1,1,1,0,1,
    0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,1,
    1,0,0,0,1,1,0,1,1,1,1,1,0,1,1,1,
    0,1,0,0,0,1,0,0,1,1,1,1,0,0,0,0,
    0,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,
    0,1,1,1,0,0,0,0,0,0,1,1,0,1,1,0,
    0,0,0,0,1,0,0,0,1,0,0,1,0,1,0,0,
    1,1,1,0,1,1,0,0,1,1,0,0,0,1,1,1,
    1,1,0,1,1,1,1,1,1,1,1,0,0,0,1,1,
    1,0,1,1,1,0,0,1,1,0,0,0,0,0,1,1,
    1,1,0,1,0,1,0,1,1,1,1,0,0,0,0,1,
    1,1,1,0,1,1,0,1,1,0,1,1,1,1,0,1,
    1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1
    };

我希望只获取那些由0和1构成的代码块,并进行一些处理。

我导入了re并尝试了以下代码:

In [11]: re.search('static char header_data(.*);', src, flags=re.M)

In [12]: re.findall('static char header_data(.*);', src, flags=re.M)
Out[12]: []

为什么它没有匹配到任何内容?如何解决这个问题?(使用的是Python3)


考虑我的答案中的非贪婪匹配。否则,您将匹配超过一个括号部分。但这取决于您的源文件的外观。 - wenzul
@wenzul 其实是文件中的最后一个大括号,但为了安全起见,我会加上它。 - MightyPork
@wenzul,我拒绝了你的编辑,因为它并没有对它进行任何改进。当我遇到这个问题时,我几乎完全按照现在标题中的内容进行了谷歌搜索 - 这样人们就可以很容易地找到它。 - MightyPork
3个回答

13
您需要使用re.S标志,而不是re.M
  • re.Mre.MULTILINE)控制^$的行为(它们是否匹配整个字符串或每行的开头/结尾)。
  • re.Sre.DOTALL)控制.的行为,并且是您想要允许点匹配换行符时所需的选项。
请参见文档

1
谢谢,现在运行得很好。 "MULTILINE" 这个名称有点令人困惑。 - MightyPork

3

然后以某种方式对其进行处理。

现在我们开始从文件中获取可用的列表:

import re
match = re.search(r"static char header_data\[\] = {(.*?)};", src, re.DOTALL)
if match:
    header_data = "".join(match.group(1).split()).split(',')
    print header_data

.*?是一种非贪婪匹配,因此您实际上只会得到括号中的值。

如果没有DOTALLMULTILINE,则更明确的方法是:

match = re.search(r"static char header_data\[\] = {([01,\s\r\n]*?)};", src)

0
如果文件格式不变,你可以不使用re而是使用切片。以下代码可能会有用:
>>> file_in_string
'\n// some preceding stuff\nstatic char header_data[] = {\n    1,1,1,1,1,1,0,0,0
,0,1,1,1,1,1,1,\n    1,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,\n    1,1,0,1,0,1,0,1,1,0,1
,0,1,0,1,1,\n    1,0,1,1,1,0,0,1,1,0,0,1,1,1,0,1,\n    0,0,0,1,1,1,1,1,1,1,1,1,1
,0,1,1,\n    1,0,0,0,1,1,0,1,1,1,1,1,0,1,1,1,\n    0,1,0,0,0,1,0,0,1,1,1,1,0,0,0
,0,\n    0,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,\n    0,1,1,1,0,0,0,0,0,0,1,1,0,1,1,0,\
n    0,0,0,0,1,0,0,0,1,0,0,1,0,1,0,0,\n    1,1,1,0,1,1,0,0,1,1,0,0,0,1,1,1,\n
 1,1,0,1,1,1,1,1,1,1,1,0,0,0,1,1,\n    1,0,1,1,1,0,0,1,1,0,0,0,0,0,1,1,\n    1,1
,0,1,0,1,0,1,1,1,1,0,0,0,0,1,\n    1,1,1,0,1,1,0,1,1,0,1,1,1,1,0,1,\n    1,1,1,1
,1,1,0,0,0,0,1,1,1,1,1,1\n    };\n'
>>> lines = file_in_string.split()
>>> lines[9:-1]
['1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,', '1,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,', '1,1,0,
1,0,1,0,1,1,0,1,0,1,0,1,1,', '1,0,1,1,1,0,0,1,1,0,0,1,1,1,0,1,', '0,0,0,1,1,1,1,
1,1,1,1,1,1,0,1,1,', '1,0,0,0,1,1,0,1,1,1,1,1,0,1,1,1,', '0,1,0,0,0,1,0,0,1,1,1,
1,0,0,0,0,', '0,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,', '0,1,1,1,0,0,0,0,0,0,1,1,0,1,1,
0,', '0,0,0,0,1,0,0,0,1,0,0,1,0,1,0,0,', '1,1,1,0,1,1,0,0,1,1,0,0,0,1,1,1,', '1,
1,0,1,1,1,1,1,1,1,1,0,0,0,1,1,', '1,0,1,1,1,0,0,1,1,0,0,0,0,0,1,1,', '1,1,0,1,0,
1,0,1,1,1,1,0,0,0,0,1,', '1,1,1,0,1,1,0,1,1,0,1,1,1,1,0,1,', '1,1,1,1,1,1,0,0,0,
0,1,1,1,1,1,1']

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接