如何防止Python读取文本文件中的正则表达式时转义特殊字符？

Question

如何防止Python读取文本文件中的正则表达式时转义特殊字符？

3

我正在使用Python读取一个文本文件，其中包含预先编写的正则表达式，以便稍后进行匹配。文本文件的格式如下：

``` ... --> Task 2 Concatenate and print the strings "Hello, " and "world!" to the screen. --> Answer Hello, world! print(\"Hello,\s\"\s*+\s*\"world!\") --> Hint 1 You can concatenate two strings with the + operator ... ```

根据任务接受用户输入，并在子进程中执行以查看返回值或与正则表达式匹配。但问题是，Python的file.readline（）将转义正则表达式字符串中的所有特殊字符（即反斜杠），导致我得到了一些无用的内容。

我尝试将文件读入为字节，并使用“raw_unicode_escape”参数解码行（描述为生成“适合作为Python源代码中原始Unicode文字的字符串”），但没有成功。

file.open(filename, 'rb')
for line in file:
  line = line.decode('raw_unicode_escape')
  ...

我这样做是完全错误的吗？

非常感谢您的所有帮助。

p.s. 我也发现了这个问题：从文件中读取特殊字符时出现问题。但是，当我使用file.open(filename, 'r'，encoding='utf-8')时，我仍然有同样的问题。

- Zachary Allaun

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- unutbu · Accepted Answer

Python正则表达式模式只是普通的字符串。将它们存储在文件中不应该有问题。也许当你使用file.readline()时，你看到了转义字符，因为你看到了行的repr? 但是，当你实际将该模式用作正则表达式时，这不应该成为问题:

import re
filename='/tmp/test.txt'
with open(filename,'w') as f:
    f.write(r'\"Hello,\s\"\s*\+\s*\"world!\"')

with open(filename,'r') as f:
    pat = f.readline()
    print(pat)
    # \"Hello,\s\"\s*\+\s*\"world!\"
    print(repr(pat))
    # '\\"Hello,\\s\\"\\s*\\+\\s*\\"world!\\"'
    assert re.search(pat,'  "Hello, " +   "world!"')  # Shows match was found