Python前瞻正则表达式中的 .* 的作用是什么？

Question

Python前瞻正则表达式中的 .* 的作用是什么？

pythonregexregex-lookarounds

3

我正在学习正则表达式，并在这里找到了一篇关于密码输入验证的有趣且有用的文章。我有一个问题，涉及到以下表达式中的.*：

"^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d]{8,}$"

我知道.*是通配符，代表任意数量的文本（或无文本），但我很难理解它在这些前瞻表达式中的作用。为什么需要它们才能使这些前瞻工作呢？

- James S

最终你会找到一个 [a-z]，等等。 - Willem Van Onsem

@WillemVanOnsem 是的，但 .* 是贪婪的。点不会换行。所以他们可能正在寻找一个换行符后面跟着任何字母字符？ - Jerinaw

2

@Jerinaw：对于 lookahead 等匹配，不存在贪婪因素，因为它不会捕获。通常 dot . 不包含换行符。 - Willem Van Onsem

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Willem Van Onsem · Accepted Answer

Lookahead 意味着直接向前查看。因此，如果您写：

(?=a)

这意味着第一个字符应该是a。有时，例如在密码检查中，你并不想要那个限制。你只需要表达出字符串中某处应该有一个a。因此：

(?=.*a)

这意味着第一个字符可以是b、8或@，但最终必须有一个a。

因此，您的正则表达式的含义是：

^               # start a match at the beginning of the string
(?=.*[a-z])     # should contain at least one a-z character
(?=.*[A-Z])     # should contain at least one A-Z character
(?=.*\d)        # should contain at least one digit
[a-zA-Z\d]{8,}  # consists out of 8 or more characters and only A-Za-z0-9
$               # end the match at the end of the string

< p > 没有 .*，就永远不可能匹配成功，因为：

 "^(?=[a-z])(?=[A-Z])(?=\d)[a-zA-Z\d]{8,}$"

意思是：

^               # start a match at the beginning of the string
(?=[a-z])       # first character should be an a-z character
(?=[A-Z])       # first character should be an A-Z character
(?=\d)          # first character should be a digit
[a-zA-Z\d]{8,}  # consists out of 8 or more characters and only A-Za-z0-9
$               # end the match at the end of the string

由于不存在既是A-Z字符又是数字的字符，因此这将永远不能被满足。

附注:

我们在前瞻中不捕获贪婪性，因此贪婪模式并不重要;
默认情况下，点“.”不会匹配换行符，详情请见此处;
即使它匹配了，你还有一个约束条件 ^[A-Za-z0-9]{8,}$，这意味着您只验证不包含换行符的输入。