对于我找到的几个不同的正则表达式,我发现正则表达式的可选和条件部分在第一次匹配和后续匹配中的行为不同。这是使用Python,但我发现它通常适用。
以下是两个类似的示例,说明了问题:
第一个示例:
表达式:
(?:\w. )?([^,.]*).*(\d{4}\w?)
文本:
J. wang Wang, X. Liu, and A. A. Chien. Empirical Study of Tolerating \nDenial-of-Service Attacks with a Proxy Network. In Proceedings of the USENIX Security Symposium, 2002.
R. wang Wang, X. Liu, and A. A. Chien. Empirical Study of Tolerating \nDenial-of-Service Attacks with a Proxy Network. In Proceedings of the USENIX Security Symposium, 2002.
匹配:
匹配1
1. wang Wang
2. 2002
匹配2
1. R
2. 2002
第二个示例:
表达式:
((?:\w\. )?[^,.]*).*(\d{4}\w?)
文本:
J. wang Wang, X. Liu, and A. A. Chien. Empirical Study of Tolerating \nDenial-of-Service Attacks with a Proxy Network. In Proceedings of the USENIX Security Symposium, 2002.
R. wang Wang, X. Liu, and A. A. Chien. Empirical Study of Tolerating \nDenial-of-Service Attacks with a Proxy Network. In Proceedings of the USENIX Security Symposium, 2002.
匹配:
匹配1
1. J. wang Wang
2. 2002
匹配2
1. R
2. 2002
我错过了什么?
我希望它的行为有所不同,我认为匹配应该是一致的。我认为它应该是(但还不明白为什么不是):
示例1
匹配1
1. wang Wang
2. 2002
匹配2
1. wang Wang
2. 2002
示例2
匹配1
1. J. wang Wang
2. 2002
匹配2
1. R. wang Wang
2. 2002
以下是两个类似的示例,说明了问题:
第一个示例:
表达式:
(?:\w. )?([^,.]*).*(\d{4}\w?)
文本:
J. wang Wang, X. Liu, and A. A. Chien. Empirical Study of Tolerating \nDenial-of-Service Attacks with a Proxy Network. In Proceedings of the USENIX Security Symposium, 2002.
R. wang Wang, X. Liu, and A. A. Chien. Empirical Study of Tolerating \nDenial-of-Service Attacks with a Proxy Network. In Proceedings of the USENIX Security Symposium, 2002.
匹配:
匹配1
1. wang Wang
2. 2002
匹配2
1. R
2. 2002
第二个示例:
表达式:
((?:\w\. )?[^,.]*).*(\d{4}\w?)
文本:
J. wang Wang, X. Liu, and A. A. Chien. Empirical Study of Tolerating \nDenial-of-Service Attacks with a Proxy Network. In Proceedings of the USENIX Security Symposium, 2002.
R. wang Wang, X. Liu, and A. A. Chien. Empirical Study of Tolerating \nDenial-of-Service Attacks with a Proxy Network. In Proceedings of the USENIX Security Symposium, 2002.
匹配:
匹配1
1. J. wang Wang
2. 2002
匹配2
1. R
2. 2002
我错过了什么?
我希望它的行为有所不同,我认为匹配应该是一致的。我认为它应该是(但还不明白为什么不是):
示例1
匹配1
1. wang Wang
2. 2002
匹配2
1. wang Wang
2. 2002
示例2
匹配1
1. J. wang Wang
2. 2002
匹配2
1. R. wang Wang
2. 2002