在两个字符串之间获取字符串

8
<p>I'd like to find the string between the two paragraph tags.</p><br><p>And also this string</p>

我该如何获取第一个和第二个段落标签之间的字符串?然后,我该如何获取第二个段落标签之间的字符串?
3个回答

15

正则表达式

import re
matches = re.findall(r'<p>.+?</p>',string)
以下是您的文本在控制台中运行的内容。
>>>import re
>>>string = """<p>I'd like to find the string between the two paragraph tags.</p><br><p>And also this string</p>"""
>>>re.findall('<p>.+?</p>',string)
["<p>I'd like to find the string between the two paragraph tags.</p>", '<p>And also this string</p>']

我已经给你点了赞,但请更正第一个片段中的拼写错误。应该是 </p> - gonczor
这可能听起来很蠢,但为什么这个不起作用呢?split = re.findall('<p>.+?</p>',content)......second = split[1]......它给了我一个“索引超出范围”的错误。我该如何获取第二个元素? - Zorgan
@Zorgan,你改过内容了吗? 它在我的控制台上运行正常:>>> split = re.findall('

(.+?)

',string) >>> split[1] 'And also this string'
- Isdj
是的,那是我的内容,我已经修复了。谢谢。 - Zorgan

13
如果您想获取位于p标签之间的字符串(不包括 p 标签),那么请在findall方法中的.+?周围添加括号。
import re
    string = """<p>I'd like to find the string between the two paragraph tags.</p><br><p>And also this string</p>"""
    subStr = re.findall(r'<p>(.+?)</p>',string)
    print subStr

结果

["I'd like to find the string between the two paragraph tags.", 'And also this string']

太棒了。正是我需要的。 - JayJay123

0

<p></p>之间

In [7]: content = "<p>I'd like to find the string between the two paragraph tags.</p><br><p>And also this string</p>"

In [8]: re.findall(r'<p>(.+?)</p>', content)
Out[8]: 
["I'd like to find the string between the two paragraph tags.",
 'And also this string']

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接