BeautifulSoup中"findAll"和"find_all"的区别

Question

BeautifulSoup中"findAll"和"find_all"的区别

pythonxml-parsinghtml-parsingbeautifulsoup

44

我希望使用Python解析HTML文件，我使用的模块是BeautifulSoup。

有人说find_all函数和findAll函数是相同的。我已经尝试过两者，但我相信它们是不同的：

import urllib, urllib2, cookielib
from BeautifulSoup import *
site = "http://share.dmhy.org/topics/list?keyword=TARI+TARI+team_id%3A407"

rqstr = urllib2.Request(site)
rq = urllib2.urlopen(rqstr)
fchData = rq.read()

soup = BeautifulSoup(fchData)

t = soup.findAll('tr')

有人能告诉我其中的区别吗？

- Oberon

2

你正在使用哪个版本的beautifulsoup？如果应该使用BS4，则导入语句应为from bs4 import BeautifulSoup。请参见http://www.crummy.com/software/BeautifulSoup/bs4/doc/#porting-code-to-bs4 - marchelbling

1

有什么区别吗？我的意思是，你说你都用过，而且看到了区别。你能否发布一些输出结果，展示不同的行为方式？还是你在问为什么有两种方法可以做同样的事情？如果是这种情况，Martijn Pieters是正确的。 - Bakuriu

find_all：无法找到模块 findAll：它找到了几个HTML代码部分。 - Oberon

2个回答

11

从BeautifulSoup的源代码中：

http://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/view/head:/bs4/element.py#L1260

def find_all(self, name=None, attrs={}, recursive=True, text=None,
                 limit=None, **kwargs):
# ...
# ...

findAll = find_all       # BS3
findChildren = find_all  # BS2

- kmonsoor

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Martijn Pieters · Accepted Answer

在BeautifulSoup版本4中，方法完全相同；混合大小写的版本（如findAll、findAllNext、nextSibling等）都已重命名以符合Python风格指南，但是旧名称仍然可用以使移植更容易。请参见方法名称以获取完整列表。

在新代码中，应使用小写字母版本，例如find_all等。

但是在您的示例中，您正在使用BeautifulSoup 版本3 （自2012年3月停止使用，请尽量不要使用），其中仅findAll()可用。未知的属性名称（如仅在BeautifulSoup 4中可用的.find_all）将被视为按该名称搜索标记。您的文档中没有<find_all>标记，因此返回None。