得票数最多 'beautifulsoup' 问题 - 第5页

标签列表

关联标签

70得票4回答

使用Beautiful Soup从“img”标记中提取“src”属性

请考虑：<div class="someClass"> <a href="href"> <img alt="some" src="some"/> ...

pythonregexbeautifulsoup

68得票5回答

使用BeautifulSoup和Python获取meta标签内容属性

我正在尝试使用Python和Beautiful Soup提取以下标签中的内容部分： <meta property="og:title" content="Super Fun Event 1" /> <meta property="og:url" content="http:...

pythonhtmlweb-scrapingbeautifulsoup

67得票8回答

美丽汤，HTML5lib: 模块对象没有属性_base。

我更新软件包后出现了这个新错误：class TreeBuilderForHtml5lib(html5lib.treebuilders._base.TreeBuilder): AttributeError: 'module' object has no attribute '_base' 我试图...

beautifulsouphtml5lib

66得票4回答

使用BeautifulSoup提取不带标签的文本

我的网页看起来像这样： YOB: 1987 RACE: WHI...

pythonweb-scrapingbeautifulsoup

65得票3回答

如何使用Python BeautifulSoup将输出写入HTML文件

我使用beautifulsoup删除了一些标签，现在我想将结果写回到一个html文件中。我的代码：from bs4 import BeautifulSoup from bs4 import Comment soup = BeautifulSoup(open('1.html'),"h...

pythonhtmlbeautifulsoup

64得票12回答

使用BeautifulSoup删除标签但保留其内容

目前我有一段代码，大致像这样：soup = BeautifulSoup(value) for tag in soup.findAll(True): if tag.name not in VALID_TAGS: tag.extract() soup.renderCont...

pythonbeautifulsoup

63得票7回答

在Python中解析HTML - 使用lxml或BeautifulSoup？这两者在哪些情况下更好？

据我所了解，Python中两个主要的HTML解析库是lxml和BeautifulSoup。我选择使用BeautifulSoup来完成正在进行的项目，但这只是因为我发现它的语法比较易学易懂，并没有特别的原因。尽管如此，我发现很多人似乎更喜欢lxml，并且我也听说lxml更快。那么，一个库相对...

pythonbeautifulsouphtml-parsinglxml

60得票2回答

使用Python将渲染的HTML转换为纯文本

我正在尝试使用BeautifulSoup将一块HTML文本转换。以下是一个示例：<div> Some text more text even more te...

pythonbeautifulsoup

60得票4回答

Beautifulsoup - nextSibling

我试图使用以下代码获取内容"My home address"，但是出现了AttributeError错误：address = soup.find(text="Address:") print address.nextSibling 这是我的 HTML：<td>Ad...

pythonbeautifulsoup

60得票2回答

如何使用CSS选择器和BeautifulSoup获取特定类别中的链接？

我是Python的新手，正在学习用它来进行爬虫。我正在使用BeautifulSoup来收集链接（即'a'标签的href）。我尝试收集http://allevents.in/lahore/网站上“即将到来的活动”选项卡下的链接。我正在使用Firebug来检查元素并获取CSS路径，但是这段代码没有...

pythoncsscss-selectorsbeautifulsoupfirebug