得票数最多 'beautifulsoup' 问题 - 第7页

关联标签

50得票2回答

Beautiful Soup的findAll没有找到所有内容

我尝试使用 find_all() 方法解析网站并获取一些信息，但它无法找到所有信息。这是代码：#!/usr/bin/python3 from bs4 import BeautifulSoup from urllib.request import urlopen page = urlope...

pythonhtmlpython-3.xbeautifulsoup

48得票8回答

如何使用Beautiful Soup查找特定文本的标签？

如何在以下HTML中查找我正在寻找的文本（换行符用\n标记）？... <tr> <td class="pos">\n "Some text:"\n <br>\n <stro...

pythonhtmlweb-scrapingbeautifulsoup

48得票4回答

使用BeautifulSoup提取标签内的内容

我想提取内容Hello world。请注意页面上有多个<table>和类似的<td colspan="2">：<table border="0" cellspacing="2" width="800"> <tr> <td co...

pythonbeautifulsoup

48得票3回答

美丽汤：'ResultSet'对象没有属性'find_all'？

我想使用Beautiful Soup来爬取一个简单的表格。以下是我的代码：import requests from bs4 import BeautifulSoup url = 'https://gist.githubusercontent.com/anonymous/c8eedd8bf41...

pythonbeautifulsoup

47得票3回答

在BeautifulSoup中是否有类似于InnerText的东西？

使用以下代码：soup = BeautifulSoup(page.read(), fromEncoding="utf-8") result = soup.find('div', {'class' :'flagPageTitle'}) 我收到以下的HTML：<div id="ctl00_C...

pythonbeautifulsoup

47得票1回答

美丽汤：查找特定div的子元素

我正在尝试使用Python和Beautiful Soup解析这个网页：我想要提取高亮显示的td div的内容。目前我可以通过以下方式获取所有的div：alltd = soup.findAll('td') for td in alltd: print td 但我试图将范围缩小到...

pythonparsingbeautifulsoup

47得票3回答

“.string” 和 “.text” 在BeautifulSoup中有什么区别？

我注意到在使用BeautifulSoup时有些奇怪的情况，但是找不到任何支持这种情况的文档，所以我想在这里询问一下。假设我们已经使用BS解析出以下类似的标签： <td>Some Table Data</td> <td></td> 提取数...

pythonbeautifulsoup

46得票6回答

使用BeautifulSoup将表格抓取到数据框中

我正在尝试从硬币目录中抓取数据。其中一页是需要抓取的页面。我需要将这些数据抓取到Dataframe中。到目前为止，我有以下代码： import bs4 as bs import urllib.request import pandas as pd source = urllib.r...

pandasdataframeweb-scrapingbeautifulsoup

44得票2回答

BeautifulSoup中"findAll"和"find_all"的区别

我希望使用Python解析HTML文件，我使用的模块是BeautifulSoup。有人说find_all函数和findAll函数是相同的。我已经尝试过两者，但我相信它们是不同的： import urllib, urllib2, cookielib from BeautifulSoup i...

pythonxml-parsinghtml-parsingbeautifulsoup

43得票3回答

美丽汤（Beautiful Soup）使用类“Contains”还是正则表达式？

如果我的类名经常不同，例如：listing-col-line-3-11 dpt 41 listing-col-block-1-22 dpt 41 listing-col-line-4-13 CWK 12 通常我可以做到：for EachPart in soup.find_all("div", ...

pythonregexweb-scrapingbeautifulsoup