得票数最多 'beautifulsoup' 问题

关联标签

1495得票34回答

UnicodeEncodeError: 'ascii'编解码器无法在第20个位置编码字符u'\xa0'，该位置的序数不在128范围内。

我在处理从不同网页（不同网站）获取的文本时遇到了Unicode字符的问题。我正在使用BeautifulSoup。问题在于错误并不总是可重现的；它有时可以与某些页面一起工作，有时会通过抛出 UnicodeEncodeError 而失败。我已经尝试了几乎所有我能想到的方法，但是我还没有找到任何...

pythonunicodebeautifulsouppython-2.xpython-unicode

655得票19回答

如何按类名查找元素

我在使用BeautifulSoup解析HTML元素中的"class"属性时遇到了问题。代码如下：soup = BeautifulSoup(sdata) mydivs = soup.findAll('div') for div in mydivs: if (div["class"] ...

pythonhtmlweb-scrapingbeautifulsoup

524得票12回答

抱歉，我无法提供直接的中文翻译。可能是由于编码问题导致了该错误。请尝试使用其他方法或工具来进行翻译。

我正在尝试爬取一个网站，但是它给了我一个错误。我使用以下代码： import urllib.request from bs4 import BeautifulSoup get = urllib.request.urlopen("https://www.website.com/") ht...

pythonbeautifulsoupfile-iourllib

428得票22回答

bs4.FeatureNotFound: 找不到具有您请求的功能的树构建器：lxml。您需要安装解析器库吗？

... soup = BeautifulSoup(html, "lxml") File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__ % ",".join(features)) bs4.Fe...

pythonpython-2.7beautifulsouplxml

360得票16回答

如何在Python中从字符串中删除\xa0？

我目前正在使用BeautifulSoup解析HTML文件并调用get_text(), 但好像留下了很多代表空格的Unicode字符\xa0。是否有一种有效的方法可以在Python 2.7中删除所有这些字符并将它们转换为空格？我想更一般化的问题是，是否有一种方法可以去除Unicode格式？我...

pythonpython-2.7unicodebeautifulsouputf-8

340得票1回答

BeautifulSoup获取href

我有以下这个soup：<a href="some_url">next</a> <span class="class">...</span> 我想提取href属性，其值为"some_url&quot...

pythontagsbeautifulsoup

284得票27回答

爬虫：针对http://en.wikipedia.org的SSL证书验证失败错误

我正在练习《Python网络数据采集》中的代码，但我一直遇到这个证书问题： I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem: from ur...

pythonweb-scrapingbeautifulsoupscrapyssl-certificate

227得票5回答

TypeError: 在 Python 和 CSV 中需要一个类似字节的对象，而不是 'str'

在执行以下Python代码将HTML表格数据保存到CSV文件时，我遇到了上述错误。如何消除这个错误？ “TypeError：需要类似字节的对象，而不是'str'” import csv import requests from bs4 import BeautifulSoup url='...

pythonbeautifulsouphtml-table

213得票13回答

美丽汤和通过ID提取div及其内容

soup.find("tagName", { "id" : "articlebody" }) 为什么这个代码没有返回 <div id="articlebody"> ... </div> 标签和其中的内容？它没有返回任何东西。我知道它存在，因为我正在看着它。 soup....

pythonbeautifulsoup

213得票10回答

使用BeautifulSoup提取属性值

我正在尝试从网页上特定的 "input" 标签中提取单个 "value" 属性的内容。我使用以下代码：import urllib f = urllib.urlopen("http://58.68.130.147") s = f.rea...

pythonparsingattributesbeautifulsoup