如何修复“不支持带编码声明的Unicode字符串”错误。

Question

如何修复“不支持带编码声明的Unicode字符串”错误。

9

ValueError: Unicode strings with encoding declaration are not supported. 
Please use bytes input or XML fragments without declaration.

当我尝试解析这个网站时，它不起作用。

当我尝试序列化这个页面文本时，出现错误TypeError：类型'str'无法序列化

from lxml import html

source = 'http://games.chruker.dk/eve_online/item.php?type_id=814'
path = '//*[@id="top"]/table[1]/tbody/tr[1]/td[3]/table'

page = requests.get(source)
pagetext = page.text

parser = html.fromstring(pagetext)

result = parser.xpath(path)
print(result)

我希望你能提供一个类似于以下网站的需求表格： http://games.chruker.dk/eve_online/item.php?type_id=814

- Cre3d

1

为什么不直接使用element = html.parse(source)呢？html.parse方法支持像source变量中一样的URL作为输入。 - Martin Honnen

2个回答

2

API提供的parse函数允许您直接传入一个URL，就像您在source变量中所拥有的一样：

from lxml import html

source = 'http://games.chruker.dk/eve_online/item.php?type_id=814'
path = '//*[@id="top"]/table[1]/tbody/tr[1]/td[3]/table'

tree = html.parse(source)

result = tree.xpath(path)

print(result)

- Martin Honnen

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sagar Gupta · Accepted Answer

试一试：

parser = html.fromstring(bytes(pagetext, encoding='utf8'))