我正在学习使用Nathan Yau的书《Visualize This》来爬取数据。我试图爬取2009年的Wunderground数据,但是出现了这个错误。它说超出了范围,但是我不知道为什么。
line 43, in <module>
f.write(timestamp + ',' + dayTemp + '\n')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xb0' in position 11: ordinal not in range(128)
以下是我的代码:
import sys
import urllib2
from bs4 import BeautifulSoup as BS
# Create/open a file called wunder.txt (which will be a comma-delimited file)
f = open('wunder-data.txt', 'w')
# Iterate through months and day
for m in range(1, 13):
for d in range(1, 32):
# Check if already gone through month
if (m == 2 and d > 28):
break
elif (m in [4, 6, 9, 11] and d > 30):
break
# Open wunderground.com url
url = "http://www.wunderground.com/history/airport/KBUF/2009/" + str(m) + "/" + str(d) + "/DailyHistory.html"
page = urllib2.urlopen(url)
# Get temperature from page
soup = BS(page,"html.parser")
# dayTemp = soup.body.nobr.b.string
dayTemp = soup.find("span", text="Mean Temperature").parent.find_next_sibling("td").get_text(strip=True)
# Format month for timestamp
if len(str(m)) < 2:
mStamp = '0' + str(m)
else:
mStamp = str(m)
# Format day for timestamp
if len(str(d)) < 2:
dStamp = '0' + str(d)
else:
dStamp = str(d)
# Build timestamp
timestamp = '2009' + mStamp + dStamp
# Write timestamp and temperature to file
f.write(timestamp + ',' + dayTemp + '\n')
# Done getting data! Close file.
f.close()
open('wunder-data.txt', 'w', encoding='utf8')
打开文件。 - juanpa.arrivillagaimport io
,然后使用io.open('wunder-data.txt', 'w', encoding='utf8')
代替内置的open
。如果你是初学者,应该考虑使用 Python 3 而不是 Python 2。 - juanpa.arrivillaga