在使用Django时，TextBlob出现了缺失语料库错误。

Question

在使用Django时，TextBlob出现了缺失语料库错误。

4

我正在使用Python 2.7，Django 1.8和Linux Ubuntu上的Apache服务器。我有一个包含23000条推文的JSON文件。我想根据预定义的类别对推文进行分类。但是当我运行代码时，它会抛出MissingCorpusError at /错误并建议：

要下载必要的数据，只需运行

python -m textblob.download_corpora

我已经拥有了TextBlob的最新语料库，但是我仍然遇到了错误。

我的views.py如下：

def get_tweets(request):
    retweet = 0
    category = ''
    sentiment = ''
    tweets_data_path = STATIC_PATH+'/stream.json'
    tweets_data = []
    tweets_file = open(tweets_data_path, "r")
    for line in tweets_file:
        try:
            tweet = json.loads(line)
            tweets_data.append(tweet)
        except:
            continue
    subs = []
    for l in tweets_data:
        s = re.sub("http[\w+]{0,4}://t.co/[\w]+","",l)
        subs.append(s)
    for t in subs:
        i = 0
        while i < len(t):
            text = t[i]['tweet_text']
            senti = TextBlob(text)
            category = cl.classify(text)
            if senti.sentiment.polarity > 0:
                sentimen = 'positive'
            elif senti.sentiment.polarity < 0:
                sentimen = 'negative'
            else:
                sentimen = 'neutral'
            if text.startswith('RT'):
                retweet = 1
            else:
                retweet = 0
            twe = Tweet(text=text,category=category,
                sentiment=sentimen, retweet= retweet)
            twe.save()
            i = i+1
    return HttpResponse("done")

- user5315166

请发布JSON的结构。并将while循环重写为for ti in t。每个子集中有多少条推文？ - Pynchia

总共有23689条推文。我应该发布JSON文件的结构还是特定的推文？ - user5315166

2个回答

0

我曾经遇到过同样的问题。我使用Anaconda，它对我有用。这可能会有所帮助：

http://www.nltk.org/data.html

https://anaconda.org/anaconda/nltk

$ pip3 install -U textblob

$ python3 -m textblob.download_corpora

这段内容与编程有关。

- rzskhr

$ pip install -U textblob $ python -m textblob.download_corpora - rzskhr

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Gašper Gračner · Accepted Answer

我有同样的问题。当我下载nltk_data时，它被放置在/root/nltk_data/，当我将这个nltk_data文件夹复制到/var/www/时，它可以正常工作。

$ sudo cp -avr nltk_data/ /var/www/