我正在将一些csv文件中第二列包含的文本添加到一个列表中,以便稍后对列表中的每个项目执行情感分析。目前,我的代码可以完全处理大型csv文件,但是我在列表中对项目进行情感分析所需的时间太长了,因此我想仅读取每个csv文件的前200行。代码如下:
import nltk, string, lumpy
import math
import glob
from collections import defaultdict
columns = defaultdict(list)
from nltk.corpus import stopwords
import math
import sentiment_mod as s
import glob
lijst = glob.glob('21cf/*.csv')
tweets1 = []
for item in lijst:
stopwords_set = set(stopwords.words("english"))
with open(item, encoding = 'latin-1') as d:
reader1=csv.reader(d)
next(reader1)
for row in reader1:
tweets1.extend([row[2]])
words_cleaned = [" ".join([words for words in sentence.split() if 'http' not in words and not words.startswith('@')]) for sentence in tweets1]
words_filtered = [e.lower() for e in words_cleaned]
words_without_stopwords = [word for word in words_filtered if not word in stopwords_set]
tweets1 = words_without_stopwords
tweets1 = list(filter(None, tweets1))
如何使用csv读取器仅读取每个csv文件的前200行?