Pandas - 用数据框中的值填充字符串列表

3

我正在从文件夹中读取 csv 文件,并将它们过滤到一个 pandas 数据帧中,就像这样:

results=[]
for filename in glob.glob(os.path.join('/path/*.csv')):
  with open(filename) as p:
    df = pd.read_csv(p)

    filtered = df[(df['duration'] > low1) & (df['duration'] < high1)]

    artist = filtered['artist'].values
    print artist
    track = filtered['track'].values
    print track

low1 = 0high_1 = 0.5 时,请查看以下内容:

artisttrack 正常打印数百个过滤项作为普通字符串,但如果我尝试在循环中将它们附加到 results

artist = filtered['artist'].values
track = filtered['track'].values
results.append([track,artist]) 

我看到我在添加对象和类型时,results 最终只包含了一小部分过滤后的项。我不知道发生了什么。

我如何以以下方式将所有项作为常规 strings 填充 results

[['artist1', 'track1'], ['artist1', 'track2], ...]]

artist = filtered['artist'].values.tolist() - Rakesh
1个回答

1
创建一个 DataFrame 列表,然后通过 concat 进行连接,最后转换为嵌套列表:
results=[]
for filename in glob.glob(os.path.join('/path/*.csv')):
    df = pd.read_csv(filename)
    #filter by conditions and also columns by names with .loc
    filtered = df.loc[(df['duration'] > low1) & (df['duration'] < high1), ['artist','track']]
    #alternative solution 
    filtered = df.loc[df['duration'].between(low1, high1,inclusive=False), ['artist','track']]
    results.append(filtered) 

out = pd.concat(results).values.tolist()

另一种解决方案是通过追加列表并使用列表推导式最后将它们扁平化:
results=[]
for filename in glob.glob(os.path.join('/path/*.csv')):
    df = pd.read_csv(filename)
    #filter by conditions and also columns by names with .loc
    mask = df['duration'].between(low1, high1,inclusive=False)
    filtered = df.loc[mask, ['artist','track']].values.tolist()
    results.append(filtered) 

out = [y for x in results for y in x]

1
results = zip(filtered['artist'],filtered['track']) - BoyInDaBox89

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接