如何使用两个数据帧创建一个Pandas数据帧,其中一个作为列,另一个作为行?

3

I have a 2 dataframes as follow:

df1 = pd.DataFrame({'Barcode':[1,2,3,4],'Store':['s1','s2','s3','s4']})    

df2 = pd.DataFrame({'Date':['2020-10-10','2020-10-09','2020-10-08','2020-10-07','2020-10-06']})

如何创建一个数据框,其中df1为行,df2为列,从而生成具有空值的单元格。像下面这样: enter image description here 最后一步是使用另一个表(df4)进行连接来填充单元格:
df4 = pd.DataFrame({'Barcode':[1,2,3,4],'Store':['s1','s2','s3','s4'],'2020-10-10':[1,2,5,np.nan],'2020-10-09':[np.nan,2,3,0],'2020-10-08':[0,0,2,3],'2020-10-07':[np.nan,1,np.nan,2]})

最终数据框应该如下所示:

enter image description here

非常感谢您的帮助。

3个回答

3

我希望我理解了你的问题。你有三个数据框:

df1 = pd.DataFrame({'Barcode':[1,2,3,4],'Store':['s1','s2','s3','s4']})    
df2 = pd.DataFrame({'Date':['2020-10-10','2020-10-09','2020-10-08','2020-10-07','2020-10-06']})
df4 = pd.DataFrame({'Barcode':[1,2,3,4],'Store':['s1','s2','s3','s4'],'2020-10-10':[1,2,5,np.nan],'2020-10-09':[np.nan,2,3,0],'2020-10-08':[0,0,2,3],'2020-10-07':[np.nan,1,np.nan,2]})

然后:

df1 = pd.DataFrame(df1, columns= df1.columns.tolist() + df2['Date'].tolist())
df1 = df1.set_index('Barcode')
df4 = df4.set_index('Barcode')

print(df1.fillna(df4))

输出:

        Store  2020-10-10  2020-10-09  2020-10-08  2020-10-07  2020-10-06
Barcode                                                                  
1          s1         1.0         NaN         0.0         NaN         NaN
2          s2         2.0         2.0         0.0         1.0         NaN
3          s3         5.0         3.0         2.0         NaN         NaN
4          s4         NaN         0.0         3.0         2.0         NaN

2

首先创建一个临时的DataFrame:

wrk = pd.DataFrame('', index=pd.MultiIndex.from_frame(df1),
    columns=df2.Date.rename(None)); wrk

这个表格中填充了空字符串,并从df2中获取了必要的列名。目前,BarcodeStore索引列。这种安排很快就会被需要。

然后使用df4的数据更新它(原地更新):

wrk.update(df4.set_index(['Barcode', 'Store']))

最后一步是:
result = wrk.reset_index()

结果如下:
   Barcode Store 2020-10-10 2020-10-09 2020-10-08 2020-10-07 2020-10-06
0        1    s1          1                     0                      
1        2    s2          2          2          0          1           
2        3    s3          5          3          2                      
3        4    s4                     0          3          2           

2
for item in df2.Date.tolist():
    df1[item] = np.nan

dfinal = df1.fillna(df4)
dfinal = dfinal.set_index('Barcode')
display(dfinal)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接