Pandas：InvalidIndexError：重新索引仅适用于具有唯一值的索引对象。

Question

Pandas：InvalidIndexError：重新索引仅适用于具有唯一值的索引对象。

6

我有两个数据框，存储了商店采购的产品数据。 df1 存储了商店名称、产品ID、产品名称以及购买日期的数据。df2 存储了产品 ID、产品名称和产品类型的数据。我正在尝试使用df1中的收到日期值更新df2，但仅限于类型为“P”的产品。

下面是数据框的视图和我的尝试方法。 df1:

StoreName,ProdId,ProdName,DateReceived
Store A,P1,Prod1,2018-05-01
Store A,P2,Prod2,2018-05-02
Store B,P1,Prod1,2018-05-04

df2:

DateRecived,ProdId,ProdName,Type

,P1,Prod1,P
,P2,Prod2,P
,P3,Prod3,S

脚本：

df2['DateRecived'] = df2['ProdId'].map(df1.set_index('ProdId')['StoreName']).df2['Type'] == 'P'

运行此代码会抛出以下错误：

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

请帮忙修改脚本，使我能够通过 商店名称 和 产品名称 进行筛选，并让 df2 填充 DateReceived 值。谢谢。

- dark horse

@jezrael，我试图按Type = P过滤Dataframe，因此尝试将其附加到条件的末尾。基本上，我要查找的输出是按Store Name，Product Name和Type = P进行过滤作为输出的一部分... - dark horse

是的，刚刚意识到了 :) - jezrael

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- jezrael · Accepted Answer

问题是重复 - P1 产品出现了两次：

s = df1.set_index('ProdId')['StoreName']
print (s)

ProdId
P1    Store A
P2    Store A
P1    Store B
Name: StoreName, dtype: object

所以需要唯一的值，drop_duplicates 保留第一个值：

s = df1.drop_duplicates('ProdId').set_index('ProdId')['StoreName']
print (s)
ProdId
P1    Store A
P2    Store A
Name: StoreName, dtype: object

然后可以使用布尔掩码进行替换：

mask = df2['Type'] == 'P'
df2['DateRecived'] = df2['DateRecived'].mask(mask, df2['ProdId'].map(s))
print (df2)
  DateRecived ProdId ProdName Type
0     Store A     P1    Prod1    P
1     Store A     P2    Prod2    P
2         NaN     P3    Prod3    S

df2.loc[mask, 'DateRecived'] = df2.loc[mask, 'ProdId'].map(s)
print (df2)
  DateRecived ProdId ProdName Type
0     Store A     P1    Prod1    P
1     Store A     P2    Prod2    P
2         NaN     P3    Prod3    S