pandas根据数据框中的另一列分配列值

Question

pandas根据数据框中的另一列分配列值

3

我有以下的df，

id    a_id    b_id
1     25      50
1     25      50
2     26      51
2     26      51
3     25      52
3     28      52
3     28      52

我有以下代码，根据每个id值在df中有多少行来将a_id和b_id分配为-1；如果a_id或b_id的值恰好与id的某个特定值具有相同的行/子数据框，则这些a_id和b_id的行会得到-1。

cluster_ids = df.loc[df['id'] > -1]['id'].unique()

types = ['a_id', 'b_id']

for cluster_id in cluster_ids:
    rows = df.loc[df['id'] == cluster_id]

    for type in types:
        ids = rows[type].values

        match_rows = df.loc[df[type] == ids[0]]

        if match_rows.equals(rows):
           df.loc[match_rows.index, type] = -1

因此，结果数据框将如下所示：

id    a_id    b_id
1     25      -1
1     25      -1
2     -1      -1
2     -1      -1
3     25      -1
3     28      -1
3     28      -1

我在想是否有更有效率的方法来做这件事。

- daiyue

5

您的解释不够清晰。请问您能否澄清一下您尝试做什么？ - cs95

@coldspeed已修改了操作符； - daiyue

2

有没有任何情况下它不能是-1？我们需要一个不同的例子来使其清晰明了。 - Bharath M Shetty

2

很抱歉，但我仍然不明白。这里的一切都是-1，所以我无法可视化任何错误情况。 - cs95

@coldspeed，Dark已经修改了原帖中的示例，抱歉之前不够清晰。 - daiyue

你提供的代码实际上产生了与你提供的不同的输出。 - user59271

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- phi · Accepted Answer

one_value_for_each_id = df.groupby('id').transform(lambda x: len(set(x)) == 1)

 a_id  b_id
0   True  True
1   True  True
2   True  True
3   True  True
4  False  True
5  False  True
6  False  True

one_id_for_each_value = pd.DataFrame({
    col: df.groupby(col).id.transform(lambda x: len(set(x)) == 1)
    for col in ['a_id', 'b_id']
})

   a_id  b_id
0  False  True
1  False  True
2   True  True
3   True  True
4  False  True
5   True  True
6   True  True

one_to_one_relationship = one_id_for_each_value & one_value_for_each_id

# Set all values that satisfy the one-to-one relationship to `-1`
df.loc[one_to_one_relationship.a_id, 'a_id'] = -1
df.loc[one_to_one_relationship.b_id, 'b_id'] = -1

a_id  b_id
0    25    -1
1    25    -1
2    -1    -1
3    -1    -1
4    25    -1
5    28    -1
6    28    -1