在Pandas中获取布尔值DataFrame中为True的元素的(index, column)对

Question

在Pandas中获取布尔值DataFrame中为True的元素的(index, column)对

19

假设我有一个Pandas数据框，我想获取一个元组列表，形式为[(索引1，列1)，（索引2，列2）...]，描述DataFrame中所有满足某个条件的元素的位置。例如：

x = pd.DataFrame(np.random.normal(0, 1, (4,4)), index=['a', 'b', 'c', 'd'],
                 columns=['e', 'f', 'g', 'h'])
x


     e           f           g           h
a   -1.342571   -0.274879   -0.903354   -1.458702
b   -1.521502   -1.135800   -1.147913   1.829485
c   -1.199857   0.458135    -1.993701   -0.878301
d   0.485599    0.286608    -0.436289   -0.390755

y = x > 0

有没有办法获得：

x.loc[y]

返回：

[(b, h), (c,f), (d, e), (d,f)]

或者一些等价物？显然，我可以做到：

postup = []
for i in x.index:
    for j in x.columns:
        if x.loc[i, j] > 0:
            postup.append((i, j))

但我想象可能有更好的解决方案/已经被实现。在Matlab中，find函数与sub2ind结合使用可以完成此任务。

- dylkot

3个回答

3

我的方法使用 MultiIndex：

#make it a multi-indexed Series
stacked = y.stack()

#restrict to where it's True
true_stacked = stacked[stacked]

#get index as a list of tuples
result = true_stacked.index.tolist()

- exp1orer

2

如果您想要每行索引的单个元组：

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.normal(0, 1, (4,4)), index=['a', 'b', 'c', 'd'], columns=['e', 'f', 'g', 'h'])

# build column replacement
column_dict = {}
for col in [{col: {True: col}} for col in df.columns]:
    column_dict.update(col)

# replace where > 0
df = (df>0).replace(to_replace=column_dict)

# convert to tuples and drop 'False' values
[tuple(y for y in x if y != False) for x in df.to_records()]

- allen-smithee

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- A. Coady · Accepted Answer

x[x > 0].stack().index.tolist()