@ DSM解决方案非常棒,但仅当您的值为1
或0
时才能工作。如果您需要将其与其他值进行比较,可以尝试以下方法:
[df.columns[df.ix[i,:]==1].tolist() for i in range(len(df.index))]
In [156]: [df.columns[df.ix[i,:]==1].tolist() for i in range(len(df.index))]
Out[156]:
[['apple', 'banana', 'carrot'],
['banana'],
['apple'],
['apple', 'carrot', 'dietcoke'],
['banana', 'carrot'],
['banana', 'carrot']]
编辑
虽然你可以仅仅修改一下 @DSM 的解决方案:
In [177]: [df.columns[row == 1].tolist() for row in df.values]
Out[177]:
[['apple', 'banana', 'carrot'],
['banana'],
['apple'],
['apple', 'carrot', 'dietcoke'],
['banana', 'carrot'],
['banana', 'carrot']]
一些性能测试:
In [179]: %timeit [df.columns[row == 1].tolist() for row in df.values]
The slowest run took 4.03 times longer than the fastest. This could mean that an intermediate result is being cached
1000 loops, best of 3: 212 us per loop
In [180]: %timeit [df.columns[row.astype(bool)].tolist() for row in df.values]
10000 loops, best of 3: 186 us per loop
In [181]: %timeit [df.columns[df.ix[i,:]==1].tolist() for i in range(len(df.index))]
100 loops, best of 3: 2.4 ms per loop
row
是一个numpy数组。当你执行row.astype(bool)
时,你会得到类似于array([False, True, True, False], dtype=bool)
的东西。这个布尔数组可以用来选择性地索引df.columns
,它是一个pd.Index
对象:df.columns==Index(['apple', 'banana', 'carrot', 'dietcoke'], dtype='object')
。 - Pedro M Duarte