基于条件列，对 Pandas 的行子集进行排名

Question

基于条件列，对 Pandas 的行子集进行排名

pandasdataframeconditional-statementsrank

3

我想要按照score对下面的数据框排序，只针对那些 condition是False 的行。其余的应该具有NaN的排名。

df=pd.DataFrame(np.array([[34, 65, 12, 98, 5],[False, False, True, False, False]]).T, index=['A', 'B','C','D','E'], columns=['score', 'condition'])

期望的输出结果是（按降序排列的条件等级）：

   score  condition  cond_rank
A     34          0     3 
B     65          0     2
C     12          1    NaN
D     98          0     1
E      5          0     4

我知道 pd.DataFrame.rank() 可以处理需要被排名的值为 NaN 的情况，但在需要条件筛选另一列/序列的情况下，最有效的方法是什么？

- Zhubarb

2个回答

1

这是关于编程中的“where”和“rank”的内容。请确保指定ascending=False，否则会得到错误的输出。

df['score'].where(df['condition'].eq(0)).rank(ascending=False)

A    3.0
B    2.0
C    NaN
D    1.0
E    4.0
Name: score, dtype: float64

- user3483203

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- jezrael · Accepted Answer

你可以按照条件列 rank 进行过滤：

df['new'] = df.loc[~df['condition'].astype(bool), 'score'].rank()
print (df)
   score  condition  new
A     34          0  2.0
B     65          0  3.0
C     12          1  NaN
D     98          0  4.0
E      5          0  1.0