如何使用pandas计算每行中出现最多的次数？

Question

如何使用pandas计算每行中出现最多的次数？

3

df = pd.DataFrame([[1,2,3,1,5],
                   [2,3,4,4,6],
                   [7,2,2,2,5],
                   [21,3,4,3,6]], index=[1,2,3,4], columns=list('ABCDE'))

df 的结果值

    A   B   C   D   E
1   1   2   3   1   5
2   2   3   4   4   6
3   7   2   2   2   5
4   21  3   4   3   6

如何知道每行中出现次数最多的数字？

例如：

row 1: 1    appears two times
row 2: 4    appears two times
row 3: 2    appears three times

抱歉，我的翻译目前只支持英文和西班牙文。

- liu gang

格式化的DataFrame声明，使行更易于查看。固定语言。 - Matthew

2个回答

1

for idx, i in df.iterrows():
    l = list(i)
    list_of_counts = [l.count(x) for x in l]
    m = max(list_of_counts)
    print ("row " + str(idx) + ":" + str(l[list_of_counts.index(m)]) +" appears " + str(m) +" times")

第1行：1出现了2次

第2行：4出现了2次

第3行：2出现了3次

第4行：3出现了2次

- seanmus

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Anton Protopopov · Accepted Answer

您可以使用apply和value_counts来获取出现的值和数量，然后使用concat将它们连接起来：

value = df.apply(lambda x: x.value_counts().index[0], axis=1)
count = df.apply(lambda x: x.value_counts().iloc[0], axis=1)

out = pd.concat([value, count], axis=1).reset_index()

out.columns = ['row_num', 'val', 'appearing']
out['row_num'] = 'row ' + out['row_num'].astype(str) + ':'
out['appearing'] = 'appears ' + out['appearing'].astype(str) + ' times'

In [64]: out
Out[64]:
  row_num  val        appearing
0  row 1:    1  appears 2 times
1  row 2:    4  appears 2 times
2  row 3:    2  appears 3 times
3  row 4:    3  appears 2 times