Pandas - 在DataFrame中查找任意值的索引

Question

Pandas - 在DataFrame中查找任意值的索引

pythonpandas

14

我刚接触Python和Pandas。

我想要在我的pandas数据框中找到一个特定值(假设为security_id)的索引，因为这是列开始的地方。 (在列上方有未关注数据的未知行数，以及在左侧有若干空的“列”.)

据我所见，isin方法只返回该值是否存在的布尔值，而不是它的索引。

那么如何找到这个值的索引呢？

- Kemeia

2

欢迎来到StackOverflow。请花些时间阅读这篇关于如何提供一个出色的pandas示例的帖子（https://dev59.com/O2Ij5IYBdhLWcg3wk182），以及如何提供一个最小化、完整和可验证的示例，并相应地修改您的问题。这些关于如何提问的提示（http://stackoverflow.com/help/how-to-ask）也可能会有用。 - jezrael

6个回答

3

一个一行代码解决的方案，避免了显式循环...

返回整行数据

df.iloc[np.flatnonzero((df=='security_id').values)//df.shape[1],:]
返回行和列数据

df.iloc[ np.flatnonzero((df=='security_id').values)//df.shape[1], np.unique(np.flatnonzero((df=='security_id').values)%df.shape[1]) ]

- Peterd

3

您要查找的值未重复：

poz=matrix[matrix==minv].dropna(axis=1,how='all').dropna(how='all')
value=poz.iloc[0,0]
index=poz.index.item()
column=poz.columns.item()

您可以获取其索引和列

重复：

matrix=pd.DataFrame([[1,1],[1,np.NAN]],index=['q','g'],columns=['f','h'])
matrix
Out[83]: 
   f    h
q  1  1.0
g  1  NaN
poz=matrix[matrix==minv].dropna(axis=1,how='all').dropna(how='all')
index=poz.stack().index.tolist()
index
Out[87]: [('q', 'f'), ('q', 'h'), ('g', 'f')]

您将获得一个列表

- Jay

2

我认为这个问题可能之前已经被问过了（链接）。接受的答案非常全面，应该可以帮助您找到列中值的索引。

编辑：如果不知道存在值的列，则可以使用：

for col in df.columns:
    df[df[col] == 'security_id'].index.tolist()

- Adam Slack

1

在给定的问题中，列是已知的。但在我的情况下，我不知道值出现在哪一列。但我同意它为我的问题提供了答案的方向。 - Kemeia

啊，抱歉！您可以在数据框中循环列并应用上面链接的答案。 for col in df.columns: df[df[col] == 'security_id'].index.tolist()。这也会给您所有您要查找的出现次数。 - Adam Slack

2

假设您的DataFrame如下所示：

      0       1            2      3    4
0     a      er          tfr    sdf   34
1    rt     tyh          fgd    thy  rer
2     1       2            3      4    5
3     6       7            8      9   10
4   dsf     wew  security_id   name  age
5   dfs    bgbf          121  jason   34
6  dddp    gpot         5754   mike   37
7  fpoo  werwrw          342   jack   31

做以下事情：

for row in range(df.shape[0]): # df is the DataFrame
         for col in range(df.shape[1]):
             if df.get_value(row,col) == 'security_id':
                 print(row, col)
                 break

- Ujjwal

1

谢谢，这似乎是一个解决方案:) 不过，找到值的唯一方法是遍历行和列吗？是否有更高效的方法？ - Kemeia

无论你做什么，迭代总是必不可少的。要么你自己做，要么Pandas会帮你做。内部迭代总是必须的。此外，一旦获得ID，迭代就会停止。最坏的情况是当security_id是您DataFrame的右下角元素时（O(mn)）。如果security_id在DataFrame的左上半部分，则成本几乎为零。 - Ujjwal

1

此外，您正在要求进行数据清洗。因此，这是一项廉价的预处理步骤。不要试图对所有内容进行超级优化。过早优化是所有罪恶之源。请记住。 - Ujjwal

是的，那很有道理，我就想可能是这种情况（迭代）。感谢您的解释。 - Kemeia

0

该函数在数据框中查找值的位置

import pandas as pd
import numpy as np

def pandasFindPositionsInDataframe(dfIn,findme):
    positions = []
    irow =0
    while ( irow < len(dfIn.index)):
        list_colPositions=dfIn.columns[dfIn.iloc[irow,:]==findme].tolist()   
        if list_colPositions != []:
            colu_iloc = dfIn.columns.get_loc(list_colPositions[0])
            positions.append([irow, colu_iloc])
        irow +=1

    return positions

- tasos

这如何变成不区分大小写的？ - Utsav Talwar

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ravishankar Sivasubramaniam · Accepted Answer

获取所有列中与搜索条件匹配的行的索引

search = 'security_id' 
df.loc[df.isin([search]).any(axis=1)].index.tolist()

所有列中匹配搜索词的行已被过滤

search = 'search term' 
df.loc[df.isin([search]).any(axis=1)]