使用列表过滤Pandas数据框

3

我正在尝试使用用户ID列表和掩码进行过滤。这是包含两个用户ID的输入:

data = np.array([['user_id','comment','label'],
            [100,'First comment',0],
            [101,'Buy viagra',1],
            [100,'Buy viagra two',1],
            [101,'Third comment',0],
            [100,'Third comment two',0],
            [101,'Buy drugs',1],
            [100,'Buy drugs two',1],
            [101,'Buy icecream',1],
            [100,'Buy icecream two',1],
            [101,'Buy something',1],
            [100,'Buy something two',1]])

所需输出为:
0      100      First comment     0
1      101         Buy viagra     1
2      100     Buy viagra two     1
3      101      Third comment     0
4      100  Third comment two     0
5      101          Buy drugs     1
6      100      Buy drugs two     1
7      101       Buy icecream     1
8      100   Buy icecream two     1

通过传递用户ID列表,我得到了一个错误的输出。

m = df.user_id.isin([100,101]) & df.label.eq('1')
i = df[m].head(3)
j = df[~m]
df = pd.concat([i, j]).sort_index()
print (df)

然而,如果我只传递一个 user_id 如下所示,就会得到正确的输出。你能告诉我哪里出了问题吗?谢谢。

m = df.user_id.eq('101') & df.label.eq('1')
1个回答

4

你的问题在于user_id列中的值是字符串,因此需要使用['100','101']而不是[100,101]

df = pd.DataFrame(data[1:], columns=data[0])

m = df.user_id.isin(['100','101']) & df.label.eq('1')
i = df[m]
print (i)
   user_id            comment label
1      101         Buy viagra     1
2      100     Buy viagra two     1
5      101          Buy drugs     1
6      100      Buy drugs two     1
7      101       Buy icecream     1
8      100   Buy icecream two     1
9      101      Buy something     1
10     100  Buy something two     1

您可以通过以下方式在一列中检查type:
print (df.user_id.apply(type))

0     <class 'str'>
1     <class 'str'>
2     <class 'str'>
3     <class 'str'>
4     <class 'str'>
5     <class 'str'>
6     <class 'str'>
7     <class 'str'>
8     <class 'str'>
9     <class 'str'>
10    <class 'str'>
Name: user_id, dtype: object

如果需要检查所有列:

print (df.applymap(type))

          user_id        comment          label
0   <class 'str'>  <class 'str'>  <class 'str'>
1   <class 'str'>  <class 'str'>  <class 'str'>
2   <class 'str'>  <class 'str'>  <class 'str'>
3   <class 'str'>  <class 'str'>  <class 'str'>
4   <class 'str'>  <class 'str'>  <class 'str'>
5   <class 'str'>  <class 'str'>  <class 'str'>
6   <class 'str'>  <class 'str'>  <class 'str'>
7   <class 'str'>  <class 'str'>  <class 'str'>
8   <class 'str'>  <class 'str'>  <class 'str'>
9   <class 'str'>  <class 'str'>  <class 'str'>
10  <class 'str'>  <class 'str'>  <class 'str'>

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接