在Pandas中,如何获取包含列表的Series的value_counts()(值计数)?

3

我有一个pandas系列df.files,它长这样:

In [79]: df.files
Out[79]:
0        [{'url': 'http://www.apkmirror.com/wp-content/...
1        [{'url': 'http://www.apkmirror.com/wp-content/...
2        [{'url': 'http://www.apkmirror.com/wp-content/...
3        [{'url': 'http://www.apkmirror.com/wp-content/...
4        [{'url': 'http://www.apkmirror.com/wp-content/...
5        [{'url': 'http://www.apkmirror.com/wp-content/...
6        [{'url': 'http://www.apkmirror.com/wp-content/...
7        [{'url': 'http://www.apkmirror.com/wp-content/...
8        [{'url': 'http://www.apkmirror.com/wp-content/...
9        [{'url': 'http://www.apkmirror.com/wp-content/...
10       [{'url': 'http://www.apkmirror.com/wp-content/...
11       [{'url': 'http://www.apkmirror.com/wp-content/...
12       [{'url': 'http://www.apkmirror.com/wp-content/...
13       [{'url': 'http://www.apkmirror.com/wp-content/...
14       [{'url': 'http://www.apkmirror.com/wp-content/...
15       [{'url': 'http://www.apkmirror.com/wp-content/...
16       [{'url': 'http://www.apkmirror.com/wp-content/...
17       [{'url': 'http://www.apkmirror.com/wp-content/...
18       [{'url': 'http://www.apkmirror.com/wp-content/...
19       [{'url': 'http://www.apkmirror.com/wp-content/...
20       [{'url': 'http://www.apkmirror.com/wp-content/...
21       [{'url': 'http://www.apkmirror.com/wp-content/...
22       [{'url': 'http://www.apkmirror.com/wp-content/...
23       [{'url': 'http://www.apkmirror.com/wp-content/...
24       [{'url': 'http://www.apkmirror.com/wp-content/...
25       [{'url': 'http://www.apkmirror.com/wp-content/...
26       [{'url': 'http://www.apkmirror.com/wp-content/...
27       [{'url': 'http://www.apkmirror.com/wp-content/...
28       [{'url': 'http://www.apkmirror.com/wp-content/...
29       [{'url': 'http://www.apkmirror.com/wp-content/...
                               ...                        
16487    [{'url': 'http://www.apkmirror.com/wp-content/...
16488                                                   []
16489    [{'url': 'http://www.apkmirror.com/wp-content/...
16490    [{'url': 'http://www.apkmirror.com/wp-content/...
16491                                                   []
16492    [{'url': 'http://www.apkmirror.com/wp-content/...
16493    [{'url': 'http://www.apkmirror.com/wp-content/...
16494    [{'url': 'http://www.apkmirror.com/wp-content/...
16495                                                   []
16496                                                   []
16497                                                   []
16498    [{'url': 'http://www.apkmirror.com/wp-content/...
16499    [{'url': 'http://www.apkmirror.com/wp-content/...
16500    [{'url': 'http://www.apkmirror.com/wp-content/...
16501    [{'url': 'http://www.apkmirror.com/wp-content/...
16502    [{'url': 'http://www.apkmirror.com/wp-content/...
16503                                                   []
16504                                                   []
16505                                                   []
16506                                                   []
16507                                                   []
16508                                                   []
16509                                                   []
16510                                                   []
16511                                                   []
16512                                                   []
16513                                                   []
16514                                                   []
16515                                                   []
16516                                                   []

一些值为空列表,而其他值是包含单个字典的列表,其格式类似于以下内容:
In [80]: df.files.loc[0]
Out[80]: 
[{'checksum': '9f6075f4c561792e48354277b46a6810',
  'path': 'full/80832b9fca82ce0f58f4d23c511e5a1d657c40e8.php?id=2968',
  'url': 'http://www.apkmirror.com/wp-content/themes/APKMirror/download.php?id=2968'}]

我想找出df.files中有多少个空列表。但是,如果我尝试使用df.files.value_counts(),会出现TypeError: unhashable type: 'list'的错误。请问如何解决这个问题?
3个回答

13
你可以先转换为tuple,然后如果想要使用value_counts
vc = df.files.apply(tuple).value_counts()

但如果只需要空的lists的长度,请使用str.len来计算 lists 的数量,然后对布尔掩码中所有True的值求和:

l = (df['files'].str.len() == 0).sum()

如果不可能存在NaN值,则使用IanS的解决方案

l = (df['files'].apply(len) == 0).sum()

2
如果您正在寻找空列表,为什么要使用value_counts?
len([i for i in df.files if len(i) == 0])

1
或者(df['files'].apply(len) == 0).sum()(可能更快) - IanS
@IanS 是的,你说得对。我以前从没用过那个。现在我肯定会用了。 - A.Kot
嗯,df['files'].str.len() 可能更快(请参见其他答案)。 - IanS

0
你可以编写一个for循环来遍历列表:
for i in df.files:
    count = 0
    if len(i) == 0:
        count = count + 1
    else:
        pass

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接