我有一列超过800行的数据如下:
0 ['Overgrow', 'Chlorophyll']
1 ['Overgrow', 'Chlorophyll']
2 ['Overgrow', 'Chlorophyll']
3 ['Blaze', 'Solar Power']
4 ['Blaze', 'Solar Power']
5 ['Blaze', 'Solar Power']
6 ['Torrent', 'Rain Dish']
7 ['Torrent', 'Rain Dish']
8 ['Torrent', 'Rain Dish']
9 ['Shield Dust', 'Run Away']
10 ['Shed Skin']
11 ['Compoundeyes', 'Tinted Lens']
12 ['Shield Dust', 'Run Away']
13 ['Shed Skin']
14 ['Swarm', 'Sniper']
15 ['Keen Eye', 'Tangled Feet', 'Big Pecks']
16 ['Keen Eye', 'Tangled Feet', 'Big Pecks']
17 ['Keen Eye', 'Tangled Feet', 'Big Pecks']
我想要什么?
- 我想要计算每个字符串值出现的次数。
- 我还想将唯一的字符串值排列成列表。
以下是我为了得到第二部分所做的事情:
list_ability = df_pokemon['abilities'].tolist()
new_list = []
for i in range(0, len(list_ability)):
m = re.findall(r"'(.*?)'", list_ability[i], re.DOTALL)
for j in range(0, len(m)):
new_list.append(m[j])
list1 = set(new_list)
我能把唯一的字符串值放进一个列表中,但是有没有更好的方法呢?
例子:
'Overgrow' - 3
'Chlorophyll' - 3
'Blaze' - 3
'Sheild Dust' - 2 .... 等等
(顺便说一句,列的名称是'abilities'
,来自于数据框架df_pokemon
。)
from collections import Counter; counts = df_pokemon.abilities.map(Counter).sum()
吗? - Jon Clements