Python中如何统计列表中重复元素的个数?

6

i have this list:

['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Chicago Cubs', 'Pittsburgh Pirates', 'Philadelphia Athletics', 'Philadelphia Athletics', 'Boston Red Sox', 'Philadelphia Athletics', 'Boston Braves', 'Boston Red Sox', 'Boston Red Sox', 'Chicago White Sox', 'Boston Red Sox', 'Cincinnati Reds', 'Cleveland Indians', 'New York Giants', 'New York Giants', 'New York Yankees', 'Washington Senators', 'Pittsburgh Pirates', 'St. Louis Cardinals', 'New York Yankees', 'New York Yankees', 'Philadelphia Athletics', 'Philadelphia Athletics', 'St. Louis Cardinals', 'New York Yankees', 'New York Giants', 'St. Louis Cardinals', 'Detroit Tigers', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'Cincinnati Reds', 'New York Yankees', 'St. Louis Cardinals', 'New York Yankees', 'St. Louis Cardinals', 'Detroit Tigers', 'St. Louis Cardinals', 'New York Yankees', 'Cleveland Indians', 'New York Yankees', 'New York Yankees']

如何在不使用count、append或set方法或导入其他模块的情况下,从这个列表中删除重复项?

或者我真正想要的是:如何将该列表转换为以下输出:

Boston Americans 5
New York Giants 2
team_name  number_of_duplicates
team_name  number_of_duplicates
team_name  number_of_duplicates

你想要删除还是只是计算每个出现的次数? - Padraic Cunningham
我希望它能够输出名称以及该名称在列表中出现的次数,就像我给出的示例一样。只是不使用count、append或set方法。 - adrianhmartinez
2
如果不使用特定函数的原因是一个任务,你很可能可以从讲座中推断出要使用什么。通常情况下,您会先进行排序,然后从第一个到最后一个遍历,每当前/后不同时,就会打开一个新的“组”,并打印出前面“组”的计数。 - eckes
7个回答

19

要统计列表中每个条目的数量,您可以使用collections模块中的Counter类:

l =['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Chicago Cubs', 'Pittsburgh Pirates', 'Philadelphia Athletics', 'Philadelphia Athletics', 'Boston Red Sox', 'Philadelphia Athletics', 'Boston Braves', 'Boston Red Sox', 'Boston Red Sox', 'Chicago White Sox', 'Boston Red Sox', 'Cincinnati Reds', 'Cleveland Indians', 'New York Giants', 'New York Giants', 'New York Yankees', 'Washington Senators', 'Pittsburgh Pirates', 'St. Louis Cardinals', 'New York Yankees', 'New York Yankees', 'Philadelphia Athletics', 'Philadelphia Athletics', 'St. Louis Cardinals', 'New York Yankees', 'New York Giants', 'St. Louis Cardinals', 'Detroit Tigers', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'Cincinnati Reds', 'New York Yankees', 'St. Louis Cardinals', 'New York Yankees', 'St. Louis Cardinals', 'Detroit Tigers', 'St. Louis Cardinals', 'New York Yankees', 'Cleveland Indians', 'New York Yankees', 'New York Yankees']

from collections import Counter
c = Counter(l) 
print(c)

c 是一个 Counter 对象,它统计了列表中每个不同的条目/键的出现次数。由于 Counter 衍生自 dict,因此您可以像访问其他字典一样访问它。

Counter({'New York Yankees': 13, 'St. Louis Cardinals': 6, 'Philadelphia Athletics': 5, 'New York Giants': 4, 'Boston Red Sox': 4, 'Chicago White Sox': 2, 'Pittsburgh Pirates': 2, 'Detroit Tigers': 2, 'Cincinnati Reds': 2, 'Cleveland Indians': 2, 'Chicago Cubs': 2, 'Boston Americans': 1, 'Boston Braves': 1, 'Washington Senators': 1})

这里为什么会有负评?这恰好是OP所要求的内容!或者说我真正想要的是:如何将该列表打印成这样。嗯,至少在他添加了无导入部分之前是这样的... - PeterE

6
l =['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Chicago Cubs', 'Pittsburgh Pirates', 'Philadelphia Athletics', 'Philadelphia Athletics', 'Boston Red Sox', 'Philadelphia Athletics', 'Boston Braves', 'Boston Red Sox', 'Boston Red Sox', 'Chicago White Sox', 'Boston Red Sox', 'Cincinnati Reds', 'Cleveland Indians', 'New York Giants', 'New York Giants', 'New York Yankees', 'Washington Senators', 'Pittsburgh Pirates', 'St. Louis Cardinals', 'New York Yankees', 'New York Yankees', 'Philadelphia Athletics', 'Philadelphia Athletics', 'St. Louis Cardinals', 'New York Yankees', 'New York Giants', 'St. Louis Cardinals', 'Detroit Tigers', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'Cincinnati Reds', 'New York Yankees', 'St. Louis Cardinals', 'New York Yankees', 'St. Louis Cardinals', 'Detroit Tigers', 'St. Louis Cardinals', 'New York Yankees', 'Cleveland Indians', 'New York Yankees', 'New York Yankees']

for team in [ele for ind, ele in enumerate(l,1) if ele not in l[ind:]]:
    print("{} {}".format(team,l.count(team)))
Boston Americans 1
Chicago Cubs 2
Boston Braves 1
Chicago White Sox 2
Boston Red Sox 4
Washington Senators 1
Pittsburgh Pirates 2
Philadelphia Athletics 5
New York Giants 4
Cincinnati Reds 2
Detroit Tigers 2
St. Louis Cardinals 6
Cleveland Indians 2
New York Yankees 13

完全不使用 list.count 的方式:

for team in [ele for ind, ele in enumerate(l,1) if ele not in l[ind:]]:
    count = 0
    for ele in l:
        if team == ele:
            count += 1
    print("{} {}".format(team,count))
    count = 0

Boston Americans 1
Chicago Cubs 2
Boston Braves 1
Chicago White Sox 2
Boston Red Sox 4
Washington Senators 1
Pittsburgh Pirates 2
Philadelphia Athletics 5
New York Giants 4
Cincinnati Reds 2
Detroit Tigers 2
St. Louis Cardinals 6
Cleveland Indians 2
New York Yankees 13

你没有说明是否可以使用字典,因此:

d = {}

for team in l:
    # if we have not seen team before, create k/v pairing
    # setting value to 0, if team already in dict this does nothing
    d.setdefault(team,0)
    # increase the count for the team
    d[team] += 1
for team, count in d.items():
    print("{} {}".format(team,count))

Chicago White Sox 2
New York Giants 4
Cincinnati Reds 2
Boston Red Sox 4
New York Yankees 13
Philadelphia Athletics 5
Pittsburgh Pirates 2
St. Louis Cardinals 6
Washington Senators 1
Boston Braves 1
Boston Americans 1
Cleveland Indians 2
Detroit Tigers 2
Chicago Cubs 2

你能解释一下你代码中的以下部分吗?[ele for ind, ele in enumerate(l,1) if ele not in l[ind:]] - Erik Åsland
实际上,您介意解释一下您给出的字典示例吗?我想更深入地了解它。 - Erik Åsland
@ea87,这个字典只是将每个团队名称用作键,并在我们遇到新团队时将值设置为0。每个 +=1 只是增加字典中每个团队的计数,所以最终我们得到每个团队的频率/计数。如果我们不限制不能导入模块,我会使用 collections.Counter 字典简化这个过程。 - Padraic Cunningham

2
players = ['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Chicago Cubs', 'Pittsburgh Pirates', 'Philadelphia Athletics', 'Philadelphia Athletics', 'Boston Red Sox', 'Philadelphia Athletics', 'Boston Braves', 'Boston Red Sox', 'Boston Red Sox', 'Chicago White Sox', 'Boston Red Sox', 'Cincinnati Reds', 'Cleveland Indians', 'New York Giants', 'New York Giants', 'New York Yankees', 'Washington Senators', 'Pittsburgh Pirates', 'St. Louis Cardinals', 'New York Yankees', 'New York Yankees', 'Philadelphia Athletics', 'Philadelphia Athletics', 'St. Louis Cardinals', 'New York Yankees', 'New York Giants', 'St. Louis Cardinals', 'Detroit Tigers', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'Cincinnati Reds', 'New York Yankees', 'St. Louis Cardinals', 'New York Yankees', 'St. Louis Cardinals', 'Detroit Tigers', 'St. Louis Cardinals', 'New York Yankees', 'Cleveland Indians', 'New York Yankees', 'New York Yankees']

players_details, players_name = [], []
for each_player in players:
    if not(each_player in players_name):
        players_name = players_name + [each_player]
        players_details = players_details + [[each_player, 1]]
    else:
        for index in range(len(players_details)):
            if players_details[index][0] == each_player:
                players_details[index][1] = players_details[index][1]+1

for each in players_details:
    print '{} : {}'.format(*each)

结果:

Boston Americans : 1
New York Giants : 4
Chicago White Sox : 2
Chicago Cubs : 2
Pittsburgh Pirates : 2
Philadelphia Athletics : 5
Boston Red Sox : 4
Boston Braves : 1
Cincinnati Reds : 2
Cleveland Indians : 2
New York Yankees : 13
Washington Senators : 1
St. Louis Cardinals : 6
Detroit Tigers : 2

0

我使用了这段代码:

from collections import Counter
a=input().split()
print(a)
c=Counter(a) 
for i in c:
    print(str(i),"appears", c[i],"times")

它产生了这个结果: 代码的输出

希望能有所帮助。


0

试试这个:

new_list =  ['a', 'b', 'a']
new_dict = {}
for i in new_list:
 new_dict[i]=new_list.count(i)       
print(new_dict)

Result - {'a': 2, 'b': 1}

-1

你可以创建一个新的列表,例如:

l = ['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Chicago Cubs', 'Pittsburgh Pirates', 'Philadelphia Athletics', 'Philadelphia Athletics', 'Boston Red Sox', 'Philadelphia Athletics', 'Boston Braves', 'Boston Red Sox', 'Boston Red Sox', 'Chicago White Sox', 'Boston Red Sox', 'Cincinnati Reds', 'Cleveland Indians', 'New York Giants', 'New York Giants', 'New York Yankees', 'Washington Senators', 'Pittsburgh Pirates', 'St. Louis Cardinals', 'New York Yankees', 'New York Yankees', 'Philadelphia Athletics', 'Philadelphia Athletics', 'St. Louis Cardinals', 'New York Yankees', 'New York Giants', 'St. Louis Cardinals', 'Detroit Tigers', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'Cincinnati Reds', 'New York Yankees', 'St. Louis Cardinals', 'New York Yankees', 'St. Louis Cardinals', 'Detroit Tigers', 'St. Louis Cardinals', 'New York Yankees', 'Cleveland Indians', 'New York Yankees', 'New York Yankees']
l2 = []
for v in l:
    if v not in l2:
        l2 = l2 + [v]

print(l2)

给出:

['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Pittsburgh Pirates', 'Philadelphia Athletics', 'Boston Red Sox', 'Boston Braves', 'Cincinnati Reds', 'Cleveland Indians', 'New York Yankees', 'Washington Senators', 'St. Louis Cardinals', 'Detroit Tigers']

这个问题不是“如何在不使用count、append或set方法或导入的情况下从列表中删除重复项?”吗? - helloV
@helloV 不再使用 append。谢谢。我错过了它。 - Marcin

-3
list=['Boston Americans', 'New York Giants', 'Chicago White Sox', 'Chicago Cubs', 'Chicago Cubs', 'Pittsburgh Pirates', 'Philadelphia Athletics', 'Philadelphia Athletics', 'Boston Red Sox', 'Philadelphia Athletics', 'Boston Braves', 'Boston Red Sox', 'Boston Red Sox', 'Chicago White Sox', 'Boston Red Sox', 'Cincinnati Reds', 'Cleveland Indians', 'New York Giants', 'New York Giants', 'New York Yankees', 'Washington Senators', 'Pittsburgh Pirates', 'St. Louis Cardinals', 'New York Yankees', 'New York Yankees', 'Philadelphia Athletics', 'Philadelphia Athletics', 'St. Louis Cardinals', 'New York Yankees', 'New York Giants', 'St. Louis Cardinals', 'Detroit Tigers', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'New York Yankees', 'Cincinnati Reds', 'New York Yankees', 'St. Louis Cardinals', 'New York Yankees', 'St. Louis Cardinals', 'Detroit Tigers', 'St. Louis Cardinals', 'New York Yankees', 'Cleveland Indians', 'New York Yankees', 'New York Yankees']
list1=[]
list2=[]
for x in list:
    if not x in list1:
        list1.append(x)
    if x in list1:
        list2.append(x)
list2.sort()
for num,og in enumerate(list2,1):
    print (num,og)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接