我在pandas中很难筛选groupby
项目。我想要做的是
select email, count(1) as cnt
from customers
group by email
having count(email) > 1
order by cnt desc
我做了
customers.groupby('Email')['CustomerID'].size()
它正确地给出了电子邮件列表及其相应的计数,但我无法实现 having count(email) > 1
部分。
email_cnt[email_cnt.size > 1]
返回值为1
email_cnt = customers.groupby('Email')
email_dup = email_cnt.filter(lambda x:len(x) > 2)
提供了所有 email > 1
的客户记录,但我需要聚合表格。