在一列中查找重复值的行

8

我有一张名为 author_data 的表:

 author_id | author_name
 ----------+----------------
 9         | ernest jordan
 14        | k moribe
 15        | ernest jordan
 25        | william h nailon 
 79        | howard jason
 36        | k moribe

现在我需要的结果是:
 author_id | author_name                                                  
 ----------+----------------
 9         | ernest jordan
 15        | ernest jordan     
 14        | k moribe 
 36        | k moribe

我的需求是找到重复出现的姓名的author_id。我已经尝试了这个语句:

select author_id,count(author_name)
from author_data
group by author_name
having count(author_name)>1

但是它不起作用。我该如何获取它?

4个回答

14

我建议在子查询中使用窗口函数

SELECT author_id, author_name  -- omit the name here if you just need ids
FROM (
   SELECT author_id, author_name
        , count(*) OVER (PARTITION BY author_name) AS ct
   FROM   author_data
   ) sub
WHERE  ct > 1;

你会认识基本的聚合函数count()。通过添加OVER子句,它可以转变为窗口函数 - 就像任何其他聚合函数一样。

这样它就可以按照每个分区对行进行计数。Voilá。

必须在子查询中完成此操作,因为结果不能在同一SELECTWHERE子句中引用(发生在WHERE之后)。参见:

在没有窗口函数(v.8.3或更早版本)或通常情况下,此替代方法表现得相当快:

SELECT author_id, author_name  -- omit name, if you just need ids
FROM   author_data a
WHERE  EXISTS (
   SELECT FROM author_data a2
   WHERE  a2.author_name = a.author_name
   AND    a2.author_id <> a.author_id
   );

如果您关注性能,请在author_name上添加索引。


3
你已经完成了一半。你只需要使用已识别的作者ID并获取其余数据即可。
尝试这样做...
SELECT author_id, author_name
FROM author_data
WHERE author_id in (select author_id
        from author_data
        group by author_name
        having count(author_name)>1)

2

您可以将表格与自身连接,使用以下任一查询均可实现:

SELECT a1.author_id, a1.author_name
FROM authors a1
CROSS JOIN authors a2
  ON a1.author_id <> a2.author_id
  AND a1.author_name = a2.author_name;

-- 9 |ernest jordan
-- 15|ernest jordan
-- 14|k moribe
-- 36|k moribe

--OR

SELECT a1.author_id, a1.author_name
FROM authors a1
INNER JOIN authors a2
  WHERE a1.author_id <> a2.author_id
  AND a1.author_name = a2.author_name;

-- 9 |ernest jordan
-- 15|ernest jordan
-- 14|k moribe
-- 36|k moribe

0

如果您想要在问题中提到的答案,整个查询将为您获取,但如果您只想要重复的答案,则可以使用内部查询。您还可以使用窗口函数、Row、Dense rank等来获得您的答案。

select a.author_id, 
a.author_name 
from authors a JOIN
   (
    select author_name
    from authors
    group by author_name
    having count(author_name) >1
   ) as temp
on a.author_name = temp.author_name

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接