在MySQL中查找重复记录

740

我想从MySQL数据库中提取重复的记录。这可以通过以下方式完成:

SELECT address, count(id) as cnt FROM list
GROUP BY address HAVING cnt > 1

这将导致:

100 MAIN ST    2

我想提取每个重复的行,以便显示出来。类似这样:

JIM    JONES    100 MAIN ST
JOHN   SMITH    100 MAIN ST
有没有关于如何做到这一点的想法?我试图避免在代码中进行第二个查询来查找重复项,而不是先执行第一个查询。
28个回答

12

可能不会非常高效,但这个方法应该能够工作:

SELECT *
FROM list AS outer
WHERE (SELECT COUNT(*)
        FROM list AS inner
        WHERE inner.address = outer.address) > 1;

这个查询比其他查询更好,谢谢。 - Subramanya Rao

11

这将在一个表传递中选择重复项,无需子查询。

SELECT  *
FROM    (
        SELECT  ao.*, (@r := @r + 1) AS rn
        FROM    (
                SELECT  @_address := 'N'
                ) vars,
                (
                SELECT  *
                FROM
                        list a
                ORDER BY
                        address, id
                ) ao
        WHERE   CASE WHEN @_address <> address THEN @r := 0 ELSE 0 END IS NOT NULL
                AND (@_address := address ) IS NOT NULL
        ) aoo
WHERE   rn > 1

这个查询实际上模拟了在OracleSQL Server中存在的ROW_NUMBER()函数。

有关详细信息,请参阅我的博客文章:


22
不是挑毛病,但 FROM (SELECT ...) aoo 是一个子查询 :-P - gen_Eric

8
这也会显示您拥有多少个重复项,并按顺序排列结果而无需连接。
SELECT  `Language` , id, COUNT( id ) AS how_many
FROM  `languages` 
GROUP BY  `Language` 
HAVING how_many >=2
ORDER BY how_many DESC

完美,因为它仍然显示了重复条目的数量。 - denis
GROUP BY 只列出每个重复项中的一个。假设有三个?或者五十个? - ToolmakerSteve

5
SELECT id, count(*) as c  
 FROM 'list'
GROUP BY id HAVING c > 1

这将返回重复的id及其出现次数,如果没有重复的id,则不会返回任何内容。

更改group by中的id(例如:address),它将返回一个地址重复出现的次数,由第一个具有该地址的id确定。

SELECT id, count(*) as c  
 FROM 'list'
GROUP BY address HAVING c > 1

希望这有帮助。享受吧;)


4
select * from table_name t1 inner join (select distinct <attribute list> from table_name as temp)t2 where t1.attribute_name = t2.attribute_name

对于您的表格,它可能是这样的:
select * from list l1 inner join (select distinct address from list as list2)l2 where l1.address=l2.address

这个查询将会给出您列表表中所有不同的地址条目...如果您的姓名等有任何主键数值,我不确定这个查询是否有效。

4
 SELECT firstname, lastname, address FROM list
 WHERE 
 Address in 
 (SELECT address FROM list
 GROUP BY address
 HAVING count(*) > 1)

也尝试了这个,但似乎一直卡住了。相信内部查询返回的结果不符合IN参数格式。 - doublejosh
1
什么意思不满足in参数格式?所有IN需要的是您的子查询必须返回单个列。这真的很简单。更有可能的是,您的子查询正在生成未索引的列,因此运行时间过长。如果花费很长时间,我建议将其分成两个查询。首先将子查询运行到临时表中,为其创建索引,然后运行完整查询,在临时表中使用子查询进行重复字段。 - Ryan Roper
我曾经担心IN需要逗号分隔的列表而不是列,这是错误的。以下是对我有效的查询语句:SELECT users.name, users.uid, users.mail, from_unixtime(created) FROM users INNER JOIN ( SELECT mail FROM users GROUP BY mail HAVING count(mail) > 1 ) dup ON users.mail = dup.mail ORDER BY users.mail, users.created; - doublejosh

4

就我个人而言,这个查询解决了我的问题:

SELECT `SUB_ID`, COUNT(SRV_KW_ID) as subscriptions FROM `SUB_SUBSCR` group by SUB_ID, SRV_KW_ID HAVING subscriptions > 1;

此脚本的作用是显示表格中存在多次的所有订阅者ID和找到的重复次数。

这是表格的列:

| SUB_SUBSCR_ID | int(11)     | NO   | PRI | NULL    | auto_increment |
| MSI_ALIAS     | varchar(64) | YES  | UNI | NULL    |                |
| SUB_ID        | int(11)     | NO   | MUL | NULL    |                |    
| SRV_KW_ID     | int(11)     | NO   | MUL | NULL    |                |

希望这对你有所帮助!

4

最快的重复项删除查询过程:

/* create temp table with one primary column id */
INSERT INTO temp(id) SELECT MIN(id) FROM list GROUP BY (isbn) HAVING COUNT(*)>1;
DELETE FROM list WHERE id IN (SELECT id FROM temp);
DELETE FROM temp;

3
这显然只会删除每组重复记录中的第一条。 - Palec

3
SELECT t.*,(select count(*) from city as tt where tt.name=t.name) as count FROM `city` as t where (select count(*) from city as tt where tt.name=t.name) > 1 order by count desc

城市替换为您的表。 将名称替换为您的字段名称。


2

这里大部分的答案没有处理当您有多个重复结果和/或需要检查多列重复时的情况。当您面临这种情况时,可以使用以下查询来获取所有重复的ID:

SELECT address, email, COUNT(*) AS QUANTITY_DUPLICATES, GROUP_CONCAT(id) AS ID_DUPLICATES
    FROM list
    GROUP BY address, email
    HAVING COUNT(*)>1;

第一个查询截图示例

如果您想将每个结果列为单独的行,则需要使用更复杂的查询。这是我发现可行的查询:

CREATE TEMPORARY TABLE IF NOT EXISTS temptable AS (    
    SELECT GROUP_CONCAT(id) AS ID_DUPLICATES
    FROM list
    GROUP BY address, email
    HAVING COUNT(*)>1
); 
SELECT d.* 
    FROM list AS d, temptable AS t 
    WHERE FIND_IN_SET(d.id, t.ID_DUPLICATES) 
    ORDER BY d.id;

second query screenshot example


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接