我需要从表格中仅删除重复的行,就像我的表中有3个重复的行,我的查询将从这3个重复的行中删除2个。
我该怎么做?请帮帮我。
我需要从表格中仅删除重复的行,就像我的表中有3个重复的行,我的查询将从这3个重复的行中删除2个。
我该怎么做?请帮帮我。
请尝试下面的查询,它一定能够达到你的目标。
SET ROWCOUNT 1
DELETE test
FROM test a
WHERE (SELECT COUNT(*) FROM test b WHERE b.name = a.name) > 1
WHILE @@rowcount > 0
DELETE test
FROM test a
WHERE (SELECT COUNT(*) FROM test b WHERE b.name = a.name) > 1
SET ROWCOUNT 0
其中test是您的表名
虽然这不是一个单一的语句,但在SQL Server中可以正常工作:
Declare @cnt int;
Select @cnt=COUNT(*) From DupTable Where (Col1=1); -- Assumes you are trying to delete the duplicates where some condition (e.g. Col1=1) is true.
Delete Top (@cnt-1) From DupTable
ALTER TABLE dbo.DupTable ADD
IDCol int NOT NULL IDENTITY (1, 1)
GO
DELETE FROM DupTable WHERE IDCol NOT IN
(SELECT MAX(IDCol) FROM DupTable GROUP BY Col1, Col2, Col3)
DELETE FROM Table t1, Table t2 WHERE t1.colDup = t2.colDup AND t1.date < t2.date
将从Table
中删除每个重复行(在列colDup
上),除了最旧的(即最低的date
)。
DELETE FROM `mytbl`
INNER JOIN (
SELECT 1 FROM `mytbl`
GROUP BY `duplicated_column` HAVING COUNT(*)=2
) USING(`id`)
编辑:
抱歉,上面的查询无法工作。
假设表结构:
id
int 自增
num
int # <-- 这是包含重复值的列
以下查询语句在MySQL中可以运行(已经验证):
DELETE `mytbl` FROM `mytbl`
INNER JOIN
(
SELECT `num` FROM `mytbl`
GROUP BY `num` HAVING COUNT(*)=2
) AS `tmp` USING (`num`)
这个查询会删除“num”列中存在2个(不多不少)重复值的行。
编辑(再次):
我建议在“num”列上添加一个键。
编辑(#3):
如果作者想要删除重复的行,下面的代码在MySQL中是可行的(我已经试过):
DELETE `delete_duplicated_rows` FROM `delete_duplicated_rows`
NATURAL JOIN (
SELECT *
FROM `delete_duplicated_rows`
GROUP BY `num1` HAVING COUNT(*)=2
) AS `der`
假设表结构如下:
CREATE TABLE `delete_duplicated_rows` (
`num1` tinyint(4) DEFAULT NOT NULL,
`num2` tinyint(4) DEFAULT NOT NULL
) ENGINE=MyISAM;
如果你有想要删除的行的ID,那么...
DELETE FROM table WHERE id IN (1, 4, 7, [id numbers to delete...])
-- Just to demonstrates Marks example
.
-- START === 1.0.dbo..DuplicatesTable.TableCreate.sql
/****** Object: Table [dbo].[DuplicatesTable]
Script Date: 03/29/2010 21:24:02 ******/
IF EXISTS (SELECT * FROM sys.objects
WHERE
object_id = OBJECT_ID(N'[dbo].[DuplicatesTable]')
AND type in (N'U'))
DROP TABLE [dbo].[DuplicatesTable]
GO
/****** Object: Table [dbo].[DuplicatesTable]
Script Date: 03/29/2010 21:24:02 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DuplicatesTable](
[ColA] [varchar](10) NOT NULL, -- the name of the DuplicatesTable
[ColB] [varchar](10) NULL, -- the description of the e DuplicatesTable
)
/*
<doc>
Models a DuplicatesTable for
</doc>
*/
GO
--============================================================ DuplicatesTable START
declare @ScriptFileName varchar(2000)
SELECT @ScriptFileName = '$(ScriptFileName)'
SELECT @ScriptFileName + ' --- DuplicatesTable START ========================================='
declare @TableName varchar(200)
select @TableName = 'DuplicatesTable'
SELECT 'SELECT name from sys.tables where name =''' + @TableName + ''''
SELECT name from sys.tables
where name = @TableName
DECLARE @TableCount INT
SELECT @TableCount = COUNT(name ) from sys.tables
where name =@TableName
if @TableCount=1
SELECT ' DuplicatesTable PASSED. The Table ' + @TableName + ' EXISTS '
ELSE
SELECT ' DuplicatesTable FAILED. The Table ' + @TableName + ' DOES NOT EXIST '
SELECT @ScriptFileName + ' --- DuplicatesTable END ========================================='
--============================================================ DuplicatesTable END
GO
-- END === 1.0.dbo..DuplicatesTable.TableCreate.sql
.
-- START === 1.1..dbo..DuplicatesTable.TableInsert.sql
BEGIN TRANSACTION;
INSERT INTO [dbo].[DuplicatesTable]([ColA], [ColB])
SELECT N'ColA', N'ColB' UNION ALL
SELECT N'ColA', N'ColB' UNION ALL
SELECT N'ColA', N'ColB' UNION ALL
SELECT N'ColA', N'ColB' UNION ALL
SELECT N'ColA', N'ColB' UNION ALL
SELECT N'ColA', N'ColB' UNION ALL
SELECT N'ColA', N'ColB' UNION ALL
SELECT N'ColA1', N'ColB1' UNION ALL
SELECT N'ColA1', N'ColB1' UNION ALL
SELECT N'ColA1', N'ColB1' UNION ALL
SELECT N'ColA1', N'ColB1' UNION ALL
SELECT N'ColA1', N'ColB1' UNION ALL
SELECT N'ColA1', N'ColB1' UNION ALL
SELECT N'ColA1', N'ColB1'
COMMIT;
RAISERROR (N'[dbo].[DuplicatesTable]: Insert Batch: 1.....Done!', 10, 1) WITH NOWAIT;
GO
-- END === 1.1..dbo..DuplicatesTable.TableInsert.sql
.
-- START === 2.0.RemoveDuplicates.Script.sql
ALTER TABLE dbo.DuplicatesTable ADD
DuplicatesTableId int NOT NULL IDENTITY (1, 1)
GO
-- Then the delete is trivial:
DELETE FROM dbo.DuplicatesTable WHERE DuplicatesTableId NOT IN
(SELECT MAX(DuplicatesTableId) FROM dbo.DuplicatesTable GROUP BY ColA , ColB)
Select * from DuplicatesTable ;
-- END === 2.0.RemoveDuplicates.Script.sql
我认为每个表都有唯一的标识符。 因此,如果存在,则可以编写以下查询: 从Table1 t1中删除Table1,其中2>=(select count(id)from Table1 where dupColumn = t1.dupColumn),并且 t1.id不在(select max(id)from Table1 where dupColumn = t1.dupColumn)
哎呀。似乎只能使用第二个过滤器 从Table1 t1中删除Table1,其中 t1.id不在(select max(id)from Table1 where dupColumn = t1.dupColumn)