我在Oracle中正在测试某些内容,并使用一些示例数据填充了一个表格,但是在此过程中意外加载了重复记录,因此现在我无法使用某些列创建主键。
如何删除所有重复行并仅保留其中一个?
我在Oracle中正在测试某些内容,并使用一些示例数据填充了一个表格,但是在此过程中意外加载了重复记录,因此现在我无法使用某些列创建主键。
如何删除所有重复行并仅保留其中一个?
1. 解决方案
delete from emp
where rowid not in
(select max(rowid) from emp group by empno);
2. 解决方案
delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);
3.solution
delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );
4. 解决方案
delete from emp where rowid in
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid
) rn
from emp
)
where rn > 1
);
5. 解决方案
delete from emp where rowid in
(
select rid from
(
select rowid rid,rank() over (partition by emp_id order by rowid)rn from emp
)
where rn > 1
);
delete from emp where rowid in
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid
) rn
from emp
)
where rn > 1
);
delete from emp
where rowid not in
(select max(rowid) from emp group by empno);
使用自连接-
delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );
delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);
处理大型表格的最快方法
Create exception table with structure below: exceptions_table
ROW_ID ROWID
OWNER VARCHAR2(30)
TABLE_NAME VARCHAR2(30)
CONSTRAINT VARCHAR2(30)
Try create a unique constraint or primary key which will be violated by the duplicates. You will get an error message because you have duplicates. The exceptions table will contain the rowids for the duplicate rows.
alter table add constraint
unique --or primary key
(dupfield1,dupfield2) exceptions into exceptions_table;
Join your table with exceptions_table by rowid and delete dups
delete original_dups where rowid in (select ROW_ID from exceptions_table);
If the amount of rows to delete is big, then create a new table (with all grants and indexes) anti-joining with exceptions_table by rowid and rename the original table into original_dups table and rename new_table_with_no_dups into original table
create table new_table_with_no_dups AS (
select field1, field2 ........
from original_dups t1
where not exists ( select null from exceptions_table T2 where t1.rowid = t2.row_id )
)
DELETE from table_name where rowid not in (select min(rowid) FROM table_name group by column_name);
你还可以用另一种方式删除重复记录
DELETE from table_name a where rowid > (select min(rowid) FROM table_name b where a.column=b.column);
为了获得最佳性能,这是我编写的内容:
(请参见执行计划)
DELETE FROM your_table
WHERE rowid IN
(select t1.rowid from your_table t1
LEFT OUTER JOIN (
SELECT MIN(rowid) as rowid, column1,column2, column3
FROM your_table
GROUP BY column1, column2, column3
) co1 ON (t1.rowid = co1.rowid)
WHERE co1.rowid IS NULL
);
我没有看到任何使用常用表达式和窗口函数的答案。 这是我发现最容易处理的。
DELETE FROM
YourTable
WHERE
ROWID IN
(WITH Duplicates
AS (SELECT
ROWID RID,
ROW_NUMBER()
OVER(
PARTITION BY First_Name, Last_Name, Birth_Date)
AS RN
SUM(1)
OVER(
PARTITION BY First_Name, Last_Name, Birth_Date
ORDER BY ROWID ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING)
AS CNT
FROM
YourTable
WHERE
Load_Date IS NULL)
SELECT
RID
FROM
duplicates
WHERE
RN > 1);
需要注意的几点:
1)我们只检查分区子句中的字段是否有重复。
2)如果您有某些原因选择一个重复项而不是其他重复项,可以使用order by子句使该行具有row_number() = 1。
3)您可以通过将最终where子句更改为“Where RN > N”(其中N > = 1)来更改保留的重复项数量(我想N = 0将删除所有具有重复项的行,但实际上会删除所有行)。
4)在CTE查询中添加了Sum分区字段,它将为每个组标记行数。因此,要选择包括第一项在内的重复行,请使用“WHERE cnt > 1”。
1.
Create table test(id int,sal int);
insert into test values(1,100);
insert into test values(1,100);
insert into test values(2,200);
insert into test values(2,200);
insert into test values(3,300);
insert into test values(3,300);
commit;
3.
select * from test;
delete from
test
where rowid in
(select rowid from
(select
rowid,
row_number()
over
(partition by id order by sal) dup
from test)
where dup > 1)
select * from test;
您会发现重复记录已被删除。
希望这解决了您的问题。
谢谢 :)