我看过很多类似的内容,但没有一个完全符合我想要的。
我有一张名为TableA
的表,其中包含用户对可配置问卷的答案。列是member_id, quiz_num, question_num, answer_num
。
某些成员的答案被提交了两次。所以我需要删除重复记录,但确保留下一行。
由于没有主列,因此可能会出现两行或三行具有相同数据的情况。
是否有查询可以删除所有重复项?
我看过很多类似的内容,但没有一个完全符合我想要的。
我有一张名为TableA
的表,其中包含用户对可配置问卷的答案。列是member_id, quiz_num, question_num, answer_num
。
某些成员的答案被提交了两次。所以我需要删除重复记录,但确保留下一行。
由于没有主列,因此可能会出现两行或三行具有相同数据的情况。
是否有查询可以删除所有重复项?
在你的表上添加唯一索引:
ALTER IGNORE TABLE `TableA`
ADD UNIQUE INDEX (`member_id`, `quiz_num`, `question_num`, `answer_num`);
另一种方法是:
在您的表中添加主键,然后您可以使用以下查询轻松地从表中删除重复项:
DELETE FROM member
WHERE id IN (SELECT *
FROM (SELECT id FROM member
GROUP BY member_id, quiz_num, question_num, answer_num HAVING (COUNT(*) > 1)
) AS A
);
alter ignore table
很快就会消失了:http://dev.mysql.com/worklog/task/?id=7395 - juacala与其使用drop table TableA
,您可以删除所有记录(delete from TableA;
),然后使用来自TableA_Verify的记录重新填充原始表格(insert into TAbleA select * from TAbleA_Verify
)。这样,您就不会失去对原始表格(索引等)的所有引用。
CREATE TABLE TableA_Verify AS SELECT DISTINCT * FROM TableA;
DELETE FROM TableA;
INSERT INTO TableA SELECT * FROM TAbleA_Verify;
DROP TABLE TableA_Verify;
这里不使用临时表,而是使用实际表。如果问题只涉及到临时表而不是创建或删除表的问题,那么这将会起作用:
SELECT DISTINCT * INTO TableA_Verify FROM TableA;
DROP TABLE TableA;
RENAME TABLE TableA_Verify TO TableA;
感谢jveirasv提供的上面的答案。
如果您需要删除特定列的重复项,可以使用以下方法(如果表中有时间戳等不同的情况)
CREATE TABLE TableA_Verify AS SELECT * FROM TableA WHERE 1 GROUP BY [COLUMN TO remove duplicates BY];
DELETE FROM TableA;
INSERT INTO TableA SELECT * FROM TAbleA_Verify;
DROP TABLE TableA_Verify;
在您的表上添加唯一索引:
ALTER IGNORE TABLE TableA
ADD UNIQUE INDEX (member_id, quiz_num, question_num, answer_num);
它的工作非常出色。
如果您没有使用任何主键,则可以在一次操作中执行以下查询,通过替换值:
# table_name - Your Table Name
# column_name_of_duplicates - Name of column where duplicate entries are found
create table table_name_temp like table_name;
insert into table_name_temp select distinct(column_name_of_duplicates),value,type from table_name group by column_name_of_duplicates;
delete from table_name;
insert into table_name select * from table_name_temp;
drop table table_name_temp
在操作数据库之前,建议先备份数据库。
CREATE TABLE temp_table AS SELECT * FROM original_table LIMIT 0
ALTER TABLE temp_table ADD PRIMARY KEY (primary-key-field)
INSERT IGNORE INTO temp_table AS SELECT * FROM original_table
DROP TABLE original_table
RENAME TABLE temp_table TO original_table
这里提供了一种解决方案,不会删除任何数据,并在整个过程中保留原始表中的数据,允许删除重复项同时保持表“活动”:
alter table tableA add column duplicate tinyint(1) not null default '0';
update tableA set
duplicate=if(@member_id=member_id
and @quiz_num=quiz_num
and @question_num=question_num
and @answer_num=answer_num,1,0),
member_id=(@member_id:=member_id),
quiz_num=(@quiz_num:=quiz_num),
question_num=(@question_num:=question_num),
answer_num=(@answer_num:=answer_num)
order by member_id, quiz_num, question_num, answer_num;
delete from tableA where duplicate=1;
alter table tableA drop column duplicate;
duplicate
列,使其恢复到原始状态。alter table ignore
也可能很快就会消失:http://dev.mysql.com/worklog/task/?id=7395
已在mysql 5中测试。不确定其他版本。 如果您希望保留具有最低id值的行:
DELETE n1 FROM 'yourTableName' n1, 'yourTableName' n2 WHERE n1.id > n2.id AND n1.member_id = n2.member_id and n1.answer_num =n2.answer_num
DELETE n1 FROM 'yourTableName' n1, 'yourTableName' n2 WHERE n1.id < n2.id AND n1.member_id = n2.member_id and n1.answer_num =n2.answer_num