MySQL并发插入在(显式)事务之外导致死锁

12

我试图调试以下场景:有两个并发进程,向具有唯一约束条件的表中插入完全相同的行。这是在显式事务之外完成的(尽管我假设InnoDB会将其作为单个语句内部处理,带有自动提交?)

模式如下:

CREATE TABLE locks (
    id bigint(20) UNSIGNED NOT NULL AUTO_INCREMENT,
    lock_uid varchar(255) NOT NULL,
    count smallint(6) NOT NULL,
    processor_id varchar(255) DEFAULT NULL,
    created_at timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
    updated_at timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
    PRIMARY KEY (id),
    UNIQUE INDEX locks_lock_uid_unique (lock_uid)
)

如您所见,lock_uid 上有唯一索引,以防止表中存在具有相同值的多行。

正在运行的命令(为了上下文,这些命令来自于完整的通用查询日志,除了排序命令外,两个线程都没有其他语句):

线程1:

insert into `locks` (`lock_uid`, `count`, `processor_id`, `created_at`, `updated_at`) 
values ('11161567', '0', NULL, '2017-11-07 10:46:36', '2017-11-07 10:46:36')

线程2:

insert into `locks` (`lock_uid`, `count`, `processor_id`, `created_at`, `updated_at`) 
values ('11161567', '0', NULL, '2017-11-07 10:46:36', '2017-11-07 10:46:36')

这导致了以下死锁情况:
LATEST DETECTED DEADLOCK
------------------------
2017-11-07 10:46:36 0x2ac88f791700
*** (1) TRANSACTION:
TRANSACTION 6089510736, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 1
MySQL thread id 177584962, OS thread handle 47059008030464, query id 13109086103 ec2-34-232-58-13.compute-1.amazonaws.com 34.232.58.13 appserver update
insert into `locks` (`lock_uid`, `count`, `processor_id`, `created_at`, `updated_at`) values ('11161567', '0', NULL, '2017-11-07 10:46:36', '2017-11-07 10:46:36')
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6403 page no 4 n bits 176 index locks_lock_uid_unique of table `core`.`locks` trx id 6089510736 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 107 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 30; hex 636865636b6f75745265636f7665727950726f636573735f313131363137; asc 111617; (total 32 bytes);
1: len 8; hex 0000000003266637; asc      &f7;;

*** (2) TRANSACTION:
TRANSACTION 6089510734, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 5 row lock(s), undo log entries 1
MySQL thread id 177584971, OS thread handle 47040888903424, query id 13109086092 ec2-34-237-3-244.compute-1.amazonaws.com 34.237.3.244 appserver update
insert into `locks` (`lock_uid`, `count`, `processor_id`, `created_at`, `updated_at`) values ('11161567', '0', NULL, '2017-11-07 10:46:36', '2017-11-07 10:46:36')
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 6403 page no 4 n bits 176 index locks_lock_uid_unique of table `core`.`locks` trx id 6089510734 lock mode S
Record lock, heap no 104 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 30; hex 636865636b6f75745265636f7665727950726f636573735f313131363135; asc 111615; (total 32 bytes);
1: len 8; hex 0000000003266632; asc      &f2;;

Record lock, heap no 105 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 30; hex 636865636b6f75745265636f7665727950726f636573735f313131363135; asc 111615; (total 32 bytes);
1: len 8; hex 0000000003266634; asc      &f4;;

Record lock, heap no 107 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 30; hex 636865636b6f75745265636f7665727950726f636573735f313131363137; asc 111617; (total 32 bytes);
1: len 8; hex 0000000003266637; asc      &f7;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 6403 page no 4 n bits 176 index locks_lock_uid_unique of table `core`.`locks` trx id 6089510734 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 107 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 30; hex 636865636b6f75745265636f7665727950726f636573735f313131363137; asc 111617; (total 32 bytes);
1: len 8; hex 0000000003266637; asc      &f7;;

*** WE ROLL BACK TRANSACTION (2)

我已经阅读了类似的答案(例如MySQL锁定中的重复键错误),但我无法完全理解此情况发生了什么。解释与死锁输出不匹配。

  • 为什么这个死锁会在事务外的2个插入操作中发生?
  • 为什么T2在请求X之前已经持有S锁,而完全相同的T1没有持有S锁?

请仔细检查“autocommit”的设置。是否需要“id”? - Rick James
潜在地,如果我们的ORM使用代理键工作,则我们面临一个折衷的问题,即它相对容易按我们的意愿进行调整。 - AndySavage
我担心有两个UNIQUE键(其中一个是PRIMARY)是问题的一部分。但我无法证明。你每秒执行数百个INSERTs吗? - Rick James
在我们看到死锁的时候,通常会很高峰,但我不确定我们是否达到了数百个。我们内部进行了类似的讨论,也觉得拥有PK是额外的工作,但仍然无法解释足够的“为什么”来证明修改生产的必要性。 - AndySavage
看看在 INSERT IGNORE ... 后面加上一个测试(对 rows_affected 进行测试?)是否可以避免死锁,并告诉你当前的情况。 - Rick James
显示剩余4条评论
2个回答

9

为什么在事务之外插入两个元素会导致死锁?

它仍然有一个事务 https://dev.mysql.com/doc/refman/5.7/en/innodb-autocommit-commit-rollback.html

在InnoDB中,所有用户活动都发生在事务内。如果启用了自动提交模式,则每个SQL语句本身就形成一个单独的事务。

因此,您的单个查询可以被视为短暂的事务

为什么T2在请求X之前已经持有S锁,而相同的T1没有?

它已经持有。这只是打印最新死锁信息的函数的工作方式 :-) 它不会打印第一个事务持有的锁。您可以通过从2个并行mysql会话模拟简单的死锁来自行检查。 下面是代码:

https://github.com/mysql/mysql-server/blob/5.7/storage/innobase/lock/lock0lock.cc#L7236

DeadlockChecker::notify(const lock_t* lock) const
{
    ut_ad(lock_mutex_own());

    start_print();

    print("\n*** (1) TRANSACTION:\n");

    print(m_wait_lock->trx, 3000);

    print("*** (1) WAITING FOR THIS LOCK TO BE GRANTED:\n");

    print(m_wait_lock);

    print("*** (2) TRANSACTION:\n");

    print(lock->trx, 3000);

    print("*** (2) HOLDS THE LOCK(S):\n");

    print(lock);

    /* It is possible that the joining transaction was granted its
    lock when we rolled back some other waiting transaction. */

    if (m_start->lock.wait_lock != 0) {
        print("*** (2) WAITING FOR THIS LOCK TO BE GRANTED:\n");

        print(m_start->lock.wait_lock);
    }

    DBUG_PRINT("ib_lock", ("deadlock detected"));
}

解释与死锁输出不符

这与您的情况非常接近。以下是一个模拟:

准备工作:

create table test (id int primary key, val int not null unique) engine=innodb;
insert into test values (1, 1), (2, 2), (3, 3);

现在让我们打开3个终端并运行以下命令:
1> begin;
2> begin;
3> begin;

然后

1>insert into test values (1, 1);
2>insert into test values (1, 1); (will hang)
3>insert into test values (1, 1); (will hang)

现在

1>rollback
2>would produce: ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
3>would produce: Query OK, 1 row affected

1>show engine innodb status;
...
------------------------
LATEST DETECTED DEADLOCK
------------------------
2017-11-15 23:21:47 0x700000d95000
*** (1) TRANSACTION:
TRANSACTION 2336, ACTIVE 8 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 8, OS thread handle 123145316831232, query id 58 localhost 127.0.0.1 root update
insert into test values (1, 1)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 26 page no 3 n bits 72 index PRIMARY of table `test`.`test` trx id 2336 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

*** (2) TRANSACTION:
TRANSACTION 2335, ACTIVE 13 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 5, OS thread handle 123145316552704, query id 57 localhost root update
insert into test values (1, 1)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 26 page no 3 n bits 72 index PRIMARY of table `test`.`test` trx id 2335 lock mode S
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 26 page no 3 n bits 72 index PRIMARY of table `test`.`test` trx id 2335 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

*** WE ROLL BACK TRANSACTION (2)

正如您所看到的,它与您的结果非常接近


0

插入查询在分配共享或独占锁之前使用插入意图锁。

例如: 模式:

 CREATE TABLE `temp_table` (
    `id` INT NOT NULL AUTO_INCREMENT,
    `u_id` INT NULL DEFAULT NULL,
    PRIMARY KEY (`id`),
    INDEX `u_id` (`u_id`)
 )
 COLLATE='latin1_swedish_ci'
 ENGINE=InnoDB;

假设我在表中有2、4、7个u_id。现在我正在从不同的会话进行两个插入操作。

以下是可能的情况。

情况1:

T1:insert into temp_table (u_id) values ('5');

T2:insert into temp_table (u_id) values ('6');

此事务首先应用间隙插入意向锁,并检查应分配哪种类型的锁(S或X)。由于5和6适合于间隙(4-7),因此将为T1和T2分配独占锁。

情况2:

T1:insert into temp_table (u_id) values ('5');

T2:insert into temp_table (u_id) values ('5');

在这里,当插入意向锁定时,T1将适合间隙(4-7),并且它将被分配独占锁,但是T2不适合间隙,因为间隙已更新为(5-7),因为独占锁已经分配给T1,所以T2被分配共享锁并等待T1完成。

如果T1成功完成,则T2将被视为重复键错误,但如果由于任何原因T1失败,则T2将被独占锁定并尝试插入记录。

有关更多信息,请参考Mysql指南: https://dev.mysql.com/doc/refman/5.7/en/innodb-locking.html


它没有解释在两个事务的情况下死锁可能发生的方式(不像AndySavage提供了三个事务的情况)。 - Oleg Kuralenko
请访问以下链接了解有关死锁的更多信息。https://www.percona.com/blog/2014/10/28/how-to-deal-with-mysql-deadlocks/ - Bhushan

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接