如何在CircleCI上运行jest测试时避免Postgres死锁?

10
当我在CircleCI上运行测试时,它会多次记录以下消息,并最终导致测试失败,因为由于死锁,数据库方法无法检索数据:
{
  "message": "Error running raw sql query in pool.",
  "stack": "error: deadlock detected\n    at Connection.Object.<anonymous>.Connection.parseE (/home/circleci/backend/node_modules/pg/lib/connection.js:567:11)\n    at Connection.Object.<anonymous>.Connection.parseMessage (/home/circleci/-backend/node_modules/pg/lib/connection.js:391:17)\n    at Socket.<anonymous> (/home/circleci/backend/node_modules/pg/lib/connection.js:129:22)\n    at emitOne (events.js:116:13)\n    at Socket.emit (events.js:211:7)\n    at addChunk (_stream_readable.js:263:12)\n    at readableAddChunk (_stream_readable.js:250:11)\n    at Socket.Readable.push (_stream_readable.js:208:10)\n    at TCP.onread (net.js:597:20)",
  "name": "error",
  "length": 316,
  "severity": "ERROR",
  "code": "40P01",
  "detail": "Process 1000 waits for AccessExclusiveLock on relation 17925 of database 16384; blocked by process 986.\nProcess 986 waits for RowShareLock on relation 17870 of database 16384; blocked by process 1000.",
  "hint": "See server log for query details.",
  "file": "deadlock.c",
  "line": "1140",
  "routine": "DeadLockReport",
  "level": "error",
  "timestamp": "2018-10-15T20:54:29.221Z"
}

这是我运行的测试命令:jest --logHeapUsage --forceExit --runInBand

  • 我还尝试了这个:jest --logHeapUsage --forceExit --maxWorkers=2

几乎所有的测试都会运行某种数据库函数。当我们添加更多测试时,出现了这个问题。有其他人遇到过相同的问题吗?


我在进行基于GraphQL快照的测试时遇到了同样的问题,需要发送/期望响应的大量查询/变异,然后PostgreSQL开始死锁,你已经找到解决方法了吗? - SkyzohKey
1个回答

1

根据我们得到的错误信息,由于行共享锁而导致死锁;

这意味着两个事务(我们称其为transactionOne和transactionTwo)已经锁定了另一个事务需要的资源。

例子:

transactionOne locks record in UserTable with userId = 1
transactionTwo locks record in UserTable with userId = 2

transactionOne attempts to update in UserTable for userId = 2, but since it is locked by another transaction - it waits for the lock to be released

transactionTwo attempts to update in UserTable for userId = 1, but since it is locked by another transaction - it waits for the lock to be released

Now the SQL engine detects that there is a deadlock and randomly picks one of the transactions and terminates it.

Lets say the SQL engine picks transactionOne and terminates it. This will result in the exception that is posted in the question.

transactionTwo is now allowed to perform an update in UserTable for user with userId = 1. 
transactionTwo completes with success

SQL引擎在检测死锁方面非常快,异常将会立即出现。

这就是死锁的原因。死锁可能有不同的根本原因。

我看到您使用了pg插件。请确保正确使用事务:pg node-postgres transactions

我怀疑有几个不同的根本原因及其解决方案:

原因1:多个测试运行在同一个数据库实例上

可能是不同的CI流水线针对同一个Postgres实例执行相同的测试

解决方案:

这是最不可能的情况,但是CI流水线应该在每次运行时为其自己提供单独的Postgres实例。

原因2:事务没有适当处理catch("ROLLBACK")

这意味着一些事务可能会保持活动状态并阻止其他事务。

解决方案:所有事务都应该有适当的错误处理。


  const client = await pool.connect()
  try {
    await client.query('BEGIN')

    //do what you have to do

    await client.query('COMMIT')
  } catch (e) {
    await client.query('ROLLBACK')
    throw e
  } finally {
    client.release()
  }

原因3:并发。例如:测试在并行运行时,会导致死锁。

我们正在编写可扩展的应用程序。这意味着死锁是不可避免的。我们必须为此做好准备并适当处理。

解决方案:使用“让我们再试一次”的策略。当我们在代码中检测到死锁异常时,只需重试有限次数即可。这种方法已经在我所有的生产应用程序中得到了验证,已经有十多年的历史了。

带有帮助函数的解决方案:

//Sample deadlock wrapper
const handleDeadLocks = async (action, currentAttepmt = 1 , maxAttepmts = 3) {
  try {
    return await action();
  } catch (e) {
    //detect it is a deadlock. Not 100% sure whether this is deterministic enough
    const isDeadlock = e.stack?.includes("deadlock detected");
    const nextAttempt = currentAttepmt + 1;
    if (isDeadlock && nextAttempt <= maxAttepmts) {
      //try again
      return await handleDeadLocks(action, nextAttempt, maxAttepmts); 
    } else {
      throw e;
    }

  }
}

//our db access functions
const updateUserProfile = async (input) => {
  return handleDeadLocks(async () => {
    //do our db calls
  });
};

如果代码变得过于复杂/嵌套,我们可以尝试使用高阶函数来实现另一种解决方案。

const handleDeadLocksHOF = (funcRef, maxAttepmts = 3) {
  return async (...args) {
    const currentAttepmt = 1;
    while (currentAttepmt <= maxAttepmts) {
      try {
        await funcRef(...args);
      } catch (e) {
        const isDeadlock = e.stack?.includes("deadlock detected");
        if (isDeadlock && currentAttepmt + 1 < maxAttepmts) {
          //try again
          currentAttepmt += 1;
        } else {
           throw e;
        }
      }
    }
  }
}



// instead of exporting the updateUserProfile we should export the decorated func, we can control how many retries we want or keep the default

// old code:

export const updateUserProfile = (input) => {
  //out legacy already implemented data access code
}

// new code
const updateUserProfileLegacy = (input) => {
  //out legacy already implemented data access code
}

export const updateUserProfile = handleDeadLocksHOF(updateUserProfile)



网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接