如何避免在更新大量记录时，UPDATE语句锁定整个表格

Question

如何避免在更新大量记录时，UPDATE语句锁定整个表格

sql-servert-sql

14

我对锁和提示不太熟悉。

我的表进行非常频繁的SELECT和INSERT操作，该表有1100万条记录。

我已经向其中添加了一个新列，并且需要将同一表中现有列中的数据复制到新列中。

我计划使用ROWLOCK提示来避免升级锁定到表级别锁定并阻止表上的所有其他操作。例如：

UPDATE 
    SomeTable WITH (ROWLOCK)
SET
    NewColumn = OldColumn

问题:

使用 NOLOCK 是否比使用 ROWLOCK 更好？注意，一旦记录插入表中，OldColumn 的值不会改变，因此 NOLOCK 不会导致脏读。
在这种情况下，使用 NOLOCK 是否有意义，因为SQL Server无论如何都必须获取 UPDATE 的更新锁。
是否有更好的方法来实现这个目的？

我知道应该避免使用提示，并且SQL Server通常会做出更明智的选择，但我不想在此更新期间将表锁定。

- HappyTown

您当前的语句将在单个事务中更新表中的所有记录，因此无论如何，您都将锁定事务中的所有记录。 - MikeS

我曾经看到 Kendra Little 在更新操作中使用了一个巧妙的技巧。虽然更新操作本身不支持并行处理，但是为了加快速度，她使用了 CTE 中的 SELECT 语句来实现并行读取。 - pacreely

4个回答

7

我采用了 @pacreely 的方法（请参见他在此问题上的答案），批量更新并创建了一个 update...top 变体。我添加了 (rowlock) 提示，告诉 SQL Server 保持行级别的锁定。

有关详细信息，请参见 update...top。还要注意，在使用 update、insert、merge、delete 语句时，无法使用 order by 来使用 top，因此所引用的行不会以任何顺序排列。

declare @BatchSize  int = 1000
declare @RowCount   int = @BatchSize

while @RowCount > 0
begin
    update top (@BatchSize) SomeTable with (rowlock)
    set NewColumn = OldColumn
    where 
        NewColumn <> OldColumn      or
        (
            NewColumn is null       and
            OldColumn is not null
        )
    select @RowCount = @@rowcount
end

- HappyTown

2

这个问题在StackExchange的Database Administrators网站上有答案，链接在这里：https://dba.stackexchange.com/questions/127158/how-to-get-sql-insert-and-or-update-to-not-lock-entire-table-on-ms-sql-server

- STLDev

0

我们最近遇到了一个类似的情况，但需要在几天内慢慢更新（每次运行仅更新特定数量的记录，只在某些时间段内进行）。最近的数据没问题，但数以百万计的旧数据需要被更新。我们的数据表看起来像这样：

Create Table FileContent
(
FileContent varchar(max),
File_PK bigint,
NewFileContent varchar(max)
)

我们只需要更新某些行，但是有数百万行。我们创建了一个表来存储我们的进度，以便我们可以使用定期作业来迭代更新主表，然后将此表填充为需要更新的主表记录的主键:

Create Table FilesToUpdate
(
File_PK bigint,
IsUpdated bit NOT NULL DEFAULT 0
)

然后我们安排了以下脚本来进行更新（为了您自己的使用，请根据您的系统调整批处理大小和计划安排）。

/***  
Script to update and fix records.
***/
DECLARE @Rowcount INT = 1 -- 
    ,   @BatchSize INT = 100 -- how many rows will be updated on each iteration of the loop 
    ,   @BatchesToRun INT = 25 -- the max number of times the loop will iterate
    ,   @StartingRecord BIGINT = 1;

-- Get the highest File_PK not already fixed as a starting point.
Select @StartingRecord = MAX(File_PK) From FilesToUpdate where IsUpdated = 0

-- While there are still rows to update and we haven't hit our limit on iterations... 
WHILE (@Rowcount > 0 and @BatchesToRun > 0)   
BEGIN
    print Concat('StartingRecord (Start of Loop): ', @StartingRecord)
    UPDATE FileContent SET  NewFileContent = 'New value here'
    WHERE File_PK BETWEEN (@StartingRecord - @BatchSize + 1) AND @StartingRecord;

    -- @@Rowcount is the number of records affected by the last statement.  If this returns 0, the loop will stop because we've run out of things to update.
    SET @Rowcount = @@ROWCOUNT;
    print Concat('RowCount: ', @Rowcount)

    -- Record which PKs were updated so we know where to start next time around.
    UPDATE FilesToUpdate Set IsUpdated = 1 where File_PK BETWEEN (@StartingRecord - @BatchSize + 1) AND @StartingRecord;

    -- The loop will stop after @BatchSize*@BatchesToRun records are updated. 
    -- If there aren't that many records left to update, the @Rowcount checks will stop it. 
    SELECT @BatchesToRun = @BatchesToRun - 1
    print Concat('Batches Remaining: ',@BatchesToRun)

    -- Set the starting record for the next time through the loop.
    SELECT @StartingRecord -= @BatchSize
    print Concat('StartingRecord (End of Loop): ', @StartingRecord)
END

- Jen R

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- pacreely · Accepted Answer

尝试分批更新。

DECLARE @Batch INT = 1000
DECLARE @Rowcount INT = @Batch


WHILE @Rowcount > 0
    BEGIN
        ;WITH CTE AS 
        (
            SELECT TOP (@Batch) NewColumn,OldColumn 
            FROM SomeTable 
            WHERE NewColumn <> OldColumn
                  OR (NewColumn IS NULL AND OldColumn IS NOT NULL)
        )
        UPDATE cte
            SET NewColumn = OldColumn;
        SET @Rowcount = @@ROWCOUNT
    END