如何清理SSISDB?

30

我设置时忽略了数据保留期,现在我的数据库变得非常庞大,因此想要减小它的大小。如果我简单地更改保留期(原为365天),那么会导致SSIS运行我的程序包出现问题。我甚至试着逐步更改保留期,但删除语句会创建锁,阻止新作业的运行。

有什么解决方法吗?我考虑过只是创建一个新的SSISDB。

4个回答

45

Phil Brammer在他的文章Catalog Indexing Recommendations中涵盖了与SSIS目录的维护相关的问题,包括以下内容。

根本问题

根本问题是MS试图以RI为前提设计SSIS,但他们太懒了,允许级联删除发生而不是明确处理它们。

开箱即用,新的SSIS 2012目录数据库(SSISDB)应用了一些基本索引,并将引用完整性设置为在大多数表之间执行级联删除。

输入SQL Agent作业“SSIS Server Maintenance Job”。默认情况下,此作业设置为每天午夜运行,并使用两个目录参数来运行:“定期清理日志”和“保留期(天)”。当这些设置时,维护作业会清除任何超出指定保留期的数据。

此维护作业循环中每次从internal.operations中删除10条记录,然后级联到许多下游表。在我们的情况下,我们每天需要删除大约3000个操作记录(每次10个!),这相当于从internal.operation_messages中删除160万行。那只是一个下游表!整个过程完全,彻底地锁定了SSISDB数据库,无法进行任何SELECT/INSERT数据操作。

解决方案

在MS改变工作方式之前,支持的选项是

将维护作业计划移动到适合您环境的更合适的时间

我知道在我的当前客户端,我们只在凌晨加载数据,因此在营业时间内SSISDB很安静。

如果在较安静的时期运行维护作业不是一个选项,那么您需要编写自己的删除语句,以尝试使级联删除“更少吸气”。

在我的当前客户端,我们已经每晚运行大约200个包裹10个月,并且也有365天的历史记录。我们最大的表,按数量级排序为。

Schema    Table                   RowCount
internal  event_message_context   1,869,028
internal  operation_messages      1,500,811
internal  event_messages          1,500,803

所有数据的驱动程序internal.operations仅有3300行,这与Phil的评论相符,即此类数据的增长呈指数级。

因此,请确定要清除的operation_id,并从叶子表开始逐步删除,直到核心的internal.operations表。

USE SSISDB;
SET NOCOUNT ON;
IF object_id('tempdb..#DELETE_CANDIDATES') IS NOT NULL
BEGIN
    DROP TABLE #DELETE_CANDIDATES;
END;

CREATE TABLE #DELETE_CANDIDATES
(
    operation_id bigint NOT NULL PRIMARY KEY
);

DECLARE @DaysRetention int = 100;
INSERT INTO
    #DELETE_CANDIDATES
(
    operation_id
)
SELECT
    IO.operation_id
FROM
    internal.operations AS IO
WHERE
    IO.start_time < DATEADD(day, -@DaysRetention, CURRENT_TIMESTAMP);

DELETE T
FROM
    internal.event_message_context AS T
    INNER JOIN
        #DELETE_CANDIDATES AS DC
        ON DC.operation_id = T.operation_id;

DELETE T
FROM
    internal.event_messages AS T
    INNER JOIN
        #DELETE_CANDIDATES AS DC
        ON DC.operation_id = T.operation_id;

DELETE T
FROM
    internal.operation_messages AS T
    INNER JOIN
        #DELETE_CANDIDATES AS DC
        ON DC.operation_id = T.operation_id;

-- etc
-- Finally, remove the entry from operations

DELETE T
FROM
    internal.operations AS T
    INNER JOIN
        #DELETE_CANDIDATES AS DC
        ON DC.operation_id = T.operation_id;

常见的警告适用

  • 不要相信来自互联网上的陌生人的代码
  • 使用ssistalk和/或系统表中的图表来识别所有依赖关系
  • 您可能需要将删除操作仅分割为较小的操作
  • 您可能会因为操作而受益,但请务必重新启用带有检查选项的RI,以便它们得到信任。
  • 如果操作持续时间超过4小时,请咨询您的dba

2020年7月编辑

Tim Mitchell在SSIS目录自动清理更好的清理SSIS目录数据库上有一组很好的文章,以及他的新书SSIS目录:安装、管理、安全和监视您的企业ETL基础架构

@Yong Jun Kim在评论中指出

如果您正在使用Azure Data Factory中的SSIS IR,则可能会出现SSIS DB具有不同的表名,后面带有scaleout。它可以是internal.event_message_context_scaleout而不是internal.event_message_context。它可以是internal.operations_messages_scaleout而不是internal.operations_messages。只需相应地修改代码中的表名,它就可以正常运行

如果您正在Azure Data Factory中使用SSIS IR,这当然是正确的。您会发现“正常”的表仍然存在但为空,而*_scaleout版本包含所有数据。

参考文献


这非常有帮助。我可以清空数据库,从头开始吗?虽然会失去历史记录,但还有其他什么不利因素吗? - Paul Riker
只要你手头有 .ispac 文件和环境设置文件,或者不介意重新创建,"drop and recreate" 就能起作用。 - billinkc
2
应该更新,以包括https://support.microsoft.com/en-us/kb/2829948。 - Martin Smith
2
@MartinSmith 已完成,谢谢。顺便说一句,时间非常完美,因为我必须清理一个2012 SSISDB,它出了问题。 - billinkc
1
现在 SSIS DB 可能会在表名后面加上 scaleout,例如 internal.event_message_context 可能变成 internal.event_message_context_scaleout,internal.operations_messages 可能变成 internal.operations_messages_scaleout。只需相应地修改代码中的表名,就可以正常运行了。 - Yong Jun Kim
太好了,@YongJunKim 抓住了重点。我还没有涉及扩展方面的内容,所以这非常有用。 - billinkc

1
我创建了一个类似的存储过程来进行归档,如下所示。如果有任何错误,请告诉我。我不提供保证,但对我来说它很好用。这段代码并不完美,但我想分享一下(例如使用临时表意味着只能同时运行一次,也许会更好地使用会话作用域表)。
我曾经遇到一个问题,即 MS 自己的清理过程会使 LDF 文件爆炸,并锁定表很长时间,有一次导致服务器空间耗尽。我决定先删除较大的表,然后再删除操作表来解决这个问题。下面的这个过程在日志空间中从不使用超过 1GB,并且不会锁定表太长时间,这是 SSIS 作业必须全天运行的一个进退两难的情况。
首先,它将日志记录到一个表中。
CREATE TABLE [dbo].[ETL_SSIS_Operations_Archived](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [operation_id_str] [varchar](900) NOT NULL,
    [event_messages_context] [int] NULL,
    [event_messages] [int] NULL,
    [operation_messages] [int] NULL,
    [num_operators] [int] NULL,
    [chunksize] [int] NULL,
    [DateStarted] [datetime] NOT NULL,
    [DateFinished] [datetime] NULL,
    [executionSecs] [int] NULL,
    [DelOperationsDateStarted] [datetime] NULL,
    [DelOperationsDateFinished] [datetime] NULL,
    [DelOperationsExecutionSecs] [int] NULL
) ON [PRIMARY]
GO

并使用临时表

CREATE TABLE [dbo].[tmp_etl_operations_id](
    [operation_id] [int] NULL,
    [dateCreated] [datetime] NULL default getdate()
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[tmp_etl_operations_id] ADD  DEFAULT (getdate()) FOR [dateCreated]
GO

这是程序的步骤。
    CREATE PROCEDURE [dbo].[sp_Archive_SSIDB_Catalogue]
AS
BEGIN

    DECLARE @MyCursor as CURSOR;
    DECLARE @l_operation_id int;
    declare @l_rows_del int = 1

    declare @l_operation_id_str varchar(8000) = ''
    declare @l_id int

    declare @l_event_message_context int = 0
    declare @l_event_messages        int = 0
    declare @l_operation_messages    int = 0

    declare @l_loop_num int = 1
    declare @C_BULK_NUM int = 100
    declare @C_CHUNK_SIZE int = 100000

    declare @l_last_rec char(1)

    SET @MyCursor = CURSOR FOR
       with params as
       (
           -- i round up the midnight that day, just so i know once it is done for the day it is done
           -- and if the real maintenance job was to run after this (just for the sake of it to double ensure nothing has been missed), but not actually need to do
           -- anything as its already done in here, no new operations would have snuck in due to the sliding system time
           SELECT cast(dateadd(day,1,GETDATE() - CONVERT(int,property_value)) as date)  ArchiveDate 
           FROM  ssisdb.[catalog].[catalog_properties]
           WHERE property_name = 'RETENTION_WINDOW'
       )
       select operation_id,iif(r=c,'Y','N') lastrec
       from
       (
           select operation_id,row_number() over (partition by null order by operation_id) r,count(*) over (partition by null) c
           FROM ssisdb.[internal].[operations] 
           WHERE ( [end_time] <= (select ArchiveDate from params)
           -- A special case when END_TIME is null, we will delete the records based on the created time 
           OR ([end_time] IS NULL AND [status] = 1 AND [created_time] <= (select ArchiveDate from params) ))
       ) x
       order by operation_id


    OPEN @MyCursor;
    FETCH NEXT FROM @MyCursor INTO @l_operation_id,@l_last_rec

    WHILE @@FETCH_STATUS = 0
    BEGIN
        set @l_operation_id_str = @l_operation_id_str+','+cast(@l_operation_id as varchar(100))

        if @l_loop_num = 1
        begin
           delete from tmp_etl_operations_id
           set @l_operation_id_str = cast(@l_operation_id as varchar(100))
        end

        insert into tmp_etl_operations_id (operation_id)  select @l_operation_id

        if @l_loop_num = @C_BULK_NUM or @l_last_rec='Y'
        begin
            set @l_loop_num = 1

            set @l_event_message_context = 0
            set @l_event_messages        = 0
            set @l_operation_messages    = 0

            insert into ETL_SSIS_Operations_Archived ([operation_id_str], num_operators,chunksize, event_messages_context, event_messages, operation_messages, datestarted)
            select @l_operation_id_str, @C_BULK_NUM,@C_CHUNK_SIZE,@l_event_message_context,@l_event_messages,@l_operation_messages,getdate()
            --where 0 = (select count(*) from ETL_SSIS_Operations_Archived where operation_id=@l_operation_id_str)

            set @l_id = Scope_Identity() 

            set @l_rows_del = @C_CHUNK_SIZE
            while (@l_rows_del >= @C_CHUNK_SIZE)
            begin
             delete top (@C_CHUNK_SIZE)
             from   ssisdb.internal.event_message_context
             where  operation_id in (select operation_id from etl..tmp_etl_operations_id)

             set @l_rows_del = @@ROWCOUNT
             set @l_event_message_context = @l_event_message_context+@l_rows_del

             update ETL_SSIS_Operations_Archived 
             set    event_messages_context = event_messages_context+@l_rows_del
             where  id = @l_id--operation_id = @l_operation_id_str

            end

            set @l_rows_del = @C_CHUNK_SIZE

            while (@l_rows_del >= @C_CHUNK_SIZE)
            begin
             delete top (@C_CHUNK_SIZE)
             from   ssisdb.internal.event_messages
             where  operation_id in (select operation_id from tmp_etl_operations_id)

             set @l_rows_del = @@ROWCOUNT
             set @l_event_messages = @l_event_messages+@l_rows_del

             update ETL_SSIS_Operations_Archived 
             set    event_messages = event_messages+@l_rows_del
             where  id = @l_id--operation_id = @l_operation_id_strwhere  operation_id = @l_operation_id_str 

            end

            set @l_rows_del = @C_CHUNK_SIZE
            while (@l_rows_del >= @C_CHUNK_SIZE)
            begin
             delete top (@C_CHUNK_SIZE)
             from   ssisdb.internal.operation_messages
             where  operation_id in (select operation_id from tmp_etl_operations_id)

             set @l_rows_del = @@ROWCOUNT
             set @l_operation_messages = @l_operation_messages+@l_rows_del

             update ETL_SSIS_Operations_Archived 
             set    operation_messages = operation_messages+@l_rows_del
             where  id = @l_id--operation_id = @l_operation_id_strwhere  operation_id = @l_operation_id_str -- 

            end

             update ETL_SSIS_Operations_Archived 
             set    DateFinished = getdate()
                   ,executionSecs =  Datediff(s, DateStarted, getdate())
                   ,DelOperationsDateStarted = getdate()
             where  id = @l_id--operation_id = @l_operation_id_strwhere  operation_id = @l_operation_id_str -- 


             -- lets delete the operations now
             delete --top (@C_CHUNK_SIZE)
             from   ssisdb.internal.operations
             where  operation_id in (select operation_id from tmp_etl_operations_id)

             update ETL_SSIS_Operations_Archived 
             set    DelOperationsDateFinished = getdate()
                   ,DelOperationsExecutionSecs =  Datediff(s, DelOperationsDateStarted, getdate())
             where  id = @l_id--operation_id = @l_operation_id_strwhere  operation_id = @l_operation_id_str -- 

        end
        else
        begin
            set @l_loop_num = @l_loop_num+1
        end

        FETCH NEXT FROM @MyCursor INTO @l_operation_id,@l_last_rec


    END

    CLOSE @MyCursor;
    DEALLOCATE @MyCursor;

END

1

从这篇文章中找到了一个有用的清理脚本:SQL SERVER – Huge Size of SSISDB – Catalog Database SSISDB Cleanup Script

    USE SSISDB
    GO
    DELETE FROM [internal].[executions] 
    GO
    DELETE FROM [internal].[executable_statistics]
    GO
    DELETE FROM [internal].[execution_component_phases] 
    GO
    DELETE FROM [internal].[execution_data_statistics] 
    GO
    DELETE FROM [internal].[execution_data_taps] 
    GO
    DELETE FROM [internal].[execution_parameter_values]
    GO
    DELETE FROM [internal].[execution_property_override_values]
    GO
    DELETE FROM [internal].[extended_operation_info]
    GO
    DELETE FROM [internal].[operation_messages]
    GO
    DELETE FROM [internal].[event_messages]
    GO
    DELETE FROM [internal].[event_message_context]
    GO
    DELETE FROM [internal].[operation_os_sys_info]
    GO
    DELETE FROM [internal].[operation_permissions]
    GO
    DELETE FROM [internal].[validations]
    GO

还要考虑将数据库切换到简单恢复模式,但要为日志文件留足够的空间。

-1

运行此存储过程:

EXEC [internal].[cleanup_server_retention_window]

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接