从多个重叠时间间隔中查找经过的时间的SQL

10

不使用 MSSQL、DB2 或 Oracle。 没有 CTE。 没有 OVERLAP 谓词。 没有 INTERVAL 数据类型。 情况:在需要修理的车辆上,工作不能开始,直到为该工作订购的所有零件都已到货。 在开始维修之前,可能会多次订购零件。 我们需要提取车辆处于“零件保留状态”下的时间。

因此,对于标识为 id = 1 的车辆, 零件在 4 个不同的时间(d1)被订购并接收(d2)。

    ID     d1     d2
     1     8/1    8/8
     1     8/2    8/6
     1     8/12   8/14
     1     8/3    8/10

 8/1                             8/8
  d1                              d2   
  |-------------------------------|  
         8/2             8/6                    8/12      8/14                  
         d1               d2                     d1        d2     
          |---------------|                      |----------|    
                   8/3                 8/10
                   d1                    d2
                   |---------------------|   
 8/1                                                       8/14
  |---------------------------------------------------------|  = 13 days
                                        8/10    8/12
  |--------------------------------------|    +  |----------|  = parts hold  = 11 days

从上面可以看出,开始工作的等待时间(假设8/1为该车辆可用于工作的日期)为13天。 等待零件的实际时间为11天,这是我们需要从数据中推导出来的数字。 实际的日期时间数据将是我们提取小时数的时间戳,我们在此示例数据中使用日期仅为简单演示。 我们正在努力生成一个基于集合的解决方案(不是psm、udf或游标)。祝好。


我认为在辅助日历表上进行左连接可能会有所帮助。 - Martin Smith
可能是如何在一组日期范围中找到间隙的好方法?的重复问题。 - Brian Roach
@Brian,这个问题有很大的不同。楼主,你能否添加一个视图来辅助查询? - Derek Kromm
如果您查看您的“绘图”示例,可以尝试一种解决方案,该方案将计算每个日期未交付订单的数量。8/1-8/2 = 2,8/2-8/3 = 3,8/3-8/6 = 4等等。当您到达8/10-8/12时,您将没有未完成的交付。计算具有非零值的日期,并给出等待天数。 - Frode N. Rosand
3个回答

8

我无法让 @Alex W 的查询正常工作。它不是标准的 SQL 语句,因此需要大量修改以兼容 SQL Server(我可以测试)。但它确实启发了我,我进行了扩展。


查找每个连续等待期的起始点:

SELECT DISTINCT
    t1.ID,
    t1.d1 AS date,
    -DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d1) AS n
FROM Orders t1
LEFT JOIN Orders t2                   -- Join for any events occurring while this
    ON t2.ID = t1.ID                  -- is starting. If this is a start point,
    AND t2.d1 <> t1.d1                -- it won't match anything, which is what
    AND t1.d1 BETWEEN t2.d1 AND t2.d2 -- we want.
GROUP BY t1.ID, t1.d1, t1.d2
HAVING COUNT(t2.ID) = 0

以及端点的对应项:

SELECT DISTINCT
    t1.ID,
    t1.d2 AS date,
    DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d2) AS n
FROM Orders t1
LEFT JOIN Orders t2
    ON t2.ID = t1.ID
    AND t2.d2 <> t1.d2
    AND t1.d2 BETWEEN t2.d1 AND t2.d2
GROUP BY t1.ID, t1.d1, t1.d2
HAVING COUNT(t2.ID) = 0

n 是距离某个共同时间点的天数。起始点具有负值,而结束点具有正值。这是为了方便我们将它们相加以得到它们之间的天数。

span = end - start
span = end + (-start)
span1 + span2 = end1 + (-start1) + end2 + (-start2)

最后,我们只需要把所有的东西加起来:
SELECT ID, SUM(n) AS hold_days
FROM (
   SELECT DISTINCT
       t1.id,
       t1.d1 AS date,
       -DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d1)  AS n
   FROM Orders t1
   LEFT JOIN Orders t2
      ON t2.ID = t1.ID
      AND t2.d1 <> t1.d1
      AND t1.d1 BETWEEN t2.d1 AND t2.d2
   GROUP BY t1.ID, t1.d1, t1.d2
   HAVING COUNT(t2.ID) = 0
   UNION ALL
   SELECT DISTINCT
       t1.id,
       t1.d2 AS date,
       DATEDIFF(DAY, (SELECT MIN(d1) FROM Orders), t1.d2) AS n
   FROM Orders t1
   LEFT JOIN Orders t2
      ON t2.ID = t1.ID
      AND t2.d2 <> t1.d2
      AND t1.d2 BETWEEN t2.d1 AND t2.d2
   GROUP BY t1.ID, t1.d1, t1.d2
   HAVING COUNT(t2.ID) = 0
   ORDER BY ID, date
) s
GROUP BY ID;

输入表(订单):

ID   d1           d2
 1   2011-08-01   2011-08-08
 1   2011-08-02   2011-08-06
 1   2011-08-03   2011-08-10
 1   2011-08-12   2011-08-14
 2   2011-08-01   2011-08-03
 2   2011-08-02   2011-08-06
 2   2011-08-05   2011-08-09

输出:

ID   hold_days
 1          11
 2           8

或者,您可以使用存储过程完成此操作。

CREATE PROCEDURE CalculateHoldTimes
    @ID int = 0
AS
BEGIN
    DECLARE Events CURSOR FOR
    SELECT *
    FROM (
        SELECT d1 AS date, 1 AS diff
        FROM Orders
        WHERE ID = @ID
        UNION ALL
        SELECT d2 AS date, -1 AS diff
        FROM Orders
        WHERE ID = @ID
    ) s
    ORDER BY date;

    DECLARE @Events_date date,
            @Events_diff int,
            @Period_start date,
            @Period_accum int,
            @Total_start date,
            @Total_count int;

    OPEN Events;

    FETCH NEXT FROM Events
    INTO @Events_date, @Events_diff;

    SET @Period_start = @Events_date;
    SET @Period_accum = 0;
    SET @Total_start = @Events_date;
    SET @Total_count = 0;

    WHILE @@FETCH_STATUS = 0
    BEGIN
        SET @Period_accum = @Period_accum + @Events_diff;

        IF @Period_accum = 1 AND @Events_diff = 1
            -- Start of period
            SET @Period_start = @Events_date;
        ELSE IF @Period_accum = 0 AND @Events_diff = -1
            -- End of period
            SET @Total_count = @Total_count +
                DATEDIFF(day, @Period_start, @Events_date);

        FETCH NEXT FROM Events
        INTO @Events_date, @Events_diff;
    END;

    SELECT
        @Total_start AS d1,
        @Events_date AS d2,
        @Total_count AS hold_time;
END;

使用以下命令调用:

EXEC CalculateHoldTimes 1;

谢谢MizardX,这正是我们正在寻找的。 - jon
现在你可以了。 :) 无论如何,如果你认为它回答了你的问题,你仍然可以接受一个答案。点击投票箭头下面的勾号。 - Markus Jarderot
MizardX:Alex回复中提到的SQL方言是Sybase Advantage使用的方言,而我们使用的就是这个产品,因此他的回复让我感到非常高兴。 - jon
@MarkusJarderot,如果您有时间,我有几个问题:https://stackoverflow.com/questions/48531322/sql-counting-days-with-gap-overlapping?noredirect=1#comment84058272_48531322 - George

5

这个SQL语句似乎可以得到你想要的结果(t是样本表的表名):

SELECT
   d.id, 
   d.duration, 
   d.duration - 
   IFNULL(
      ( SELECT Sum( timestampdiff( SQL_TSI_DAY, 
                                   no_hold.d2, 
                                   ( SELECT min(d1) FROM t t4 
                                     WHERE t4.id = no_hold.id and t4.d1 > no_hold.d2 )))
        FROM ( SELECT DISTINCT id, d2 FROM t t1 
               WHERE ( SELECT sum( IIF( t1.d2 between t2.d1 and t2.d2, 1, 0 ) ) 
                       FROM t t2 WHERE t2.id = t1.id and t2.d2 <> t1.d2 ) = 0 
             And d2 <> ( select max( d2 ) from t t3 where t3.id = t1.id )) no_hold
        WHERE no_hold.id = d.id ),
      0 ) "parts hold"
FROM 
   ( SELECT id, timestampdiff( SQL_TSI_DAY, min( d1 ), max( d2 ) ) duration
     FROM t GROUP BY id ) d

外部查询获取维修工作的持续时间。复杂的子查询计算出未等待零件的总天数。这是通过定位车辆不等待零件的开始日期来完成的,然后计算到它再次开始等待零件的天数:

// 1) The query for finding the starting dates when the vehicle is not waiting for parts, 
// i.e. finding all d2 that is not within any date range where the vehicle is waiting for part.
// The DISTINCT is needed to removed duplicate starting "no hold" period.

SELECT DISTINCT id, d2 
FROM t t1
WHERE ( SELECT sum( IIF( t1.d2 between t2.d1 and t2.d2, 1, 0 ) ) from t t2 
        WHERE t2.id = t1.id and t2.d2 <> t1.d2 ) = 0 AND 
      d2 <> ( SELECT max( d2 ) FROM t t3 WHERE t3.id = t1.id ) )

// 2) 车辆不等待零件的日期是从上述查询日期到车辆再次等待零件的日期

timestampdiff( SQL_TSI_DAY, no_hold.d2, ( SELECT min(d1) FROM t t4 WHERE t4.id = no_hold.id and t4.d1 > no_hold.d2 ) )

将上述两个部分结合起来,并汇总所有这样的时间段,就可以得到该车辆不等待零件的天数。最终查询添加了一个额外条件,计算外部查询中每个id的结果。

在具有许多id的非常大的表上,这可能不是非常高效的。如果id被限制为一个或只有几个,那么它应该很好。


编辑以解决“无持有”期间重复开始日期的问题。 - Alex W
Alex:在上面的主代码块中,缺少一个右括号。 - jon
并且 d2 <> (select max(d2) from t t3 where t3.id = t1.id) - jon

0
USE [DnnMasterShoraSystem]
GO
/****** Object:  StoredProcedure [dbo].[CalculateHoldTimes]    Script Date: 12/8/2014 1:36:12 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [dbo].[CalculateHoldTimes]
    @PID int    
AS
BEGIN      
CREATE TABLE #tblTemp(
    [ID] [int] NOT NULL,
    [PID] [int] NOT NULL,
    [BID] [int] NOT NULL,
    [Active] [bit] NULL,
    [WorkStartDate] [nvarchar](10) NULL,
    [WorkEndDate] [nvarchar](10) NULL,
    [jobStateID] [int] NULL,
    [RegisterType] [int] NULL,
    [RegisterState] [int] NULL,
    [En_time] [datetime] NULL,
    [Fa_time] [nvarchar](40) NULL,
    [Status] [nvarchar](100) NULL,
    [PortalId] [int] NULL,
    [ModuleId] [int] NULL,
    [UserId] [int] NULL,
    [BrName] [nvarchar](150) NULL,
    [BrCode] [nvarchar](20) NULL,
    [WorkEndDate_New] [nvarchar](10) NULL
) ON [PRIMARY]

insert into #tblTemp
select * from [dbo].[Shora.Personel_Branch_Copy] 
        where WorkStartDate is not null 
        --and [dbo].[ShamsiToMiladi](WorkStartDate) <GETDATE() 
        --and [dbo].[ShamsiToMiladi](WorkEndDate) <GETDATE() 
        and PID=@PID
        --and [dbo].[ShamsiToMiladi](WorkEndDate)<[dbo].[ShamsiToMiladi](@NewDate)
        order by WorkStartDate

DECLARE Events CURSOR FOR
    SELECT [dbo].[ShamsiToMiladi](WorkStartDate) AS StartDate,[dbo].[ShamsiToMiladi](WorkEndDate) AS EndDate
        FROM #tblTemp        
    ORDER BY StartDate;

--drop table #tblTemp

    DECLARE @SDate date,
            @EDate date,
            @Period_Start date,
            @Period_End date,
            @Total int,
            @OldSDate date,
            @OldEDate date


    OPEN Events;

    FETCH NEXT FROM Events
    INTO @SDate, @EDate;

    set @Total=0
    SET @Period_Start =@SDate
    set @Period_End=@EDate

    WHILE @@FETCH_STATUS = 0
    BEGIN      
    if @OldSDate>@Period_End
        begin
            set @Period_Start=@SDate            

            if @Period_End>=@Period_Start
            set @Total+=DATEDIFF(DAY,@Period_Start,@Period_End)
        end
    else if @SDate<@Period_End
        begin       
        set @Period_Start=@Period_Start     
            set @Total=DATEDIFF(DAY,@Period_Start,@Period_End)
        end

        set @OldSDate=@SDate 
        set @OldEDate=@EDate

        FETCH NEXT FROM Events
        INTO @SDate, @EDate;

        if  @Period_End<@EDate
        set @Period_End=@EDate

    END;

INSERT INTO [dbo].[PersonelDays]
           (PID
           ,[Total_Start]
           ,[Total_End]
           ,[Total_count])
     VALUES
           (@PID,           
            @Period_Start,
            @Period_End,
            @Total
           )

drop table #tblTemp
CLOSE Events
DEALLOCATE Events
END;

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接