在一个表格中查找重叠的(日期/时间)行

14

我有一个表,每一行存储了一个带有开始日期/时间和结束日期/时间的会议。

meetingID int
meetingStart datetime
meetingEnd datetime

期望输出:
对于每一对重叠的行,我想输出
meetingID,meetingStart,meetingID,meetingEnd

在MySQL中执行这样的查询最有效的方法是什么?


比较日期范围 - viral
4个回答

25
SELECT  m1.meetingID, m1.meetingStart, m1.meetingEnd, m2.meetingID
FROM    t_meeting m1, t_meeting m2
WHERE   (m2.meetingStart BETWEEN m1.meetingStart AND m1.meetingEnd
        OR m2.meetingEnd BETWEEN m1.meetingStart AND m1.meetingEnd)
        AND m1.meetingID <> m2.meetingID

这会重复选择每对元素。

如果你想只选择每对元素一次,请使用:

SELECT  m1.meetingID, m1.meetingStart, m1.meetingEnd, m2.meetingID
FROM    t_meeting m1, t_meeting m2
WHERE   (m2.meetingStart BETWEEN m1.meetingStart AND m1.meetingEnd
        OR m2.meetingEnd BETWEEN m1.meetingStart AND m1.meetingEnd)
        AND m2.meetingID > m1.meetingID

确保在查询中对 meetingStartmeetingEnd 创建了索引,以使查询运行更加高效。

然而,MySQL 可能会使用 INDEX MERGE 来运行此查询,但是在目前的实现方式下这不是非常高效的。

你也可以尝试使用:

SELECT  m1.*, m2.*
FROM    (
        SELECT  m1.meetingID AS mid1, m2.meetingID AS mid2
        FROM    t_meeting m1, t_meeting m2
        WHERE   m2.meetingStart BETWEEN m1.meetingStart AND m1.meetingEnd
                AND m2.meetingID <> m1.meetingID
        UNION
        SELECT  m1.meetingID, m2.meetingID
        FROM    t_meeting m1, t_meeting m2
        WHERE   m2.meetingEnd BETWEEN m1.meetingStart AND m1.meetingEnd
                AND m2.meetingID <> m1.meetingID
        ) mo, t_meeting m1, t_meeting m2
WHERE   m1.meetingID = mid1
        AND m2.meetingID = mid2

还有一种更为复杂的方法,但很可能会运行得更快。


对于最后一个查询,或许加上“AND m2.meetingID > m1.meetingID”可以避免重复。 - miccet
这个方法是可行的,但在会议的上下文中,可能会得到不想要的结果。例如,如果你有一个从下午5点到6点和一个从下午6点到7点的会议,它们都会匹配,因为它们都有6点。 - Señor Reginold Francis
@Señor Reginold Francis,如何使它们不在下午6点匹配?我们需要加上+-1分钟,对吗? - Oleg Abrazhaev

2

尝试使用此查询。这是Quassnoi的解决方案,已修改为忽略一个预订的结束与另一个预订的开始相同的情况。

SELECT  m1.meetingID_id, m1.meetingStart , m1.meetingEnd, m2.meetingID_id
FROM    bookings m1, bookings m2
WHERE   (m2.meetingStart BETWEEN m1.start AND DATE_SUB(m1.meetingEnd, INTERVAL 1 second)
        OR DATE_SUB(m2.meetingEnd, INTERVAL 1 second) BETWEEN m1.meetingStart AND m1.end)
        AND m1.meetingID_id > m2.meetingID_id

0

可能是这样的:

SELECT m1.meetingID, m2.meetingID
FROM meeting AS m1, meeting AS m2
WHERE m1.meetingID < m2.meetingID
    AND m1.meetingStart BETWEEN m2.meetingStart AND m2.meetingEnd
    OR m1.meetingEnd BETWEEN m2.meetingStart AND m2.meetingEnd

通过仅选择 m1.meetingID < m2.meetingID,您不会将行与其自身进行比较,并且不会获得重复项,因为每行都会被连接两次(m1,m2)和(m2,m1)。

0
将两个会议的开始和结束时间添加到结果行中:
SELECT m1.meetingID AS firstID, m1.meetingStart AS firstStart, 
m1.meetingEnd AS firstEnd, m2.meetingID AS secondID, 
m2.meetingStart AS secondStart, m2.meetingEnd AS secondEnd 
FROM meeting AS m1, meeting AS m2 
WHERE (m2.meetingStart BETWEEN m1.meetingStart AND m1.meetingEnd) 
AND (m1.meetingID != m2.meetingID)

这样,m2将始终是在m1之后或同时开始的, 而m1.id!= m2.id确保它不会包含与自身匹配的内容。

您无需检查会议结束时间,因为可以通过仅比较会议开始时间可靠地检测到重叠。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接