PostgreSQL范围递归SQL查询以查找可用性

4

我按照这篇博客文章的步骤进行了操作:https://info.crunchydata.com/blog/range-types-recursion-how-to-search-availability-with-postgresql

CREATE TABLE travels (
    id serial PRIMARY KEY,
    travel_dates daterange NOT NULL,
    EXCLUDE USING spgist (travel_dates WITH &&)
);

我发现当我连续插入了持续时间行时,该函数存在错误

CREATE OR REPLACE FUNCTION travels_get_available_dates(daterange)
RETURNS TABLE(available_dates daterange)
AS $$
    WITH RECURSIVE calendar AS (
        SELECT
            $1 AS left,
             $1 AS center,
             $1 AS right
        UNION
        SELECT
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN daterange(lower(calendar.left), lower(travels.travel_dates * calendar.left))
                ELSE daterange(lower(calendar.right), lower(travels.travel_dates * calendar.right))
            END AS left,
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN travels.travel_dates * calendar.left
                ELSE travels.travel_dates * calendar.right
            END AS center,
            CASE travels.travel_dates && calendar.right
                WHEN TRUE THEN daterange(upper(travels.travel_dates * calendar.right), upper(calendar.right))
                ELSE daterange(upper(travels.travel_dates * calendar.left), upper(calendar.left))
            END AS right
        FROM calendar
        JOIN travels ON
            travels.travel_dates && $1 AND
            travels.travel_dates <> calendar.center AND (
                travels.travel_dates && calendar.left OR
                travels.travel_dates && calendar.right
            )
)
SELECT *
FROM (
    SELECT
        a.left AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.left <> b.left AND
        a.left @> b.left
    GROUP BY a.left
    HAVING NOT bool_or(COALESCE(a.left @> b.left, FALSE))
    UNION
    SELECT
        a.right AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.right <> b.right AND
        a.right @> b.right
    GROUP BY a.right
    HAVING NOT bool_or(COALESCE(a.right @> b.right, FALSE))
) a
$$ LANGUAGE SQL STABLE;


INSERT INTO travels (travel_dates)
VALUES
    (daterange('2018-03-02', '2018-03-02', '[]')),
    (daterange('2018-03-06', '2018-03-09', '[]')),
    (daterange('2018-03-11', '2018-03-12', '[]')),
    (daterange('2018-03-16', '2018-03-17', '[]')),
    (daterange('2018-03-25', '2018-03-27', '[]'));

目前这个方案按预期工作。

SELECT *
FROM travels_get_available_dates(daterange('2018-03-01', '2018-04-01'))
ORDER BY available_dates;

available_dates
-------------------------
[2018-03-01,2018-03-02)
[2018-03-03,2018-03-06)
[2018-03-10,2018-03-11)
[2018-03-13,2018-03-16)
[2018-03-18,2018-03-25)
[2018-03-28,2018-04-01)

但是当添加这一行时:

INSERT INTO travels (travel_dates)
VALUES
(daterange('2018-03-03', '2018-03-05', '[]'));

然后重新运行

SELECT *
FROM travels_get_available_dates(daterange('2018-03-01', '2018-04-01'))
ORDER BY available_dates;

我理解

available_dates
-------------------------
empty
4个回答

3

我在原始博客文章中添加了一条评论,说明了我认为错误出现的位置,即在处理空范围的方式上。

当日期范围是连续的,或者说相邻的时候,在左侧和/或右侧的"空"范围中会产生"空"范围。现在,在递归CTE完成后(假设空范围位于左侧列中),在"LEFT OUTER JOIN ... ON ..."子句中,一个自由且有效的travel_date与B.left范围中的一个"空"范围匹配,因为A.left <> 'empty' && A.left @> 'empty',而所有范围都显然包含空范围。理想情况下,它应该与NULL配对,因为这是左外连接,以便将其包含在最终结果集中,但是"empty"有点妨碍了这个过程。"empty"再次在"GROUP BY ... HAVING ..."子句中出现,在这里a.left @> 'empty'评估为真并被否定,因此所有有效的旅行日期都被丢弃,导致一个空表。我的解决方案如下:将"empty"作为空值,并丢弃任何位于"center"的日期范围:

CREATE OR REPLACE FUNCTION travels_get_available_dates(daterange)
RETURNS TABLE(available_dates daterange)
AS $$
    WITH RECURSIVE calendar AS (
        SELECT
            $1 AS left,
             $1 AS center,
             $1 AS right
        UNION
        SELECT
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN daterange(lower(calendar.left), lower(travels.travel_dates * calendar.left))
                ELSE daterange(lower(calendar.right), lower(travels.travel_dates * calendar.right))
            END AS left,
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN travels.travel_dates * calendar.left
                ELSE travels.travel_dates * calendar.right
            END AS center,
            CASE travels.travel_dates && calendar.right
                WHEN TRUE THEN daterange(upper(travels.travel_dates * calendar.right), upper(calendar.right))
                ELSE daterange(upper(travels.travel_dates * calendar.left), upper(calendar.left))
            END AS right
        FROM calendar
        JOIN travels ON
            travels.travel_dates && $1 AND
            travels.travel_dates <> calendar.center AND (
                travels.travel_dates && calendar.left OR
                travels.travel_dates && calendar.right
            )
)
SELECT *
FROM (
    SELECT
        a.left AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.left <> b.left AND
        a.left @> b.left
    GROUP BY a.left
    HAVING NOT bool_or(coalesce(a.left @> case when isempty(b.left) then null else b.left end, FALSE))

    UNION

    SELECT
        a.right AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.right <> b.right AND
        a.right @> b.right
    GROUP BY a.right
    HAVING NOT bool_or(coalesce(a.right @> case when isempty(b.right) then null else b.right end, false))

    EXCEPT

    SELECT a.center AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.center <> b.center AND
        a.center @> b.center
    GROUP BY a.center
    HAVING NOT bool_or(COALESCE(a.center @> b.center, FALSE))
) a
WHERE NOT isempty(a.available_dates)
$$ LANGUAGE SQL STABLE;

0

我认为你应该采取另一种方法:

CREATE OR REPLACE FUNCTION travels_get_available_dates(daterange)
RETURNS TABLE(
  available_dates daterange
)
AS $$
  WITH RECURSIVE calendar(available_dates) AS
  (
    SELECT 
      CASE 
        WHEN $1 @> travel_dates THEN unnest(array[
          daterange(lower($1),lower(travel_dates)),
          daterange(upper(travel_dates),upper($1)) 
        ])
        WHEN lower($1) < lower(travel_dates) THEN daterange(lower($1),lower(travel_dates)) 
        WHEN upper($1) > upper(travel_dates) THEN daterange(upper(travel_dates),upper($1)) 
      END
    FROM travels 
      WHERE $1 && travel_dates AND NOT travel_dates @> $1
    UNION
    SELECT 
      CASE 
        WHEN available_dates @> travel_dates THEN unnest(array[
          daterange(lower(available_dates),lower(travel_dates)), 
          daterange(upper(travel_dates),upper(available_dates)) 
        ])
        WHEN lower(available_dates) < lower(travel_dates) THEN daterange(lower(available_dates),lower(travel_dates)) 
        WHEN upper(available_dates) > upper(travel_dates) THEN daterange(upper(travel_dates),upper(available_dates)) 
      END
    FROM travels 
      JOIN calendar ON available_dates && travel_dates AND NOT travel_dates @> available_dates
  )

  SELECT $1 AS available_dates 
    WHERE NOT EXISTS(SELECT 1 FROM travels WHERE travel_dates <@ $1)    
  UNION
  SELECT * FROM calendar
    WHERE $1 <> available_dates AND 'empty' <> available_dates
      AND NOT EXISTS(SELECT 1 FROM travels WHERE available_dates && travel_dates)
$$ LANGUAGE SQL STABLE;

我们必须递归地将给定的范围分成左右段,然后仅获取那些未被占用的部分。


这个函数抛出一个错误:"ERROR: 不允许在CASE语句中使用返回集合的函数 LINE 10: WHEN $1 @> travel_dates THEN unnest(array[ ^ - jkatz05
我在SQLfiddle上使用PostgreSQL 9.6进行了测试 - http://sqlfiddle.com/#!17/0ee63/1 - IVO GELOV
谢谢回答。我可以在sqlfiddle上看到它运行,但是在我的本地Postgres实例上无法运行。与@jkatz05相同的错误 -DETAIL:级联删除到表旅行 级联删除到函数travels_get_available_dates(daterange) ERROR:CASE中不允许返回集合函数 LINE 31:当$1 @> travel_dates THEN unnest(array [ ^ HINT:您可能能够将返回集合函数移动到LATERAL FROM项中。 SQL状态:0A000 字符:876``` - mike james
我们需要分割后的左右两部分。也许您可以修改查询以使用2个CTE - 一个用于左侧部分,另一个用于右侧部分,然后进行联合 - 但我认为这不会产生正确的输出。 - IVO GELOV

0

我最初忘记了“中心”区域的条款。以下是它:

CREATE OR REPLACE FUNCTION travels_get_available_dates(daterange)
RETURNS TABLE(available_dates daterange)
AS $$
    WITH RECURSIVE calendar AS (
        SELECT
            $1 AS left,
             $1 AS center,
             $1 AS right
        UNION
        SELECT
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN daterange(lower(calendar.left), lower(travels.travel_dates * calendar.left))
                ELSE daterange(lower(calendar.right), lower(travels.travel_dates * calendar.right))
            END AS left,
            CASE travels.travel_dates && calendar.left
                WHEN TRUE THEN travels.travel_dates * calendar.left
                ELSE travels.travel_dates * calendar.right
            END AS center,
            CASE travels.travel_dates && calendar.right
                WHEN TRUE THEN daterange(upper(travels.travel_dates * calendar.right), upper(calendar.right))
                ELSE daterange(upper(travels.travel_dates * calendar.left), upper(calendar.left))
            END AS right
        FROM calendar
        JOIN travels ON
            travels.travel_dates && $1 AND
            travels.travel_dates <> calendar.center AND (
                travels.travel_dates && calendar.left OR
                travels.travel_dates && calendar.right
            )
)
SELECT *
FROM (
    SELECT
        a.left AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.left <> b.left AND
        a.left @> b.left
    GROUP BY a.left
    HAVING NOT bool_or(COALESCE(a.left @> b.left, FALSE))
    UNION
    SELECT a.center AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.center <> b.center AND
        a.center @> b.center
    GROUP BY a.center
    HAVING NOT bool_or(COALESCE(a.center @> b.center, FALSE))
    UNION
    SELECT
        a.right AS available_dates
    FROM calendar a
    LEFT OUTER JOIN calendar b ON
        a.right <> b.right AND
        a.right @> b.right
    GROUP BY a.right
    HAVING NOT bool_or(COALESCE(a.right @> b.right, FALSE))
) a
WHERE NOT isempty(a.available_dates)
$$ LANGUAGE SQL STABLE;

谢谢你的回答,但我认为这种方法行不通。它似乎没有考虑到3月3日至6日之间的可用性,而只是插入了其他日期范围。 VALUES (daterange('2018-03-02', '2018-03-02', '[]')), (daterange('2018-03-06', '2018-03-09', '[]')), (daterange('2018-03-11', '2018-03-12', '[]')), (daterange('2018-03-16', '2018-03-17', '[]')), (daterange('2018-03-18', '2018-03-19', '[]')), (daterange('2018-03-25', '2018-03-27', '[]'));``` - mike james

0

我无法让递归函数正常工作——我只会得到一个无限循环。然而,你不需要使用递归来解决这个问题!你可以使用 PostgreSQL 窗口函数。

https://www.postgresql.org/docs/current/tutorial-window.html

给定原始代码:

CREATE TABLE travels (
    id serial PRIMARY KEY,
    travel_dates daterange NOT NULL,
    EXCLUDE USING spgist (travel_dates WITH &&)
);

并插入以下值:

INSERT INTO travels (travel_dates)
VALUES
    (daterange('2018-03-02', '2018-03-02', '[]')),
    (daterange('2018-03-06', '2018-03-09', '[]')),
    (daterange('2018-03-11', '2018-03-12', '[]')),
    (daterange('2018-03-16', '2018-03-17', '[]')),
    (daterange('2018-03-25', '2018-03-27', '[]'));

以下 SQL 将查找所有可用日期(我添加了连续可用日期的数量,因为这是我需要的内容):
SELECT LOWER(lead(travel_dates) OVER w) - UPPER(travel_dates) as available_count,
       UPPER(travel_dates) AS available_start
  FROM travels
WINDOW w AS (ORDER BY travel_dates ASC);

请注意,available_count为空表示在旅行表中的最后日期之后无限可用。如果需要,您可以以不同的方式处理空值;此外,如果您想将其限制为在两个给定日期之间检查可用性,则可以这样添加WHERE子句(例如限制在2018-03-01和2018-03-15之间):

SELECT LOWER(lead(travel_dates) OVER w) - UPPER(travel_dates) as available_count,
       UPPER(travel_dates) AS available_start
  FROM travels
 WHERE '[2018-03-01,2018-03-15]'::daterange @> travel_dates
WINDOW w AS (ORDER BY travel_dates ASC);

在这种情况下,您需要忽略空值;我还没有找到最简洁的方法来做到这一点,但您可以将其作为子查询...

SELECT sq.available_count, sq.available_start
  FROM (
    SELECT LOWER(lead(travel_dates) OVER w) - UPPER(travel_dates) as available_count,
           UPPER(travel_dates) AS available_start
      FROM travels
     WHERE '[2018-03-01,2018-03-15]'::daterange @> travel_dates
    WINDOW w AS (ORDER BY travel_dates ASC)
  ) AS sq
 WHERE sq.available_count is not null;

我相信有更好的方法来完成这个任务...但我不知道是什么 :) 请注意,如果您想在范围内包括当天并获取“旅行”表中第一天到现在的可用性,则可以在WHERE子句中使用“today”。

希望这能帮助其他人; 我花了大约两天时间才解决这个问题!


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接