PostgreSQL 中的时间窗口

Question

PostgreSQL 中的时间窗口

4

我刚接触PostgreSQL（具体来说，我使用Timescale数据库），关于时间窗口我有一个问题。

数据：

date      |customerid|names   
2014-01-01|1         |Andrew 
2014-01-02|2         |Pete   
2014-01-03|2         |Andrew 
2014-01-04|2         |Steve  
2014-01-05|2         |Stef   
2014-01-06|3         |Stef  
2014-01-07|1         |Jason 
2014-01-08|1         |Jason

问题是：回溯x天（从每一行的角度看），有多少个不同的名称共享相同的ID？

对于x = 2天，结果应该如下：

date      |customerid|names  |count 
2014-01-01|1         |Andrew |1 
2014-01-02|2         |Pete   |1 
2014-01-03|2         |Andrew |2 
2014-01-04|2         |Steve  |3 
2014-01-05|2         |Stef   |3 
2014-01-06|3         |Stef   |1
2014-01-07|1         |Jason  |1
2014-01-08|1         |Jason  |1

在不使用循环遍历每一行的情况下，PostgreSQL是否可以实现这个功能？

额外信息：实际数据的时间间隔不是等距的。

非常感谢！

- Dominik

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Gordon Linoff · Accepted Answer

如果您能使用窗口函数，那将非常好：

select t.*,
       count(distinct name) over (partition by id
                                  order by date
                                  range between interval 'x day' preceding and current row
                                 ) as cnt_x
from t;

很遗憾，这是不可能的。您可以使用侧向连接(lateral join):

select t.*, tt.cnt_x
from t left join lateral
     (select count(distinct t2.name) as cnt_x
      from t t2
      where t2.id = t.id and
             t2.date >= t.date - interval 'x day' and t2.date <= t.date
     ) tt
     on true;

为了提高性能，您需要在(id, date, name)上建立索引。