在一个重复的列上进行分组

4

我很难用语言表达这个问题,可能这就是为什么我找不到一个例子的原因。以下是我想做的事情。

我有一个类似下面的表格:

    | counter|      timestamp      |
    |   1    | 2018-01-01T11:11:01 |
    |   1    | 2018-01-01T11:11:02 |
    |   1    | 2018-01-01T11:11:03 |
    |   2    | 2018-01-01T11:11:04 |
    |   2    | 2018-01-01T11:11:05 |
    |   3    | 2018-01-01T11:11:06 |
    |   3    | 2018-01-01T11:11:07 |
    |   1    | 2018-01-01T11:11:08 |
    |   1    | 2018-01-01T11:11:09 |
    |   1    | 2018-01-01T11:11:10 |

我希望做的是按照每组计数器分组,所以如果我执行类似以下的查询:

SELECT counter, MAX(timestamp) as st, MIN(timestamp) as et 
FROM table 
GROUP BY counter;

结果将会是:

    | counter |          st         |         et          |
    |   1     | 2018-01-01T11:11:01 | 2018-01-01T11:11:03 |
    |   2     | 2018-01-01T11:11:04 | 2018-01-01T11:11:05 |
    |   3     | 2018-01-01T11:11:06 | 2018-01-01T11:11:07 |
    |   1     | 2018-01-01T11:11:08 | 2018-01-01T11:11:10 |

而不是实际发生的情况,即

    | counter |          st         |         et          |
    |   1     | 2018-01-01T11:11:01 | 2018-01-01T11:11:10 |
    |   2     | 2018-01-01T11:11:04 | 2018-01-01T11:11:05 |
    |   3     | 2018-01-01T11:11:06 | 2018-01-01T11:11:07 |

所以我希望能够将分组和分区结合起来,最好不需要嵌套查询。

我这里没有看到问题。 - STLDev
更新以澄清@STLDeveloper。 - Hunter Jackson
3个回答

4

您需要为具有相同计数器重复值的组指定分组。这可以使用两个窗口函数 lag() 和累计 sum() 来完成:

select counter, min(timestamp) as st, max(timestamp) as et
from (
    select counter, timestamp, sum(grp) over w as grp
    from (
        select *, (lag(counter, 1, 0) over w <> counter)::int as grp
        from my_table
        window w as (order by timestamp)
        ) s
    window w as (order by timestamp)
    ) s
group by counter, grp
order by st

DbFiddle.


1

你应该计算一个新的组:

create table tbl(counter int, ts timestamp);
insert into tbl values
    (1, '2018-01-01T11:11:01'),
    (1, '2018-01-01T11:11:02'),
    (1, '2018-01-01T11:11:03'),
    (2, '2018-01-01T11:11:04'),
    (2, '2018-01-01T11:11:05'),
    (3, '2018-01-01T11:11:06'),
    (3, '2018-01-01T11:11:07'),
    (1, '2018-01-01T11:11:08'),
    (1, '2018-01-01T11:11:09'),
    (1, '2018-01-01T11:11:10');

10 rows affected
select min(counter) as counter, min(ts) as st, max(ts) as et
from
(
    select counter, ts, sum(rst) over (order by ts) as grp
    from 
         (
         select counter, ts,
                case when coalesce(lag(counter) over (order by ts), -1) <> counter then 1 end rst
         from   tbl
         ) t1
) t2
group by grp
counter | st                  | et                 
------: | :------------------ | :------------------
      3 | 2018-01-01 11:11:06 | 2018-01-01 11:11:07
      1 | 2018-01-01 11:11:08 | 2018-01-01 11:11:10
      2 | 2018-01-01 11:11:04 | 2018-01-01 11:11:05
      1 | 2018-01-01 11:11:01 | 2018-01-01 11:11:03

db<>fiddle here

可以翻译为:

db<>fiddle 这里


1
你可以使用排名函数。
select counter, min(timestamp) st, max(timestamp) et
from (select *, 
               row_number() over (order by timestamp) Seq1,
               row_number() over (partition by counter order by timestamp) Seq2 
      from table 
     ) t
group by counter, (Seq1-Seq2);

这将使用两个排名函数之间的差异(Seq1-Seq2),并在GROUP BY子句中使用它们。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接