按连续行的会话数对SQL行进行排序

Question

按连续行的会话数对SQL行进行排序

3

我有一张表格：

 id |   emp_id   |    telecom_id    |
----+------------+------------------+
  1 | 1          | 1                |
  2 | 1          | 1                |
  3 | 1          | 1                |
  4 | 1          | 2                |
  5 | 1          | 3                |
  6 | 1          | 3                |
  7 | 1          | 1                |
  8 | 2          | 5                |
  9 | 2          | 1                |
 10 | 1          | 1                |
 11 | 2          | 1                |
 12 | 2          | 1                |

以下是创建和填充表格的命令，以方便操作：

CREATE TABLE table1 (
        id int NOT NULL,
        emp_id varchar(255),
        telecom_id varchar(255)
    );

    insert into table1 (id, emp_id, telecom_id) values(1, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(2, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(3, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(4, '1', '2');
    insert into table1 (id, emp_id, telecom_id) values(5, '1', '3');
    insert into table1 (id, emp_id, telecom_id) values(6, '1', '3');
    insert into table1 (id, emp_id, telecom_id) values(7, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(8, '2', '5');
    insert into table1 (id, emp_id, telecom_id) values(9, '2', '1');
    insert into table1 (id, emp_id, telecom_id) values(10, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(11, '2', '1');
    insert into table1 (id, emp_id, telecom_id) values(12, '2', '1');

我需要对这个表中的行进行排名，使得每个会话的行都有相同的排名。会话是具有相等emp_id和telecom_id的连续行系列。

例如，行1-3形成一个会话，因为所有3行的emp_id = 1和telecom_id = 1。第4行形成另一个会话。行5-6形成第三个会话等等。

在排名中使用存储在表中的数据顺序非常重要。

期望的输出：

 id |   emp_id   |    telecom_id    | rnk
----+------------+------------------+------
  1 | 1          | 1                | 1
  2 | 1          | 1                | 1
  3 | 1          | 1                | 1
  4 | 1          | 2                | 2
  5 | 1          | 3                | 3
  6 | 1          | 3                | 3
  7 | 1          | 1                | 4
  8 | 2          | 5                | 5
  9 | 2          | 1                | 6
 10 | 1          | 1                | 7
 11 | 2          | 1                | 8
 12 | 2          | 1                | 8

我尝试了多种窗口函数选项，但没有一种能按预期工作。这是我尝试产生最接近期望结果的方法：

select emp_id, telecom_id, rank() 
over(partition by emp_id, telecom_id order by id) as rnk
from table1;

我正在使用PostgreSQL。

- Dmitriy Fialkovskiy

使用dense_rank代替rank。 - Ajan Balakumaran

@AjanBalakumaran，无论是RANK()还是DENSE_RANK()都产生了与我想要实现的结果不同的结果。 - Dmitriy Fialkovskiy

你使用哪种数据库管理系统？ - jarlh

你能提供一个Fiddle或者已声明的表格来展示你的数据集吗？这样其他人就可以更容易地为你工作了。 - Ajan Balakumaran

@AjanBalakumaran，我更新了我的问题，并添加了创建表格的命令。 - Dmitriy Fialkovskiy

@jarlh，PostgreSQL - Dmitriy Fialkovskiy

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- D-Shih · Accepted Answer

您可以尝试使用lag窗口函数获取先前值，并使用带有窗口函数的条件聚合函数SUM来制定您的逻辑。

CREATE TABLE table1 (
        id int NOT NULL,
        emp_id varchar(255),
        telecom_id varchar(255)
    );

    insert into table1 (id, emp_id, telecom_id) values(1, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(2, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(3, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(4, '1', '2');
    insert into table1 (id, emp_id, telecom_id) values(5, '1', '3');
    insert into table1 (id, emp_id, telecom_id) values(6, '1', '3');
    insert into table1 (id, emp_id, telecom_id) values(7, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(8, '2', '5');
    insert into table1 (id, emp_id, telecom_id) values(9, '2', '1');
    insert into table1 (id, emp_id, telecom_id) values(10, '1', '1');
    insert into table1 (id, emp_id, telecom_id) values(11, '2', '1');
    insert into table1 (id, emp_id, telecom_id) values(12, '2', '1');

查询 1:

SELECT id,emp_id,telecom_id,
       SUM(CASE WHEN 
            pretelecomVal = telecom_id 
            and pre_emp_idVal = emp_id 
           then 0 else 1 end) over(order by id) rnk
FROM (
  select *,
         lag(telecom_id) over(partition by emp_id order by id) pretelecomVal,
         lag(emp_id) over(order by id) pre_emp_idVal
  from table1
) t1

Results:

| id | emp_id | telecom_id | rnk |
|----|--------|------------|-----|
|  1 |      1 |          1 |   1 |
|  2 |      1 |          1 |   1 |
|  3 |      1 |          1 |   1 |
|  4 |      1 |          2 |   2 |
|  5 |      1 |          3 |   3 |
|  6 |      1 |          3 |   3 |
|  7 |      1 |          1 |   4 |
|  8 |      2 |          5 |   5 |
|  9 |      2 |          1 |   6 |
| 10 |      1 |          1 |   7 |
| 11 |      2 |          1 |   8 |
| 12 |      2 |          1 |   8 |