使用特定逻辑的Oracle row_number/rank

3

我需要选择某种机制来对使用row_number或rank函数排列的行进行排序。我尝试了使用RNK1和RNK2列的情况,但是我并不确定是否可能实现。请查看实际和期望的结果。

with tmp as (
    select 'username1' as username, to_date('2019-04-01 00:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag from dual union all
    select 'username1' as username, to_date('2019-04-01 01:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-01 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-01 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-01 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 01:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual
) 
select 
    tmp.*,
    case when flag = 1 then row_number() over(partition by flag order by flag) else null end as rnk1,
    case when flag = 1 then rank() over(partition by flag order by flag) else null end as rnk2
from tmp
order by startdate, username

实际情况:

+-------------+--------------------+--------+--------+--------+
| "USERNAME"  | "STARTDATE"        | "FLAG" | "RNK1" | "RNK2" |
+-------------+--------------------+--------+--------+--------+
| "username1" | 01-APR-19 00:00:00 | 1      | 6      | 1      |
| "username1" | 01-APR-19 01:00:00 | 0      |        |        |
| "username1" | 01-APR-19 02:00:00 | 1      | 4      | 1      |
| "username1" | 01-APR-19 03:00:00 | 1      | 3      | 1      |
| "username1" | 01-APR-19 04:00:00 | 0      |        |        |
| "username1" | 02-APR-19 01:00:00 | 1      | 5      | 1      |
| "username1" | 02-APR-19 02:00:00 | 1      | 1      | 1      |
| "username1" | 02-APR-19 03:00:00 | 1      | 2      | 1      |
| "username1" | 02-APR-19 04:00:00 | 0      |        |        |
| "username1" | 02-APR-19 05:00:00 | 0      |        |        |
+-------------+--------------------+--------+--------+--------+

预期结果:

+-------------+--------------------+--------+--------+--------+
| "USERNAME"  | "STARTDATE"        | "FLAG" | "RNK1" | "RNK2" |
+-------------+--------------------+--------+--------+--------+
| "username1" | 01-APR-19 00:00:00 | 1      | 1      | 1      |
| "username1" | 01-APR-19 01:00:00 | 0      |        |        |
| "username1" | 01-APR-19 02:00:00 | 1      | 2      | 2      |
| "username1" | 01-APR-19 03:00:00 | 1      | 2      | 2      |
| "username1" | 01-APR-19 04:00:00 | 0      |        |        |
| "username1" | 02-APR-19 01:00:00 | 1      | 3      | 3      |
| "username1" | 02-APR-19 02:00:00 | 1      | 3      | 3      |
| "username1" | 02-APR-19 03:00:00 | 1      | 3      | 3      |
| "username1" | 02-APR-19 04:00:00 | 0      |        |        |
| "username1" | 02-APR-19 05:00:00 | 0      |        |        |
+-------------+--------------------+--------+--------+--------+

感谢大家快速回复。我开始尝试您的建议,但又遇到了问题。

with tmp as (
    select 'username1' as username, to_date('2019-04-01 00:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
    select 'username1' as username, to_date('2019-04-01 01:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
    select 'username1' as username, to_date('2019-04-01 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
    select 'username1' as username, to_date('2019-04-01 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
    select 'username1' as username, to_date('2019-04-01 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
    select 'username1' as username, to_date('2019-04-01 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
    select 'username1' as username, to_date('2019-04-02 01:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
    select 'username1' as username, to_date('2019-04-02 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
    select 'username1' as username, to_date('2019-04-02 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
    select 'username1' as username, to_date('2019-04-02 04:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
    select 'username1' as username, to_date('2019-04-02 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual
) 
select 
    tmp.*,
    dense_rank() over( order by startdate, username, threshold)-
     (case when flag=1 then
     row_number()over(partition by flag, username order by startdate, username) - flag
      else null end) as grp
from tmp
order by 
startdate, username

实际:

+-------------+--------------------+--------+-------------+-------+
| "USERNAME"  | "STARTDATE"        | "FLAG" | "THRESHOLD" | "GRP" |
+-------------+--------------------+--------+-------------+-------+
| "username1" | 01-APR-19 00:00:00 | 1      | 1           | 1     |
| "username1" | 01-APR-19 01:00:00 | 0      |             |       |
| "username1" | 01-APR-19 02:00:00 | 1      | 1           | 2     |
| "username1" | 01-APR-19 03:00:00 | 1      |             | 2     |
| "username1" | 01-APR-19 04:00:00 | 0      |             |       |
| "username1" | 01-APR-19 05:00:00 | 0      |             |       |
| "username1" | 02-APR-19 01:00:00 | 1      | 1           | 4     |
| "username1" | 02-APR-19 02:00:00 | 1      |             | 4     |
| "username1" | 02-APR-19 03:00:00 | 1      | 1           | 4     |
| "username1" | 02-APR-19 04:00:00 | 1      |             | 4     |
| "username1" | 02-APR-19 05:00:00 | 0      |             |       |
+-------------+--------------------+--------+-------------+-------+

期望结果:

+-------------+--------------------+--------+-------------+-------+
| "USERNAME"  | "STARTDATE"        | "FLAG" | "THRESHOLD" | "GRP" |
+-------------+--------------------+--------+-------------+-------+
| "username1" | 01-APR-19 00:00:00 | 1      | 1           | 1     |
| "username1" | 01-APR-19 01:00:00 | 0      |             |       |
| "username1" | 01-APR-19 02:00:00 | 1      | 1           | 2     |
| "username1" | 01-APR-19 03:00:00 | 1      |             | 2     |
| "username1" | 01-APR-19 04:00:00 | 0      |             |       |
| "username1" | 01-APR-19 05:00:00 | 0      |             |       |
| "username1" | 02-APR-19 01:00:00 | 1      | 1           | 4     |
| "username1" | 02-APR-19 02:00:00 | 1      |             | 4     |
| "username1" | 02-APR-19 03:00:00 | 1      | 1           | 5     |
| "username1" | 02-APR-19 04:00:00 | 1      |             | 5     |
| "username1" | 02-APR-19 05:00:00 | 0      |             |       |
+-------------+--------------------+--------+-------------+-------+

1
请编辑您的问题以说明您期望的结果 - 您还没有说出您的逻辑是什么。 - Alex Poole
1
kovitals,为什么预期结果中没有 grp = 3? - Amir Kadyrov
4个回答

2
假设您正在将连续的flag = 1行分组,您可以使用Tabibitosan技术和dense_rank的组合来实现,如下所示: 最初的回答
WITH      tmp AS (select 'username1' as username, to_date('2019-04-01 00:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag from dual union all
                  select 'username1' as username, to_date('2019-04-01 01:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
                  select 'username1' as username, to_date('2019-04-01 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
                  select 'username1' as username, to_date('2019-04-01 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
                  select 'username1' as username, to_date('2019-04-01 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
                  select 'username1' as username, to_date('2019-04-02 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
                  select 'username1' as username, to_date('2019-04-02 01:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
                  select 'username1' as username, to_date('2019-04-02 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
                  select 'username1' as username, to_date('2019-04-02 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
                  select 'username1' as username, to_date('2019-04-02 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual),
  tabibitosan AS (SELECT tmp.*,
                         CASE
                           WHEN flag = 1 THEN
                            row_number() over(ORDER BY startdate) - row_number() over(PARTITION BY flag ORDER BY startdate)
                         END grp
                  FROM   tmp)
SELECT username,
       startdate,
       flag,
       CASE
         WHEN flag = 1 THEN
          dense_rank() over(PARTITION BY flag ORDER BY grp)
       END rnk
FROM   tabibitosan
ORDER  BY startdate,
          username;

USERNAME  STARTDATE                 FLAG        RNK
--------- ------------------- ---------- ----------
username1 01/04/2019 00:00:00          1          1
username1 01/04/2019 01:00:00          0 
username1 01/04/2019 02:00:00          1          2
username1 01/04/2019 03:00:00          1          2
username1 01/04/2019 04:00:00          0 
username1 02/04/2019 01:00:00          1          3
username1 02/04/2019 02:00:00          1          3
username1 02/04/2019 03:00:00          1          3
username1 02/04/2019 04:00:00          0 
username1 02/04/2019 05:00:00          0 

我已更新查询以考虑额外的阈值列:

我已更新查询,以考虑额外的阈值列:

WITH      tmp AS (select 'username1' as username, to_date('2019-04-01 00:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-01 01:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-01 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-01 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-01 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-01 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-02 01:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-02 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-02 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-02 04:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
                  select 'username1' as username, to_date('2019-04-02 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual),
  tabibitosan AS (SELECT tmp.*,
                         CASE
                           WHEN flag = 1 THEN
                            row_number() over(PARTITION BY username ORDER BY startdate) - row_number() over(PARTITION BY username, flag ORDER BY startdate)
                         END grp,
                         SUM(CASE WHEN flag = 1 THEN threshold END) OVER (PARTITION BY username, flag ORDER BY startdate) threshold_sum -- assumes threshold is 1 or null; change the case statement inside the sum if this isn't the case
                  FROM   tmp)
SELECT username,
       startdate,
       flag,
       threshold,
       CASE
         WHEN flag = 1 THEN
          dense_rank() over(PARTITION BY flag ORDER BY grp, threshold_sum)
       END rnk
FROM   tabibitosan
ORDER  BY startdate,
          username;

USERNAME  STARTDATE         FLAG  THRESHOLD        RNK
--------- ----------- ---------- ---------- ----------
username1 01/04/2019           1          1          1
username1 01/04/2019           0            
username1 01/04/2019           1          1          2
username1 01/04/2019           1                     2
username1 01/04/2019           0            
username1 01/04/2019           0            
username1 02/04/2019           1                     3
username1 02/04/2019           1          1          4
username1 02/04/2019           1          1          5
username1 02/04/2019           1                     5
username1 02/04/2019           0            

注意:我假定阈值列只能为1或null;如果不是这种情况,你将不得不根据你的数据更新条件求和。

我还更新了分析函数的分区,包括用户名列,因为我假设这是数据的主键。

最后,请注意我稍微更改了你的示例数据,以显示如果第一个flag = 1的组中的第一行的阈值为null,它将成为自己的组,如果下一行设置了阈值。如果这不是你想要的行为,你需要更新你的问题并说明你想要的逻辑。

注:N.B.是拉丁文Nota Bene的缩写,意思是“请注意”。


如果第一个标志为0怎么办? - Amir Kadyrov
如果第一行的标志(flag)不等于1,它将完全按照相同的方式工作。我想知道你为什么认为它会有任何不同的表现? - Boneist
请查看我在该主题中的最后一篇帖子。 - kovitals
@kovitals 我已经更新了我的答案,包括一个额外的阈值列的查询。在未来提问时,请确保您包含您真正需要的情况,并请记住包含所需结果背后的逻辑。感谢您包含样本数据的方式;这使得帮助变得非常容易。 - Boneist

1
尝试像下面这样:
with tmp as (
    select 'username1' as username, to_date('2019-04-01 00:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag from dual union all
    select 'username1' as username, to_date('2019-04-01 01:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-01 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-01 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-01 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 01:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
    select 'username1' as username, to_date('2019-04-02 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual
) 
select 
    tmp.*,
    dense_rank() over( order by startdate, username)-
     (case when flag=1 then
     row_number()over(partition by flag, username order by startdate, username) - flag
      else null end) as grp


from tmp
order by 
startdate, username


USERNAME    STARTDATE   FLAG    GRP
username1   01-APR-19   1       1
username1   01-APR-19   0   
username1   01-APR-19   1       2
username1   01-APR-19   1       2
username1   01-APR-19   0   
username1   02-APR-19   1       3
username1   02-APR-19   1       3
username1   02-APR-19   1       3
username1   02-APR-19   0   
username1   02-APR-19   0   

在线演示


如果第一个标志为0怎么办? - Amir Kadyrov
@akk0rd87,你有同样的问题吗? - Zaynul Abadin Tuhin
如果第一个标志为0,则您的组将从3开始。 - Amir Kadyrov
请查看我在主题中的最后一篇帖子。 - kovitals

1
with tmp as (
select 'username1' as username, to_date('2019-04-01 00:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
select 'username1' as username, to_date('2019-04-01 01:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
select 'username1' as username, to_date('2019-04-01 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
select 'username1' as username, to_date('2019-04-01 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
select 'username1' as username, to_date('2019-04-01 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
select 'username1' as username, to_date('2019-04-01 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual union all
select 'username1' as username, to_date('2019-04-02 01:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
select 'username1' as username, to_date('2019-04-02 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
select 'username1' as username, to_date('2019-04-02 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, 1 as threshold from dual union all
select 'username1' as username, to_date('2019-04-02 04:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag, null as threshold from dual union all
select 'username1' as username, to_date('2019-04-02 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag, null as threshold from dual
) 
select tmp.*, decode(flag, 1, count(threshold) over (partition by username order by startdate)) rn
from tmp;

USERNAME  STARTDATE                 FLAG  THRESHOLD         RN
--------- ------------------- ---------- ---------- ----------
username1 2019-04-01 00:00:00          1          1          1
username1 2019-04-01 01:00:00          0                      
username1 2019-04-01 02:00:00          1          1          2
username1 2019-04-01 03:00:00          1                     2
username1 2019-04-01 04:00:00          0                      
username1 2019-04-01 05:00:00          0                      
username1 2019-04-02 01:00:00          1          1          3
username1 2019-04-02 02:00:00          1                     3
username1 2019-04-02 03:00:00          1          1          4
username1 2019-04-02 04:00:00          1                     4
username1 2019-04-02 05:00:00          0                      

11 rows selected.

0
with tmp as (
select 'username1' as username, to_date('2019-04-01 00:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag from dual union all
select 'username1' as username, to_date('2019-04-01 01:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
select 'username1' as username, to_date('2019-04-01 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
select 'username1' as username, to_date('2019-04-01 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
select 'username1' as username, to_date('2019-04-01 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
select 'username1' as username, to_date('2019-04-02 05:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual union all
select 'username1' as username, to_date('2019-04-02 01:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
select 'username1' as username, to_date('2019-04-02 02:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
select 'username1' as username, to_date('2019-04-02 03:00','YYYY-MM-DD HH24:MI') as startdate, 1 as flag  from dual union all
select 'username1' as username, to_date('2019-04-02 04:00','YYYY-MM-DD HH24:MI') as startdate, 0 as flag  from dual
) 
select username, startdate, flag, decode(flag, 1, match_num) as rn
from tmp
match_recognize(
partition by username
order by startdate
measures match_number() AS match_num
all rows per match
pattern (s* f*)
define f as f.flag = 1, s as s.flag = 0
);

USERNAME  STARTDATE                 FLAG         RN
--------- ------------------- ---------- ----------
username1 2019-04-01 00:00:00          1          1
username1 2019-04-01 01:00:00          0           
username1 2019-04-01 02:00:00          1          2
username1 2019-04-01 03:00:00          1          2
username1 2019-04-01 04:00:00          0           
username1 2019-04-02 01:00:00          1          3
username1 2019-04-02 02:00:00          1          3
username1 2019-04-02 03:00:00          1          3
username1 2019-04-02 04:00:00          0           
username1 2019-04-02 05:00:00          0           

10 rows selected. 

只有当标志值仅有两个(这可能是OP所拥有的),它才能正常工作,但如果标志值可能为空,则无法正常工作(https://dbfiddle.uk/?rdbms=oracle_18&fiddle=c2fb05f5697e2c6e38cd0357e3025a95); s 的模式定义需要更新以使其适用于所有情况。此外,它在11g中无法正常工作,这是OP所标记的。 - Boneist
如果flag可能为空,只需使用nvl(s.flag, 0) = 0。是的,模式匹配对于Oracle 11g没有用处,仅适用于12+。 - Amir Kadyrov

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接