使用最后一个非空值更新排序行

5
考虑一个类似以下数据的表格:
column_a (boolean) | column_order (integer)
TRUE               |     1
NULL               |     2
NULL               |     3
TRUE               |     4
NULL               |     5
FALSE              |     6
NULL               |     7

我想编写一个查询,将column_a中的每个NULL值替换为该列按照column_order指定的顺序前面的值中最后一个非NULL值。结果应如下所示:

column_a (boolean) | column_order (integer)
TRUE               |     1
TRUE               |     2
TRUE               |     3
TRUE               |     4
TRUE               |     5
FALSE              |     6
FALSE              |     7

为简单起见,我们可以假设第一个值永远不为空。如果没有超过一个连续的NULL值,则以下方法可行:

SELECT
  COALESCE(column_a, lag(column_a) OVER (ORDER BY column_order))
FROM test_table
ORDER BY column_order;

然而,以上方法无法处理任意数量连续的NULL值。有没有一种适用于Postgres的查询语句能够实现以上结果?是否有一种高效的查询语句能够很好地扩展到大量行数据?
5个回答

3

您可以使用一个方便的技巧,对case进行求和操作,根据空值和非空序列之间的区分来创建分区,然后使用first_value将它们向前移动。

例如:

select
  *,
  sum(case when column_a is not null then 1 else 0 end)
    OVER (order by column_order) as partition
from table1;

 column_a | column_order | partition 
----------+--------------+-----------
 t        |            1 |         1
          |            2 |         1
          |            3 |         1
 t        |            4 |         2
          |            5 |         2
 f        |            6 |         3
          |            7 |         3
(7 rows)

那么

select
  first_value(column_a)
    OVER (PARTITION BY partition ORDER BY column_order),
  column_order
from (
    select
      *,
      sum(case when column_a is not null then 1 else 0 end)
        OVER (order by column_order) as partition
    from table1
) partitioned;

给你:

 first_value | column_order 
-------------+--------------
 t           |            1
 t           |            2
 t           |            3
 t           |            4
 t           |            5
 f           |            6
 f           |            7
(7 rows)

1

不确定Postgresql是否支持此功能,但可以尝试一下:

SELECT
  COALESCE(column_a, (select t2.column_a from test_table t2
                      where t2.column_order < t1.column_order
                        and t2.column_a is not null
                      order by t2.column_order desc
                      fetch first 1 row only))
FROM test_table t1
ORDER BY column_order;

它适用于PostgreSQL 9.2。 - elysch

1

我更熟悉 SqlServer,但这应该可以满足你的需求。

update  tableA as a2
set column_a = b2.column_a
from (
  select a.column_order, max(b.column_order) from tableA as a
  inner join tableA as b on a.column_order > b.column_order and b.column_a is not null
  where a.column_a is null
  group by a.column_order
) as junx 
inner join tableA as b2 on junx.max =b2.column_order
where a2.column_order = junx.column_order

SQL Fiddle


0

使用这个:

select 
   case when column_a is null then 
      (select top 1 column_a from myTable where column_order < mt.column_order
         AND column_a is not null 
       order by column_order desc)
   else column_a
   end,
   column_order
   from myTable mt

这里有一个fiddle


0

以下查询适用于Postgresql。否则,执行可能会非常缓慢,您应该考虑在column_order列上添加索引。

编辑:在一个有10万条记录的表上,它花费了3661毫秒,而“select * from test_table”花费了2511毫秒。当您在column_a上放置索引时,性能非常好。(统计数据在客户端进行)

select 
    case
        when a.column_a is null then b.column_a
        else a.column_a
    end,
    a.column_order
from test_table a
left join test_table b on b.column_order = (
        select max(column_order) 
        from test_table c 
        where c.column_order < a.column_order and c.column_a is not null)
order by column_order

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接