PostgreSQL 视图聚合数据。

3

我遇到了一个问题,需要让生成的内容“好看”,高效,易读。不幸的是,它缺少这两个属性。

给定一个具有 date,transaction_type,usernamecredits 的表,我想要生成一个视图,概括如下字段:date_from, date_next, username, credits_total, credits_total

解释:

  • date_from 和 date_next 是每月的第一天和下一个月的第一天。 (例如 2022-06-01 和 2022-07-01)
  • username 被分组,所以每对 date_from/date_next 只有一个相同的 username
  • credits_total 是 transaction='charge' 时 credit_change 的总和
  • credits_left 是 credits_total - sum(credits_change where transaction_type = 'usage')

我已经确定了多个问题,并且已经部分解决了它们:

  • 使用 date_trunc('month', date) 和 `date_trunc('month', date) + interval '1 month'' 很容易得到 date_from/_next
  • 通过 group by 可以对 username/dates 进行分组
  • 制作 credits_total 没有重复很难。那么子查询是唯一的解决方案吗?
  • credits_left 几乎相同,但是与其他 transaction_type 相关,并从 credits_total 中减去。如何重用 credits_total?

我想到了以下内容(并且非常不满意):

原始表:

create table usage  -- simplified
(
    datetime timestamp default now() not null,
    transaction_type varchar(16) not null,
    user varchar(128) not null,
    credits_change int not null,
);

我的视图代码:

CREATE MATERIALIZED VIEW token_usage
AS
SELECT 
       -- trivial:
       user,
       date_trunc('month', datetime) as date_from,
       date_trunc('month', datetime) + interval '1 month' as date_next,

       -- sum of credits_change with requirement + duplication
       (    -- see here. first time
            SELECT sum(credits_change)
            FROM usage
            WHERE transaction_type = 'charge'
            AND datetime BETWEEN date_trunc('month', datetime) AND date_trunc('month', datetime) + interval '1 month'
       ) as credits_total,

       -- sum of credits change minus other sum and more duplication
       (   -- see here. using the same again
           SELECT sum(credits_change)
           FROM usage
           WHERE transaction_type = 'charge'
           AND datetime BETWEEN date_trunc('month', datetime) AND date_trunc('month', datetime) + interval '1 month'
       ) - (  -- see here. using the same again, but with different transaction_type
           SELECT sum(credits_change)
           FROM usage
           WHERE transaction_type = 'usage'
           AND datetime BETWEEN date_trunc('month', datetime) AND date_trunc('month', datetime) + interval '1 month'
       ) as credits_left
    FROM usage
    GROUP BY user_name, datetime, datetime_next_start
WITH DATA;

说实话,看起来我只是缺少一些postgresql工具,以使它更好。

谢谢你的帮助 :)


1
我对此非常不满意 - 不满意的部分主要在代码注释中描述?直接在SO问题文本中扩展这些观点会有帮助吗? - Stefan Wuebbe
1个回答

3

在不了解查询样本数据和期望输出的情况下,以下内容可作为完整解决方案的概要。我猜这里的关键点是了解聚合函数(*)的FILTER子句:

CREATE MATERIALIZED VIEW token_usage AS

SELECT
    user,
    date_trunc('month', datetime) as date_from,
    date_trunc('month', datetime) + interval '1 month' as date_next,
    
    SUM(credits_change) FILTER (WHERE transaction_type = 'charge') as credits_total,
    SUM(credits_change) FILTER (WHERE transaction_type = 'charge')
        - SUM(credits_change) FILTER (WHERE transaction_type = 'usage') as credits_left
FROM usage
GROUP BY 1, 2, 3

使用子查询可能会减少冗余但可读性可能会降低的替代方案:

CREATE MATERIALIZED VIEW token_usage AS

SELECT 
    user,
    date_from,
    date_from + interval '1 month' as date_next,
    credits_total,
    credits_total - credits_usage as credits_left
FROM (
    SELECT
        user,
        date_trunc('month', datetime) as date_from,
    
        SUM(credits_change) FILTER (WHERE transaction_type = 'charge') as credits_total,
        SUM(credits_change) FILTER (WHERE transaction_type = 'usage') as credits_usage
    FROM usage
    GROUP BY 1, 2 
) s

*) 你也可以使用 CASE 子句代替 FILTER

SUM(abc) FILTER (WHERE condition)

-- generally the same as

SUM(
    CASE WHEN condition THEN 
        abc 
    END
)

太棒了,谢谢!FILTER 正是我所需要的,这甚至帮助我处理了我在示例中省略的部分。 <3! - loki.dev
最后一部分是错误的。SUM(abc) FILTER (WHERE condition) 可以转换为 SUM(CASE WHEN condition THEN abc END),而不是 CASE WHEN condition THEN SUM(abc) END - Thorsten Kettner
@ThorstenKettner 当然,你是对的! - S-Man

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接