在Snowflake中如何对多个聚合进行数据透视

3
我有以下表结构:
product_id Period Sales Profit
x1 L13 $100 $10
x1 L26 $200 $20
x1 L52 $300 $30
x2 L13 $500 $110
x2 L26 $600 $120
x2 L52 $700 $130
我想要将期间列进行透视,并在这些列中具有销售价值和利润。 我需要下面这样的表格。
产品编号 销售额_L13 销售额_L26 销售额_L52 利润_L13 利润_L26 利润_L52
x1 $100 $200 $300 $10 $20 $30
x2 $500 $600 $700 $110 $120 $130

我正在使用snowflake编写查询。我尝试使用snowflake的pivot函数,但只能指定一个聚合函数。

有谁能帮忙解决这个问题吗?

感谢任何帮助。

谢谢


你试过什么了吗? - RiggsFolly
@Vipendra,请尝试以下解决方案。 - trillion
嗨,RiggsFolly,我尝试使用单个枢轴函数进行多个聚合,但出现错误。 - Vipendra Singh
@trillion,Phil Coulson的答案很好用。感谢您的帮助。 - Vipendra Singh
5个回答

6

在我们转型之前,我们如何将销售和利润叠加在一起?我会把把我搞糊涂的列名留给你来修复。

with cte (product_id, period, amount) as
  
(select product_id, period||'_profit', profit from t
 union all
 select product_id, period||'_sales', sales from t)
   
select * 
from cte
     pivot(max(amount) for period in ('L13_sales','L26_sales','L52_sales','L13_profit','L26_profit','L52_profit'))
     as p (product_id,L13_sales,L26_sales,L52_sales,L13_profit,L26_profit,L52_profit);

如果您想在销售和利润方面进行两次透视,您需要复制该列,以便每个透视实例都有一个列。很明显,这会因为重复列仍然存在于第一次透视后而创建 null 值。为了处理它,我们可以在最终选择中使用 max。下面是实现的样子:

select product_id, 
       max(L13_sales) as L13_sales, 
       max(L26_sales) as L26_sales, 
       max(L52_sales) as L52_sales, 
       max(L13_profit) as L13_profit, 
       max(L26_profit) as L26_profit, 
       max(L52_profit) as L52_profit
from (select *, period as period2 from t) t
      pivot(max(sales) for period in ('L13','L26','L52'))
      pivot(max(profit) for period2 in ('L13','L26','L52'))  
      as p (product_id, L13_sales,L26_sales,L52_sales,L13_profit,L26_profit,L52_profit)
group by product_id;

此时,这是一个眼部创伤。你可以使用条件聚合或更好的方法是在报告应用程序中处理数据透视。一种更紧凑的条件聚合替代方案使用decode

select product_id,
       max(decode(period,'L13',sales)) as L13_sales,
       max(decode(period,'L26',sales)) as L26_sales,
       max(decode(period,'L52',sales)) as L52_sales,
       max(decode(period,'L13',profit)) as L13_profit,
       max(decode(period,'L26',profit)) as L26_profit,
       max(decode(period,'L52',profit)) as L52_profit
from t
group by product_id;

这个可能可行,我会尝试一下并告诉你结果。谢谢 Phil。 - Vipendra Singh
@VipendraSingh 我忘记在 as p(....) 中包含 product_id,所以我进行了更新。还包括了两次透视的方法。 - Radagast

3
这里有一种备选形式,使用 OBJECT_AGGLATERAL FLATTEN,避免了 Adrian White 建议的 ARRAY_AGGPIVOT 存在的潜在支持问题。这应该适用于任何在 OBJ_TALL CTE 中的初始 ARRAY_CONSTRUCT 内包含多个输入列的聚合函数。我认为使用 CASE 语句的条件聚合选项会更快,但需要在实际规模下进行测试以确认。
-- OBJECT FORM USING LATERAL FLATTEN 
WITH CTE AS(
                   SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100  SALES,10   PROFIT
             UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200  SALES,20   PROFIT
             UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300  SALES,30   PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500  SALES,110  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600  SALES,120  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700  SALES,130  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
,OBJ_TALL AS (  SELECT PRODUCT_ID, 
                OBJECT_CONSTRUCT(PERIOD,
                                 ARRAY_CONSTRUCT(  SUM(SALES)
                                                  ,SUM(PROFIT)
                                                )
                                 ) S 
                  FROM CTE 
              GROUP BY PRODUCT_ID, PERIOD)
 SELECT * FROM OBJ_TALL;
,OBJ_WIDE AS (  SELECT PRODUCT_ID, OBJECT_AGG(KEY,VALUE) OA 
                  FROM OBJ_TALL, LATERAL FLATTEN(INPUT => S) 
              GROUP BY PRODUCT_ID)
-- SELECT * FROM OBJ_WIDE;
SELECT 
    PRODUCT_ID
    ,OA:L13[0] SALES_L13 
    ,OA:L13[1] PROFIT_L13 
    ,OA:L26[0] SALES_L26 
    ,OA:L26[1] PROFIT_L26 
    ,OA:L52[0] SALES_L52 
    ,OA:L52[1] PROFIT_L52 
FROM OBJ_WIDE
ORDER BY 1;

为了便于与上述内容进行比较,这里是Adrians使用CTE重新格式化的ARRAY_AGGPIVOT版本。

-- ARRAY FORM - RE-WRITTEN WITH CTES FOR CLARITY AND COMPARISON TO OBJECT FORM
WITH CTE AS(
                   SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100  SALES,10   PROFIT
             UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200  SALES,20   PROFIT
             UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300  SALES,30   PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500  SALES,110  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600  SALES,120  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700  SALES,130  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
,ARR_TALL AS (SELECT PRODUCT_ID, 
                     PERIOD,
                     ARRAY_CONSTRUCT( SUM(SALES)
                                     ,SUM(PROFIT)
                                    ) S 
                FROM CTE GROUP BY 1,2)
,ARR_WIDE AS (SELECT * 
                FROM ARR_TALL PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')  )  )
SELECT 
    PRODUCT_ID
    ,"'L13'"[0][0] SALES_L13 
    ,"'L13'"[0][1] PROFIT_L13 
    ,"'L26'"[0][0] SALES_L26 
    ,"'L26'"[0][1] PROFIT_L26 
    ,"'L52'"[0][0] SALES_L52 
    ,"'L52'"[0][1] PROFIT_L52 
FROM ARR_WIDE
ORDER BY 1;

2
太棒了!这是一个令人印象深刻的首帖...太棒了...这正是我认为Stack Overflow应该鼓励的事情 :-) - Adrian White
@Fieldy,精彩的帖子和代码,真是太棒了。非常酷的解决方案。我希望我能投两次赞;) - Roberto Hernandez

2

使用条件聚合:

SELECT product_id
   ,SUM(CASE WHEN Period = 'L13' THEN Sales END) AS SALES_L13
   ,SUM(CASE WHEN Period = 'L26' THEN Sales END) AS SALES_L26
   ,SUM(CASE WHEN Period = 'L52' THEN Sales END) AS SALES_L52
   ,SUM(CASE WHEN Period = 'L13' THEN Profit END) AS PROFIT_L52
   ,SUM(CASE WHEN Period = 'L26' THEN Profit END) AS PROFIT_L52
   ,SUM(CASE WHEN Period = 'L52' THEN Profit END) AS PROFIT_L52
FROM tab
GROUP BY product_id

2
你的case语句有误,你只在最后3个中使用了L52,应该是L13、L26和L52。 - trillion

2

我相信你一次只能有一个枢轴,但可以通过运行下面的第一个代码来检查。然后,您可以分别仅使用一个枢轴运行以查看它是否正常工作。不幸的是,如果不允许多个枢轴,即第一个代码,则可以使用第三个代码,即case when方法,或者先使用union将它们组合起来,即(来自上述的Phil Culson方法)。

 select * 
      from [table name]
        pivot(sum(amount) for PERIOD in (L13, L26, L52)),
        pivot(sum(profit) for PERIOD in (L13, L26, L52))
      order by product_id;

如果上面的方法不起作用,请尝试使用一个示例: https://count.co/sql-resources/snowflake/pivot-tables

  select * 
      from [table name]
        pivot(sum(amount) for PERIOD in (L13, L26, L52))
      order by product_id;

否则,您将需要应用手动的case when逻辑:
select 
product_id,
sum(case when Period = 'L13' then Sales end)  as sales_l13,
sum(case when Period = 'L26' then Sales end)  as  sales_l26,
sum(case when Period = 'L52' then Sales end)  as  sales_l52,
sum(case when Period = 'L13' then Profit end) as  profi_l13,
sum(case when Period = 'L26' then Profit end) as  profit_l26,
sum(case when Period = 'L52' then Profit end) as  profit_l52
from [table name]
group by 1 

嗨,trillion,两个枢轴不起作用。在一个单独的枢轴中,我们肯定可以聚合数据,但我必须添加两个聚合。在你的第二个答案中会有许多重复。 - Vipendra Singh

2
我对这个答案并不完全满意...相信有人可以改进这种方法。

enter image description here

基本上,对于一个数组进行PIVOTING... 数组可用的聚合函数列表并不是很多... 只有一个ARRAY_AGG。而PIVOT只支持AVG、COUNT、MAX、MIN和SUM。所以这应该行不通... 但我认为PIVOT只需要某种形式的聚合,所以它确实可以工作。
我建议在构建数组之前聚合您的指标... 但它确实让您一次旋转多个指标 - 从阅读Stack Overflow来看,这似乎是不可能的!
复制|粘贴|运行|... 并且请改进 :-)
WITH CTE AS( SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT)


SELECT 
PRODUCT_ID
,"'L13'"[0][0] SALES_L13 
,"'L13'"[0][1] PROFIT_L13 
,"'L26'"[0][0] SALES_L26 
,"'L26'"[0][1] PROFIT_L26 
,"'L52'"[0][0] SALES_L52 
,"'L52'"[0][1] PROFIT_L52 
FROM 
(SELECT * FROM 
   (
   SELECT PRODUCT_ID, PERIOD,ARRAY_CONSTRUCT(SALES,PROFIT) S FROM CTE)
   PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')
   ) 
 )  

enter image description here

带有聚合的示例(将1700、1130添加到L52 X2)
WITH CTE AS(
  SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100  SALES,10   PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200  SALES,20   PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300  SALES,30   PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500  SALES,110  PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600  SALES,120  PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700  SALES,130  PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)

SELECT 
    PRODUCT_ID
    ,"'L13'"[0][0] SALES_L13 
    ,"'L13'"[0][1] PROFIT_L13 
    ,"'L26'"[0][0] SALES_L26 
    ,"'L26'"[0][1] PROFIT_L26 
    ,"'L52'"[0][0] SALES_L52 
    ,"'L52'"[0][1] PROFIT_L52 
FROM 
   (SELECT * FROM 
   (
   SELECT PRODUCT_ID, PERIOD,ARRAY_CONSTRUCT(SUM(SALES),SUM(PROFIT)) S FROM CTE GROUP BY 1,2)
   PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')
   ) 
)  

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接