Redshift不支持rollup()和grouping()函数。

Question

Redshift不支持rollup()和grouping()函数。

3

尝试将Teradata BTEQ SQL脚本转换为Redshift SQL。我的当前Redshift版本是8.0.2，Redshift版本是1.0.1499。当前版本的Redshift不支持rollup()和grouping()函数。如何克服和解决这种情况？它们的等效Redshift函数是什么？有人能举一些例子来解释如何操作吗？ 示例Teradata SQL-

select 
PRODUCT_ID,CUST_ID, 
GROUPING (PRODUCT_ID), 
GROUPING (CUST_ID), 
row_number over (order by PRODUCT_ID,CUST_ID) AS "ROW_OUTPUT_NUM"
from products 
group by rollup(PRODUCT_ID,CUST_ID);

需要将上述 SQL 查询转换为 Redshift。

- pavan kumar

请提供一些Teradata SQL的示例以及您在Redshift上尝试过的内容。 - Jon Scott

@JonScott 尝试将简单的 Rollup() 和 grouping() 函数转换为 Redshift，但未能成功。 - pavan kumar

要在Redshift或任何不支持Rollup的数据库中执行此操作，您需要逐个运行每个组，然后将结果合并。这可以在一个使用CTE的SELECT语句中完成。 - Jon Scott

你能否给我提供一个以上SQL查询的示例呢？ - pavan kumar

我使用了以下链接来编写Redshift中的SQL Rollup、Grouping查询。它有效地运行了。 - pavan kumar

2个回答

1

如果您使用其他人指向的 UNION 技术，那么您将多次扫描底层表。

如果细粒度分组实际上导致数据大小显着减少，则更好的解决方案可能是：

create temp table summ1 
as
select PRODUCT_ID,CUST_ID, ...
from products 
group by PRODUCT_ID,CUST_ID;

create temp table summ2
as
select PRODUCT_ID,cast(NULL as INT) AS CUST_ID, ...
from products 
group by PRODUCT_ID;

select * from summ1
union all
select * from summ2
union all
select cast(NULL as INT) AS PRODUCT_ID, cast(NULL as INT) AS CUST_ID, ...
from summ2

- dsz

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ePi272314 · Accepted Answer

手动实现ROLLUP

由于Redshift目前不支持ROLLUP子句，因此您必须以一种困难的方式来实现这种分组技术。

带一个参数的ROLLUP

使用ROLLUP，例如PostgreSQL

原始答案：Original Answer

SELECT column1, aggregate_function(*)
FROM some_table
GROUP BY ROLLUP(column1)

等价实现

-- First, the same GROUP BY without the ROLLUP
-- For efficiency, we will reuse this table
DROP TABLE IF EXISTS tmp_totals;
CREATE TEMP TABLE tmp_totals AS
  SELECT column1, aggregate_function(*) AS total1
  FROM some_table
  GROUP BY column1;

-- Show the table 'tmp_totals'
SELECT * FROM tmp_totals

UNION ALL

-- The aggregation of 'tmp_totals'
SELECT null, aggregate_function(total1) FROM tmp_totals

ORDER BY 1

最初的回答

示例输出

Country  | Sales
-------- | -----
Poland   | 2
Portugal | 4
Ukraine  | 3
null     | 9

带有两个参数的ROLLUP

使用ROLLUP，例如PostgreSQL

最初的回答：没有提供翻译

SELECT column1, column2, aggregate_function(*)
FROM some_table
GROUP BY ROLLUP(column1, column2);

等价的实现方式

-- First, the same GROUP BY without the ROLLUP
-- For efficiency, we will reuse this table
DROP TABLE IF EXISTS tmp_totals;
CREATE TEMP TABLE tmp_totals AS
  SELECT column1, column2, aggregate_function(*) AS total1
  FROM some_table
  GROUP BY column1, column2;

-- Show the table 'tmp_totals'
SELECT * FROM tmp_totals

UNION ALL

-- The sub-totals of the first category
SELECT column1, null, sum(total1) FROM tmp_totals GROUP BY column1

UNION ALL

-- The full aggregation of 'tmp_totals'
SELECT null, null, sum(total1) FROM tmp_totals

ORDER BY 1, 2;

Example output

Country  | Segment  | Sales
-------- | -------- | -----
Poland   | Premium  | 0
Poland   | Base     | 2
Poland   | null     | 2     <- sub total
Portugal | Premium  | 1
Portugal | Base     | 3
Portugal | null     | 4     <- sub total
Ukraine  | Premium  | 1
Ukraine  | Base     | 2
Ukraine  | null     | 3     <- sub total
null     | null     | 9     <- grand total