在Presto中从JSON数组中提取值

6

我有一个包含JSON数组的列如下:

{data=[{"name":"col1","min":0,"max":32,"avg":29},
{"name":"col2","min":1,"max":35,"avg":21},
{"name":"col3","min":4,"max":56,"avg":34}]}

我正在尝试解析数组并根据条件提取特定的值。例如:

"name"="col1"时,"min"的值为0。

"name"="col3"时,"avg"的值为34。

有人有解决方案吗?

3个回答

5

你的JSON格式不正确。正确的格式应该是{"data":[而不是{data = [

如果JSON格式正确(你可以在子查询中轻松修复它),提取数据,将其转换为数组(行),并使用CASE表达式获取值。我在这里添加了max()聚合函数来删除NULL记录并在单行中获取所需的所有值,你也可以使用过滤器(例如 where x.name ='col1')根据你的需要进行过滤:

with mydata as (
select '{"data":[{"name":"col1","min":0,"max":32,"avg":29},
{"name":"col2","min":1,"max":35,"avg":21},
{"name":"col3","min":4,"max":56,"avg":34}]}' json
)

select max(case when x.name = 'col1' then x.min end) min_col1,
       max(case when x.name = 'col3' then x.avg end) avg_col3
from mydata
CROSS JOIN
    UNNEST(
            CAST(
                JSON_EXTRACT(json,'$.data')
                    as ARRAY(ROW(name VARCHAR, min INTEGER, max INTEGER, avg INTEGER))
                 )
          ) as x(name, min, max, avg) --column aliases

结果:

min_col1    avg_col3
0           34

1
如果数组没有存储在任何键中,而是直接作为数组存储,查询将如何更新? - Siraj Alam

1

关于@siraj-alam提出的问题,并在@leftjoin的答案上进行拓展,如果数据是一个JSON数组,即

with mydata as (
select '[{"name":"col1","min":0,"max":32,"avg":29},
{"name":"col2","min":1,"max":35,"avg":21},
{"name":"col3","min":4,"max":56,"avg":34}]' json
)

这个查询会得到相同的答案

select max(case when x.name = 'col1' then x.min end) min_col1,
       max(case when x.name = 'col3' then x.avg end) avg_col3
from mydata
CROSS JOIN
    UNNEST(
            CAST(
                JSON_EXTRACT(json,'$')
                    as ARRAY(ROW(name VARCHAR, min INTEGER, max INTEGER, avg INTEGER))
                 )
          ) as x(name, min, max, avg) --column aliases

min_col1    avg_col3
0           34

1
我尝试使用@mingdinghan提到的解决方案,因为这正是我正在寻找的解决方案。然而,我最终遇到了错误。
我对上述代码进行了更改,并成功地获得了解决方案。如果有人在上述代码中遇到问题,下面是解决方案。
with mydata as (
select '[{"name":"col1","min":0,"max":32,"avg":29},
{"name":"col2","min":1,"max":35,"avg":21},
{"name":"col3","min":4,"max":56,"avg":34}]' jsonobj
)
select 
    max(case when a.name = 'col1' then a.min end) min_col1,
    max(case when a.name = 'col3' then a.avg end) avg_col3
from mydata
CROSS JOIN
    UNNEST(
            CAST(json_parse(jsonobj) as array(ROW(name VARCHAR, min INTEGER, max INTEGER, avg INTEGER))
                 )
          ) as x(a)

结果仍然与上述相同。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接