我有一个按小时分级的数据表。我想要找到每个小时的计数和所有小时的值并存储在一个数组中。 输入表:
+-----+-----+-----+
| hour| col1| col2|
+-----+-----+-----+
| 00 | 0.0 | a |
| 04 | 0.1 | b |
| 08 | 0.2 | c |
| 12 | 0.0 | d |
+-----+-----+-----+
如下解决方案建议,我正在使用函数将列值获取到一个数组中
select count(hr),
map_values(str_to_map(concat_ws(
',',
collect_set(
concat_ws(':', reflect('java.util.UUID','randomUUID'), cast(col1 as string))
)
))) as col1_arr,
map_values(str_to_map(concat_ws(
',',
collect_set(
concat_ws(':',reflect('java.util.UUID','randomUUID'), cast(col12 as string))
)
))) as col2_arr from table;
我得到的输出结果显示,col2_arr中的值与col1_arr中的值的顺序不同。请建议如何按照相同的顺序获取数组/列表中不同列的值。
+----------+-----------------+----------+
| count(hr)| col1_arr | col2_arr |
+----------+-----------------+----------+
| 4 | 0.0,0.1,0.2,0.0 | b,a,c,d |
+----------+----------------+-----------+
必需的输出:
+----------+-----------------+----------+
| count(hr)| col1_arr | col2_arr |
+----------+-----------------+----------+
| 4 | 0.0,0.1,0.2,0.0 | a,b,c,d |
+----------+----------------+-----------+