假设我有以下表定义:
CREATE TABLE x (i serial primary key, value integer not null);
我想计算value
的中位数(而不是平均值)。中位数是将集合分成包含相同数量元素的两个子集的值。如果元素数为偶数,则中位数是最低段中最大值和最高段中最小值的平均值。(有关更多细节,请参见维基百科。)
以下是我计算中位数的方法,但我认为肯定有更好的方法:
SELECT AVG(values_around_median) AS median
FROM (
SELECT
DISTINCT(CASE WHEN FIRST_VALUE(above) OVER w2 THEN MIN(value) OVER w3 ELSE MAX(value) OVER w2 END)
AS values_around_median
FROM (
SELECT LAST_VALUE(value) OVER w AS value,
SUM(COUNT(*)) OVER w > (SELECT count(*)/2 FROM x) AS above
FROM x
GROUP BY value
WINDOW w AS (ORDER BY value)
ORDER BY value
) AS find_if_values_are_above_or_below_median
WINDOW w2 AS (PARTITION BY above ORDER BY value DESC),
w3 AS (PARTITION BY above ORDER BY value ASC)
) AS find_values_around_median
有什么想法吗?