如何在PostgreSQL中找到表子集的平均值?

4

请原谅,我对PostgreSQL还很陌生,但是我被分配了更新一些表中的字段。其中一个特定的字段是下面显示的平均决策时间:

CASE WHEN COUNT(tdrm.dbid) > 0
THEN TO_CHAR((AVG(tdrm.total_processing_time) || ' millisecond')::interval, 'MI:SS.MS')
ELSE '00:00.000'
END AS average_decision_time

在这个逻辑中,COUNT(tdrm.dbid)items_seen。但问题在于我们想要排除那些具有“AF_ABORT”终止标志的项目的总处理时间,以计算平均值。

我尝试做的是:

CASE WHEN COUNT(tdrm.dbid) > 0
THEN TO_CHAR((AVG(COUNT(CASE WHEN tdrm.tdr_abort_flag!=AF_ABORT THEN tdrm.total_processing_time END)) || ' millisecond')::interval, 'MI:SS.MS')
ELSE '00:00.000'
END AS average_decision_time

但是我遇到了以下错误:

错误:聚合函数调用不能嵌套 行 64: THEN TO_CHAR((AVG(COUNT(CASE WHEN tdrm.tdr_abort_flag!=A...

我是在正确的轨道上吗?或者有更简单的方法吗?

下面是完整的SQL:

SELECT s.*,
CASE WHEN agent_event.event_code = 'data_download' THEN 'DL'
WHEN agent_event.event_code = 'mode' THEN 'Mode'
ELSE agent_event.event_code
END AS userAction
FROM
(
WITH report_constants AS (
-- Decisions from DetectionReport.h
SELECT
0::int as AD_UNKNOWN,
1::int as AD_ALARM,
2::int as AD_CLEAR,
-- Flags from DetectionReport.h
0::int as AF_UNKNOWN,
1::int as AF_ABORT,
2::int as AF_SUCCESS,
-- UI values for Decisions are DIFFERENT
0::int as UI_AD_ALL,
1::int as UI_AD_CLEAR,
2::int as UI_AD_ALARM,
3::int as UI_AD_UNKNOWN,
--
0::int as AGENT_TYPE_SCANNER,
1::int as AGENT_TYPE_OSR,
2::int as AGENT_TYPE_DIVERTER,
3::int as AGENT_TYPE_TIP,
4::int as AGENT_TYPE_SEARCH,
-- Operation Mode from Module.h
0::int as OPERATION_MODE_UNKNOWN,
1::int as OPERATION_MODE_SCAN,
2::int as OPERATION_MODE_OTHER
)
SELECT
nss_user.username AS user_name,
reg_login.action_time AS login_action_time,
reg_logout.action_time AS logout_action_time,
to_char(reg_login.action_time, 'MM-DD-YYYY') AS login_date,
to_char(reg_login.action_time, 'HH24:MI:SS') AS login_time,
CASE WHEN reg_logout.action_time IS NULL THEN '' ELSE 
to_char(reg_logout.action_time, 'MM-DD-YYYY') END AS logout_date,
CASE WHEN reg_logout.action_time IS NULL THEN '' ELSE 
to_char(reg_logout.action_time, 'HH24:MI:SS') END AS logout_time,
CASE WHEN user_level.name LIKE 'Level %' THEN SUBSTRING(user_level.name from 7) ELSE user_level.name END AS userAccess,
COUNT(tdrm.dbid) AS items_seen,
CASE WHEN COUNT(tdrm.dbid) > 0
THEN ROUND(100.0 * COUNT(CASE WHEN tdrm.tdr_abort_flag=AF_SUCCESS
  AND tdrm.tdr_alarm_decision=AD_CLEAR THEN 1 END) / COUNT(tdrm.dbid), 2)
ELSE 0.00
END AS clear_rate,
COUNT(CASE WHEN (tdrm.tdr_abort_flag=AF_SUCCESS 
AND tdrm.tdr_alarm_decision=AD_UNKNOWN) 
  OR tdrm.tdr_abort_flag=AF_ABORT THEN 1 END) AS operator_timeout,
CASE WHEN COUNT(tdrm.dbid) > 0
THEN ROUND(100.0 * COUNT(CASE WHEN tdrm.tdr_abort_flag=AF_SUCCESS
  AND tdrm.tdr_alarm_decision=AD_ALARM THEN 1 END) / COUNT(tdrm.dbid), 2)
ELSE 0.00
END AS suspect_rate,
CASE WHEN COUNT(tdrm.dbid) > 0
THEN ROUND(100.0 * COUNT(CASE WHEN 
  (tdrm.tdr_abort_flag=AF_SUCCESS AND tdrm.tdr_alarm_decision=AD_UNKNOWN) 
    OR tdrm.tdr_abort_flag=AF_ABORT THEN 1 END) / COUNT(tdrm.dbid), 2)
ELSE 0.00
END AS operatorNoDecisionRate,
CASE WHEN COUNT(tdrm.dbid) > 0
THEN TO_CHAR((AVG(CASE WHEN tdrm.tdr_abort_flag!=AF_ABORT THEN (tdrm.total_processing_time) END) || ' millisecond')::interval, 'MI:SS.MS')
ELSE '00:00.000'
END AS average_decision_time
v2_module.dbid AS v2_gem_dbid
FROM report_constants CROSS JOIN auth_event
INNER JOIN registration_event AS reg_login
ON reg_login.credential_id=auth_event.credential_id
AND reg_login.event_type=3
LEFT OUTER JOIN registration_event AS reg_logout
ON reg_logout.credential_id=auth_event.credential_id
AND reg_logout.event_type=4
INNER JOIN nss_user ON nss_user.dbid=auth_event.nss_user_dbid
INNER JOIN user_level ON user_level.dbid=nss_user.user_level_dbid
LEFT OUTER JOIN bag_tdr ON nss_user.dbid=bag_tdr.author_user_dbid
AND (item_tdr.agent_type=AGENT_TYPE_OSR OR 
item_tdr.agent_type=AGENT_TYPE_SEARCH)
AND item_tdr.author_credential_id=auth_event.credential_id
LEFT OUTER JOIN v2_module AS tdrm ON 
item_tdr.v2_module_dbid=tdrm.dbid 
LEFT OUTER JOIN v2_general_equipment_module
ON v2_general_equipment_module.dbid=reg_login.v2_gem_dbid
WHERE auth_event.credential_id IS NOT NULL
AND auth_event.auth_event_type=1
AND ($P{userid} = 'ALL' OR $P{userid} = nss_user.username)
AND item_tdr.created_date >= $P{fromdate}
AND item_tdr.created_date <= $P{todate}
AND v2_module.operation_mode != OPERATION_MODE_OTHER
GROUP BY nss_user.username, user_level.name, reg_login.agent_type, 
reg_login.action_time, reg_logout.action_time, 
v2_module.dbid
) s
LEFT OUTER JOIN agent_event
ON s.v2_dbid=agent_event.v2_dbid
AND agent_event.event_timestamp >= s.login_action_time
AND (s.logout_action_time IS NULL OR agent_event.event_timestamp <= s.logout_action_time)
ORDER BY s.login_action_time
3个回答

5
我们希望将中止标志等于“AF_ABORT”的项目的总处理时间从平均处理时间中排除。
CASE WHEN count(tdrm.dbid) > 0
   THEN to_char(avg(tdrm.total_processing_time)
                   FILTER (WHERE tdrm.tdr_abort_flag IS DISTINCT FROM 'AF_ABORT') -- ①, ②
              * interval '1 millisecond'  -- ③
              , 'MI:SS.MS')
   ELSE '00:00.000'
END AS average_decision_time

① 实现过滤器的关键元素是聚合FILTER子句。参见:

② 如果tdrm.tdr_abort_flag可以为NULL(缺少信息),我们需要使用tdrm.tdr_abort_flag IS DISTINCT FROM 'AF_ABORT'。否则,我们可以简化为tdrm.tdr_abort_flag <> 'AF_ABORT'

③ 乘法比连接和转换更快。

但是,在添加像这样的FILTER后,表达式最终可能产生NULL值。您的要求有点模糊。您可能真正想要:

total_processing_time的平均值,其中tdr_abort_flag <> 'AF_ABORT'。如果由于任何原因结果为NULL,则默认为0

COALESCE(to_char(avg(tdrm.total_processing_time) FILTER (WHERE tdrm.tdr_abort_flag <> 'AF_ABORT')
               * interval '1 millisecond'
               , 'MI:SS.MS')
       , '00:00.000') AS average_decision_time

或:

tdr_abort_flag <> 'AF_ABORT' 的情况下,对 total_processing_time 取平均值。但仅在 count(tdrm.dbid) > 0 的情况下才执行此操作。如果由于任何原因结果为 NULL,则默认为 0

CASE WHEN count(tdrm.dbid) > 0
   THEN COALESCE(to_char(avg(tdrm.total_processing_time) FILTER (WHERE tdrm.tdr_abort_flag <> 'AF_ABORT')
                       * interval '1 millisecond'
                       , 'MI:SS.MS')
               , '00:00.000')
   ELSE '00:00.000'
END AS average_decision_time

或: tdr_abort_flag <> 'AF_ABORT'的情况下,计算 total_processing_time 的平均值。但是,如果tdrm.dbid的计数 > 0tdr_abort_flag <> 'AF_ABORT',则默认为 0
CASE WHEN count(tdrm.dbid) FILTER (WHERE tdrm.tdr_abort_flag <> 'AF_ABORT') > 0
   THEN to_char(avg(tdrm.total_processing_time) FILTER (WHERE tdrm.tdr_abort_flag <> 'AF_ABORT')
              * interval '1 millisecond'
              , 'MI:SS.MS')
   ELSE '00:00.000'
END AS average_decision_time

你提到你是“新手SQL”。让我补充一下:问题的明确定义占解决方案的> 50%。在许多领域都是如此,但在SQL中尤其如此。


3

您应该能够通过一个简单的子查询来实现这个目标,重新从表中选择并在计算平均值时使用 WHERE 子句过滤掉已经终止的记录。代码示例如下:

    CASE WHEN COUNT(tdrm.dbid) > 0
        THEN TO_CHAR((SELECT (AVG(CASE WHEN tdrm_subq.tdr_abort_flag != AF_ABORT
                                       THEN (tdrm_subq.total_processing_time)
                                  END) || ' millisecond')
                      FROM v2_module tdrm_subq, report_constants
                      WHERE tdr_abort_flag != AF_ABORT)::interval,
                     'MI:SS.MS')
        ELSE '00:00.000'
    END AS average_decision_time

演示在此处:https://rextester.com/NSQN12214

1
好的,我已经添加了完整的SQL语句。我认为这不会起作用,因为我们还有其他字段需要包括那些没有设置AF_ABORT标志的项目。 - user2990406
谢谢提供这个 - 现在清晰多了。我已相应修改了答案... - Steve Chambers

1
除了子查询之外,还应该提到这正是 窗口函数 的用途。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接