如何优化PostgreSQL中的“JOIN”操作

3

我有四个表需要提取信息:用户:名字mongo用户:电子邮件,卡状态交易:交易类型,余额,发布时间,是否为ATM,是否为购买用户登录:用户ID,登录日期,登录ID...

在添加第四个表 user_login 之前,所有操作都很高效。但是,第四个 JOIN 让一切变得缓慢。我编写的查询如下:

SELECT * FROM 
(SELECT
ssluserid,
first_name,
m.email,
zipcode,
date_part('year',age(birthday)) AS birthday,
(current_date - DATE(created_date)) AS duration,
CASE WHEN card_status = 'ACTIVE' THEN 1 ELSE 0 END AS IS_ACTIVE,
SUM(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN balance END) AS LOAD_AMT,
SUM(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN balance END) AS SPEND_AMT,
COUNT(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN balance END) AS LOAD_CT,
COUNT(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN balance END) AS SPEND_CT,
MIN(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN DATE(posted_at) END) AS FIRST_LOAD,
MAX(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 THEN DATE(posted_at) END) AS LAST_LOAD,
MIN(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN DATE(posted_at) END) AS FIRST_SPEND,
MAX(CASE WHEN transaction_type = 'Debit' AND balance > 1.00 THEN DATE(posted_at) END) AS LAST_SPEND,
  SUM(CASE WHEN transaction_type = 'Debit' AND is_atm = 't' AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days'
                                    THEN balance END) AS ATM_AMT,
  SUM(CASE WHEN transaction_type = 'Debit' AND is_purchase = 't' AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days'
                                    THEN balance END) AS POS_AMT,
  SUM(CASE WHEN transaction_type = 'Credit' AND balance > 1.00 AND DATE(posted_at) >= CURRENT_DATE - INTERVAL '90 days' 
                                    THEN balance END) AS LOAD_VOL,
  COUNT(CASE WHEN DATE(login_date) >= CURRENT_DATE - INTERVAL '90 days' THEN 
login_id END) AS CT_LOGIN
FROM
mongouser m
LEFT OUTER JOIN
user u
ON m.userid = u.id
LEFT OUTER JOIN transactions t
ON u.id = t.user_id
LEFT OUTER JOIN user_login l
ON m.userid = l.user_id
GROUP BY 1,2,3,4,5,6,7) t
WHERE LAST_LOAD >= CURRENT_DATE - INTERVAL '90 days'
ORDER BY 9 DESC;

这个查询已经运行了将近40分钟......有什么方法可以优化它吗?


是的,有很多方法可以对其进行优化。您可以使用EXPLAIN获取有关查询中成本高的地方的报告,并确定是否有任何地方可以从更好的索引使用中受益,您可以更改表的索引,限制您提取的列数或行数等。或者您可以尝试删除借方/贷方标志,并将借方存储为负金额,然后您可以消除所有CASE内容。您是否尝试过自己寻找任何优化措施或至少研究查询被阻止的位置? - GordonM
@GordonM 谢谢你的建议! - YOBOX
1个回答

2

聚焦于您的陈述,您知道问题出在哪里。您以前遇到过这种情况。

LEFT OUTER JOIN user u
ON m.userid = u.id

"你说事情“不慢”。然后你补充道,"
LEFT OUTER JOIN user_login l
ON m.userid = l.user_id

您说速度变慢了。很可能是因为您在m.userid上建立了索引。您是否在l.user_id上建立了索引?

CREATE INDEX foo ON user_login ( user_id );

你是对的。mongouser表中的userid没有索引,而user_login表中的user_id有索引。但是,您能否解释一下为什么添加user_login表会使一切变慢?在添加此表之前,时间是22秒。 - YOBOX
22秒已经很慢了。我不知道用户的情况,但如果你没有索引,你必须通过嵌套循环(如果有序)在每个表中运行并连接它们,或者使用位图连接来连接它们。这两种方法都很慢。 - Evan Carroll
除了创建索引之外,有没有优化此查询的方法?遗憾的是,作为分析师,我没有在表上创建索引的权限... - YOBOX
不,需要先获得允许才能翻译与程序相关的内容。 - Evan Carroll

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接