PostgreSQL首次加载时查询运行非常缓慢

4
我们在Amazon EC2上使用PostgreSQL版本9.4数据库。我们所有的查询在第一次尝试时运行非常缓慢,直到它被缓存后它们才相当快,但这并不是一种安慰,因为它会减缓页面加载速度。
这里有一个我们使用的查询示例:
SELECT HE.fs_perm_sec_id,
   HE.TICKER_EXCHANGE,
   HE.proper_name,
   OP.shares_outstanding,

(SELECT factset_industry_desc
 FROM factset_industry_map AS fim
 WHERE fim.factset_industry_code = HES.industry_code) AS industry,

(SELECT SUM(POSITION) AS ST_HOLDINGS
 FROM OWN_STAKES_HOLDINGS S
  WHERE S.POSITION > 0
   AND S.fs_perm_sec_id = HE.fs_perm_sec_id
 GROUP BY FS_PERM_SEC_ID) AS stake_holdings,

(SELECT SUM(CURRENT_HOLDINGS)
 FROM
   (SELECT CURRENT_HOLDINGS
    FROM OWN_INST_HOLDINGS IHT
    WHERE FS_PERM_SEC_ID=HE.FS_PERM_SEC_ID
    ORDER BY CURRENT_HOLDINGS DESC LIMIT 10)A) AS top_10_inst_hodings,

 (SELECT SUM(OIH.current_holdings)
  FROM own_inst_holdings OIH
  WHERE OIH.fs_perm_sec_id = HE.fs_perm_sec_id) AS inst_holdings

FROM own_prices OP
JOIN h_security_ticker_exchange HE ON OP.fs_perm_sec_id = HE.fs_perm_sec_id
JOIN h_entity_sector HES ON HES.factset_entity_id = HE.factset_entity_id
WHERE HE.ticker_exchange = 'PG-NYS'
ORDER BY OP.price_date DESC LIMIT 1

运行了一个EXPLAIN ANALYSE,并得到了以下结果:

  QUERY PLAN
  Limit  (cost=223.39..223.39 rows=1 width=100) (actual time=2420.644..2420.645 rows=1 loops=1)
  ->  Sort  (cost=223.39..223.39 rows=1 width=100) (actual time=2420.643..2420.643 rows=1 loops=1)
    Sort Key: op.price_date
    Sort Method: top-N heapsort  Memory: 25kB
    ->  Nested Loop  (cost=0.26..223.39 rows=1 width=100) (actual time=2316.169..2420.566 rows=36 loops=1)
          ->  Nested Loop  (cost=0.17..8.87 rows=1 width=104) (actual time=3.958..5.084 rows=36 loops=1)
                ->  Index Scan using h_sec_exch_factset_entity_id_idx on h_security_ticker_exchange he  (cost=0.09..4.09 rows=1 width=92) (actual time=1.452..1.454 rows=1 loops=1)
                      Index Cond: ((ticker_exchange)::text = 'PG-NYS'::text)
                ->  Index Scan using alex_prices on own_prices op  (cost=0.09..4.68 rows=33 width=23) (actual time=2.496..3.592 rows=36 loops=1)
                      Index Cond: ((fs_perm_sec_id)::text = (he.fs_perm_sec_id)::text)
          ->  Index Scan using alex_factset_entity_idx on h_entity_sector hes  (cost=0.09..4.09 rows=1 width=14) (actual time=0.076..0.077 rows=1 loops=36)
                Index Cond: (factset_entity_id = he.factset_entity_id)
          SubPlan 1
            ->  Index Only Scan using alex_factset_industry_code_idx on factset_industry_map fim  (cost=0.03..2.03 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=36)
                  Index Cond: (factset_industry_code = hes.industry_code)
                  Heap Fetches: 0
          SubPlan 2
            ->  GroupAggregate  (cost=0.08..2.18 rows=2 width=17) (actual time=0.735..0.735 rows=1 loops=36)
                  Group Key: s.fs_perm_sec_id
                  ->  Index Only Scan using own_stakes_holdings_perm_position_idx on own_stakes_holdings s  (cost=0.08..2.15 rows=14 width=17) (actual time=0.080..0.713 rows=39 loops=36)
                        Index Cond: ((fs_perm_sec_id = (he.fs_perm_sec_id)::text) AND (\position\ > 0::numeric))
                        Heap Fetches: 1155
          SubPlan 3
            ->  Aggregate  (cost=11.25..11.26 rows=1 width=6) (actual time=0.166..0.166 rows=1 loops=36)
                  ->  Limit  (cost=0.09..11.22 rows=10 width=6) (actual time=0.081..0.150 rows=10 loops=36)
                        ->  Index Only Scan Backward using alex_current_holdings_idx on own_inst_holdings iht  (cost=0.09..194.87 rows=175 width=6) (actual time=0.080..0.147 rows=10 loops=36)
                              Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
                              Heap Fetches: 288
          SubPlan 4
            ->  Aggregate  (cost=194.96..194.96 rows=1 width=6) (actual time=66.102..66.102 rows=1 loops=36)
                  ->  Index Only Scan using alex_current_holdings_idx on own_inst_holdings oih  (cost=0.09..194.87 rows=175 width=6) (actual time=0.060..65.209 rows=2505 loops=36)
                        Index Cond: (fs_perm_sec_id = (he.fs_perm_sec_id)::text)
                 Heap Fetches: 33453
 Planning time: 1.581 ms
 Execution time: 2420.830 ms

一旦我们停用三个聚合的SELECT SUM(),查询速度就会显著提高,但这违背了关系数据库的初衷。

我们正在使用PG插件(https://www.npmjs.com/package/pg)在NodeJS上运行查询来连接和查询数据库。

我们应该如何加快查询速度?我们可以采取哪些额外措施?我们已经对数据库进行了索引,所有字段似乎都已正确索引,但仍然不够快。

欢迎任何帮助、评论和/或建议。


1
你正在使用EC2而不是RDS?你使用的实例大小是多少?你是否在使用EBS?你使用的EBS卷类型是什么?EBS卷的大小是多少?你的服务器监控中似乎存在瓶颈,是磁盘IO、CPU、网络传输哪个方面? - Mark B
我正在使用Heroku插件,它似乎将数据库托管在AWS上。 - Alex_B
2
如果您甚至不知道它是否托管在EC2实例上,也许您应该使用“Heroku”标记问题,而不是任何AWS标记。 Heroku的PostgreSQL提供是针对Heroku特定的,虽然它可能在幕后使用AWS资源,但像amazon-ec2这样的标记与问题无关。 - Mark B
谢谢您的反馈,但那并没有回答我的问题。 - Alex_B
1
我在你的问题下发表了一条评论,而不是回答。我的评论旨在帮助你更新问题,提供必要的细节,以便有人能够提供答案。 - Mark B
显示剩余2条评论
1个回答

0
嵌套聚合循环通常是不好的。下面的代码应该避免了这种情况。(未经测试;如果有SQLFiddle会更有帮助。)试一试并让我知道结果。我很想知道窗口函数过滤器的效果如何。
WITH    security
AS  (
    SELECT  HE.fs_perm_sec_id
,       HE.TICKER_EXCHANGE
,       HE.proper_name
,       OP.shares_outstanding
,       OP.price_date
FROM    own_prices                  AS OP
 JOIN   h_security_ticker_exchange  AS HE
  ON    OP.fs_perm_sec_id       = HE.fs_perm_sec_id
 JOIN   h_entity_sector             AS HES
  ON    HES.factset_entity_id   = HE.factset_entity_id
WHERE HE.ticker_exchange = 'PG-NYS'
)
SELECT  SE.fs_perm_sec_id
,       SE.TICKER_EXCHANGE
,       SE.proper_name
,       SE.shares_outstanding
,       S.stake_holdings
,       IHT.top_10_inst_holdings
,       OIH.inst_holdings
FROM    security    SE
 JOIN   (
        SELECT  S.fs_perm_sec_id
        ,       SUM(S.POSITION)     AS stake_holdings
        FROM    OWN_STAKES_HOLDINGS AS S
        WHERE   S.fs_perm_sec_id    IN (
                    SELECT  fs_perm_sec_id
                    FROM    security
                )
         AND    S.POSITION          > 0
        GROUP BY    S.fs_perm_sec_id
        )   AS S
  ON    SE.fs_perm_sec_id   = S.fs_perm_sec_id
 JOIN   (
        SELECT  IHT.FS_PERM_SEC_ID
        ,       SUM(IHT.CURRENT_HOLDINGS)   AS top_10_inst_holdings
        FROM    OWN_INST_HOLDINGS   AS IHT
        WHERE   IHT.FS_PERM_SEC_ID  IN (
                    SELECT  fs_perm_sec_id
                    FROM    security
                )
         AND    ROW_NUMBER() OVER (
                    PARTITION BY IHT.FS_PERM_SEC_ID
                    ORDER BY IHT.CURRENT_HOLDINGS DESC
                )                   <= 10
        GROUP BY    IHT.FS_PERM_SEC_ID
        )   AS IHT
  ON    SE.fs_perm_sec_id   = IHT.fs_perm_sec_id
 JOIN   (
        SELECT  S.fs_perm_sec_id
        ,       SUM(OIH.current_holdings)   AS inst_holdings
        FROM    own_inst_holdings   AS OIH
        WHERE   OIH.fs_perm_sec_id  IN (
                    SELECT  fs_perm_sec_id
                    FROM    security
                )
        GROUP BY    OIH.fs_perm_sec_id
        )   AS OIH
  ON    SE.fs_perm_sec_id   = OIH.fs_perm_sec_id
ORDER BY    SE.price_date
LIMIT   1

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接