PostgreSQL: exists vs left join

Question

PostgreSQL: exists vs left join

performancepostgresqlexplain

4

我听说Postgres处理exists查询比left join更快。

http://archives.postgresql.org/pgsql-performance/2002-12/msg00185.php

对于单表聚合来说，这绝对是正确的。

但在我们的情况下，有多个相同的查询使用exists构建，这使得Postgres永远挂起：

explain 
SELECT count(DISTINCT "groups".id) AS count_all 
FROM "groups"
WHERE (exists(
    select * from products p where groups.id = p.group_id AND exists(
        select * from products_categories pc where p.id = pc.product_id AND pc.category_id in (2,3))) AND groups.id != 3)

结果：

 Aggregate  (cost=26413436.66..26413436.67 rows=1 width=4)
   ->  Seq Scan on groups  (cost=0.00..26413403.84 rows=13126 width=4)
         Filter: ((id <> 3) AND (subplan))
         SubPlan
           ->  Index Scan using index_products_on_group_id on products p  (cost=0.00..1006.13 rows=1 width=1483)
                 Index Cond: ($1 = group_id)
                 Filter: (subplan)
                 SubPlan
                   ->  Seq Scan on products_categories pc  (cost=0.00..498.49 rows=1 width=8)
                         Filter: ((category_id = ANY ('{2,3}'::integer[])) AND ($0 = product_id))

这是执行时间异常长的根本原因吗？这是某种配置问题吗？

谢谢，博格丹。

- Bogdan Gusiev

组的id上有索引吗？因为在我看来好像没有。另外，你能告诉我们你想要实现什么吗？也许我们可以帮助你优化查询。 - aardbol

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- araqnid · Accepted Answer

嗯，对于“groups”中的每一行，postgresql都会对“products_categories”进行全表扫描，这并不好。这不一定是配置问题，但也许可以不要像那样嵌套子查询来陈述查询？

SELECT count(DISTINCT "groups".id) AS count_all 
FROM "groups"
WHERE exists(
    select 1 from products p where groups.id = p.group_id
             join products_categories pc on pc.product_id = p.id
    where pc.category_id in (2,3)
    ) and groups.id <> 3

此外，products_categories表是否在product_id上有索引？