如何优化多次连接同一张表的查询以提高性能?

3
以下查询用于在电子商务网站上搜索产品数据库时应用多个过滤器值。

当使用超过7个过滤器时,MySQL服务器会崩溃。

  • 是否有机会只连接一次positions_categories_links
  • 我应该考虑改变数据库结构吗?

每个产品都有一个eshop_cats类别(many2one),最多有20个带有多个关系(many2many)的categories

eshop_pos表具有40,000条记录,包含产品。

eshop_cats表具有340条记录,包含主要类别。

categories表具有6,000条记录,包含双工类别。

positions_categories_links表具有360,000条记录,包含产品和类别之间的关键字。


这是我的查询:
SELECT COUNT(DISTINCT eshop_pos.id)
FROM eshop_pos
INNER JOIN eshop_cats t1 ON eshop_pos.eshopcatid = t1.id
    AND t1.active = 1
INNER JOIN positions_categories_links t2 ON t2.pos_id = eshop_pos.id
INNER JOIN categories t3 ON t3.id = t2.cat_id
    AND t3.active = 1
    AND t3.section_id = 62021
INNER JOIN positions_categories_links t4 ON t4.pos_id = eshop_pos.id
INNER JOIN categories t5 ON t5.id = t4.cat_id
    AND t5.active = 1
    AND t5.section_id = 62023
INNER JOIN positions_categories_links AS duplex_links_51 ON duplex_links_51.pos_id = eshop_pos.id
    AND duplex_links_51.cat_id = 51
    AND duplex_links_51.value IN (2984)
INNER JOIN positions_categories_links AS duplex_links_52 ON duplex_links_52.pos_id = eshop_pos.id
    AND duplex_links_52.cat_id = 52
    AND duplex_links_52.value IN (3003)
INNER JOIN positions_categories_links AS duplex_links_3904 ON duplex_links_3904.pos_id = eshop_pos.id
    AND duplex_links_3904.cat_id = 3904
    AND duplex_links_3904.value IN (3941)
INNER JOIN positions_categories_links AS duplex_links_4462 ON duplex_links_4462.pos_id = eshop_pos.id
    AND duplex_links_4462.cat_id = 4462
    AND duplex_links_4462.value IN (4465)
INNER JOIN positions_categories_links AS duplex_links_4466 ON duplex_links_4466.pos_id = eshop_pos.id
    AND duplex_links_4466.cat_id = 4466
    AND duplex_links_4466.value IN (4468)
INNER JOIN positions_categories_links AS duplex_links_4472 ON duplex_links_4472.pos_id = eshop_pos.id
    AND duplex_links_4472.cat_id = 4472
    AND duplex_links_4472.value IN (4473)
INNER JOIN positions_categories_links AS duplex_links_4974 ON duplex_links_4974.pos_id = eshop_pos.id
    AND duplex_links_4974.cat_id = 4974
    AND duplex_links_4974.value IN (4978)
INNER JOIN positions_categories_links AS duplex_links_4979 ON duplex_links_4979.pos_id = eshop_pos.id
    AND duplex_links_4979.cat_id = 4979
    AND duplex_links_4979.value IN (4982)
INNER JOIN positions_categories_links AS duplex_links_4984 ON duplex_links_4984.pos_id = eshop_pos.id
    AND duplex_links_4984.cat_id = 4984
    AND duplex_links_4984.value IN (4986)

无法在服务器上运行查询的 EXPLAIN 命令。但是在我的本地笔记本电脑上可以正常工作:

查询的EXPLAIN结果


请花些时间审查您的问题并修复格式问题。 - Brien Foss
谢谢!我刚刚完成了编辑。 - Andrey Lartsev
既然这是你在这个网站的第一天,我会帮助你。给我几分钟时间,然后接受我对你的帖子所做的修改。花点时间看看区别,并尝试更加仔细地提出下一个问题。 - Brien Foss
谢谢!看起来不错。 - Andrey Lartsev
“goes down” 指什么?服务器崩溃了吗?还是超时了?或者是没有返回?您是否能运行一下查询的解释? - Doon
整个网站都会出现网关超时,除非你终止查询。我今天稍后会发布解释。 - Andrey Lartsev
1个回答

1

稍微修改了您的查询格式。我还将您的“duplex_links”引用更改为缩写别名“PCL”(来自您的Position_categories_links表)。较短和相关的帮助表引用(至少对我和可能其他人都是如此)。

至于您的表/索引,如果它们尚不存在,我建议使用以下表/索引。在这种情况下,我为您的查询提供了所有覆盖索引,这意味着由于用于满足所有连接条件的列是索引的一部分,因此sql数据库不必转到底层实际数据页面以确认其他细节,从而有助于提高性能。

Table                       Index
eshop_pos                   (id, eshopcatid)
eshop_cats                  (id, active)
positions_categories_links  (pos_id, cat_id, value)
categories                  (id, active, section_id)

我也喜欢展示缩进相关性,以便您知道如何从一个表/别名到达下一个级别。您可以直接看到层次结构和来源。
SELECT 
      COUNT(DISTINCT eshop_pos.id) 
   FROM 
      eshop_pos 
         inner join eshop_cats t1 
            on eshop_pos.eshopcatid = t1.id 
           AND t1.active = 1 

         inner join positions_categories_links t2 
            on eshop_pos.id = t2.pos_id

            inner join categories t3 
               on t2.cat_id = t3.id
              and t3.active = 1 
              and t3.section_id = 62021 

         inner join positions_categories_links t4 
            on eshop_pos.id = t4.pos_id

            inner join categories t5 
               on t4.cat_id = t5.id
              and t5.active = 1 
              and t5.section_id = 62023 

         INNER JOIN positions_categories_links AS PCL51 
            ON eshop_pos.id = PCL51.pos_id
           AND PCL51.cat_id = 51 
           and PCL51.value in (2984) 

         INNER JOIN positions_categories_links AS PCL52 
            ON eshop_pos.id = PCL52.pos_id
           AND PCL52.cat_id = 52 
           and PCL52.value in (3003) 

         INNER JOIN positions_categories_links AS PCL3904 
            ON eshop_pos.id = PCL3904.pos_id
           AND PCL3904.cat_id = 3904 
           and PCL3904.value in (3941) 

         INNER JOIN positions_categories_links AS PCL4462
            ON eshop_pos.id = PCL4462.pos_id
           AND PCL4462.cat_id = 4462 
           and PCL4462.value in (4465) 

         INNER JOIN positions_categories_links AS PCL4466 
            ON eshop_pos.id = PCL4466.pos_id
           AND PCL4466.cat_id = 4466 
           and PCL4466.value in (4468) 

现在,拥有适当的索引来帮助优化查询是一件事情,但是不断地对这些多个条件进行计数可能会过度。如果您有一个已知的特定细节级别,比如这里的特定类别,肯定会有所帮助。

感谢您的解决方案。经过快速审查,我猜测它将输出不同的结果。在上面的查询中使用了内连接,而您建议使用OR。 - Andrey Lartsev
@AndreyLartsev,啊...对了...忽略OR'd版本,我会将其删除,因为您希望所有内容都有资格... - DRapp

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接