需要帮助提高MYSQL子查询性能

6
我正在学习MYSQL,我有一个像这样的MySql子查询:
EXPLAIN EXTENDED SELECT brand_name, stars, hh_stock, hh_stock_value, sales_monthly_1, sales_monthly_2, sales_monthly_3, sold_monthly_1, sold_monthly_2,
  sold_monthly_3, price_uvp, price_ecp, price_default, price_margin AS margin, vc_percent as vc, cogs, products_length, products_id, material_expenses,
  MAX(price) AS products_price, SUM(total_sales) AS total_sales,
  IFNULL(MAX(active_age), DATEDIFF(NOW(), products_date_added)) AS products_age, DATEDIFF(NOW(), products_date_added) AS jng_products_age,
  AVG(sales_weekly) AS sales_weekly, AVG(sales_monthly) AS sales_monthly, SUM(total_sold) AS total_sold, SUM(total_returned) AS total_returned,
  ((SUM(total_returned)/SUM(total_sold)) * 100) AS returned_rate
FROM
  (
    SELECT p.products_id, jc.price, jc.price_end_customer AS price_ecp, jc.total_sales, jc.active_age, jc.sales_weekly,
      jc.sales_monthly, jc.total_sold, jc.total_returned, jc.price_uvp, p.price_margin, p.vc_percent, p.material_expenses,
      p.products_date_added, p.stars , pb.brand_name, p.family_id, p.products_price_default AS price_default, pl.sales_monthly_1,
      pl.sales_monthly_2, pl.sales_monthly_3, pl.sold_monthly_1, pl.sold_monthly_2, pl.sold_monthly_3, pst.stock AS hh_stock,
      (pst.stock * p.average_stock_value) AS hh_stock_value, pnc.products_length,
      IF(ploc.cogs IS NULL OR ploc.cogs=0,
         (CASE p.complexity
          WHEN 'F' THEN ROUND(5*(p.material_expenses+(7.5/100*p.material_expenses)+1.7+0.25+2.2)/100+(p.material_expenses+(7.5/100*p.material_expenses)+1.7+0.25+2.2),2)
          WHEN 'E' THEN ROUND(5*(p.material_expenses+(7.5/100*p.material_expenses)+1.7+0.25+2.2)/100+(p.material_expenses+(7.5/100*p.material_expenses)+1.7+0.25+2.2),2)
          WHEN 'N' THEN ROUND(5*(p.material_expenses+(7.5/100*p.material_expenses)+2.4+0.25+2.2)/100+(p.material_expenses+(7.5/100*p.material_expenses)+2.4+0.25+2.2),2)
          WHEN 'M' THEN ROUND(5*(p.material_expenses+(7.5/100*p.material_expenses)+2.4+0.25+2.2)/100+(p.material_expenses+(7.5/100*p.material_expenses)+2.4+0.25+2.2),2)
          WHEN 'I' THEN ROUND(5*(p.material_expenses+(7.5/100*p.material_expenses)+3.5+0.25+2.2)/100+(p.material_expenses+(7.5/100*p.material_expenses)+3.5+0.25+2.2),2)
          WHEN 'H' THEN ROUND(5*(p.material_expenses+(7.5/100*p.material_expenses)+3.5+0.25+2.2)/100+(p.material_expenses+(7.5/100*p.material_expenses)+3.5+0.25+2.2),2)
          ELSE ROUND(5*(p.material_expenses+(7.5/100*p.material_expenses)+5+0.25+2.2)/100+(p.material_expenses+(7.5/100*p.material_expenses)+5+0.25+2.2),2) END), ploc.cogs) AS cogs
    FROM products p
      LEFT  JOIN jng_sp_catalog jc ON  jc.products_id=p.products_id
      LEFT JOIN products_description pd ON pd.products_id = p.products_id AND pd.language_id = 2
      LEFT JOIN products_description2 pd2 ON pd2.products_id = p.products_id
      LEFT JOIN products_brand pb ON pb.products_brand_id = p.products_brand_id
      LEFT JOIN products_log pl ON pl.products_id = p.products_id
      LEFT JOIN products_log_static pls ON pls.products_id=p.products_id
      LEFT JOIN products_local ploc ON ploc.products_id = p.products_id
      LEFT JOIN products_non_configurator pnc ON pnc.products_id = p.products_id
      INNER JOIN
      (
        SELECT shp.products_id, CONCAT(',', GROUP_CONCAT(shp.styles_id), ',') AS styles_id
        FROM styles_has_products shp GROUP BY shp.products_id HAVING styles_id NOT LIKE '%,1967,%') subquery_styles ON subquery_styles.products_id = p.products_id
      LEFT JOIN products_stock_temp pst ON pst.products_id=p.products_id WHERE p.active_status='1'  AND p.categories_top_id  =  '1') dt  GROUP BY products_id ORDER BY  products_id;

解释的结果是这样的:
+----+-------------+------------+------------+--------+---------------------+-------------+---------+------------------------------------+--------+----------+----------------------------------------------+
| id | select_type | table      | partitions | type   | possible_keys       | key         | key_len | ref                                | rows   | filtered | Extra                                        |
+----+-------------+------------+------------+--------+---------------------+-------------+---------+------------------------------------+--------+----------+----------------------------------------------+
|  1 | PRIMARY     | p          | NULL       | ALL    | PRIMARY             | NULL        | NULL    | NULL                               |  40458 |     1.00 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY     | pb         | NULL       | eq_ref | PRIMARY             | PRIMARY     | 4       | manobo_central.p.products_brand_id |      1 |   100.00 | NULL                                         |
|  1 | PRIMARY     | ploc       | NULL       | eq_ref | PRIMARY             | PRIMARY     | 4       | manobo_central.p.products_id       |      1 |   100.00 | NULL                                         |
|  1 | PRIMARY     | pl         | NULL       | eq_ref | PRIMARY             | PRIMARY     | 4       | manobo_central.p.products_id       |      1 |   100.00 | Using where                                  |
|  1 | PRIMARY     | pls        | NULL       | eq_ref | PRIMARY             | PRIMARY     | 4       | manobo_central.p.products_id       |      1 |   100.00 | Using index                                  |
|  1 | PRIMARY     | pst        | NULL       | eq_ref | PRIMARY             | PRIMARY     | 4       | manobo_central.p.products_id       |      1 |   100.00 | NULL                                         |
|  1 | PRIMARY     | pd2        | NULL       | eq_ref | PRIMARY             | PRIMARY     | 4       | manobo_central.p.products_id       |      1 |   100.00 | Using index                                  |
|  1 | PRIMARY     | pnc        | NULL       | eq_ref | PRIMARY             | PRIMARY     | 4       | manobo_central.p.products_id       |      1 |   100.00 | Using where                                  |
|  1 | PRIMARY     | pd         | NULL       | eq_ref | PRIMARY             | PRIMARY     | 8       | manobo_central.p.products_id,const |      1 |   100.00 | Using index                                  |
|  1 | PRIMARY     | jc         | NULL       | ref    | products_id         | products_id | 4       | manobo_central.p.products_id       |      4 |   100.00 | Using where                                  |
|  1 | PRIMARY     | <derived3> | NULL       | ref    | <auto_key0>         | <auto_key0> | 4       | manobo_central.p.products_id       |     10 |   100.00 | Using where                                  |
|  3 | DERIVED     | shp        | NULL       | index  | PRIMARY,products_id | PRIMARY     | 8       | NULL                               | 208226 |   100.00 | Using index; Using filesort                  |
+----+-------------+------------+------------+--------+---------------------+-------------+---------+------------------------------------+--------+----------+----------------------------------------------+

我已经考虑了几种选择:
  1. 我将放弃子查询,使用VIEWS来输出数据,就像使用查询一样。因为我在FROM子句中使用了子查询,所以我将从VIEWS中使用VIEWS。但有人说这会影响性能。你们怎么看?
  2. 我仍然会使用子查询,但会尝试搜索如何优化查询。对于这个问题,我想问一下,EXPLAIN TABLE中的第一行结果显示表production p,类型为“all”,如何避免“all”?我已经设法将其他表的类型设置为“eq_ref”,但仍然不知道为什么产品表是“all”?

再次询问您的意见, 您认为我需要切换到VIEW吗?还是再试着优化子查询。

非常感谢!

编辑:products索引表

create index family_id on products (family_id);
create index idx_products_date_added on products (products_date_added);
create index material_expenses on products (material_expenses);
create index products_brand_id on products (products_brand_id);
create index products_ean on products (products_ean);
create index products_status on products (products_status);
create index tb_status on products (tb_status);

编辑:表格样式_has_产品
CREATE TABLE `styles_has_products` (
  `styles_id` int(10) unsigned NOT NULL DEFAULT '0',
  `products_id` int(10) unsigned NOT NULL DEFAULT '0',
  `date_added` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`styles_id`,`products_id`),
  KEY `products_id` (`products_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

你在产品表上有哪些索引? - Shadow
@Shadow 我已经添加了产品索引。请查看主帖。谢谢! - Erick
请问您能否发布表的唯一键?我是想了解,是否可以有多个product_id对应多个products_description2,多个product_id对应多个products_local等等。我猜测products_description的唯一键是products_id + language,这样我们就可以通过查询得到一个产品的确切描述,是吗?但其他表呢? - Thorsten Kettner
顺便说一下,视图不会影响性能。它们只是为了方便,这样您就不必一遍又一遍地输入相同的内容。(好吧,这并不完全正确;视图可以影响性能,但只会以负面方式。它们永远不会更快,但有时会更慢,例如当它们嵌套或聚合时,因此优化器在应用标准时会出现问题。) - Thorsten Kettner
一些格式化会很好。 - Strawberry
3个回答

1

首先,永远不要为实时使用编写如此复杂的查询。我建议进行批处理并维护数据仓库,并在数据仓库上使用实时查询。

仍然有许多事情您不应该在实时使用SQL查询中做以获得更好的性能。例如,永远不要使用更多的联接操作,永远不要放置更多的if else条件,尤其是如果表很大,永远不要应用group by,寻找适当的索引和分区结构。


我不同意。我不认为这个查询复杂。而且,为什么不使用SQL按组聚合数据呢?这是SQL的一个优势。通常一个查询涉及到很多表。再说一遍:为什么不连接这些表呢?关系型数据库就是关于这个的,相关的表通过连接来选择它们的数据。 - Thorsten Kettner

0
我注意到的第一件事是你的subquery_styles。你除了用它来过滤之外,没有使用它的结果。在我看来,标准应该放在WHERE子句中。由于你似乎想排除存在style_id 1967的产品,我建议使用NOT EXISTSNOT IN
WHERE p.active_status = 1
AND p.categories_top_id = 1
AND p.products_id NOT IN
(
  SELECT products_id
  FROM styles_has_products
  WHERE styles_id = 1967
)

第二件事是您的查询没有适当的索引。 您正在选择具有active_status 1和categories_top_id 1的产品,但这些列上没有索引。 随着第三个条件的product_id不匹配style_id 1967,我建议使用以下其中一种索引:
create index idx1 on products (active_status, categories_top_id, products_id);
create index idx2 on products (categories_top_id, active_status, products_id);

创建两个,看哪一个正在使用,然后删除另一个。

最后一个可能需要优化/更改的点是聚合。但是为了在这里提供帮助,我必须知道表的唯一键。一旦您发布它们,我将扩展此答案:-)


0

借鉴Thorsten的建议,不要使用NOT IN ( SELECT )的方法,而是使用以下方式:

NOT EXISTS( SELECT * FROM styles_has_products
       WHERE products_id = p.products_id
         AND styles_id = 1967 )

styles_has_products需要以任意顺序包含INDEX(products_id,styles_id)

请向我们展示SHOW CREATE TABLE styles_has_products。如果它是一个多对多的映射表,请参见这里的提示here

索引需要在您要进入的表上而不是从中返回。因此,products的索引列表可能不会被使用。这个复合索引可能有用:

INDEX(categories_top_id, active_status) -- in either order

VIEWs只是语法糖,它们本身并不提供任何性能优势。在某些情况下,它们会影响性能。

pdpd2pls和其他一些内容都没有被使用,可以删除它们的JOINs

SUMsAVGs可能会出现错误。这是因为JOIN+GROUP BY导致了"爆炸-合并"的问题。先清理一些其他的东西,然后我们可以讨论如何重新安排事物,使得SUMsAVGs只针对每个product_id进行一次计算。


嗨Rick,感谢你的帮助和建议。 我已经在上面的描述中更新了表格style_has_products。 我会先尝试理解你的帖子,并跟进你的答案。 再次感谢。 - Erick
你应该从MyISAM转移到InnoDB。 - Rick James

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接