我有一个表格,其中包含列:
CREATE TABLE aggregates (
a VARHCAR,
b VARCHAR,
c VARCHAR,
metric INT
KEY test (a, b, c, metric)
);
如果我执行这样的查询:
SELECT b, c, SUM(metric) metric
FROM aggregates
WHERE a IN ('a', 'couple', 'of', 'values')
GROUP BY b, c
ORDER BY b, c
这个查询需要10秒钟,解释如下:
+----+-------------+------------+-------+---------------+------+---------+------+--------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+------+---------+------+--------+-----------------------------------------------------------+
| 1 | SIMPLE | aggregates | range | test | test | 767 | NULL | 582383 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+------------+-------+---------------+------+---------+------+--------+-----------------------------------------------------------+
如果我还按列a分组/排序,这样就不需要临时/文件排序,但是然后在另一个查询中我自己做同样的事情:
SELECT b, c, SUM(metric) metric
FROM (
SELECT a, b, c, SUM(metric) metric
FROM aggregates
WHERE a IN ('a', 'couple', 'of', 'values')
GROUP BY a, b, c
ORDER BY a, b, c
) t
GROUP BY b, c
ORDER BY b, c
查询耗时1秒,执行计划如下:
+----+-------------+------------+-------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+------+---------+------+--------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 252 | Using temporary; Using filesort |
| 2 | DERIVED | aggregates | range | test | test | 767 | NULL | 582383 | Using where; Using index |
+----+-------------+------------+-------+---------------+------+---------+------+--------+---------------------------------+
为什么会这样呢?如果我在一个查询中进行分组,而不是在外部查询中进行分组,为什么后者更快呢?
metric
吗? - ypercubeᵀᴹGROUP BY b,c
时,不需要声明ORDER BY b,c
。 - ypercubeᵀᴹ