我有一个包含每天温度的表格(巨大的表格),以及一个包含周期开始和结束日期的表格(小表格)。现在我想知道每个周期的平均温度,但是查询时间很长。是否可以改进?
注意:升级到版本5.6.19-1~exp1ubuntu2后,响应时间变短,并且可能是由于MySQL 5.6.8之前的版本存在错误(请参见Quassnoi的评论)。
重新构建包含随机数据的日和周期表:
现在,使用GROUP BY 进行查询速度非常缓慢(约为1750毫秒)。
原来,单个时间段的结果可以立即返回(1毫秒):
查询为什么如此缓慢,甚至比获取单个时间段结果要慢(多达10倍),尽管“periods”表只有10行?是否有任何方法可以优化此查询?
编辑:更多信息:
注意:升级到版本5.6.19-1~exp1ubuntu2后,响应时间变短,并且可能是由于MySQL 5.6.8之前的版本存在错误(请参见Quassnoi的评论)。
重新构建包含随机数据的日和周期表:
create table days (
day int not null auto_increment primary key,
temperature float not null);
insert into days values(null,rand()),(null,rand()),
(null,rand()),(null,rand()),(null,rand()),(null,rand()),
(null,rand()),(null,rand()); # 8 rows
insert into days select null, d1.temperature
from days d1, days d2, days d3, days d4,
days d5, days d6, days d7; # 2M rows
create table periods(id int not null auto_increment primary key,
first int not null,
last int not null,
index(first) using btree,
index(last) using btree,
index(first,last) using btree);
# add 10 periods of 1-11 days each
insert into periods(first,last)
select floor(rand(day)*2000000), floor(rand(day)*2000000 + rand()*10)
from days limit 10;
列出每个时间段的每天气温是没有问题的(返回结果在1毫秒内):
select id, temperature
from periods join days on day >= first and day <= last;
现在,使用GROUP BY 进行查询速度非常缓慢(约为1750毫秒)。
# ALT1
select id, avg(temperature)
from periods join days on day >= first and day <= last group by id;
将 <= 和 >= 替换为 BETWEEN 可以略微加快速度(约1600毫秒):
# ALT2
select id, avg(temperature)
from periods join days on day between first and last group by id;
原来,单个时间段的结果可以立即返回(1毫秒):
select id, (select avg(temperature)
from days where day >= first and day <= last) from periods
where id=1;
然而,如果没有WHERE条件,查询需要长达4200毫秒,平均每个周期需要420毫秒!
# ALT3
select id,
(select avg(temperature) from days where day >= first and day <= last)
from periods;
查询为什么如此缓慢,甚至比获取单个时间段结果要慢(多达10倍),尽管“periods”表只有10行?是否有任何方法可以优化此查询?
编辑:更多信息:
mysql> select @@version;
+-------------------------+
| @@version |
+-------------------------+
| 5.5.41-0ubuntu0.14.04.1 |
+-------------------------+
# ALT1
mysql> explain select id, avg(temperature) from periods join days on day >= first and day <= last group by id;
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | periods | index | first,last,first_2 | first_2 | 8 | NULL | 10 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | days | ALL | PRIMARY | NULL | NULL | NULL | 2097596 | Using where; Using join buffer |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
# ALT1 without GROUP BY
mysql> explain select id, temperature from periods join days on day >= first and day <= last;
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | periods | index | first,last,first_2 | first_2 | 8 | NULL | 10 | Using index |
| 1 | SIMPLE | days | ALL | PRIMARY | NULL | NULL | NULL | 2097596 | Range checked for each record (index map: 0x1) |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+------------------------------------------------+
# ALT2
mysql> explain select id, avg(temperature) from periods join days on day between first and last group by id;
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | periods | index | first,last,first_2 | first_2 | 8 | NULL | 10 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | days | ALL | PRIMARY | NULL | NULL | NULL | 2097596 | Using where; Using join buffer |
+----+-------------+---------+-------+--------------------+---------+---------+------+---------+----------------------------------------------+
# ALT3
mysql> explain select id, (select avg(temperature) from days where day >= first and day <= last) from periods;
+----+--------------------+---------+-------+---------------+---------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------+-------+---------------+---------+---------+------+---------+-------------+
| 1 | PRIMARY | periods | index | NULL | first_2 | 8 | NULL | 10 | Using index |
| 2 | DEPENDENT SUBQUERY | days | ALL | PRIMARY | NULL | NULL | NULL | 2097596 | Using where |
+----+--------------------+---------+-------+---------------+---------+---------+------+---------+-------------+
# ALT3 with where
mysql> explain select id, (select avg(temperature) from days where day >= first and day <= last) from periods where id = 1;
+----+--------------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | PRIMARY | periods | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 2 | DEPENDENT SUBQUERY | days | range | PRIMARY | PRIMARY | 4 | NULL | 10 | Using where |
+----+--------------------+---------+-------+---------------+---------+---------+-------+------+-------------+
编辑2:根据Lennart的建议,嵌套查询FROM的执行计划(查询执行时间为3毫秒)
mysql> explain select id,avg(temperature) from (select id,temperature from periods join days on day between first and last) as t group by id;
+----+-------------+------------+-------+--------------------+---------+---------+------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+--------------------+---------+---------+------+----------+------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 50 | Using temporary; Using filesort |
| 2 | DERIVED | periods | index | first,last,first_2 | first_2 | 8 | NULL | 10 | Using index |
| 2 | DERIVED | days | range | PRIMARY,day | PRIMARY | 4 | NULL | 5 | Range checked for each record (index map: 0x3) |
+----+-------------+------------+-------+--------------------+---------+---------+------+----------+------------------------------------------------+
id = 1
比其他 id 更快。 - Gordon Linoff