我有关于MySQL查询性能的问题。
表(InnoDB):
+--------------------+---------------------+------+-----+-------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+---------------------+------+-----+-------------------+-------+
| st_resource_id | varchar(32) | NO | MUL | NULL | |
| st_sub_resource_id | varchar(32) | YES | | NULL | |
| st_title | varchar(500) | YES | | NULL | |
| st_resource_type | varchar(100) | NO | MUL | NULL | |
| st_site_id | tinyint(4) | NO | MUL | NULL | |
| st_time | timestamp | NO | MUL | CURRENT_TIMESTAMP | |
| st_user_id | int(10) unsigned | YES | | NULL | |
| st_full_access | tinyint(1) unsigned | YES | | NULL | |
+--------------------+---------------------+------+-----+-------------------+-------+
索引:
+---------------+------------+------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------------+------------+------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+
| nr_statistics | 1 | resource_id | 1 | st_resource_id | A | 1546165 | NULL | NULL | | BTREE | |
| nr_statistics | 1 | resource_id | 2 | st_sub_resource_id | A | 1546165 | NULL | NULL | YES | BTREE | |
| nr_statistics | 1 | st_time | 1 | st_time | A | 1546165 | NULL | NULL | | BTREE | |
| nr_statistics | 1 | st_site_id | 1 | st_site_id | A | 16 | NULL | NULL | | BTREE | |
| nr_statistics | 1 | st_resource_type | 1 | st_resource_type | A | 16 | 10 | NULL | | BTREE | |
+---------------+------------+------------------+--------------+--------------------+-----------+-------------+----------+--------+------+------------+---------+
查询:
SELECT st_resource_id AS docId, count(*) AS cnt
FROM nr_statistics
WHERE
st_resource_type = 'document'
AND st_sub_resource_id = 'text'
AND st_time > DATE_SUB(NOW(), INTERVAL 7 DAY)
AND st_site_id = 1
GROUP BY st_resource_id
ORDER BY cnt DESC
LIMIT 0, 5;
查询计划:
+----+-------------+---------------+-------+-------------------------------------+-------------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+-------------------------------------+-------------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | nr_statistics | index | st_time,st_site_id,st_resource_type | resource_id | 197 | NULL | 1581044 | Using where; Using temporary; Using filesort |
+----+-------------+---------------+-------+-------------------------------------+-------------+---------+------+---------+----------------------------------------------+
表格有大约1,666,383行。查询运行非常缓慢。在MySQL进程列表中,我长时间看到这个查询处于“复制到临时表阶段”(> 1分钟)。查询生成大量的I/O负载。我不明白该怎么做来解决问题并加快查询执行速度。
如果问题是错误索引导致的,那么正确的索引是什么?
更新. 我创建了一个新的组合索引:
| nr_statistics | 1 | st_site_id_2 | 1 | st_site_id | A | 16 | NULL | NULL | | BTREE | |
| nr_statistics | 1 | st_site_id_2 | 2 | st_resource_type | A | 16 | NULL | NULL | | BTREE | |
| nr_statistics | 1 | st_site_id_2 | 3 | st_sub_resource_id | A | 752018 | NULL | NULL | YES | BTREE | |
| nr_statistics | 1 | st_site_id_2 | 4 | st_time | A | 1504037 | NULL | NULL | | BTREE | |
| nr_statistics | 1 | st_site_id_2 | 5 | st_resource_id | A | 1504037 | NULL | NULL | | BTREE | |
现在的查询计划是:
+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+-----------------------------------------------------------+
| 1 | SIMPLE | nr_statistics | range | st_site_id_2 | st_site_id_2 | 406 | NULL | 21168 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+-----------------------------------------------------------+
现在查询速度非常快(0.0x秒),但是我必须强制使用新索引:
SELECT st_resource_id as docId, count( * ) AS Cnt
FROM nr_statistics
USE INDEX (st_site_id_2)
WHERE st_resource_type = 'document'
AND st_sub_resource_id = 'text'
AND st_time > DATE_SUB( NOW( ) , INTERVAL 7 DAY )
AND st_site_id = 1
GROUP BY st_resource_id
ORDER BY cnt DESC
LIMIT 0 , 5;
虽然问题已解决(虽不太美观但有效),但我仍有一些未解决的问题(请参见评论)。