我有一个表格,长这样:
CREATE TABLE `metric` (
`metricid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`host` varchar(50) NOT NULL,
`userid` int(10) unsigned DEFAULT NULL,
`lastmetricvalue` double DEFAULT NULL,
`receivedat` int(10) unsigned DEFAULT NULL,
`name` varchar(255) NOT NULL,
`sampleid` tinyint(3) unsigned NOT NULL,
`type` tinyint(3) unsigned NOT NULL DEFAULT '0',
`lastrawvalue` double NOT NULL,
`priority` tinyint(3) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`metricid`),
UNIQUE KEY `unique-metric` (`userid`,`host`,`name`,`sampleid`)
) ENGINE=InnoDB AUTO_INCREMENT=1000000221496 DEFAULT CHARSET=utf8
目前它有177,892行,当我运行以下查询:
select metricid, lastrawvalue, receivedat, name, sampleid
FROM metric m
WHERE m.userid = 8
AND (host, name, sampleid) IN (('localhost','0.4350799184758216cpu-3/cpu-nice',0),
('localhost','0.4350799184758216cpu-3/cpu-system',0),
('localhost','0.4350799184758216cpu-3/cpu-idle',0),
('localhost','0.4350799184758216cpu-3/cpu-wait',0),
('localhost','0.4350799184758216cpu-3/cpu-interrupt',0),
('localhost','0.4350799184758216cpu-3/cpu-softirq',0),
('localhost','0.4350799184758216cpu-3/cpu-steal',0),
('localhost','0.4350799184758216cpu-4/cpu-user',0),
('localhost','0.4350799184758216cpu-4/cpu-nice',0),
('localhost','0.4350799184758216cpu-4/cpu-system',0),
('localhost','0.4350799184758216cpu-4/cpu-idle',0),
('localhost','0.4350799184758216cpu-4/cpu-wait',0),
('localhost','0.4350799184758216cpu-4/cpu-interrupt',0),
('localhost','0.4350799184758216cpu-4/cpu-softirq',0),
('localhost','0.4350799184758216cpu-4/cpu-steal',0),
('localhost','_util/billing-bytes',0),('localhost','_util/billing-metrics',0));
返回结果需要0.87秒的时间,解释如下:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: m
type: ref
possible_keys: unique-metric
key: unique-metric
key_len: 5
ref: const
rows: 85560
Extra: Using where
1 row in set (0.00 sec)
个人资料看起来像这样:
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000160 |
| checking permissions | 0.000010 |
| Opening tables | 0.000021 |
| exit open_tables() | 0.000008 |
| System lock | 0.000008 |
| mysql_lock_tables(): unlocking | 0.000005 |
| exit mysqld_lock_tables() | 0.000007 |
| init | 0.000068 |
| optimizing | 0.000018 |
| statistics | 0.000091 |
| preparing | 0.000042 |
| executing | 0.000005 |
| Sending data | 0.870180 |
| innobase_commit_low():trx_comm | 0.000012 |
| Sending data | 0.000111 |
| end | 0.000009 |
| query end | 0.000009 |
| ha_commit_one_phase(-1) | 0.000015 |
| innobase_commit_low():trx_comm | 0.000004 |
| ha_commit_one_phase(-1) | 0.000005 |
| query end | 0.000005 |
| closing tables | 0.000012 |
| freeing items | 0.000562 |
| logging slow query | 0.000005 |
| cleaning up | 0.000005 |
| sleeping | 0.000006 |
+--------------------------------+----------+
这对我来说似乎太高了。我尝试将第一个查询中的 userid = 8 and (host, name, sampleid) IN
部分替换为 (userid, host, name, sampleid) IN
,这个查询大约运行了0.5s - 快了近2倍,参考以下查询:
select metricid, lastrawvalue, receivedat, name, sampleid
FROM metric m
WHERE (userid, host, name, sampleid) IN ((8,'localhost','0.4350799184758216cpu-3/cpu-nice',0),
(8,'localhost','0.4350799184758216cpu-3/cpu-system',0),
(8,'localhost','0.4350799184758216cpu-3/cpu-idle',0),
(8,'localhost','0.4350799184758216cpu-3/cpu-wait',0),
(8,'localhost','0.4350799184758216cpu-3/cpu-interrupt',0),
(8,'localhost','0.4350799184758216cpu-3/cpu-softirq',0),
(8,'localhost','0.4350799184758216cpu-3/cpu-steal',0),
(8,'localhost','0.4350799184758216cpu-4/cpu-user',0),
(8,'localhost','0.4350799184758216cpu-4/cpu-nice',0),
(8,'localhost','0.4350799184758216cpu-4/cpu-system',0),
(8,'localhost','0.4350799184758216cpu-4/cpu-idle',0),
(8,'localhost','0.4350799184758216cpu-4/cpu-wait',0),
(8,'localhost','0.4350799184758216cpu-4/cpu-interrupt',0),
(8,'localhost','0.4350799184758216cpu-4/cpu-softirq',0),
(8,'localhost','0.4350799184758216cpu-4/cpu-steal',0),
(8,'localhost','_util/billing-bytes',0),
(8,'localhost','_util/billing-metrics',0));
它的解释看起来像这样:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: m
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 171121
Extra: Using where
1 row in set (0.00 sec)
接下来我已经更新了表格,将其包含为单个连接列:
alter table `metric` add `forindex` varchar(120) not null default '';
update metric set forindex = concat(userid,`host`,`name`,sampleid);
alter table metric add index `forindex` (`forindex`);
更新查询,只搜索一个字符串:
select metricid, lastrawvalue, receivedat, name, sampleid
FROM metric m
WHERE (forindex) IN (('8localhost0.4350799184758216cpu-3/cpu-nice0'),
('8localhost0.4350799184758216cpu-3/cpu-system0'),
('8localhost0.4350799184758216cpu-3/cpu-idle0'),
('8localhost0.4350799184758216cpu-3/cpu-wait0'),
('8localhost0.4350799184758216cpu-3/cpu-interrupt0'),
('8localhost0.4350799184758216cpu-3/cpu-softirq0'),
('8localhost0.4350799184758216cpu-3/cpu-steal0'),
('8localhost0.4350799184758216cpu-4/cpu-user0'),
('8localhost0.4350799184758216cpu-4/cpu-nice0'),
('8localhost0.4350799184758216cpu-4/cpu-system0'),
('8localhost0.4350799184758216cpu-4/cpu-idle0'),
('8localhost0.4350799184758216cpu-4/cpu-wait0'),
('8localhost0.4350799184758216cpu-4/cpu-interrupt0'),
('8localhost0.4350799184758216cpu-4/cpu-softirq0'),
('8localhost0.4350799184758216cpu-4/cpu-steal0'),
('8localhost_util/billing-bytes0'),
('8localhost_util/billing-metrics0'));
现在我可以在0.00秒内得到相同的结果!解释如下:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: m
type: range
possible_keys: forindex
key: forindex
key_len: 362
ref: NULL
rows: 17
Extra: Using where
1 row in set (0.00 sec)
总结一下,以下是结果:
m.userid = X AND (host, name, sampleid) IN
——使用了索引,扫描了85560行,运行时间为0.9秒。(userid, host, name, sampleid) IN
——未使用索引,扫描了171121行,运行时间为0.5秒。- 用一个连接的实用程序列取代了复合索引的附加列——使用了索引,扫描了17行,运行时间为0秒。
为什么第二个查询比第一个查询运行得更快?为什么第三个查询比其他查询快那么多?我应该保留这样一个列仅仅是为了更快的搜索吗?
MySQL版本为:
mysqld Ver 5.5.34-55 for Linux on x86_64 (Percona XtraDB Cluster (GPL), wsrep_25.9.r3928)
(x and x) or (x and x)
语句,但问题是,为什么where (a,b) in ((x1,x2),(x3,x4))
不起作用呢? - FluffyANALYZE TABLE
以获取所引用表的信息。 - Bill Karwin