我有以下查询:
SELECT
analytics.source AS referrer,
COUNT(analytics.id) AS frequency,
SUM(IF(transactions.status = 'COMPLETED', 1, 0)) AS sales
FROM analytics
LEFT JOIN transactions ON analytics.id = transactions.analytics
WHERE analytics.user_id = 52094
GROUP BY analytics.source
ORDER BY frequency DESC
LIMIT 10
分析表有 60M 行,交易表有 3M 行。
当我在此查询上运行 EXPLAIN
命令时,得到如下结果:
+------+--------------+-----------------+--------+---------------------+-------------------+----------------------+---------------------------+----------+-----------+-------------------------------------------------+
| # id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | |
+------+--------------+-----------------+--------+---------------------+-------------------+----------------------+---------------------------+----------+-----------+-------------------------------------------------+
| '1' | 'SIMPLE' | 'analytics' | 'ref' | 'analytics_user_id | analytics_source' | 'analytics_user_id' | '5' | 'const' | '337662' | 'Using where; Using temporary; Using filesort' |
| '1' | 'SIMPLE' | 'transactions' | 'ref' | 'tran_analytics' | 'tran_analytics' | '5' | 'dijishop2.analytics.id' | '1' | NULL | |
+------+--------------+-----------------+--------+---------------------+-------------------+----------------------+---------------------------+----------+-----------+-------------------------------------------------+
我不知道如何优化这个查询,因为它已经非常基本了。运行这个查询大约需要70秒。
以下是现有的索引:
+-------------+-------------+----------------------------+---------------+------------------+------------+--------------+-----------+---------+--------+-------------+----------+----------------+
| # Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+-------------+----------------------------+---------------+------------------+------------+--------------+-----------+---------+--------+-------------+----------+----------------+
| 'analytics' | '0' | 'PRIMARY' | '1' | 'id' | 'A' | '56934235' | NULL | NULL | '' | 'BTREE' | '' | '' |
| 'analytics' | '1' | 'analytics_user_id' | '1' | 'user_id' | 'A' | '130583' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'analytics' | '1' | 'analytics_product_id' | '1' | 'product_id' | 'A' | '490812' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'analytics' | '1' | 'analytics_affil_user_id' | '1' | 'affil_user_id' | 'A' | '55222' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'analytics' | '1' | 'analytics_source' | '1' | 'source' | 'A' | '24604' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'analytics' | '1' | 'analytics_country_name' | '1' | 'country_name' | 'A' | '39510' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'analytics' | '1' | 'analytics_gordon' | '1' | 'id' | 'A' | '56934235' | NULL | NULL | '' | 'BTREE' | '' | '' |
| 'analytics' | '1' | 'analytics_gordon' | '2' | 'user_id' | 'A' | '56934235' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'analytics' | '1' | 'analytics_gordon' | '3' | 'source' | 'A' | '56934235' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
+-------------+-------------+----------------------------+---------------+------------------+------------+--------------+-----------+---------+--------+-------------+----------+----------------+
+----------------+-------------+-------------------+---------------+-------------------+------------+--------------+-----------+---------+--------+-------------+----------+----------------+
| # Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+-------------+-------------------+---------------+-------------------+------------+--------------+-----------+---------+--------+-------------+----------+----------------+
| 'transactions' | '0' | 'PRIMARY' | '1' | 'id' | 'A' | '2436151' | NULL | NULL | '' | 'BTREE' | '' | '' |
| 'transactions' | '1' | 'tran_user_id' | '1' | 'user_id' | 'A' | '56654' | NULL | NULL | '' | 'BTREE' | '' | '' |
| 'transactions' | '1' | 'transaction_id' | '1' | 'transaction_id' | 'A' | '2436151' | '191' | NULL | 'YES' | 'BTREE' | '' | '' |
| 'transactions' | '1' | 'tran_analytics' | '1' | 'analytics' | 'A' | '2436151' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'transactions' | '1' | 'tran_status' | '1' | 'status' | 'A' | '22' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'transactions' | '1' | 'gordon_trans' | '1' | 'status' | 'A' | '22' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
| 'transactions' | '1' | 'gordon_trans' | '2' | 'analytics' | 'A' | '2436151' | NULL | NULL | 'YES' | 'BTREE' | '' | '' |
+----------------+-------------+-------------------+---------------+-------------------+------------+--------------+-----------+---------+--------+-------------+----------+----------------+
在没有添加任何额外索引的情况下,两个表的简化架构如下,因为建议添加索引并没有改善情况。
CREATE TABLE `analytics` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`affil_user_id` int(11) DEFAULT NULL,
`product_id` int(11) DEFAULT NULL,
`medium` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`source` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`terms` varchar(1024) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`is_browser` tinyint(1) DEFAULT NULL,
`is_mobile` tinyint(1) DEFAULT NULL,
`is_robot` tinyint(1) DEFAULT NULL,
`browser` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`mobile` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`robot` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`platform` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`referrer` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`domain` varchar(45) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`ip` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`continent_code` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`country_name` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`city` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `analytics_user_id` (`user_id`),
KEY `analytics_product_id` (`product_id`),
KEY `analytics_affil_user_id` (`affil_user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=64821325 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE `transactions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`transaction_id` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`user_id` int(11) NOT NULL,
`pay_key` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`sender_email` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`amount` decimal(10,2) DEFAULT NULL,
`currency` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`status` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`analytics` int(11) DEFAULT NULL,
`ip_address` varchar(46) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`session_id` varchar(60) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`eu_vat_applied` int(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `tran_user_id` (`user_id`),
KEY `transaction_id` (`transaction_id`(191)),
KEY `tran_analytics` (`analytics`),
KEY `tran_status` (`status`)
) ENGINE=InnoDB AUTO_INCREMENT=10019356 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
如果以上内容已经无法进一步优化,那么任何关于摘要表的实现建议都将是很好的。我们在AWS上使用LAMP堆栈。上述查询正在RDS(m1.large)上运行。
SHOW CREATE TABLE
的原因是,如果有人想在沙盒实例上尝试您的表格,他们必须费力地猜测您的列和索引来重新创建表格。从您的SHOW INDEXES
中可以拼凑出类似于您的表格的东西,但这需要太多的工作,而且我不能确定它是否正确。我不会花时间去做那些事情。祝你好运! - Bill KarwinGROUP BY
子句,查询性能会发生什么变化?(我知道这样做不会得到想要的结果;关键是要弄清楚GROUP BY ... LIMIT...
是否占用了大量时间。) - O. Jonesa LEFT JOIN b
查询时,COUNT(a.id)
有点奇怪。它计算来自b
的匹配行数,并对没有在b
中找到匹配行的每一行a
计数为1
。这是你想要的吗?对我而言,这听起来像是难以向用户解释清楚的事情。在那个COUNT
操作中完美无缺非常重要,因为之后你会用它进行GROUP BY ... LIMIT ...
操作。 - O. Jones