我有一个MySQL表格,其中包含10万行记录,记录了一些服务器日志,格式如下:
CREATE TABLE `logs` (
`id` INT NOT NULL AUTO_INCREMENT,
`ip` VARCHAR(16) NULL,
`date` DATETIME NULL,
`session_time` SMALLINT UNSIGNED NULL,
PRIMARY KEY (`id`));
我正在尝试计算会话时间,即相同IP地址的连续行之间的时间差。我可以通过以下选择查询来实现这一目标,并且查询时间不到一秒钟:
SELECT * FROM logs AS a
LEFT JOIN (
SELECT id,
from_unixtime(@diff) AS starttime,
date AS endtime,
IF(@diff = 0, 0, (unix_timestamp(date) - @diff)/60) AS session_time1,
@diff := unix_timestamp(date)
FROM logs,
(SELECT @diff := 0) AS x
ORDER BY ip, logs.date
) AS b ON
a.id = b.id
然而,当我试图使用之前的查询来进行更新连接以更新时间会话时,以下更新查询需要超过600秒:
UPDATE logs AS a
LEFT JOIN (
SELECT id,
from_unixtime(@diff) AS starttime,
date AS endtime,
IF(@diff = 0, 0, (unix_timestamp(date) - @diff)/60) AS session_time1,
@diff := unix_timestamp(date)
FROM logs,
(SELECT @diff := 0) AS x
ORDER BY ip, logs.date
) AS b ON
a.id = b.id
SET session_time = session_time1;
我错过了什么?
谢谢!
更新:这里是select
的EXPLAIN
:
+----+-------------+------------+--------+---------------+------+--------+
| id | select_type | table | type | possible_keys | key | rows |
+----+-------------+------------+--------+---------------+------+--------+
| 1 | PRIMARY | a | ALL | NULL | NULL | 109029 |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | 108680 |
| 2 | DERIVED | <derived3> | system | NULL | NULL | 1 |
| 2 | DERIVED | logs | ALL | NULL | NULL | 109029 |
| 3 | DERIVED | NULL | NULL | NULL | NULL | NULL |
+----+-------------+------------+--------+---------------+------+--------+
WHERE ip = '...'
?看起来你正在更新所有的10万条目,但是你选择的那些(LEFT JOIN
= 不符合第一组选择规则的元素)。尝试使用INNER JOIN
? - Alejandro IvánINNER JOIN
,我只是想知道它是否比LEFT JOIN
更好。当你有两个相交的组(假设为A
、B
和A&B
作为它们的交集)时,如果你使用INNER JOIN
,你将得到来自A&B
的元素(所有与你的ON
条件匹配且属于两个组的元素)。LEFT JOIN
将给你在A
上不符合你的ON
条件的元素,RIGHT JOIN
将对B
组执行相同的操作。因此,在你的SQL语句中需要匹配两个组的记录,那么INNER JOIN
应该更好(假设我理解了你的问题)。 - Alejandro Iván