Mysql查询最近日期（今天、昨天或之前）的内部连接

Question

Mysql查询最近日期（今天、昨天或之前）的内部连接

4

我正在尝试从Inner Join表中提取最新的价格数据。价格会在一天中不断更新，但不一定是凌晨更新。

当价格在一天结束时更新数据时，以下查询非常有效。但如果今天的数据为空，如何获取昨天的数据？

我以这样的格式索引列：date_itemnumber => 2015-05-22_12341234。

SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid 
AND concat('2015-05-23_',collection.itemid)=history.date_itemid
WHERE h.description LIKE '%Awesome%'

生产查询时间：0.046秒

明确一点，我希望它检查该项最新的记录。无论是今天、昨天还是之前。

SQLFiddle1

以下查询提供了我想要的结果，但对于我的生产数据集，需要超过3分钟才能返回结果。随着我的数据集变得越来越大，这种方式就不再有效。

SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid 
AND (select history.date_itemid from history WHERE itemid=collection.itemid GROUP BY date_itemid DESC LIMIT 1)=history.date_itemid 
WHERE h.description LIKE '%Awesome%'

生产环境查询时间: 181.140秒

SQLFiddle2

- Andrew

你能提供一些表格条目吗？或者甚至是一个sqlFiddle？ - Alex Tartan

使用Datesub（Curdate（），INTERVAL 1 day）获取昨天的最后一天，然后使用Like％如果ur date_itemnumber是一个字符串（似乎是）。但正如Alex所问，我们需要更多关于表格的信息。 - Falt4rm

ORDER BY date_itemnumber DESC LIMIT 1，将获取最后一条记录。 - Zast

我大幅简化了数据。但是我确实需要来自集合、h 和 history 的数据。当我有当前日期的数据时，上述查询完美地运行了 100%。 - Andrew

我已经更新了我的 SQL Fiddle。之前有一个错误，但现在你可以看到我想要的结果了。但是如果你在明天之前运行查询（在明天的定价数据被填充之前），你将得到 0 个结果。 - Andrew

以下方法可以帮我得到所需的结果，但是在生产环境下，使用我的数据集这个查询语句太慢了（需要 181 秒）。http://sqlfiddle.com/#!9/d2f62/3 - Andrew

6个回答

1

这应该是有效的：

SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN(
SELECT a.*
  FROM history a
  INNER JOIN 
     ( SELECT itemid,MAX(date_itemid) max_date_itemid
         FROM history 
        GROUP BY itemid
     ) b ON b.itemid = a.itemid AND b.max_date_itemid = a.date_itemid
  ) AS history ON history.itemid = collection.itemid
WHERE h.description LIKE '%Awesome%'

我不知道这是否会占用大量执行时间。请尝试一下，因为你的表中可能有更多数据，这将是一个好的测试来查看查询执行时间。

- Hotdin Gurning

谢谢！您的查询比其他人的更快。您可以稍微解释一下您的逻辑吗？ - Andrew

0

这实际上是SQL中一个相当常见的问题，至少我感觉经常遇到。你想做的是连接一个一对多的表，但只连接到该表中最新或最旧的记录。

解决这个问题的诀窍是在具有许多记录的表上执行自身LEFT连接，指定外键以及id应大于或小于其他记录的id（或日期或任何你正在使用的内容）。然后在WHERE条件中，只需添加一个条件，即左连接的表具有NULL id-它无法与更近期的记录连接，因为它是最新的。

在你的情况下，SQL应该看起来像这样：

SELECT h.*, collection.*, history.price
FROM collection
INNER JOIN h ON collection.itemid=h.id
INNER JOIN history ON collection.itemid=history.itemid 
-- left join history table again
LEFT JOIN history AS history2 ON history.itemid = history2.itemid AND history2.id > history.id
-- filter left join results to the most recent record
WHERE history2.id IS NULL
AND h.description LIKE '%Awesome%'

- Scott Jungwirth

如果该itemid还没有历史记录会发生什么？我会尝试调整这个逻辑，看能否得到正确的结果。 - Andrew

如果某个itemid没有历史记录，则第一个与history的内部连接将失败，并且您将无法获取该itemid的任何结果。 - Scott Jungwirth

我的历史记录ID的真实索引列实际上是UUID v4。上述方法对我无效 :( - Andrew

如果您有一个时间戳列或整数列记录创建的时间戳，您可以在ON条件中使用它。例如：LEFT JOIN history AS history2 ON history.itemid = history2.itemid AND history2.created > history.created - Scott Jungwirth

正在尝试使用ON日期，但已经花费了225.625秒，仍在获取中。 - Andrew

我假设该列已经建立了索引？如果您的列是char或varchar类型，可能无法正常工作，需要int或timestamp类型。很抱歉，我的解决方案需要一个额外的列来进行快速的大于/小于比较。 - Scott Jungwirth

0

这是另一种方法，可以减少一个内连接语句

select h.*,his.date_itemid, his.price from history his
INNER JOIN h ON his.itemid=h.id
WHERE his.itemid IN (select itemid from collection) AND h.description LIKE '%Awesome%' and his.id IN (select max(id) from history group by history.itemid)

你可以在这里尝试 http://sqlfiddle.com/#!9/837a8/1

- user2560539

今晚我会尝试一下这个概念，然后告诉你结果的时间。我感觉我从集合中提取了更多内容，但是我无法削减它。 - Andrew

@AndrewWilson 请尝试一下，因为您的表中可能有更多数据，这将是一个很好的测试来查看查询执行时间。 - user2560539

0

我不确定这是否是您想要的，但我会尝试一下。编辑：已修改。

   CREATE VIEW LatestDatesforIds
AS
SELECT
  MAX(`history`.`date_itemid`) AS `lastPriceDate`,
  MAX(`history`.`id`) AS `matchingId`
FROM `history`
GROUP BY `history`.`itemid`;  


CREATE VIEW MatchDatesToPrices
AS
SELECT
  `ldi`.`lastPriceDate` AS `lastPriceDate`,
  `ldi`.`matchingId` AS `matchingId`,
  `h`.`id` AS `id`,
  `h`.`itemid` AS `itemid`,
  `h`.`price` AS `price`,
  `h`.`date_itemid` AS `date_itemid`
FROM (`LatestDatesforIds` `ldi`
  JOIN `history` `h`
    ON ((`ldi`.`matchingId` = `h`.`id`)));

SELECT c.itemid,price,lastpriceDate,description
FROM collection c
INNER JOIN MatchDatesToPrices mp
     ON c.itemid = mp.itemid
INNER JOIN h ON c.itemid = h.id

- John

这是你的查询结果，也就是错误的数据集。http://sqlfiddle.com/#!9/4377a/1 - Andrew

@AndrewWilson 我认为你想要获取每个项目的最后价格...所以看一下这个 http://sqlfiddle.com/#!9/4377a/5/0 - John

0

在如此小的数据集上测试速度很困难，但避免使用“Group By”可能会加快速度。您可以尝试有条件地将历史表连接到自身，而不是分组？

例如：

SELECT h.*, c.*, h1.price
FROM h


INNER JOIN history h1 ON h1.itemid = h.id
LEFT OUTER JOIN history h2 ON h2.itemid = h.id
    AND h1.date_itemid < h2.date_itemid
INNER JOIN collection c ON c.itemid = h.id

WHERE h2.id IS NULL
AND h.description LIKE '%Awesome%'

修改这一行

AND h1.date_itemid < h2.date_itemid

实际上，使用顺序索引字段（最好是唯一的）来进行操作也会加快速度。例如，按id升序排序

- Moss Palmer

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Strawberry · Accepted Answer

SELECT x.*
  FROM history x
  JOIN 
     ( SELECT itemid
            , MAX(date_itemid) max_date_itemid 
         FROM history 
  -- optional JOINS and WHERE here --
        GROUP
           BY itemid
     ) y
    ON y.itemid = x.itemid
   AND y.max_date_itemid = x.date_itemid;

http://sqlfiddle.com/#!9/975f5/13