MySQL查询:当条件满足时返回行

3
假设我有一个名为 wx_data 的表格,其中包含以下数据(这里的温度并不准确对应日期和城市):
city,        wx_date,    avg_temp
Kansas City, 2012-11-01, 28 
Kansas City, 2012-11-02, 42
Kansas City, 2012-11-03, 86
Kansas City, 2012-11-04, 39
Kansas City, 2012-11-05, 27
Kansas City, 2012-11-06, 65
Kansas City, 2012-11-07, 62
Kansas City, 2012-11-08, 55
Kansas City, 2012-11-09, 95
Kansas City, 2012-11-10, 90
Kansas City, 2012-11-11, 29
Saint Louis, 2012-11-01, 88 
Saint Louis, 2012-11-02, 42
Saint Louis, 2012-11-03, 30
Saint Louis, 2012-11-04, 60
Saint Louis, 2012-11-05, 85
Saint Louis, 2012-11-06, 65
Saint Louis, 2012-11-07, 62
Saint Louis, 2012-11-08, 32
Saint Louis, 2012-11-09, 80
Saint Louis, 2012-11-10, 80
Saint Louis, 2012-11-11, 33

我有一个查询,可以确定每个城市的最大值和最小值,并根据给定日期的温度是否在数据集中观察到的最高或最低温度的15%内,返回标记为“Low”和“High”的行。

select 
      temp.city, 
      wx_date, 
      avg_tmp, 
      if(avg_tmp >=.85*temp.High, "High", "Low") 
   from 
      wx_data 
         inner join (select city, 
                            Min(avg_tmp) as Low, 
                            Max(avg_tmp) as High 
                        from 
                           wx_data
                        where
                           wx_date between '2012-11-02' and '2013-12-01'
                        group by city) as temp 
           on wx_data.city=temp.city 
   where 
        avg_tmp >= .85 * temp.High 
     or avg_tmp <= 1.15 * temp.Low
   order by 
      city, 
      wx_date;

我希望修改查询,使其返回上述结果,但仅返回状态从低到高或相反的行。 因此,我不想看到连续几天为低或高条件返回的数据。
你该如何做呢?
澄清一下,现在我的查询返回以下数据:
 Kansas City    November, 01 2012   28  Low
 Kansas City    November, 03 2012   86  High
 Kansas City    November, 04 2012   29  Low
 Kansas City    November, 05 2012   27  Low
 Kansas City    November, 09 2012   95  High
 Kansas City    November, 10 2012   90  High
 Kansas City    November, 11 2012   29  Low
 Saint Louis    November, 01 2012   33  Low
 Saint Louis    November, 02 2012   88  High 
 Saint Louis    November, 03 2012   87  High
 Saint Louis    November, 05 2012   85  High
 Saint Louis    November, 08 2012   32  Low
 Saint Louis    November, 09 2012   80  High
 Saint Louis    November, 10 2012   80  High
 Saint Louis    November, 11 2012   33  Low

我希望只返回城市状态从“高”到“低”或从“低”到“高”变化的行,如果查询正确,则应返回以下9行。
 Kansas City    November, 01 2012   28  Low
 Kansas City    November, 03 2012   86  High
 Kansas City    November, 04 2012   29  Low
 Kansas City    November, 09 2012   95  High
 Kansas City    November, 11 2012   29  Low
 Saint Louis    November, 02 2012   88  High 
 Saint Louis    November, 08 2012   32  Low
 Saint Louis    November, 09 2012   80  High
 Saint Louis    November, 11 2012   33  Low

请查看http://www.sqlfiddle.com/#!2/384fb/1,可以看到有关数据和查询结果的示例。

您能进一步说明您的问题吗?您希望查询返回什么具体的结果? - Benvorth
我的算术可能有误(这种情况时有发生),但在我看来,你的查询似乎只测试了与“low”的接近程度?!?! - Strawberry
我添加了sqlfiddle链接,这样你就可以看到查询是有效的。请注意,sqlfiddle中的数据与我在此处发布的略有不同,我认为有一两行是不同的。 - tman
11月4日,kc的平均气温为39度(而且sqlfiddle现在表现不佳)。 - Strawberry
另外,我认为您应该修改示例数据集和相应的结果集,以显示如果存在多个连续的高点(或低点)会发生什么,而不仅仅是2个。 - Strawberry
显示剩余2条评论
2个回答

2
根据您的定义,核心查询应该更像这样吧?
SELECT city
     , MAX(avg_temp) max_temp
     , 0.85*MAX(avg_temp) max_threshold
     , MIN(avg_temp) min_temp
     , 1.15 * MIN(avg_temp) min_threshold
  FROM wx_data 
 GROUP 
    BY city ;
+-------------+----------+---------------+----------+---------------+
| city        | max_temp | max_threshold | min_temp | min_threshold |
+-------------+----------+---------------+----------+---------------+
| Kansas City |       95 |         80.75 |       27 |         31.05 |
| Saint Louis |       88 |         74.80 |       30 |         34.50 |
+-------------+----------+---------------+----------+---------------+

...而且...

SELECT x.*
     , CASE WHEN x.avg_temp BETWEEN y.min_temp AND y.min_threshold THEN 'Low'
            WHEN x.avg_temp BETWEEN y.max_threshold AND y.max_temp THEN 'High'
            ELSE ''
            END status
  FROM wx_data x 
  JOIN 
     ( SELECT city
            , MAX(avg_temp) max_temp
            , 0.85*MAX(avg_temp) max_threshold
            , MIN(avg_temp) min_temp
            , 1.15 * MIN(avg_temp) min_threshold
         FROM wx_data GROUP BY city 
      ) y
    ON y.city = x.city;

+-------------+------------+----------+--------+
| city        | wx_date    | avg_temp | status |
+-------------+------------+----------+--------+
| Kansas City | 2012-11-01 |       28 | Low    |
| Kansas City | 2012-11-02 |       42 |        |
| Kansas City | 2012-11-03 |       86 | High   |
| Kansas City | 2012-11-04 |       39 |        |
| Kansas City | 2012-11-05 |       27 | Low    |
| Kansas City | 2012-11-06 |       65 |        |
| Kansas City | 2012-11-07 |       62 |        |
| Kansas City | 2012-11-08 |       55 |        |
| Kansas City | 2012-11-09 |       95 | High   |
| Kansas City | 2012-11-10 |       90 | High   |
| Kansas City | 2012-11-11 |       29 | Low    |
| Saint Louis | 2012-11-01 |       88 | High   |
| Saint Louis | 2012-11-02 |       42 |        |
| Saint Louis | 2012-11-03 |       30 | Low    |
| Saint Louis | 2012-11-04 |       60 |        |
| Saint Louis | 2012-11-05 |       85 | High   |
| Saint Louis | 2012-11-06 |       65 |        |
| Saint Louis | 2012-11-07 |       62 |        |
| Saint Louis | 2012-11-08 |       32 | Low    |
| Saint Louis | 2012-11-09 |       80 | High   |
| Saint Louis | 2012-11-10 |       80 | High   |
| Saint Louis | 2012-11-11 |       33 | Low    |
+-------------+------------+----------+--------+

编辑: ……并进一步扩展这个想法(与sqlfiddle数据集)......

 SELECT a.city,a.wx_date,a.avg_tmp FROM
 (
 SELECT x.*
      , IF(@prev = CASE WHEN x.avg_tmp BETWEEN y.min_tmp AND y.min_threshold THEN 'Low'
             WHEN x.avg_tmp BETWEEN y.max_threshold AND y.max_tmp THEN 'High'
             ELSE ''
             END, @i := 0, @i:=1) flag
      , @prev := CASE WHEN x.avg_tmp BETWEEN y.min_tmp AND y.min_threshold THEN 'Low'
             WHEN x.avg_tmp BETWEEN y.max_threshold AND y.max_tmp THEN 'High'
             ELSE ''
             END status
   FROM wx_data x 
   JOIN 
      ( SELECT city
             , MAX(avg_tmp) max_tmp
             , 0.85*MAX(avg_tmp) max_threshold
             , MIN(avg_tmp) min_tmp
             , 1.15 * MIN(avg_tmp) min_threshold
          FROM wx_data GROUP BY city 
       ) y
     ON y.city = x.city
   JOIN (SELECT @i:=NULL,@prev:=NULL) vars
  ORDER 
     BY city,wx_date
     ) a
  WHERE flag = 1 AND status <> '';

http://www.sqlfiddle.com/#!2/384fb/12


没问题,但是我提供的查询语句是有效的。如果你的更好,我完全支持。但我的核心问题仍然是我所问的那个问题。 - tman
我看到你做了什么。现在的问题是,我该如何调整它以便检查前几天的状态并返回是否与前一天有所改变?顺便说一下,我的当前查询只返回标记为低或高的行,如果是中间值,例如55或60,它不会返回该行。 - tman
好的,谢谢。有一个问题,我看了SQL Fiddle数据集,不明白为什么圣路易斯的11月5日记录会被返回。它在第2天和第5天之间没有转换为低事件。 - tman
我认为可能是因为它们之间的那一天发布了一个超出15%标准的结果,导致@prev重新设置。如果这不是您想要的,那么我认为您已经在“FROM wx_data x”之后放置了“WHERE status <>''”。问题在于(我认为)MySQL在那个时候并不真正知道“status”是什么,所以您必须手写它,即“WHERE CASE WHEN x.avg_tmp等”。 - Strawberry

1
假设您的查询是正确的,并且您只想显示高/低值发生变化的行:
select city, wx_date, avg_tmp, hi_lo
from (
    select temp.city, wx_date, avg_tmp, 
    if(avg_tmp >=.85*temp.High,"High","Low") hi_lo,
    @prevHiLo = (avg_tmp >=.85*temp.High and @prevCity = temp.city) same_as_prev,
    @prevHiLo := (avg_tmp >=.85*temp.High),
    @prevCity := temp.city
    from wx_data 
    inner join 
    (select city, Min(avg_tmp) as Low, Max(avg_tmp) as High from wx_data
    where (wx_date between '2012-11-02' and '2013-12-01') group by city) 
    as temp on wx_data.city=temp.city 
    where (avg_tmp >= .85*temp.High or avg_tmp <= 1.15*temp.Low)  
    order by city, wx_date
) t1 
where same_as_prev = 0
order by city, wx_date

http://www.sqlfiddle.com/#!2/b6f42/2

CITY    WX_DATE AVG_TMP HI_LO
Kansas City November, 03 2012 00:00:00+0000 86  High
Kansas City November, 04 2012 00:00:00+0000 29  Low
Kansas City November, 09 2012 00:00:00+0000 95  High
Kansas City November, 11 2012 00:00:00+0000 29  Low
Saint Louis November, 02 2012 00:00:00+0000 88  High
Saint Louis November, 08 2012 00:00:00+0000 32  Low
Saint Louis November, 09 2012 00:00:00+0000 80  High
Saint Louis November, 11 2012 00:00:00+0000 33  Low

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接