在表A中查找最大值的嵌套查询，以查找表B中的详细信息

Question

在表A中查找最大值的嵌套查询，以查找表B中的详细信息

sqlmysqlnestedsubquerymax

3

我有一大堆关于机场之间的航班信息，涉及到的内容和IT技术有关。

每个机场都有一个ID和(x,y)坐标。

对于属于某个用户的给定航班列表，我想找出访问过的最北端（y值最高）的机场。

这是我目前正在使用的查询：

SELECT name,iata,icao,apid,x,y 
  FROM airports 
 WHERE y=(SELECT MAX(y) 
            FROM airports AS a
               , flights AS f 
           WHERE (f.src_apid=a.apid OR f.dst_apid=a.apid) AND f.uid=[user_id]
         )

这个方法非常有效，速度也很快，但只有在y唯一的情况下才有效（即该纬度只有一个机场），否则会失败。不幸的是，这种情况经常发生，例如，军用和民用机场虽然占据相同的坐标，但仍有不同的条目。

我真正想做的是在子查询中找到具有MAX(y)的机场，并返回实际匹配的机场(a.apid)，而不是返回y的值，然后再进行匹配。你有什么建议吗？

假设用户只有这一次航班，从apid '3728' 开始：

mysql> select * from flights where uid=35 and src_apid=3728 limit 1;
+------+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+
| uid  | src_apid | src_time | dst_apid | distance | code | seat | seat_type | class | reason | plid | alid | trid | fid    | duration | registration | note         | upd_time            | opp  | src_date   | mode |
+------+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+
|   35 |     3728 | NULL     |     3992 |     4116 | NW16 | 23C  | A         | Y     | L      |  167 | 3731 | NULL | 107493 | 08:00:00 |              | del. typhoon | 2008-10-04 10:40:58 | Y    | 2001-08-22 | F    | 
+------+----------+----------+----------+----------+------+------+-----------+-------+--------+------+------+------+--------+----------+--------------+--------------+---------------------+------+------------+------+

这里存在两个坐标相同的机场：

mysql> select * from airports where y=21.318681;
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+
| name                  | city     | country       | iata | icao | x           | y         | elevation | apid | uid  | timezone | dst  |
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+
| Honolulu Intl         | Honolulu | United States | HNL  | PHNL | -157.922428 | 21.318681 |        13 | 3728 | NULL |      -10 | N    | 
| Hickam Air Force Base | Honolulu | United States |      | PHIK | -157.922428 | 21.318681 |        13 | 7055 |    3 |      -10 | N    | 
+-----------------------+----------+---------------+------+------+-------------+-----------+-----------+------+------+----------+------+

如果运行原始查询，子查询将返回y=21.318681，这将匹配apid 3728（正确）或apid 7055（错误）。

- lambshaanxy

你能给出一些机场和航班的示例行，展示问题以及期望的结果集吗？我很乐意帮忙，但我还不够清楚地理解这个问题。 - Ronnis

查询应该被修改为仅返回乘客已经访问过的机场，即在示例数据中，乘客访问了3728和3992，因此只应返回其中一个机场，从而排除Hickam（apid = 7055）。 - Bob Jarvis - Слава Україні

4个回答

1

第三次尝试，使用假设用户（用户ID，姓名）表。

select u.name, ap.name
     , ap.iata
     , ap.icao
     , ap.apid
     , ap.x
     , max(ap.y)  
  from users u
     , airports ap
     , flights f
 where u.userid=f.userid
   and (   f.src_apid=ap.apid 
        OR f.dst_apid=ap.apid
       )
group by u.name, ap.name,ap.iata,ap.icao,ap.apid,ap.x

现在您可以将查询限制为您感兴趣的一个用户..

关于GROUP BY的评论：

严格来说，MySQL允许我将group by写成'group by u.name, ap.name'。
其他SQL方言不允许这样做，它们要求所有未聚合的选择字段都在GROUP BY语句中。
因此，在选择我的GROUP BY字段时，我倾向于“保险”。

- lexu

好的，那将会有一点困难……你想看两个机场中的哪一个？有需要应用排序标准吗？或者 PHIK 和 PHNL 的 apid（=AirPortID）是相同的吗？ - lexu

apid是机场的唯一键，因此PHIK和PHNL有不同的键，我需要查看最初匹配的那个--与f.src_apid或f.dst_apid相等的机场的apid。更具体的例子：用户X从檀香山-PHNL到悉尼有一个航班（从flights中选择其中一个，其中f.uid=X），我想知道用户X最北端的机场。答案应该是檀香山-PHNL，而不是Hickam AFB（PHIK）。 - lambshaanxy

@jpatokal：没有地理空间数据，当前查询无法返回最北的机场。没有支持需求的数据，您将无法可靠地获得所需的价值。 - OMG Ponies

这到底是什么意思？你是在告诉我，我应该故意破坏机场数据，并使y的所有值错开，只为了解决SQL中的限制问题吗？再重复一遍，确定用户访问了哪个机场（apid）是很简单的，问题是如何从查询中获取apid。 - lambshaanxy

@OMG Ponies：我仍然完全不知道你在说什么，但幸运的是这并不重要，因为我刚刚把赏金授予了一个快速准确解决问题的答案。 - lambshaanxy

显示剩余4条评论

1

以下查询的执行方式是什么？它首先找到已访问机场中最北端的Y坐标。然后执行一个相同的查询，该查询由上一个查询中的Y坐标进行过滤。最后一步是找到机场。

drop table airports;
drop table flights;

create table airports(
   apid    int         not null
  ,apname  varchar(50) not null
  ,x       int         not null
  ,y       int         not null
  ,primary key(apid)
  ,unique(apname)
);

create table flights(
   flight_id int         not null auto_increment
  ,src_apid  int         not null
  ,dst_apid  int         not null
  ,user_id   varchar(20) not null
  ,foreign key(src_apid) references airports(apid)
  ,foreign key(dst_apid) references airports(apid)
  ,primary key(flight_id)
  ,index(user_id)
);

insert into airports(apid, apname, x, y) values(1, 'Northpole Civilian',     50, 100);
insert into airports(apid, apname, x, y) values(2, 'Northpole Military',     50, 100);
insert into airports(apid, apname, x, y) values(3, 'Transit point',          50, 50);
insert into airports(apid, apname, x, y) values(4, 'Southpole Civilian',     50, 0);
insert into airports(apid, apname, x, y) values(5, 'Southpole Military',     50, 0);

insert into flights(src_apid, dst_apid, user_id) values(4, 3, 'Family guy');
insert into flights(src_apid, dst_apid, user_id) values(3, 1, 'Family guy');

insert into flights(src_apid, dst_apid, user_id) values(5, 3, 'Mr Bazooka');
insert into flights(src_apid, dst_apid, user_id) values(3, 2, 'Mr Bazooka');

select airports.apid
      ,airports.apname
      ,airports.x
      ,airports.y
  from (select max(a.y) as y
          from flights  f
          join airports a on (a.apid = f.src_apid or a.apid = f.dst_apid)
         where f.user_id = 'Family guy'
       ) as northmost 
  join (select a.apid
              ,a.y
          from flights  f
          join airports a on (a.apid = f.src_apid or a.apid = f.dst_apid)
         where f.user_id = 'Family guy'
       ) as userflights on(northmost.y = userflights.y)   
  join airports on(userflights.apid = airports.apid);

编辑。可能对优化器更少造成困惑的替代查询

select airports.*
  from (select case when s.y > d.y then s.apid else d.apid end as apid
              ,case when s.y > d.y then s.y    else d.y    end as northmost
          from flights  f
          join airports s on(f.src_apid = s.apid)
          join airports d on(f.dst_apid = d.apid)
         where f.user_id = 'Family guy'
         order by northmost desc
         limit 1
       ) as user_flights
  join airports on(airports.apid = user_flights.apid);

- Ronnis

哦，我没看到你在我写这个的时候添加了示例数据。你想让我使用你的示例吗？ - Ronnis

叮叮叮，我想我们有一个赢家！它可以得到正确的结果，并且速度非常快。唯一有点可怕的是临时表和DESCRIBE说需要进行文件排序，但我会尝试一些病态案例并报告回来。 - lambshaanxy

根据调查结果显示：针对超过4000个航班，响应时间为0.23秒，非常出色；针对40000个航班，响应时间为3.15秒，相当可接受。任务奖励归您所有！ - lambshaanxy

太棒了！你最终使用了哪个查询？ - Ronnis

（顺便说一下，由于某种原因，需要单独点击答案旁边的赏金奖励图标才能发放赏金） - Ronnis

第二个版本已经发布，并且赏金已经发放。 - lambshaanxy

1

好的，也许可以这样：

SELECT name, iata, icao, apid, x, y
  FROM airports
  WHERE y = (SELECT MAX(A.y)
               FROM airports AS a
             INNER JOIN flights AS f
               ON (F.SRC_APID = A.APID OR
                   F.DST_APID = A.APID)
               WHERE f.uid = [user_id]) AND
        apid IN (SELECT SRC_APID AS APID
                   FROM FLIGHTS
                   WHERE UID = [user_id]
                 UNION ALL
                 SELECT DEST_APID AS APID
                   FROM FLIGHTS
                   WHERE UID = [user_id])

不能保证这个程序的表现如何，但或许这是朝着正确方向迈出的一步。

分享并享受。

- Bob Jarvis - Слава Україні

很遗憾，这只有Ronnis的答案速度的一半快，因为您实际上获取了用户的航班两次。 - lambshaanxy

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Veynom · Accepted Answer

2

这个怎么样：

SELECT name,iata,icao,apid,x,y 
FROM airports AS a, flights AS f 
WHERE f.src_apid=a.apid OR f.dst_apid=a.apid
ORDER BY y DESC LIMIT 1

您需要获取所有相关用户的航班信息，按照从北到南的顺序排序，并从列表中选择第一个。

- Veynom

不是一个坏主意，但是DESCRIBE说查询首先对所有机场进行排序，然后再进行过滤，因此相当慢：即使使用了y、src_apid、dst_apid的索引，对于典型用户来说需要2.7秒。 - lambshaanxy

“apid” 的选择将是任意的 - 您不能保证每次获取相同的值。实际上，基于lexu回答中的评论，由于系统中没有数据，无法支持选择最北端机场的需求。 - OMG Ponies

哦？上面的查询创建了用户机场列表（在示例数据中为3728和3992），并按y对这些匹配机场进行排序，因此我不明白它如何会返回错误的7055。 - lambshaanxy