MySQL子查询排名 - 每组Top N结果

3

我需要获取每个国家前两个姓名的列表(从账户和国家表中)。我已经搜索了很多并找到了一些有效的答案,但无法获得正确的结果。

请在此处查看我的SQL Fiddle:

http://sqlfiddle.com/#!9/cd1296/5

CREATE TABLE IF NOT EXISTS `country` (
  `id` int(6) unsigned NOT NULL,
  `iso` varchar(3) NOT NULL,
  `country_name` varchar(24) NOT NULL,
  PRIMARY KEY (id)
) DEFAULT CHARSET=utf8;

INSERT INTO `country` (`id`, `iso`,`country_name`) VALUES
  ('1', 'DEU','Germany'),
  ('2', 'USA','United States'),
  ('3', 'CAN','Canada'),
  ('4', 'JPN','Japan');

CREATE TABLE IF NOT EXISTS `accounts` (
  id int(6) unsigned NOT NULL,
  name varchar(50) NOT NULL,
  iso3 varchar(3) NOT NULL,
  PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;

INSERT INTO `accounts` (`id`,`name`, `iso3`) VALUES
  ('1', 'Hans', 'DEU'),
  ('2', 'Willi', 'DEU'),
  ('3', 'Peter', 'DEU'),
  ('4', 'Susanne', 'DEU'),
  ('5', 'John', 'USA'),
  ('6', 'Jane', 'USA'),
  ('7', 'Peter', 'USA'),
  ('8', 'Paul', 'USA'),
  ('9', 'Mary', 'USA'),
  ('10', 'Gerard', 'CAN'),
  ('11', 'Mirelle', 'CAN'),
  ('12', 'Hiko', 'JPN'),
  ('13', 'Miko', 'JPN'),
  ('14', 'Susanne', 'DEU'),
  ('15', 'Peter', 'DEU'),
  ('16', 'John', 'USA'),
  ('17', 'Paul', 'USA'),
  ('18', 'Susanne', 'DEU'),
  ('19', 'Bob', 'DEU'),
  ('20', 'John', 'USA'),
  ('21', 'Paul', 'USA'),
  ('33', 'Gerard', 'CAN'),
  ('22', 'Maribelle', 'CAN'),  
  ('23', 'Gerd', 'CAN'),
  ('24', 'Mira', 'CAN'),
  ('25', 'Huko', 'JPN'),
  ('26', 'Hako', 'JPN'),
  ('27', 'Hiko', 'JPN'),
('28', 'Jon', 'USA'),
('29', 'Jim', 'USA'),
('30', 'John', 'USA'),
('31', 'JJ', 'USA'),
('32', 'Bob', 'USA'),
('34', 'Bob', 'USA'),
('35', 'Miko', 'JPN'),
('36', 'Miko', 'JPN');

使用这个语句可以使列表按正确顺序排列,但不会在第二个结果后停止:
SELECT country_name, iso, name, COUNT(name) AS name_count
 FROM accounts
 JOIN country ON country.iso = accounts.iso3
 GROUP BY country.iso,  name
 ORDER BY country.iso ASC, name_count DESC;

如其他问题/答案所建议的那样,解决方案可能是使用“MySQL会话变量”(基于https://www.databasejournal.com/features/mysql/selecting-the-top-n-results-by-group-in-mysql.html)。

我的问题: country_rank没有正确生成,因此无法得到正确的结果。我做错了什么?

SET @current_country = ""; 
SET @country_rank = 0; 

 SELECT country_name, name, name_count, rank
 FROM
 (
    SELECT country_name, iso, name, COUNT(name) AS name_count,
    @country_rank := IF( @current_country = iso, 
                         @country_rank + 1, 
                         1 
                       ) AS rank, 
    @current_country := iso 
    FROM accounts
    JOIN country ON country.iso = accounts.iso3
    GROUP BY country.iso,  name
    ORDER BY country.iso ASC, name_count DESC
) AS ranked
WHERE rank<=2;

你使用的 MySQL 版本是哪个? - Radim Bača
@RadimBača 5.7.24 - Christian K.
1
你应该升级到v8并使用窗口函数来完成此操作。 - Radim Bača
2个回答

1

你需要在子查询中进行分组,这样排名就可以在分组结果上完成。

SELECT country_name, name, name_count, rank
FROM (
    SELECT country_name, iso, name, name_count,
        @country_rank := IF( @current_country = iso, 
                             @country_rank + 1, 
                             1 
                           ) AS rank, 
        @current_country := iso 
    FROM (
        SELECT country_name, iso, name, COUNT(name) AS name_count
        FROM accounts
        JOIN country ON country.iso = accounts.iso3
        GROUP BY country.iso,  name
        ORDER BY country.iso ASC, name_count DESC
    ) AS ordered
) AS ranked
CROSS JOIN (SELECT @country_rank = 0, @current_country = '') AS vars
WHERE rank<=2;

Fiddle

的中文翻译为:

小提琴


1

MySQL不能保证在SELECT中表达式的求值顺序。因此,在一个表达式中定义变量,然后在另一个表达式中使用它是很危险的。也就是说,赋值和使用变量应该在同一个表达式中。

问题可能是间歇性的,因此代码看起来在一个上下文中有效,但在另一个上下文中无效。因此,我建议将其编写为:

SELECT country_name, name, name_count, rank
FROM (SELECT country_name, iso, name, name_count,
             (@rn := IF(@c = iso, @rn + 1,
                        IF(@c := iso, 1, 1)
                       )
             ) as rank
      FROM (SELECT c.country_name, c.iso, a.name, COUNT(*) AS name_count
            FROM accounts a JOIN
                 country c
                 ON c.iso = a.iso3
            GROUP BY country.iso,  name
            ORDER BY c.country_name, c.iso ASC, name_count DESC
           ) c CROSS JOIN
           (SELECT @c := '', @rn := 0) params
      ) c
WHERE rank <= 2;

比较两个答案后,我能够重现@gordon提到的奇怪行为。到目前为止,这种方法在我测试的所有情况下都能按预期工作! - Christian K.

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接