按两列分组选择最大值的行

5
我看到了很多关于这种问题的解决方案(尤其是这个:SQL选择只有最大值的行),但似乎都不太合适:
我有以下表格布局,即附件的版本控制,这些附件绑定在实体上:
TABLE attachments
+------+--------------+----------+----------------+---------------+
| id   | entitiy_id   | group_id | version_number | filename      |
+------+--------------+----------+----------------+---------------+
| 1    | 1            | 1        | 1              | file1-1.pdf   |
| 2    | 1            | 1        | 2              | file1-2.pdf   |
| 3    | 1            | 2        | 1              | file2-1.pdf   |
| 4    | 2            | 1        | 1              | file1-1.pdf   |
| 5    | 2            | 1        | 2              | file1-2.pdf   |
| 6    | 2            | 3        | 1              | file3-1.pdf   |
+------+--------------+----------+----------------+---------------+

输出应该是按group_id和entity_id分组的最大版本号,如果只需要单个entity_id的列表,则只需要一个列表。
+------+--------------+----------+----------------+---------------+
| id   | entitiy_id   | group_id | version_number | filename      |
+------+--------------+----------+----------------+---------------+
| 2    | 1            | 1        | 2              | file1-2.pdf   |
| 3    | 1            | 2        | 1              | file2-1.pdf   |
| 5    | 2            | 1        | 2              | file1-2.pdf   |
| 6    | 2            | 3        | 1              | file3-1.pdf   |
+------+--------------+----------+----------------+---------------+

What I have come up with is this self join one:

SELECT *
FROM   `attachments` `attachments`
       LEFT OUTER JOIN attachments t2
         ON ( attachments.group_id = t2.group_id
              AND attachments.version_number < t2.version_number )
WHERE  ( t2.group_id IS NULL )
   AND ( `t2`.`id` = 1 )
GROUP  BY t2.group_id

但是这个方案只适用于不同实体不共享相同组号的情况。不幸的是,这是必要的。

在创建视图时,我遇到了一种可行的解决方案,但在我的当前设置中不支持此解决方案。

非常感谢任何想法。谢谢!

4个回答

3

试试这个:

select t1.* from attachments t1
left join attachments t2
on t1.entity_id = t2.entity_id and t1.group_id = t2.group_id and
   t1.version_number < t2.version_number
where t2.version_number is null

谢谢!这是一个非常简单明了的解决方案。 - Simon Brückner
我至少会建议您添加我提出的键来维护表格性能,因此包括entity_id、group_id和version_number。在您的问题中,您指定它是针对单个entity id的,而由于未使用entity_id来限制连接,这种选项在更高的表格容量下将比其他建议执行得更慢。 - Simon at My School Portal

2

这适用于选择所有内容

SELECT attachments.*
FROM (
    SELECT entitiy_id, group_id, MAX(version_number) AS max_version
    FROM attachments
    GROUP BY entitiy_id, group_id
) AS maxVersions
INNER JOIN attachments
ON attachments.entitiy_id = maxVersions.entitiy_id
AND attachments.group_id = maxVersions.group_id
AND attachments.version_number = maxVersions.max_version

将这个扩展为只查找单个entity_id只需在子查询中添加一个WHERE,因此可以得到以下结果:

SELECT attachments.*
FROM (
    SELECT entitiy_id, group_id, MAX(version_number) AS max_version
    FROM attachments
    WHERE entitity_id = [[YOUR ENTITIY ID HERE]]
    GROUP BY entitiy_id, group_id
) AS maxVersions
INNER JOIN attachments
ON attachments.entitiy_id = maxVersions.entitiy_id
AND attachments.group_id = maxVersions.group_id
AND attachments.version_number = maxVersions.max_version

如果您希望在行数增加时仍能保持快速运行,我建议您确保在附件上添加一个键,其中包括行(entitiy_id, group_id, max_version),这样子查询就可以依赖于它,从而确保不会锁定表格。

2
这个可以解决问题:

select a1.* from attachments a1
inner join ( select entitiy_id, group_id, max(version_number) as version_number
             from attachments
             group by entitiy_id, group_id) a2 on a1.entitiy_id = a2.entitiy_id and
                                                  a1.group_id = a2.group_id and
                                                  a1.version_number = a2.version_number

0

你也可以使用高性能的公共表达式(CTE)来解决这个问题。

WITH CTE AS
(
SELECT entitiy_id, group_id, version_number, filename,       
ROW_NUMBER() OVER (PARTITION BY entitiy_id, group_id ORDER BY version_number DESC) as RowNum
FROM attachments
)
SELECT entitiy_id, group_id, version_number, filename
FROM CTE 
WHERE RowNum = 1

或者

SELECT T.entitiy_id, T.group_id, T.version_number, T.filename
FROM (SELECT entitiy_id, group_id, version_number, filename,       
     ROW_NUMBER() OVER (PARTITION BY entitiy_id, group_id ORDER BY version_number DESC) as RowNum
     FROM attachments
     ) as T 
WHERE RowNum = 1

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接