PostgreSQL中按UUID排序是如何工作的?

15

在使用postgresql中的uuid_generate_v1()函数时,是否有任何排序保证?

如果有的话,这些保证是基于每台机器还是无论UUID是在哪台机器上生成的都不影响?由于V1 UUID是通过时间戳+MAC地址生成的,那么Postgres内部是否按照时间戳部分然后是MAC来排序?

我能否“order by”一个UUID类型的列,并期望它总是有效(看起来有效)?

我想在多台机器上生成UUID(使用postgresql的uuid_generate_v1()),将它们复制到一个Postgres实例中,然后按UUID列排序。它必须保证按机器排序,而不是按所有机器的UUID排序。


时间戳可以从UUID(v1)中提取。你看过这个吗?https://dev59.com/21oU5IYBdhLWcg3wM04H - Andy Carlson
我看到了,但是在这个答案中没有使用建议的函数进行排序似乎也可以工作(使用普通的ORDER BY列)。不确定它是否与机器无关。 - Thiago Sayão
1
UUID(或字节布局)没有通用排序。然而,PostgreSQL有一个实现特定的 UUID排序,SQL Server也有(这并不保证相同;.NET有一个不同于SQL Server的排序,尽管两者都是微软的创作..)。因此,PostgreSQL排序将是一致的。如果uuid_generate_v1在PgSQL中是索引友好的和/或基于“时间”在PgSQL中排序良好,则是一个具体的细节。 - user2864740
无论如何,我找不到任何技术文档,但我会“假设”顺序在 PostgreSQL 中是“保证”的。更改顺序将是一个“重大破坏性变更”,并且可能依赖于 UUID 的二进制存储编码。如果有任何原因导致此数据在 PgSQL 之外排序,则不要依赖它 D: - user2864740
2个回答

3

我为PostgreSQL调整了上述查询

With UIDs As (--                     0 1 2 3  4 5  6 7  8 9  A B C D E F
            Select 'F' as id, cast('00000000-0000-0000-0000-000000000011' as uuid) as uid
    Union   Select 'E' as id, cast('00000000-0000-0000-0000-000000001100' as uuid) as uid
    Union   Select 'D' as id, cast('00000000-0000-0000-0000-000000110000' as uuid) as uid
    Union   Select 'C' as id, cast('00000000-0000-0000-0000-000011000000' as uuid) as uid
    Union   Select 'B' as id, cast('00000000-0000-0000-0000-001100000000' as uuid) as uid
    Union   Select 'A' as id, cast('00000000-0000-0000-0000-110000000000' as uuid) as uid
    Union   Select '9' as id, cast('00000000-0000-0000-0011-000000000000' as uuid) as uid
    Union   Select '8' as id, cast('00000000-0000-0000-1100-000000000000' as uuid) as uid
    Union   Select '7' as id, cast('00000000-0000-0011-0000-000000000000' as uuid) as uid
    Union   Select '6' as id, cast('00000000-0000-1100-0000-000000000000' as uuid) as uid
    Union   Select '5' as id, cast('00000000-0011-0000-0000-000000000000' as uuid) as uid
    Union   Select '4' as id, cast('00000000-1100-0000-0000-000000000000' as uuid) as uid
    Union   Select '3' as id, cast('00000011-0000-0000-0000-000000000000' as uuid) as uid
    Union   Select '2' as id, cast('00001100-0000-0000-0000-000000000000' as uuid) as uid
    Union   Select '1' as id, cast('00110000-0000-0000-0000-000000000000' as uuid) as uid
    Union   Select '0' as id, cast('11000000-0000-0000-0000-000000000000' as uuid) as uid
)
Select * From UIDs Order By uid desc

在PostgreSQL中,排序顺序不同:

Position by highest-to-lowest value

id                                  uuid                             
 0  11000000-0000-0000-0000-000000000000
 1  00110000-0000-0000-0000-000000000000
 2  00001100-0000-0000-0000-000000000000
 3  00000011-0000-0000-0000-000000000000
 4  00000000-1100-0000-0000-000000000000
 5  00000000-0011-0000-0000-000000000000
 6  00000000-0000-1100-0000-000000000000
 7  00000000-0000-0011-0000-000000000000
 8  00000000-0000-0000-1100-000000000000
 9  00000000-0000-0000-0011-000000000000
 A  00000000-0000-0000-0000-110000000000
 B  00000000-0000-0000-0000-001100000000
 C  00000000-0000-0000-0000-000011000000
 D  00000000-0000-0000-0000-000000110000
 E  00000000-0000-0000-0000-000000001100
 F  00000000-0000-0000-0000-000000000011

这意味着在PostgreSQL中,UUIDs不会按照它们的时间组件进行排序。内部使用memcmp根据它们的内存布局对UUIDs进行排序。


0

虽然这并不是一个明确的答案 - 即“PostgreSQL所有安装中是否定义了该行为?”,但此SQL(适用于SQL Server)可检查GUID中每个字节的排序。可能需要一些调整才能在PostgreSQL中使用。

生成这样的映射应该可以让人们看到特定UUID结构(其中之一是明确定义的类型或其他类型)在PostgreSQL中“以特定方式排序”的情况。

With UIDs As (--                           0 1 2 3  4 5  6 7  8 9  A B C D E F
            Select ID = 'F', UID = cast ('00000000-0000-0000-0000-000000000011' as uniqueidentifier)
    Union   Select ID = 'E', UID = cast ('00000000-0000-0000-0000-000000001100' as uniqueidentifier)
    Union   Select ID = 'D', UID = cast ('00000000-0000-0000-0000-000000110000' as uniqueidentifier)
    Union   Select ID = 'C', UID = cast ('00000000-0000-0000-0000-000011000000' as uniqueidentifier)
    Union   Select ID = 'B', UID = cast ('00000000-0000-0000-0000-001100000000' as uniqueidentifier)
    Union   Select ID = 'A', UID = cast ('00000000-0000-0000-0000-110000000000' as uniqueidentifier)
    Union   Select ID = '9', UID = cast ('00000000-0000-0000-0011-000000000000' as uniqueidentifier)
    Union   Select ID = '8', UID = cast ('00000000-0000-0000-1100-000000000000' as uniqueidentifier)
    Union   Select ID = '7', UID = cast ('00000000-0000-0011-0000-000000000000' as uniqueidentifier)
    Union   Select ID = '6', UID = cast ('00000000-0000-1100-0000-000000000000' as uniqueidentifier)
    Union   Select ID = '5', UID = cast ('00000000-0011-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = '4', UID = cast ('00000000-1100-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = '3', UID = cast ('00000011-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = '2', UID = cast ('00001100-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = '1', UID = cast ('00110000-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = '0', UID = cast ('11000000-0000-0000-0000-000000000000' as uniqueidentifier)
)
Select * From UIDs Order By UID desc

在 SQL Server(2014年,并与 SQL Server 2005 匹配)中,降序排序为:
Position by highest-to-lowest value:

A B C D E F | 8 9 | 7 6 | 5 4 | 3 2 1 0

由于SQL Server的newsequentialid利用此排序进行索引友好的GUID生成,因此这种行为可能永远不会改变。 SQL Server还必须在所有系统中维护此行为以支持复制。因此,如果问题涉及SQL Server,我肯定会说“在SQL Server中GUID有一致的排序”,可以完全依赖于SQL Server。

然而,这种排序与.NET的GUID排序不同,如果PostgreSQL中的排序不同,我也不会感到惊讶。 SQL Server中的“翻转”差异是因为它遵循COM GUIDs的“Variant 2”(又名小端)排序;即使对于“Variant 1” UUIDs也是如此。(但是,似乎更随意,因为为什么组本身从右到左排序:更多的是微软历史?)

有趣的问题仍然存在: PostgreSQL中,如何/在哪里指定了这个排序?如果没有很好地指定,是否仍然可以将实现视为行为公理?

此外查看有关SQL Server UUID的更多详细信息;以及有关“为什么”存在这些差异的美味细节。

2
对于任何来到这里想知道此脚本在Postgresql中的输出的人,稍作修改后,输出将被很好地排序为0 1 2 3 ... F。(PostgreSQL 9.5.14 on x86_64-pc-linux-gnu,由gcc(Ubuntu 5.4.0-6ubuntu1〜16.04.10)编译,64位) - Codism

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接