UUID格式:8-4-4-4-12 - 为什么这样设计?

108
为什么UUID的格式为“8-4-4-4-12”(数字)?我搜索了一下,但找不到这样做的原因。
UUID按十六进制字符串格式表示的示例: 58D5E212-165B-4CA0-909B-C86B9CEE0111

17
实际上,那个十六进制字符串的例子是错误的。UUID规范要求:表示UUID值的十六进制字符串必须使用小写字母。该规范还要求实现能够解析大写或混合大小写的字符串,但只能生成小写字符串。不幸的是,包括苹果、微软在内的常见实现违反了这个规则。 - Basil Bourque
1
有趣的 Basil,谢谢。 - Fidel
3个回答

86

如下所示,它由时间、版本、时钟序列高位、时钟序列低位、节点分隔。

引用自IETF RFC4122:

4.1.2.  Layout and Byte Order

   To minimize confusion about bit assignments within octets, the UUID
   record definition is defined only in terms of fields that are
   integral numbers of octets.  The fields are presented with the most
   significant one first.

   Field                  Data Type     Octet  Note
                                        #

   time_low               unsigned 32   0-3    The low field of the
                          bit integer          timestamp

   time_mid               unsigned 16   4-5    The middle field of the
                          bit integer          timestamp

   time_hi_and_version    unsigned 16   6-7    The high field of the
                          bit integer          timestamp multiplexed
                                               with the version number  

   clock_seq_hi_and_rese  unsigned 8    8      The high field of the
   rved                   bit integer          clock sequence
                                               multiplexed with the
                                               variant

   clock_seq_low          unsigned 8    9      The low field of the
                          bit integer          clock sequence

   node                   unsigned 48   10-15  The spatially unique
                          bit integer          node identifier

   In the absence of explicit application or presentation protocol
   specification to the contrary, a UUID is encoded as a 128-bit object,
   as follows:

   The fields are encoded as 16 octets, with the sizes and order of the
   fields defined above, and with each field encoded with the Most
   Significant Byte first (known as network byte order).  Note that the
   field names, particularly for multiplexed fields, follow historical
   practice.

   0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          time_low                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       time_mid                |         time_hi_and_version   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |clk_seq_hi_res |  clk_seq_low  |         node (0-1)            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         node (2-5)                            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

16
时间戳被分成三个部分的原因是什么? - user253751
5
字段生成的方式取决于UUID版本。首选方法不使用时间戳,因为这会暴露出ID生成的时间(潜在的安全问题)。详见http://zh.wikipedia.org/wiki/通用唯一识别码#变体及版本。 - pmont
1
@pmont "首选"? - Basil Bourque
2
@brocoli 我不得不反驳一下。V4 依赖于一个加密强度很高的随机数生成器,这比仅仅获取 MAC 地址、当前时间和递增的任意数字(如 V1 UUID 中所见)要困难得多。此外,V1 的实现通常是开源的,并且在许多年前就已经构建并在整个行业中得到了广泛使用,现在已经非常成熟。声称 V1 “容易部分失效” 简直是荒谬的。V1 UUID 是你系统中最后需要担心失败的部分。 - Basil Bourque
2
@BasilBourque,目前在容器的普及和容器网络化方面我们需要注意到的一个问题是MAC地址冲突。通常情况下,容器和虚拟机都是从有限范围内的MAC地址中进行获取。记得Hyper-V默认仅从256个可能的MAC地址池中提取。 - Nathan Clayton
显示剩余5条评论

14
这个格式在IETF RFC4122第3节中定义。输出格式在"UUID = ..."处定义。

3.- 命名空间注册模板

命名空间ID: UUID 注册信息: 注册日期:2003-10-01

命名空间的申报者: JTC 1/SC6 (ASN.1汇报员组)

语法结构的声明: UUID是一个标识符,它在时间和空间上都是唯一的, 相对于所有UUID的空间而言。由于UUID具有固定大小并包含时间字段, 因此可能会发生值溢出(根据使用的特定算法,约为A.D.3400)。 UUID可以用于多种目的,从标记具有极短寿命的对象到可靠地识别跨网络的持久对象。

  The internal representation of a UUID is a specific sequence of
  bits in memory, as described in Section 4.  To accurately
  represent a UUID as a URN, it is necessary to convert the bit
  sequence to a string representation.

  Each field is treated as an integer and has its value printed as a
  zero-filled hexadecimal digit string with the most significant
  digit first.  The hexadecimal values "a" through "f" are output as
  lower case characters and are case insensitive on input.

  The formal definition of the UUID string representation is
  provided by the following ABNF [7]:

  UUID                   = time-low "-" time-mid "-"
                           time-high-and-version "-"
                           clock-seq-and-reserved
                           clock-seq-low "-" node
  time-low               = 4hexOctet
  time-mid               = 2hexOctet
  time-high-and-version  = 2hexOctet
  clock-seq-and-reserved = hexOctet
  clock-seq-low          = hexOctet
  node                   = 6hexOctet
  hexOctet               = hexDigit hexDigit
  hexDigit =
        "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
        "a" / "b" / "c" / "d" / "e" / "f" /
        "A" / "B" / "C" / "D" / "E" / "F"

7

128位

"8-4-4-4-12"格式只是为了让人类阅读方便。实际上,UUID是一个128位数字。

考虑到字符串格式需要存储或在内存中使用的字节数是128位数字的两倍,建议在内部使用数字,当它需要在UI中显示或导出到文件时,再使用字符串格式。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接