我该如何将没有破折号的字符串转换为java.util.UUID?
"5231b533ba17478798a3f2df37de2aD7" => #uuid "5231b533-ba17-4787-98a3-f2df37de2aD7"
我该如何将没有破折号的字符串转换为java.util.UUID?
"5231b533ba17478798a3f2df37de2aD7" => #uuid "5231b533-ba17-4787-98a3-f2df37de2aD7"
java.util.UUID.fromString(
"5231b533ba17478798a3f2df37de2aD7"
.replaceFirst(
"(\\p{XDigit}{8})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}+)", "$1-$2-$3-$4-$5"
)
).toString()
5231b533-ba17-4787-98a3-f2df37de2ad7
或者将十六进制字符串的每一半解析为long
整数数字,并传递给UUID
构造函数。
UUID uuid = new UUID ( long1 , long2 ) ;
UUID是一个由128位组成的值。UUID实际上不是由字母和数字组成的,而是由比特组成的。你可以将其视为描述一个非常大的数字。
我们可以将这些比特显示为一百二十八个0
和1
字符。
0111 0100 1101 0010 0101 0001 0101 0110 0110 0000 1110 0110 0100 0100 0100 1100 1010 0001 0111 0111 1010 1001 0110 1110 0110 0111 1110 1100 1111 1100 0101 1111
人类不容易读取比特,因此为了方便起见,我们通常将128位的值表示为一个hexadecimal字符串,由字母和数字组成。
74d25156-60e6-444c-a177-a96e67ecfc5f
这样的十六进制字符串并不是 UUID 本身,只是一种人类友好的表示方式。连字符是按照 UUID 规范添加的标准格式,但是可选的。
74d2515660e6444ca177a96e67ecfc5f
顺便说一下,UUID 规范明确规定在生成十六进制字符串时必须使用小写字母,而大写字母应该被容忍作为输入。不幸的是,包括苹果、微软和其他许多实现都违反了这个小写生成规则。请参见我的博客文章。
java.util.UUID uuidFromHyphens = java.util.UUID.fromString("6f34f25e-0b0d-4426-8ece-a8b3f27f4b63");
System.out.println( "UUID from string with hyphens: " + uuidFromHyphens );
java.util.UUID uuidFromNoHyphens = java.util.UUID.fromString("6f34f25e0b0d44268ecea8b3f27f4b63");
一种解决方法是将十六进制字符串格式化并添加连字符以达到规范化。下面是使用正则表达式格式化十六进制字符串的尝试。注意:这段代码能运行,但我不是正则表达式专家。在格式化之前应该检查字符串长度是否为32个字符,在格式化之后是否为36个。
// -----| With Hyphens |----------------------
java.util.UUID uuidFromHyphens = java.util.UUID.fromString( "6f34f25e-0b0d-4426-8ece-a8b3f27f4b63" );
System.out.println( "UUID from string with hyphens: " + uuidFromHyphens );
System.out.println();
// -----| Without Hyphens |----------------------
String hexStringWithoutHyphens = "6f34f25e0b0d44268ecea8b3f27f4b63";
// Use regex to format the hex string by inserting hyphens in the canonical format: 8-4-4-4-12
String hexStringWithInsertedHyphens = hexStringWithoutHyphens.replaceFirst( "([0-9a-fA-F]{8})([0-9a-fA-F]{4})([0-9a-fA-F]{4})([0-9a-fA-F]{4})([0-9a-fA-F]+)", "$1-$2-$3-$4-$5" );
System.out.println( "hexStringWithInsertedHyphens: " + hexStringWithInsertedHyphens );
java.util.UUID myUuid = java.util.UUID.fromString( hexStringWithInsertedHyphens );
System.out.println( "myUuid: " + myUuid );
您可能会发现这种替代语法更易读,使用正则表达式中的Posix表示法,其中\\p{XDigit}
取代了[0-9a-fA-F]
(请参见Pattern文档):
String hexStringWithInsertedHyphens = hexStringWithoutHyphens.replaceFirst( "(\\p{XDigit}{8})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}+)", "$1-$2-$3-$4-$5" );
完整的示例。
java.util.UUID uuid =
java.util.UUID.fromString (
"5231b533ba17478798a3f2df37de2aD7"
.replaceFirst (
"(\\p{XDigit}{8})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}+)",
"$1-$2-$3-$4-$5"
)
);
System.out.println ( "uuid.toString(): " + uuid );
uuid.toString():5231b533-ba17-4787-98a3-f2df37de2ad7#uuid
标记文字是一个通过java.util.UUID/fromString
传递的。而且,fromString
会将其按照"-"分割并转换为两个Long
值。(UUID的格式标准化为8-4-4-4-12个十六进制数字,但"-"实际上仅用于验证和视觉识别。)java.util.UUID/fromString
。(defn uuid-from-string [data]
(java.util.UUID/fromString
(clojure.string/replace data
#"(\w{8})(\w{4})(\w{4})(\w{4})(\w{12})"
"$1-$2-$3-$4-$5")))
如果您不想使用正则表达式,可以使用ByteBuffer
和DatatypeConverter
。
(defn uuid-from-string [data]
(let [buffer (java.nio.ByteBuffer/wrap
(javax.xml.bind.DatatypeConverter/parseHexBinary data))]
(java.util.UUID. (.getLong buffer) (.getLong buffer))))
(.{8})(.{4})(.{4})(.{4})(.{12})
。其中,点号.
代表“任意字符”。这样一来,正则表达式解析器就不需要检查每个字符是否属于“单词”字符组。 - Ruslan Stelmachenko正则表达式的解决方案可能更快,但你也可以看看这个 :)
String withoutDashes = "44e128a5-ac7a-4c9a-be4c-224b6bf81b20".replaceAll("-", "");
BigInteger bi1 = new BigInteger(withoutDashes.substring(0, 16), 16);
BigInteger bi2 = new BigInteger(withoutDashes.substring(16, 32), 16);
UUID uuid = new UUID(bi1.longValue(), bi2.longValue());
String withDashes = uuid.toString();
顺便说一下,将16个二进制字节转换为UUID。
InputStream is = ..binarty input..;
byte[] bytes = IOUtils.toByteArray(is);
ByteBuffer bb = ByteBuffer.wrap(bytes);
UUID uuidWithDashesObj = new UUID(bb.getLong(), bb.getLong());
String uuidWithDashes = uuidWithDashesObj.toString();
你可以使用一个有趣的正则表达式替换:
String digits = "5231b533ba17478798a3f2df37de2aD7";
String uuid = digits.replaceAll(
"(\\w{8})(\\w{4})(\\w{4})(\\w{4})(\\w{12})",
"$1-$2-$3-$4-$5");
System.out.println(uuid); // => 5231b533-ba17-4787-98a3-f2df37de2aD7
public static String addUUIDDashes(String idNoDashes) {
StringBuffer idBuff = new StringBuffer(idNoDashes);
idBuff.insert(20, '-');
idBuff.insert(16, '-');
idBuff.insert(12, '-');
idBuff.insert(8, '-');
return idBuff.toString();
}
与使用正则表达式和字符串操作相比,速度快了很多(约900%)的解决方案是将十六进制字符串解析为2个长整型并从中创建UUID实例:
(defn uuid-from-string
"Converts a 32digit hex string into java.util.UUID"
[hex]
(java.util.UUID.
(Long/parseUnsignedLong (subs hex 0 16) 16)
(Long/parseUnsignedLong (subs hex 16) 16)))
优化版@maerics的答案:
String[] digitsList= {
"daa70a7ffa904841bf9a81a67bdfdb45",
"529737c950e6428f80c0bac104668b54",
"5673c26e2e8f4c129906c74ec634b807",
"dd5a5ee3a3c44e4fb53d2e947eceeda5",
"faacc25d264d4e9498ade7a994dc612e",
"9a1d322dc70349c996dc1d5b76b44a0a",
"5fcfa683af5148a99c1bd900f57ea69c",
"fd9eae8272394dfd8fd42d2bc2933579",
"4b14d571dd4a4c9690796da318fc0c3a",
"d0c88286f24147f4a5d38e6198ee2d18"
};
//Use compiled pattern to improve performance of bulk operations
Pattern pattern = Pattern.compile("(\\w{8})(\\w{4})(\\w{4})(\\w{4})(\\w{12})");
for (int i = 0; i < digitsList.length; i++)
{
String uuid = pattern.matcher(digitsList[i]).replaceAll("$1-$2-$3-$4-$5");
System.out.println(uuid);
}
public class Example1 {
/**
* Get a UUID from a 32 char hexadecimal.
*
* @param string a hexadecimal string
* @return a UUID
*/
public static UUID toUuid(String string) {
if (string == null || string.length() != 32) {
throw new IllegalArgumentException("invalid input string!");
}
char[] input = string.toCharArray();
char[] output = new char[36];
System.arraycopy(input, 0, output, 0, 8);
System.arraycopy(input, 8, output, 9, 4);
System.arraycopy(input, 12, output, 14, 4);
System.arraycopy(input, 16, output, 19, 4);
System.arraycopy(input, 20, output, 24, 12);
output[8] = '-';
output[13] = '-';
output[18] = '-';
output[23] = '-';
return UUID.fromString(output)
}
public static void main(String[] args) {
UUID uuid = toUuid("daa70a7ffa904841bf9a81a67bdfdb45");
}
}
uuid-creator中有一个编解码器可以更高效地完成此任务:Base16Codec
。例如:
// Parses base16 strings with 32 chars (case insensitive)
UuidCodec<String> codec = new Base16Codec();
UUID uuid = codec.decode("0123456789AB4DEFA123456789ABCDEF");
String hyphenlessUuid = in.nextString();
BigInteger bigInteger = new BigInteger(hyphenlessUuid, 16);
new UUID(bigInteger.shiftRight(64).longValue(), bigInteger.longValue());
我相信以下代码在性能方面是最快的。它甚至比Long.parseUnsignedLong版本稍微快一些。这是来自java-uuid-generator的略微修改过的代码。
public static UUID from32(
String id) {
if (id == null) {
throw new NullPointerException();
}
if (id.length() != 32) {
throw new NumberFormatException("UUID has to be 32 char with no hyphens");
}
long lo, hi;
lo = hi = 0;
for (int i = 0, j = 0; i < 32; ++j) {
int curr;
char c = id.charAt(i);
if (c >= '0' && c <= '9') {
curr = (c - '0');
}
else if (c >= 'a' && c <= 'f') {
curr = (c - 'a' + 10);
}
else if (c >= 'A' && c <= 'F') {
curr = (c - 'A' + 10);
}
else {
throw new NumberFormatException(
"Non-hex character at #" + i + ": '" + c + "' (value 0x" + Integer.toHexString(c) + ")");
}
curr = (curr << 4);
c = id.charAt(++i);
if (c >= '0' && c <= '9') {
curr |= (c - '0');
}
else if (c >= 'a' && c <= 'f') {
curr |= (c - 'a' + 10);
}
else if (c >= 'A' && c <= 'F') {
curr |= (c - 'A' + 10);
}
else {
throw new NumberFormatException(
"Non-hex character at #" + i + ": '" + c + "' (value 0x" + Integer.toHexString(c) + ")");
}
if (j < 8) {
hi = (hi << 8) | curr;
}
else {
lo = (lo << 8) | curr;
}
++i;
}
return new UUID(hi, lo);
}