如何轻松地压缩和解压字符串为/自字节数组？

Question

如何轻松地压缩和解压字符串为/自字节数组？

12

我有一些大约10K个字符的字符串，其中有很多重复的内容。它们是序列化的JSON对象。我想要将它们轻松地压缩成字节数组，并从字节数组中解压缩。

我应该如何最容易地完成这个任务？我正在寻找方法，以便能够执行以下操作：

String original = "....long string here with 10K characters...";
byte[] compressed = StringCompressor.compress(original);
String decompressed = StringCompressor.decompress(compressed);
assert(original.equals(decompressed);

- Steve McLeod

1

我会使用InflatorInputStream/DeflatorOutputStream与ByteArrayInput/OutputStream。 - Peter Lawrey

2

有一个易于使用的“zip”类存在...编辑-它在这里http://docs.oracle.com/javase/6/docs/api/java/util/zip/package-summary.html，似乎使用@peter提到的类。 - Tony Ennis

2

这个怎么样？https://dev59.com/zHA65IYBdhLWcg3wvxeE - Mikita Belahlazau

只使用 String 和 byte[]，这个方法不会超过10-15行，假设JSON都是ASCII编码。如果必须使用UTF-8，则需要添加10行代码。 - CodeClown42

3个回答

3

彼得·劳雷的答案可以通过使用此不太复杂的代码来改进解压函数。

    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    try {
        OutputStream out = new InflaterOutputStream(baos);
        out.write(bytes);
        out.close();
        return new String(baos.toByteArray(), "UTF-8");
    } catch (IOException e) {
        throw new AssertionError(e);
    }

- Ray Hulha

1

我制作了一个库来解决压缩通用字符串（尤其是短字符串）的问题。它尝试使用各种算法（纯UTF-8、拉丁字母5位编码、哈夫曼编码、长字符串的gzip），并选择结果最短的算法（在最坏情况下，它将选择UTF-8编码，以便您永远不会冒失去空间的风险）。

我希望它可能有用，这是链接https://github.com/lithedream/lithestring 编辑：我意识到您的字符串总是“长”，对于这些大小，我的库默认为gzip，恐怕我无法为您做得更好。

- lithedream

标准API已经解决了问题，为什么还需要一个库？ - Lluis Martinez

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Peter Lawrey · Accepted Answer

你可以尝试

enum StringCompressor {
    ;
    public static byte[] compress(String text) {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        try {
            OutputStream out = new DeflaterOutputStream(baos);
            out.write(text.getBytes("UTF-8"));
            out.close();
        } catch (IOException e) {
            throw new AssertionError(e);
        }
        return baos.toByteArray();
    }

    public static String decompress(byte[] bytes) {
        InputStream in = new InflaterInputStream(new ByteArrayInputStream(bytes));
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        try {
            byte[] buffer = new byte[8192];
            int len;
            while((len = in.read(buffer))>0)
                baos.write(buffer, 0, len);
            return new String(baos.toByteArray(), "UTF-8");
        } catch (IOException e) {
            throw new AssertionError(e);
        }
    }
}