最快的高斯模糊实现

Question

最快的高斯模糊实现

javaimage-processingfilteringgaussian

44

如何实现最快的高斯模糊算法？

我将在Java中实现它，因此排除了GPU解决方案。我的应用程序planetGenesis是跨平台的，所以我不想使用JNI。

- Sid Datta

3

值得一看（或直接使用！）JH Labs 的高斯滤波器（GaussianFilter）- http://www.jhlabs.com/ip/filters/index.html - 我用过它，速度相当快。 - mikera

请看这里：https://github.com/RoyiAvital/FastGuassianBlur - Royi

15个回答

30

数学精英可能已经知道这个，但对于其他人来说..

由于高斯函数的一个美好数学特性，您可以通过先对图像的每行运行1D高斯模糊，然后对每列运行1D模糊，快速地使2D图像变得模糊。

- DarenW

17

我发现了Quasimondo : 孵化器 : 处理 : 快速高斯模糊。该方法包含许多近似值，例如使用整数和查找表而不是浮点数和浮点数除法。我不知道在现代Java代码中这会有多大的加速。
快速矩形阴影使用B样条进行近似算法。
C#中的快速高斯模糊算法声称具有一些很酷的优化。
此外，David Everly的快速高斯模糊（PDF）具有快速的高斯模糊处理方法。

我将尝试各种方法，对它们进行基准测试，并在此处发布结果。

为了我的目的，我从互联网上复制并实现了基本的（独立处理X-Y轴）方法和David Everly的快速高斯模糊方法。它们的参数不同，因此我无法直接比较它们。但是后者在大模糊半径下经过的迭代次数要少得多。此外，后者是一种近似算法。

- Sid Datta

8

您可能需要盒状模糊效果，这种方法速度更快。请参考此链接获取一份优秀教程以及可复制的 C 代码。

- Steve Hanov

你如何将高斯核的标准差与盒状模糊的长度联系起来？ - Royi

7

如果要使用更大的模糊半径，可以尝试应用三次盒状模糊。这将很好地近似高斯模糊，并且比真正的高斯模糊快得多。

- Tom Sirgedas

2

基本上，如果您想要一个“模糊直径”为20，应用直径为7、7、6的盒子模糊。这将产生类似于直径为20的单个盒子模糊的模糊效果，但外观更加美观。 - Tom Sirgedas

1

据我所知，Photoshop使用的是伪高斯模糊而非真正的高斯模糊。 - Camilo Martin

@Jaan 我也不知道，我是通过尝试得出来的，它有效！它一定类似于在二维高斯函数上积分（体积），同时保持相同的期望值，然后将其“转换”为盒子形状。请查看我的下面的答案！ - Ivan Kuckir

当近似给定的高斯分布时，一个合理的想法是匹配其标准差。然后正确的公式是BoxBlurRadius = sqrt(0.25+3σ^2/IterationCount)-0.5，其中盒子宽度为2BoxBlurRadius+1，σ是1D高斯模糊的标准差。对于较大的σ值，这简化为BoxBlurRadius = sqrt(3/IterationCount)*σ。（我使用了这些事实：卷积是可交换的；两个密度的卷积是独立随机变量和的密度；这种和的方差是方差之和；(-r)^2 + ... + r^2 = r(r+1)(2r+1)/3。） - Jaan

@Drazick 请查看我的答案 http://stackoverflow.com/a/18375487/978280，我的实现实际上比Adobe Photoshop的更精确。 - Ivan Kuckir

显示剩余5条评论

4

我已经将Ivan Kuckir实现的快速高斯模糊，使用三次线性方框模糊转换成了Java代码。结果的时间复杂度为O(n)，正如他在自己的博客中所述。如果你想了解为什么三次方框模糊可以近似于高斯模糊（3％），你可以查看方框模糊和高斯模糊。这是Java代码实现。

@Override
public BufferedImage ProcessImage(BufferedImage image) {
    int width = image.getWidth();
    int height = image.getHeight();

    int[] pixels = image.getRGB(0, 0, width, height, null, 0, width);
    int[] changedPixels = new int[pixels.length];

    FastGaussianBlur(pixels, changedPixels, width, height, 12);

    BufferedImage newImage = new BufferedImage(width, height, image.getType());
    newImage.setRGB(0, 0, width, height, changedPixels, 0, width);

    return newImage;
}

private void FastGaussianBlur(int[] source, int[] output, int width, int height, int radius) {
    ArrayList<Integer> gaussianBoxes = CreateGausianBoxes(radius, 3);
    BoxBlur(source, output, width, height, (gaussianBoxes.get(0) - 1) / 2);
    BoxBlur(output, source, width, height, (gaussianBoxes.get(1) - 1) / 2);
    BoxBlur(source, output, width, height, (gaussianBoxes.get(2) - 1) / 2);
}

private ArrayList<Integer> CreateGausianBoxes(double sigma, int n) {
    double idealFilterWidth = Math.sqrt((12 * sigma * sigma / n) + 1);

    int filterWidth = (int) Math.floor(idealFilterWidth);

    if (filterWidth % 2 == 0) {
        filterWidth--;
    }

    int filterWidthU = filterWidth + 2;

    double mIdeal = (12 * sigma * sigma - n * filterWidth * filterWidth - 4 * n * filterWidth - 3 * n) / (-4 * filterWidth - 4);
    double m = Math.round(mIdeal);

    ArrayList<Integer> result = new ArrayList<>();

    for (int i = 0; i < n; i++) {
        result.add(i < m ? filterWidth : filterWidthU);
    }

    return result;
}

private void BoxBlur(int[] source, int[] output, int width, int height, int radius) {
    System.arraycopy(source, 0, output, 0, source.length);
    BoxBlurHorizantal(output, source, width, height, radius);
    BoxBlurVertical(source, output, width, height, radius);
}

private void BoxBlurHorizontal(int[] sourcePixels, int[] outputPixels, int width, int height, int radius) {
    int resultingColorPixel;
    float iarr = 1f / (radius + radius);
    for (int i = 0; i < height; i++) {
        int outputIndex = i * width;
        int li = outputIndex;
        int sourceIndex = outputIndex + radius;

        int fv = Byte.toUnsignedInt((byte) sourcePixels[outputIndex]);
        int lv = Byte.toUnsignedInt((byte) sourcePixels[outputIndex + width - 1]);
        float val = (radius) * fv;

        for (int j = 0; j < radius; j++) {
            val += Byte.toUnsignedInt((byte) (sourcePixels[outputIndex + j]));
        }

        for (int j = 0; j < radius; j++) {
            val += Byte.toUnsignedInt((byte) sourcePixels[sourceIndex++]) - fv;
            resultingColorPixel = Byte.toUnsignedInt(((Integer) Math.round(val * iarr)).byteValue());
            outputPixels[outputIndex++] = (0xFF << 24) | (resultingColorPixel << 16) | (resultingColorPixel << 8) | (resultingColorPixel);
        }

        for (int j = (radius + 1); j < (width - radius); j++) {
            val += Byte.toUnsignedInt((byte) sourcePixels[sourceIndex++]) - Byte.toUnsignedInt((byte) sourcePixels[li++]);
            resultingColorPixel = Byte.toUnsignedInt(((Integer) Math.round(val * iarr)).byteValue());
            outputPixels[outputIndex++] = (0xFF << 24) | (resultingColorPixel << 16) | (resultingColorPixel << 8) | (resultingColorPixel);
        }

        for (int j = (width - radius); j < width; j++) {
            val += lv - Byte.toUnsignedInt((byte) sourcePixels[li++]);
            resultingColorPixel = Byte.toUnsignedInt(((Integer) Math.round(val * iarr)).byteValue());
            outputPixels[outputIndex++] = (0xFF << 24) | (resultingColorPixel << 16) | (resultingColorPixel << 8) | (resultingColorPixel);
        }
    }
}

private void BoxBlurVertical(int[] sourcePixels, int[] outputPixels, int width, int height, int radius) {
    int resultingColorPixel;
    float iarr = 1f / (radius + radius + 1);
    for (int i = 0; i < width; i++) {
        int outputIndex = i;
        int li = outputIndex;
        int sourceIndex = outputIndex + radius * width;

        int fv = Byte.toUnsignedInt((byte) sourcePixels[outputIndex]);
        int lv = Byte.toUnsignedInt((byte) sourcePixels[outputIndex + width * (height - 1)]);
        float val = (radius + 1) * fv;

        for (int j = 0; j < radius; j++) {
            val += Byte.toUnsignedInt((byte) sourcePixels[outputIndex + j * width]);
        }
        for (int j = 0; j <= radius; j++) {
            val += Byte.toUnsignedInt((byte) sourcePixels[sourceIndex]) - fv;
            resultingColorPixel = Byte.toUnsignedInt(((Integer) Math.round(val * iarr)).byteValue());
            outputPixels[outputIndex] = (0xFF << 24) | (resultingColorPixel << 16) | (resultingColorPixel << 8) | (resultingColorPixel);
            sourceIndex += width;
            outputIndex += width;
        }
        for (int j = radius + 1; j < (height - radius); j++) {
            val += Byte.toUnsignedInt((byte) sourcePixels[sourceIndex]) - Byte.toUnsignedInt((byte) sourcePixels[li]);
            resultingColorPixel = Byte.toUnsignedInt(((Integer) Math.round(val * iarr)).byteValue());
            outputPixels[outputIndex] = (0xFF << 24) | (resultingColorPixel << 16) | (resultingColorPixel << 8) | (resultingColorPixel);
            li += width;
            sourceIndex += width;
            outputIndex += width;
        }
        for (int j = (height - radius); j < height; j++) {
            val += lv - Byte.toUnsignedInt((byte) sourcePixels[li]);
            resultingColorPixel = Byte.toUnsignedInt(((Integer) Math.round(val * iarr)).byteValue());
            outputPixels[outputIndex] = (0xFF << 24) | (resultingColorPixel << 16) | (resultingColorPixel << 8) | (resultingColorPixel);
            li += width;
            outputIndex += width;
        }
    }
}

- Ali Akdurak

1

它的功能很好，但结果图像是黑白的，我该如何使它变成彩色？ - Tamer Saleh

2

在一维情况下：

使用几乎任何核进行重复模糊处理最终都会趋向于高斯核。这就是高斯分布的魅力所在，也是统计学家喜欢它的原因。因此，选择一个易于模糊处理的东西，并多次应用它。

例如，可以使用盒形核进行模糊处理。首先计算累积和：

y(i) = y(i-1) + x(i)

那么：

blurred(i) = y(i+radius) - y(i-radius)

多次重复。

或者您可以使用各种形式的IIR滤波器来前后反复，这些滤波器同样快速。

在二维或更高维度中：

依次在每个维度中进行模糊处理，如DarenW所说。

- Paul Harrison

2

我在研究中遇到了这个问题，尝试了一种快速高斯模糊的有趣方法。首先，如上所述，最好将模糊分为两个1D模糊，但根据您的硬件实际计算像素值，您实际上可以预计算所有可能的值并将它们存储在查找表中。

换句话说，预先计算每个高斯系数*输入像素值的组合。当然，您需要离散化系数，但我只是想添加这个解决方案。如果您有IEEE订阅，您可以在Fast image blurring using Lookup Table for real time feature extraction中了解更多信息。

最终，我最终使用了CUDA :)

- Ben McIntosh

2

如果你想使用更大的内核，我建议你考虑使用CUDA或其他GPU编程工具包。如果不行，你可以手动调整循环的汇编代码。

- Dana The Sane

2

步骤 1：SIMD 一维高斯模糊步骤 2：转置步骤 3：重复步骤 1

最好在小块上执行此操作，因为全图转置很慢，而小块转置可以使用PUNPCKs（PUNPCKHBW、PUNPCKHDQ、PUNPCKHWD、PUNPCKLBW、PUNPCKLDQ、PUNPCKLWD）链非常快速地完成。

- Dark Shikari

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Dima · Accepted Answer

你应该利用高斯核是可分离的事实，也就是说你可以将二维卷积表示为两个一维卷积的组合。

如果滤波器很大，使用空间域中的卷积等效于频率（傅里叶）域中的乘法可能更有意义。这意味着您可以对图像和滤波器进行傅里叶变换，相乘得到（复杂的）结果，然后进行逆傅里叶变换。 FFT（快速傅里叶变换）的复杂度为O(n log n)，而卷积的复杂度为O(n^2)。此外，如果您需要使用相同的滤波器模糊多个图像，则只需对滤波器进行一次FFT即可。

如果您决定使用FFT，则FFTW库是一个不错的选择。FFTW library