实现对数Gabor滤波器组

9

我正在阅读这篇论文"自反2D Log-Gabor小波",它将2D log gabor滤波器定义为:

enter image description here enter image description here

该论文还指出,该滤波器仅覆盖频率空间的一侧,并在此图像中显示。

enter image description here

在我尝试实现筛选器时,我得到的结果与论文中所述的不符。让我先介绍我的实现,然后再说明问题。
实现:
  1. I created a 2d array that contains the filter and transformed each index so that the origin of the frequency domain is at the center of the array with positive x-axis going right and positive y-axis going up.

    number_scales = 5         # scale resolution
    number_orientations = 9   # orientation resolution
    N = constantDim           # image dimensions
    
    def getLogGaborKernal(scale, angle, logfun=math.log2, norm = True):
        # setup up filter configuration
        center_scale = logfun(N) - scale          
        center_angle = ((np.pi/number_orientations) * angle) if (scale % 2) \
                    else ((np.pi/number_orientations) * (angle+0.5))
        scale_bandwidth =  0.996 * math.sqrt(2/3)
        angle_bandwidth =  0.996 * (1/math.sqrt(2)) * (np.pi/number_orientations)
    
        # 2d array that will hold the filter
        kernel = np.zeros((N, N))
        # get the center of the 2d array so we can shift origin
        middle = math.ceil((N/2)+0.1)-1
    
        # calculate the filter
        for x in range(0,constantDim):
            for y in range(0,constantDim):
                # get the transformed x and y where origin is at center
                # and positive x-axis goes right while positive y-axis goes up
                x_t, y_t = (x-middle),-(y-middle)
                # calculate the filter value at given index
                kernel[y,x] = logGaborValue(x_t,y_t,center_scale,center_angle,
            scale_bandwidth, angle_bandwidth,logfun)
    
        # normalize the filter energy
        if norm:
            Kernel = kernel / np.sum(kernel**2)
        return kernel
    
  2. To calculate the filter value at each index another transform is made where we go to the log-polar space

    def logGaborValue(x,y,center_scale,center_angle,scale_bandwidth,
                  angle_bandwidth, logfun):
        # transform to polar coordinates
        raw, theta = getPolar(x,y)
        # if we are at the center, return 0 as in the log space
        # zero is not defined
        if raw == 0:
            return 0
    
        # go to log polar coordinates
        raw = logfun(raw)
    
        # calculate (theta-center_theta), we calculate cos(theta-center_theta) 
        # and sin(theta-center_theta) then use atan to get the required value,
        # this way we can eliminate the angular distance wrap around problem
        costheta, sintheta = math.cos(theta), math.sin(theta)
        ds = sintheta * math.cos(center_angle) - costheta * math.sin(center_angle)    
        dc = costheta * math.cos(center_angle) + sintheta * math.sin(center_angle)  
        dtheta = math.atan2(ds,dc)
    
        # final value, multiply the radial component by the angular one
        return math.exp(-0.5 * ((raw-center_scale) / scale_bandwidth)**2) * \
                math.exp(-0.5 * (dtheta/angle_bandwidth)**2)
    

Problems:

  1. 角度:论文指出,将角度从1->8进行索引会产生良好的方向覆盖,但是在我的实现中,从1->n的角度并没有覆盖全部方向,甚至垂直方向也没有被正确覆盖。可以通过以下图示来展示:该图包含了尺度为3且方向从1->8的一组滤波器。

    enter image description here

  2. 覆盖率:通过上面的滤波器可知滤波器覆盖了空间的两侧,这与论文所说的不同。这可以通过使用从 -4 -> 4 的9个方向来更明确地表示。下面的图像包含了所有滤波器的一个图像,以展示它如何涵盖频谱的两侧(此图像是通过从所有滤波器的各个位置取最大值来创建的):

    enter image description here

  3. 中间列(方向 $\pi / 2$):在第一幅图中,从3->8的方向可以看到滤波器在方向 $\pi / 2$处消失。这正常吗?当我将所有滤波器(所有5个尺度和9个方向)合并到一个图像中时,也可以看到这一点:

    enter image description here

更新: 在空间域中添加了滤波器的脉冲响应,如您所见,在-4和4个方向上存在明显的失真:

enter image description here

2个回答

6

经过大量代码分析,我发现我的实现是正确的,但是getPolar函数出了问题,所以上面的代码应该可以正常工作。如果有人需要,这是一个没有getPolar函数的新代码:

number_scales = 5          # scale resolution
number_orientations = 8    # orientation resolution
N = 128                    # image dimensions
def getFilter(f_0, theta_0):
    # filter configuration
    scale_bandwidth =  0.996 * math.sqrt(2/3)
    angle_bandwidth =  0.996 * (1/math.sqrt(2)) * (np.pi/number_orientations)

    # x,y grid
    extent = np.arange(-N/2, N/2 + N%2)
    x, y = np.meshgrid(extent,extent)

    mid = int(N/2)
    ## orientation component ##
    theta = np.arctan2(y,x)
    center_angle = ((np.pi/number_orientations) * theta_0) if (f_0 % 2) \
                else ((np.pi/number_orientations) * (theta_0+0.5))

    # calculate (theta-center_theta), we calculate cos(theta-center_theta) 
    # and sin(theta-center_theta) then use atan to get the required value,
    # this way we can eliminate the angular distance wrap around problem
    costheta = np.cos(theta)
    sintheta = np.sin(theta)
    ds = sintheta * math.cos(center_angle) - costheta * math.sin(center_angle)    
    dc = costheta * math.cos(center_angle) + sintheta * math.sin(center_angle)  
    dtheta = np.arctan2(ds,dc)

    orientation_component =  np.exp(-0.5 * (dtheta/angle_bandwidth)**2)

    ## frequency componenet ##
    # go to polar space
    raw = np.sqrt(x**2+y**2)
    # set origin to 1 as in the log space zero is not defined
    raw[mid,mid] = 1
    # go to log space
    raw = np.log2(raw)

    center_scale = math.log2(N) - f_0
    draw = raw-center_scale
    frequency_component = np.exp(-0.5 * (draw/ scale_bandwidth)**2)

    # reset origin to zero (not needed as it is already 0?)
    frequency_component[mid,mid] = 0

    return frequency_component * orientation_component

感谢您回复自己的问题。我也在寻找有关虹膜识别中的对数 Gabor 滤波器的信息,并发现您的帖子非常有用。 - pem

1
是的 - 这对我也非常有帮助。因为我刚刚开始学习log-gabor滤波器,所以花了一些时间才意识到一些可能很基础的东西,但我想在这里提一下给其他人。Log-gabor是一种频率滤波器。所以,与其创建一个小的核并在图像上进行卷积,不如在频率空间中直接进行乘法运算。
这意味着N必须与图像大小匹配,并且您必须首先使用fft进行转换,然后只需乘以滤波器,最后再使用逆fft进行转换。这段代码给了我所需的完整的端到端测试,可以对图像进行滤波。然后,我可以渲染结果图像并验证我得到了我所期望的结果。作为一个测试,可以通过合并成一个单一的RGB图像来轻松可视化3个方向。
def fft(im):
    return np.fft.fftshift(np.fft.fft2(im))

def ifft(f):
    return np.real(np.fft.ifft2(np.fft.ifftshift(f)))

N = src.shape[0]
f_0 = 4
number_orientations = 4
lg = [getLogGaborFilter(N,f_0,x,number_orientations) for x in range(0,number_orientations)]
f = [ifft(fft(src) * lg[x]) for x in range(0,number_orientations)]

我稍微改变了函数签名,以便接受图像大小和方向数量。而且,我花了一点时间才意识到f_0的确切含义。从math.log2(N) - f_0来看,这相当于将N除以2的f_0次方,选择更大尺度的特征。我的理解是,这意味着您可能只想使用小的正整数。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接