检测图像中不均匀照明的强大算法【仅需检测】

Question

检测图像中不均匀照明的强大算法【仅需检测】

pythonalgorithmopencvimage-processingcomputer-vision

21

在tesseract OCR文本识别中，最大的挑战之一是图像的不均匀照明。我需要一个算法，可以判断图像是否包含不均匀照明。

测试图像

我附上了没有照明的图像、有光斑的图像(白斑图像)和包含阴影的图像的图像。如果我们将一张图像给算法，该算法应该分为两类，如下：

没有不均匀照明 - 我们的没有照明的图像将属于此类别。
不均匀照明 - 我们的有光斑的图像(白斑图像)和包含阴影的图像将属于此类别。

没有照明的图像 - 类别A

不均匀照明图像（眩光图像（白斑图像））B类

不均匀照明图像（包含阴影的图像）类别B

初始方法

将颜色空间转换为HSV
对HSV的值通道进行直方图分析，以识别不均匀的照明。

与前两个步骤不同，我们可以使用感知亮度通道代替HSV的值通道

设置低阈值以获取低于该阈值的像素数量
设置高阈值以获取高于该阈值的像素数量
低像素值百分比和高像素值百分比的比例来检测不均匀的光照条件（也要设置百分比的阈值）

但我在不均匀照明的图像中没有找到太多相似之处。通过直方图分析，我只发现一些像素具有低值和一些像素具有高值。{{histogram analysis}}

基本上，我认为如果在低阈值中设置一些阈值值，并找出有多少像素小于低阈值，并设置一些高阈值以找出有多少像素大于该阈值。通过像素计数，我们可以得出结论，以检测图像中的不均匀光照条件。在这里，我们需要确定两个阈值值和像素数量的百分比，以得出结论。

def  show_hist_v(img_path):
    img = cv2.imread(img_path)
    hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    h,s,v  = cv2.split(hsv_img)
    histr =cv2.calcHist(v, [0], None, [255],[0,255])
    plt.plot(histr) 
    plt.show() 
    low_threshold =np.count_nonzero(v < 50)
    high_threshold =np.count_nonzero(v >200)
    total_pixels = img.shape[0]* img.shape[1]
    percenet_low =low_threshold/total_pixels*100
    percenet_high =high_threshold/total_pixels*100
    print("Total Pixels - {}\n Pixels More than 200 - {} \n Pixels Less than 50 - {} \n Pixels percentage more than 200 - {} \n Pixel spercentage less than 50 - {} \n".format(total_pixels,high_threshold,low_threshold,percenet_low,percenet_high))

                                    
    return total_pixels,high_threshold,low_threshold,percenet_low,percenet_high

那么有人可以改进我的初始方法或提供比这更好的方法来检测一般情况下图像中的不均匀照明吗？

此外，我尝试了感知亮度而不是值通道，因为值通道取（b，g，r）值的最大值，所以我认为感知亮度是一个不错的选择。

 def get_perceive_brightness( float_img):
    float_img = np.float64(float_img)  # unit8 will make overflow
    b, g, r = cv2.split(float_img)
    float_brightness = np.sqrt(
        (0.241 * (r ** 2)) + (0.691 * (g ** 2)) + (0.068 * (b ** 2)))
    brightness_channel = np.uint8(np.absolute(float_brightness))
    return brightness_channel

def  show_hist_v(img_path):
    img = cv2.imread(img_path)
    v = get_perceive_brightness(img)
    histr =cv2.calcHist(v, [0], None, [255],[0,255])
    plt.plot(histr) 
    plt.show() 
    low_threshold =np.count_nonzero(v < 50)
    high_threshold =np.count_nonzero(v >200)
    total_pixels = img.shape[0]* img.shape[1]
    percenet_low =low_threshold/total_pixels*100
    percenet_high =high_threshold/total_pixels*100
    print("Total Pixels - {}\n Pixels More than 200 - {} \n Pixels Less than 50 - {} \n Pixels percentage more than 200 - {} \n Pixel spercentage less than 50 - {} \n".format(total_pixels,high_threshold,low_threshold,percenet_low,percenet_high))

                                    
    return  total_pixels,high_threshold,low_threshold,percenet_low,percenet_high

感知亮度通道的直方图分析

正如Ahmet所建议的。

def get_percentage_of_binary_pixels(img=None, img_path=None):
  if img is None:
    if img_path is not None:
      gray_img = cv2.imread(img_path, 0)
    else:
      return "No img or img_path"
  else:
    print(img.shape)
    if len(img.shape) > 2:
      gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    else:
      gray_img = img
  h, w = gray_img.shape
  guassian_blur = cv2.GaussianBlur(gray_img, (5, 5), 0)
  thresh_value, otsu_img = cv2.threshold(guassian_blur, 0, 255,
                                         cv2.THRESH_BINARY + cv2.THRESH_OTSU)
  cv2.imwrite("binary/{}".format(img_path.split('/')[-1]), otsu_img)
  black_pixels = np.count_nonzero(otsu_img == 0)
  # white_pixels = np.count_nonzero(otsu_img == 255)

  black_pixels_percentage = black_pixels / (h * w) * 100
  # white_pixels_percentage = white_pixels / (h * w) * 100

  return black_pixels_percentage

当我们使用大津二值化方法得到超过35%的黑色像素百分比时，我们可以检测到大约80%左右的不均匀照明图像。当照明发生在图像的一个小区域时，检测会失败。

提前感谢。

- Sivaram Rasathurai

谢谢@Ziri，我会尝试的。 - Sivaram Rasathurai

1

参考增强动态范围和规范化照明获取一些相关的想法。 - Spektre

1

@rcvaram 这只是基础...我将该算法演变为基于网格的插值，其中图像被分成均匀的网格，每个网格都像那样计算+/-一些插值处理故障（也处理闪烁）...我想我也发布了它，但要找到它需要一些时间，因为我得到了太多答案，而SO搜索引擎并不好。 - Spektre

1

@rcvaram 哈哈，比平常找得快（通过搜索函数头源代码）:) 请参见OpenCV for OCR: How to compute thresholding levels for gray image OCR，它是函数normalize。 - Spektre

嗨，黑暗的图像怎么样？例如，如果图像完全是黑色的，没有点并且是均匀的，也就是没有阴影。它只是没有被照亮。这是第四类还是不是？ - Andrea Mannari

显示剩余7条评论

4个回答

5

我建议使用分割技巧将文本与背景分离，然后仅对背景进行统计。在设置一些合理的阈值之后，很容易创建照明分类器。

def get_image_stats(img_path, lbl):
    img = cv2.imread(img_path)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    blurred = cv2.GaussianBlur(gray, (25, 25), 0)
    no_text = gray * ((gray/blurred)>0.99)                     # select background only
    no_text[no_text<10] = no_text[no_text>20].mean()           # convert black pixels to mean value
    no_bright = no_text.copy()
    no_bright[no_bright>220] = no_bright[no_bright<220].mean() # disregard bright pixels

    print(lbl)
    std = no_bright.std()
    print('STD:', std)
    bright = (no_text>220).sum()
    print('Brigth pixels:', bright)
    plt.figure()
    plt.hist(no_text.reshape(-1,1), 25)
    plt.title(lbl)

    if std>25:
        print("!!! Detected uneven illumination")
    if no_text.mean()<200 and bright>8000:
        print("!!! Detected glare")

这将导致：

 good_img
STD: 11.264569863071165
Brigth pixels: 58

 glare_img
STD: 15.00149131296984
Brigth pixels: 15122
!!! Detected glare

 uneven_img
STD: 57.99510339944441
Brigth pixels: 688
!!! Detected uneven illumination

现在让我们分析直方图并运用常识。我们期望背景是均匀的，方差较低，就像“good_img”一样。如果它的方差很高，则标准偏差会很高，这是不均匀亮度的情况。在下面的图像中，您可以看到负责3个不同照明区域的3个（较小）峰值。中间最大的峰值是将所有黑色像素设置为平均值的结果。我认为将STD超过25的图像称为“不均匀照明”情况是安全的。

当存在耀斑时，很容易发现大量明亮像素（请参见右侧图像）。除了热点外，闪光图像看起来像好图像。将亮像素的阈值设置为8000（总图像大小的1.5％）应该足以检测此类图像。有可能背景在所有地方都非常亮，因此，如果no_text像素的平均值高于200，则是这种情况，无需检测热点。

- igrinis

谢谢，igrinis。在大多数情况下它运行良好。但是我们需要提供完美的裁剪图像，因为当我们给出轻微的背景变化（包括在裁剪中的表格）时，标准差很高并被检测为照明问题。 - Sivaram Rasathurai

1

尝试比较原始图像和其暗角版本（将边缘的10-15％设置为黑色）的结果。如果暗角版本通过了测试，则可以解决此问题。您还可以使用其他统计指标，如峰度，并将所提出的解决方案与其他方法（分层分类器，形态学操作，偏斜检测等）相结合。在现实生活问题中很少存在完美的解决方案，只有足够好的解决方案。 - igrinis

4

以下是使用ImageMagick快速解决方案的步骤，但也可以像下面展示的那样在Python/OpenCV中实现。

使用分割规范化。

读取输入图像
可选地转换为灰度图像
复制图像并进行模糊处理
将模糊图像除以原始图像
保存结果

convert 8W0bp.jpg \( +clone -blur 0x13 \) +swap -compose divide -composite x1.png

convert ob87W.jpg \( +clone -blur 0x13 \) +swap -compose divide -composite x2.png

convert HLJuA.jpg \( +clone -blur 0x13 \) +swap -compose divide -composite x3.png

使用Python/OpenCV：

import cv2
import numpy as np
import skimage.filters as filters

# read the image
img = cv2.imread('8W0bp.jpg')
#img = cv2.imread('ob87W.jpg')
#img = cv2.imread('HLJuA.jpg')

# convert to gray
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# blur
smooth = cv2.GaussianBlur(gray, (33,33), 0)

# divide gray by morphology image
division = cv2.divide(gray, smooth, scale=255)

# sharpen using unsharp masking
sharp = filters.unsharp_mask(division, radius=1.5, amount=2.5, multichannel=False, preserve_range=False)
sharp = (255*sharp).clip(0,255).astype(np.uint8)

# save results
cv2.imwrite('8W0bp_division.jpg',division)
cv2.imwrite('8W0bp_division_sharp.jpg',sharp)
#cv2.imwrite('ob87W_division.jpg',division)
#cv2.imwrite('ob87W_division_sharp.jpg',sharp)
#cv2.imwrite('HLJuA_division.jpg',division)
#cv2.imwrite('HLJuA_division_sharp.jpg',sharp)

# show results
cv2.imshow('smooth', smooth)  
cv2.imshow('division', division)  
cv2.imshow('sharp', sharp)  
cv2.waitKey(0)
cv2.destroyAllWindows()

结果:

- fmw42

感谢fmw42的快速回复。我需要检测照明，现在不需要矫正照明。 - Sivaram Rasathurai

你如何定义照明？ - fmw42

抱歉如果我没能理解，@frmw42，请问我们需要考虑哪些参数来定义照明。 - Sivaram Rasathurai

1

这就是我问你的。你如何定义illumination（照明）？它有几个意思。它可以是整体亮度。请搜索Google并找到您想要的含义。例如，请参阅http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.172.2839&rep=rep1&type=pdf - fmw42

3

这是我的流程：

%matplotlib inline
import numpy as np
import cv2
from matplotlib import pyplot as plt
from scipy.signal import find_peaks

我使用以下函数：

def get_perceived_brightness( float_img):
    float_img = np.float64(float_img)  # unit8 will make overflow
    b, g, r = cv2.split(float_img)
    float_brightness = np.sqrt((0.241 * (r ** 2)) + (0.691 * (g ** 2)) + (0.068 * (b ** 2)))
    brightness_channel = np.uint8(np.absolute(float_brightness))
    return brightness_channel
    
# from: https://stackoverflow.com/questions/46300577/find-locale-minimum-in-histogram-1d-array-python
def smooth(x,window_len=11,window='hanning'):
    if x.ndim != 1:
        raise ValueError("smooth only accepts 1 dimension arrays.")

    if x.size < window_len:
        raise ValueError("Input vector needs to be bigger than window size.")

    if window_len<3:
        return x

    if not window in ['flat', 'hanning', 'hamming', 'bartlett', 'blackman']:
        raise ValueError("Window is on of 'flat', 'hanning', 'hamming', 'bartlett', 'blackman'")

    s=np.r_[x[window_len-1:0:-1],x,x[-2:-window_len-1:-1]]

    if window == 'flat': #moving average
        w=np.ones(window_len,'d')
    else:
        w=eval('np.'+window+'(window_len)')

    y=np.convolve(w/w.sum(),s,mode='valid')
    return y

我加载图片

image_file_name = 'im3.jpg'
image = cv2.imread(image_file_name)

# image category
category = 0

# gray convertion
image_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

height = image.shape[0]
width = image.shape[1]

第一项测试。这张图片有大的白点吗？

# First test. Does the image have any big white spots?
saturation_thresh = 250
raw_saturation_region = cv2.threshold(image_gray, saturation_thresh, 255,  cv2.THRESH_BINARY)[1]
num_raw_saturation_regions, raw_saturation_regions,stats, _ = cv2.connectedComponentsWithStats(raw_saturation_region)

# index 0 is the background -> to remove
area_raw_saturation_regions = stats[1:,4]

min_area_bad_spot = 1000 # this can be calculated as percentage of the image area
if (np.max(area_raw_saturation_regions) > min_area_bad_spot):
    category = 2 # there is at least one spot

图片正常情况下的结果：

带有斑点的图片结果：

带有阴影的图片结果：

如果图像通过第一个测试，我会处理第二个测试。该图片是否黑暗？

# Second test. Is the image dark?   
min_mean_intensity = 60

if category == 0 :    
    mean_intensity = np.mean(image_gray)

    if (mean_intensity < min_mean_intensity):
        category = 3 # dark image

如果图片通过了第二项测试，我会进行第三项测试。该图片是否均匀照明？

window_len = 15 # odd number
delay = int((window_len-1)/2)  # delay is the shift introduced from the smoothing. It's half window_len

# for example if the window_len is 15, the delay is 7
# infact hist.shape = 256 and smooted_hist.shape = 270 (= 256 + 2*delay)

if category == 0 :  
    perceived_brightness = get_perceived_brightness(image)
    hist,bins = np.histogram(perceived_brightness.ravel(),256,[0,256])

    # smoothed_hist is shifted from the original one    
    smoothed_hist = smooth(hist,window_len)
    
    # smoothed histogram syncronized with the original histogram
    sync_smoothed_hist = smoothed_hist[delay:-delay]    
    
    # if number the peaks with:
    #    20<bin<250
    #    prominance >= mean histogram value
    # the image could have shadows (but it could have also a background with some colors)
    mean_hist = int(height*width / 256)

    peaks, _ = find_peaks(sync_smoothed_hist, prominence=mean_hist)
    
    selected_peaks = peaks[(peaks > 20) & (peaks < 250)]
    
    if (selected_peaks.size>1) :
        category = 4 # there are shadows

图像正常情况下的直方图：

带斑点的图像的直方图：

带阴影的图像的直方图：

如果图像通过所有测试，则它是正常的。

# all tests are passed. The image is ok
if (category == 0) :
    category=1 # the image is ok

- Andrea Mannari

谢谢，安德烈亚。你的回答给了我解决问题的主要思路。 - Sivaram Rasathurai

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ahmet · Accepted Answer

为什么不把图片上的闪电效果去掉呢？

例如：

如果我们想要读取使用pytesseract输出的结果将会是' \n\f' 但是如果我们去掉这个闪电效果：

import cv2
import pytesseract

img = cv2.imread('img2.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
smooth = cv2.GaussianBlur(gray, (95, 95), 0)
division = cv2.divide(gray, smooth, scale=192)

使用pytesseract读取时，输出的部分内容如下：

.
.
.
Dosage & use
See package insert for compicic
information,

Instruction:
Keep all medicines out of the re.
Read the instructions carefully

Storage:
Store at temperature below 30°C.
Protect from Heat, light & moisture. BATCH NO. : 014C003
MFG. DATE - 03-2019

—— EXP. DATE : 03-2021

GENIX Distributed
AS Exclusi i :
genx PHARMA PRIVATE LIMITED Cevoka Pv 2 A ‘<
» 45-B, Kore ci
Karachi-75190, | Pakisier al Pei yaa fans
www.genixpharma.com

对于最后一张图片，重复上述操作：

然后使用 pytesseract 进行识别，部分输出内容如下：

.
.
.
Dosage & use
See package insert for complete prescribing
information. Rx Only

Instruction:
Keep all medicines out of the reach of children.
Read the instructions carefully before using.

Storage:

Store at temperature below 30°C. 5

Protect from Neat, light & moisture. BATCH NO, : 0140003
MFG. DATE : 03-2019
EXP. DATE : 03-2021

Manufactured by:

GENI N Exclusively Distributed by:
GENIX PHARMA PRIVATE LIMITED Ceyoka (Pvt) Ltd.

44, 45-B, Korangi Creek Road, 55, Negombe Road,
Karachi-75190, Pakistan. Peliyagoda, Snianka,

www. genixpharma.com

更新

您可以使用腐蚀(erode)和膨胀(dilatation)方法找到高亮显示的部分。

结果:

代码:

import cv2
import imutils
import numpy as np
from skimage import measure
from imutils import contours

img = cv2.imread('img2.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (95, 95), 0)
thresh = cv2.threshold(blurred, 200, 255, cv2.THRESH_BINARY)[1]
thresh = cv2.erode(thresh, None, iterations=2)
thresh = cv2.dilate(thresh, None, iterations=4)
labels = measure.label(thresh, neighbors=8, background=0)
mask = np.zeros(thresh.shape, dtype="uint8")
for label in np.unique(labels):
    if label == 0:
        continue
    labelMask = np.zeros(thresh.shape, dtype="uint8")
    labelMask[labels == label] = 255
    numPixels = cv2.countNonZero(labelMask)
    if numPixels > 300:
        mask = cv2.add(mask, labelMask)

    cnts = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL,
                            cv2.CHAIN_APPROX_SIMPLE)
    cnts = imutils.grab_contours(cnts)
    cnts = contours.sort_contours(cnts)[0]
    for (i, c) in enumerate(cnts):
        (x, y, w, h) = cv2.boundingRect(c)
        ((cX, cY), radius) = cv2.minEnclosingCircle(c)
        cv2.circle(img, (int(cX), int(cY)), int(radius),
                   (0, 0, 255), 3)
        cv2.putText(img, "#{}".format(i + 1), (x, y - 15),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 0, 255), 2)
    cv2.imshow("Image", img)
    cv2.waitKey(0)

虽然我只测试了第二张图片。您可能需要更改其他图片的参数。