在OpenCV和Python中如何根据图像大小调整cv2.putText的文本大小？

Question

在OpenCV和Python中如何根据图像大小调整cv2.putText的文本大小？

18

fontScale = 1
fontThickness = 1

# make sure font thickness is an integer, if not, the OpenCV functions that use this may crash
fontThickness = int(fontThickness)

upperLeftTextOriginX = int(imageWidth * 0.05)
upperLeftTextOriginY = int(imageHeight * 0.05)

textSize, baseline = cv2.getTextSize(resultText, fontFace, fontScale, fontThickness)
textSizeWidth, textSizeHeight = textSize

# calculate the lower left origin of the text area based on the text area center, width, and height
lowerLeftTextOriginX = upperLeftTextOriginX
lowerLeftTextOriginY = upperLeftTextOriginY + textSizeHeight

# write the text on the image
cv2.putText(openCVImage, resultText, (lowerLeftTextOriginX, lowerLeftTextOriginY), fontFace, fontScale, Color,
            fontThickness)

看起来 fontScale 并没有根据图像的宽度和高度缩放文本，因为对于不同大小的图像，文本的大小几乎相同。那么如何根据图像大小调整文本大小，以便所有文本都适合图像中呢？

- XINDI LI

你在哪里更新 fontScale？请解释其他变量是什么。 - parthagar

嗨，我的问题是我不知道如何根据图像大小更新fontScale。谢谢。 - XINDI LI

你需要什么图像尺寸才能使用 fontScale=1？请指定图像尺寸（高x宽），我会尽快分享代码。 - parthagar

实际上，我对fontScale的值感到困惑。但是从我的实验来看，对于1000*1000像素或更大的图像，我可以将fontScale设置为1。 - XINDI LI

你最终解决了这个问题吗？请与社区分享，或者说点什么。谢谢。我也有同样的问题。 - Lewis

@Lewis 我采用了parthagar的想法，但它并不完美地运行。 - XINDI LI

9个回答

2

方法

一种方法是根据图像的大小来按比例缩放字体大小。我的经验是，不仅对 fontScale 进行缩放，还要对 thickness 进行缩放，可以获得更自然的结果。例如：

import math

import cv2

FONT_SCALE = 2e-3  # Adjust for larger font size in all images
THICKNESS_SCALE = 1e-3  # Adjust for larger thickness in all images

img = cv2.imread("...")
height, width, _ = img.shape

font_scale = min(width, height) * FONT_SCALE
thickness = math.ceil(min(width, height) * THICKNESS_SCALE)

示例

我们以这张免费使用的库存照片为例。我们通过将基础图像缩放到宽度为2000px和600px（保持纵横比不变）来创建两个版本的基础图像。采用上述方法，文本在两种情况下都具有适当的图像大小（在此示例中，我们标注边界框）：

2000px

600px

完整代码以复制（但请注意：输入图像必须经过预处理）：

import math

import cv2

FONT_SCALE = 2e-3  # Adjust for larger font size in all images
THICKNESS_SCALE = 1e-3  # Adjust for larger thickness in all images
TEXT_Y_OFFSET_SCALE = 1e-2  # Adjust for larger Y-offset of text and bounding box

img_width_to_bboxes = {
    2000: [
        {"xywh": [120, 400, 1200, 510], "label": "car"},
        {"xywh": [1080, 420, 790, 340], "label": "car"},
    ],
    600: [
        {"xywh": [35, 120, 360, 155], "label": "car"},
        {"xywh": [325, 130, 235, 95], "label": "car"},
    ],
}


def add_bbox_and_text() -> None:
    for img_width, bboxes in img_width_to_bboxes.items():
        # Base image from https://www.pexels.com/photo/black-suv-beside-grey-auv-crossing-the-pedestrian-line-during-daytime-125514/
        # Two rescaled versions of the base image created with width of 600px and 2000px
        img = cv2.imread(f"pexels-kaique-rocha-125514_{img_width}.jpg")
        height, width, _ = img.shape
        for bbox in bboxes:
            x, y, w, h = bbox["xywh"]
            cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
            cv2.putText(
                img,
                bbox["label"],
                (x, y - int(height * TEXT_Y_OFFSET_SCALE)),
                fontFace=cv2.FONT_HERSHEY_TRIPLEX,
                fontScale=min(width, height) * FONT_SCALE,
                thickness=math.ceil(min(width, height) * THICKNESS_SCALE),
                color=(0, 255, 0),
            )
        cv2.imwrite(f"pexels-kaique-rocha-125514_{img_width}_with_text.jpg", img)


if __name__ == "__main__":
    add_bbox_and_text()

- swimmer

2

最初的回答：

这个起作用了！

scale = 1 # this value can be from 0 to 1 (0,1] to change the size of the text relative to the image
fontScale = min(imageWidth,imageHeight)/(25/scale)

请记住字体类型会影响到常数25。最初的回答。

- Ger hashim

0

如果您对大约1000 x 1000大小的图像采用fontScale = 1，那么这段代码应该可以正确地缩放字体。

fontScale = (imageWidth * imageHeight) / (1000 * 1000) # Would work best for almost square images

如果您仍然有任何问题，请评论。

- parthagar

1

这怎么解决问题？假设我有一张4K的图片。根据您的建议，fontScale = (3840*2160)/(1000*1000)=8.29。8.29是一个非常大的比例。 - theateist

@theateist 朋友，它实际上意味着 fontscale = (child_width * child_height) / (parent_width * parent_height)。因此，如果您有4K父级图片，则公式变为 fontscale = (child_width*child_height)/(3840*2160)。如果图片与父级图片具有相同的高度和宽度，则fontscale结果为1。 - rish_hyun

0

您可以使用以下代码中的get_optimal_font_scale函数，根据图像大小调整字体大小：

def get_optimal_font_scale(text, width):

    for scale in reversed(range(0, 60, 1)):
        textSize = cv2.getTextSize(text, fontFace=cv2.FONT_HERSHEY_DUPLEX, fontScale=scale/10, thickness=1)
        new_width = textSize[0][0]
        if (new_width <= width):
            return scale/10
    return 1

fontScale = 3*(img.shape[1]//6)
font_size = get_optimal_font_scale(text, fontScale)
cv2.putText(img, text, org, font, font_size, color, thickness, cv2.LINE_AA)

您可以为图像更改fontScale。

- NahidEbrahimian

1

b_w 是 fontScale。 - NahidEbrahimian

0

这对我有用。

double calc_scale_rectbox(const char *txt, int box_width, int box_height, 
                          cv::Size &textSize, int &baseline)

{
       if (!txt) return 1.0;
       double scale = 2.0;
       double w_aprx = 0;
       double h_aprx = 0;
       do
       {
           textSize = cv::getTextSize(txt, FONT_HERSHEY_DUPLEX, scale, 2, 
                                      &baseline);
           w_aprx = textSize.width * 100 / box_width;
           h_aprx = textSize.height * 100 / box_height;
           scale -= 0.1;
        } while (w_aprx > 50 || h_aprx > 50);
        return scale;
 }

......

cv::Size textSize;

int baseline = 0;

double scale = calc_scale_rectbox(win_caption.c_str(), width, 
                                 height, textSize, baseline);

cv::putText(img, win_caption, Point(width / 2 - textSize.width / 2, 
           (height + textSize.height - baseline + 2) / 2), 
            FONT_HERSHEY_DUPLEX, scale, CV_RGB(255, 255, 255), 2);

- Alexey Kargojarvinen

你的回答可以通过提供更多支持信息来改进。请编辑以添加进一步的细节，例如引用或文档，以便他人可以确认你的答案是正确的。您可以在帮助中心中找到有关如何编写良好答案的更多信息。 - Community

0

我实现了一个函数，用于查找文本的最佳适合居中位置。

如果这些代码对您有帮助，请查看一下。

def findFontLocate(s_txt, font_face, font_thick, cv_bgd):
    best_scale = 1.0
    bgd_w = cv_bgd.shape[1]
    bgd_h = cv_bgd.shape[0]
    txt_rect_w = 0
    txt_rect_h = 0
    baseline = 0
    for scale in np.arange(1.0, 6.0, 0.2):
        (ret_w, ret_h), tmp_bsl = cv2.getTextSize(
            s_txt, font_face, scale, font_thick)
        tmp_w = ret_w + 2 * font_thick
        tmp_h = ret_h + 2 * font_thick + tmp_bsl
        if tmp_w >= bgd_w or tmp_h >= bgd_h:
            break
        else:
            baseline = tmp_bsl
            txt_rect_w = tmp_w
            txt_rect_h = tmp_h
            best_scale = scale
    lt_x, lt_y = round(bgd_w/2-txt_rect_w/2), round(bgd_h/2-txt_rect_h/2)
    rb_x, rb_y = round(bgd_w/2+txt_rect_w/2), round(bgd_h/2+txt_rect_h/2)-baseline
    return (lt_x, lt_y, rb_x, rb_y), best_scale, baseline

请注意，该函数接受四个参数：s_txt（待呈现的字符串），font_face，font_thick和cv_bgd（背景图像以ndarray格式）。

当你使用 putText() 方法时，请按以下方式编写代码：

cv2.putText(
    cv_bgd, s_txt, (lt_x, rb_y), font_face,
    best_scale, (0,0,0), font_thick, cv2.LINE_AA)

- user12853438

0

一个简单的实用函数：

def optimal_font_dims(img, font_scale = 2e-3, thickness_scale = 5e-3):
    h, w, _ = img.shape
    font_scale = min(w, h) * font_scale
    thickness = math.ceil(min(w, h) * thickness_scale)
    return font_scale, thickness

使用方法：

font_scale, thickness = optimal_font_dims(image)
cv2.putText(image, "LABEL", (x, y), cv2.FONT_HERSHEY_SIMPLEX, font_scale, (255,0,0), thickness)

- Maxi

0

这是C#的实现：

    public static void PutText(Mat mat, Rect rect, double scale, string text)
    {
        var textBound = Cv2.GetTextSize(text, HersheyFonts.HersheySimplex, 1, 1, out int baseline);
        var widthScale = (double)textBound.Width / rect.Width;
        var heightScale = (double)textBound.Height / rect.Height;
        var finalScale = scale / Math.Max(widthScale, heightScale);
        textBound = Cv2.GetTextSize(text, HersheyFonts.HersheySimplex,
            finalScale, 1, out int baselineScaled);
        var widthDiff = rect.Width - textBound.Width;
        var heightDiff = rect.Height - textBound.Height;
        mat.PutText(text, new Point(rect.Left + widthDiff / 2, rect.Bottom - heightDiff / 2),
            HersheyFonts.HersheySimplex, finalScale, Scalar.Black);
    }

rect 是文本绘制的位置，scale 是 rect 内文本的比例。文本在给定的 rect 中居中显示。例如，rect 的大小为 100x100，文本完全是正方形的。那么它将被绘制在 80x80 的矩形内，偏移量为 10x10。

- Serge SB

通常情况下，除非提出问题的人表示其他语言会有帮助，否则您应该将答案限制在问题中指定的语言和模块范围内。 - moken

@moken 谢谢你指出来。为了辩解，我想说 1）我不会 Python；2）我的方法不涉及迭代且准确无误。其他所有答案要么不准确，要么通过硬编码值进行迭代。这就是为什么我决定分享我的解决方案。即使一个人不懂 C#，也很容易理解。再次抱歉用不同的语言说话（双关语：））。 - Serge SB

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- shabany · Accepted Answer

以下是适用于您的矩形内部的解决方案。如果您的矩形宽度是可变的，那么您可以通过循环潜在比例并测量文字需要多少宽度（以像素为单位）来获取字体比例尺。一旦您的宽度低于矩形宽度，您可以检索比例并将其用于实际 putText：

def get_optimal_font_scale(text, width):
    for scale in reversed(range(0, 60, 1)):
        textSize = cv.getTextSize(text, fontFace=cv.FONT_HERSHEY_DUPLEX, fontScale=scale/10, thickness=1)
        new_width = textSize[0][0]
        if (new_width <= width):
            print(new_width)
            return scale/10
    return 1