Python - 如何识别图像并点击它们

Question

Python - 如何识别图像并点击它们

3

我想编写一个脚本，根据要求点击图像，需要遍历图像列表。例如，如果程序要求用户单击绿色圆形：

question_list = greencircle, redcircle, bluesquare, redtriangle

if(greencircle == greencircle.png){
    pyautogui.click(greencircle.png)
}

有人能帮忙解决这个问题吗？

- Ano1231

1

如果您需要图像识别，您应该使用OpenCV来识别屏幕上的图像，并使用pyautogui在检索到它们的坐标后单击它们。 - Nastor

欢迎提出您的第一个问题！根据文档，pyautogui.click() 似乎是正确的语法。您能否澄清一下您遇到的问题是什么？https://pyautogui.readthedocs.io/en/latest/ - StephenGodderidge

@StephenGodderidge谢谢！基本上我有一个学校项目的应用程序，我有一个程序要求用户点击某些东西，就像ReCaptcha中的图像一样。我的程序只是不起作用，我不知道为什么。 - Ano1231

那么程序需要知道每个图像是什么，对吗？这样它才能选择要点击哪个图像？如果是这样的话，我会采用@Nastor建议的方法。检查OpenCV进行图像识别。这里有一个方便的教程，您可以从中开始：https://www.learnopencv.com/image-recognition-and-object-detection-part1/ - StephenGodderidge

OpenCV只接受相机吗？我需要捕捉屏幕，所以不允许使用相机。 - Ano1231

@StephenGodderidge 哦，对了，应用程序需要实时使用，因此opencv不起作用，因为它只允许图像。 - Ano1231

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Andrew Stone · Accepted Answer

PyAutoGUI内置了一个function，名为locateOnScreen()，如果在当前屏幕上找到图像，则返回该图像中心的x、y坐标（它会截取屏幕截图并进行分析）。

图像必须完全匹配才能使用此功能；即，如果您想单击button.png按钮，该按钮图片必须与程序窗口中的按钮大小/分辨率完全相同，以便程序识别它。实现此目的的一种方法是截取屏幕截图，在画图中打开它并仅剪切出要按下的按钮（或者您可以让PyAutoGUI为您完成，如我将在后面的示例中演示）。

import pyautogui

question_list = ['greencircle', 'redcircle', 'bluesquare', 'redtriangle']

user_input = input('Where should I click? ')

while user_input not in question_list:
    print('Incorrect input, available options: greencircle, redcircle, bluesquare, redtriangle')
    user_input = input('Where should I click?')

location = pyautogui.locateOnScreen(user_input + '.png')
pyautogui.click(location)

上面的例子需要你已经拥有 greencircle.png 和所有其他 .png 文件在你的目录中。

PyAutoGUI 还可以进行屏幕截图，并且您可以指定要截取的屏幕区域 pyautogui.screenshot(region=(0, 0, 0, 0)) 第一个和第二个参数是您想要选择的区域左上角的x，y坐标，第三个参数是向右的宽度 (x)，第四个参数是向下的高度 (y)。

以下示例对 Windows 10 徽标进行了截屏，将其保存到文件中，然后使用指定的 .png 文件单击该徽标。

import pyautogui

pyautogui.screenshot('win10_logo.png', region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen('win10_logo.png')
pyautogui.click(location)

你也不必将截图保存为文件，可以将其保存为变量。

import pyautogui

win10 = pyautogui.screenshot(region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen(win10)
pyautogui.click(location)

如果想让程序检测用户是否点击了特定区域（比如Windows 10徽标），就需要另一个库，比如pynput。

from pynput.mouse import Listener    

def on_click(x, y, button, pressed):
    if 0 < x < 50 and 1080 > y > 1041 and str(button) == 'Button.left' and pressed:
        print('You clicked on Windows 10 Logo')
        return False    # get rid of return statement if you want a continuous loop

with Listener(on_click=on_click) as listener:
    listener.join()

将所有东西放在一起

import pyautogui
from pynput.mouse import Listener

win10 = pyautogui.screenshot(region=(0, 1041, 50, 39))
location = pyautogui.locateOnScreen(win10)

# location[0] is the top left x coord
# location[1] is the top left y coord
# location[2] is the distance from left x coord to right x coord
# location[3] is the distance from top y coord to bottom y coord

x_boundary_left = location[0]
y_boundary_top = location[1]
x_boundary_right = location[0] + location[2]
y_boundary_bottom = location[1] + location[3]


def on_click(x, y, button, pressed):
    if x_boundary_left < x < x_boundary_right and y_boundary_bottom > y > y_boundary_top and str(button) == 'Button.left' and pressed:
        print('You clicked on Windows 10 Logo')
        return False    # get rid of return statement if you want a continuous loop


with Listener(on_click=on_click) as listener:
    listener.join()