我正在尝试运行以下代码:
import cv2
import pytesseract
img = cv2.imread('/Users/user1/Desktop/folder1/pdf1.pdf')
text = pytesseract.image_to_string(img)
print(text)
这给我带来了以下错误:
Traceback (most recent call last):
File "/Users/user1/PycharmProjects/project1/python_file.py", line 5, in <module>
text = pytesseract.image_to_string(img)
File "/Users/user1/PycharmProjects/project1/venv/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 346, in image_to_string
return {
File "/Users/user1/PycharmProjects/project1/venv/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 349, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "/Users/user1/PycharmProjects/project1/venv/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 249, in run_and_get_output
with save(image) as (temp_name, input_filename):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/Users/user1/PycharmProjects/project1/venv/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 172, in save
image, extension = prepare(image)
File "/Users/user1/PycharmProjects/project1/venv/lib/python3.8/site-packages/pytesseract/pytesseract.py", line 142, in prepare
raise TypeError('Unsupported image object')
TypeError: Unsupported image object
我该如何使其适用于PDF文件?
txt = pytesseract.image_to_string(page_data).encode("utf-8")
。 - jboi