Python PDFMiner - 'AcroForm'关键字错误

Question

Python PDFMiner - 'AcroForm'关键字错误

3

我有一个脚本，它获取附件文件名，如果扩展名指示为pdf文件，则通过以下代码运行它。但是我收到了下面的KeyError。我找不到任何关于如何纠正或排除故障的信息。我之前成功地通过此脚本运行了PDF表单，不确定为什么这次没有按预期工作。

 if ext == '.PDF' or ext == '.pdf':
      item_field_list = []
      item_number = str(random.randint(1000000, 9999999))
      #try:
      with tempfile.NamedTemporaryFile() as tmp:
           verify_item = 0
           tmp.write(part.get_payload(decode=True))
           parser = PDFParser(tmp)
           doc = PDFDocument(parser)
           fields = resolve1(doc.catalog['AcroForm'])['Fields']

以下是追踪信息：

Traceback (most recent call last):
  File "distributionitemimport.py", line 87, in <module>
    fields = resolve1(doc.catalog['AcroForm'])['Fields']
KeyError: 'AcroForm'

当我执行print(doc.catalog)时，我会得到以下结果：

{'MarkInfo': {'Marked': True}, 'Lang': b'en-US', 'Type': /'Catalog', 'StructTree
Root': <PDFObjRef:162>, 'Pages': <PDFObjRef:2>}

- AlliDeacon

2

如果在出现问题的那一行之前执行 print(doc.catalog)，你会看到什么被打印出来？ - alecxe

@alecxe 我已经更新包括了。谢谢！ - AlliDeacon

当您调用resolve1(doc.catalog)时，resolve1会返回什么？ - Ajax1234

@Ajax1234 看起来是：<class 'dict'> - AlliDeacon

1

我的猜测是它可能被密码保护、加密或两者兼备。不幸的是，我目前不知道该怎么处理它，也在同样的问题上苦苦挣扎。 - Matt Cremeens

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- user6798889 · Answer 1

问题是你的PDF收藏中有一些没有AcroForm的PDF。首先，删除所有没有AcroForm的PDF。

我建议创建一个新文件夹，放入一个包含AcroForm的PDF并检查您的代码。它会起作用。