如何将Google Cloud Vision OCR的protobuf响应保存/加载到磁盘？

Question

如何将Google Cloud Vision OCR的protobuf响应保存/加载到磁盘？

pythongoogle-cloud-platformprotocol-buffersgoogle-cloud-vision

4

我正在尝试将Google-Cloud-Vision OCR的响应保存到磁盘，并发现压缩并存储实际protobuf是以后处理的最节省空间的选项。这部分很容易！但是，现在如何从磁盘检索和解析回原始格式呢？

我的问题是：在哪里/如何重建message_pb2文件以将文件解析回protobuf？

根据文档，这是目前我的代码：documentation

#!/usr/bin/python3
# coding: utf-8

from google.cloud import vision
import gzip, os, io


def ocr_document(path):
    """
    Detects document features in an image.
    Returns response protobuf from API.
    """
    client = vision.ImageAnnotatorClient()

    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.types.Image(content=content)

    response = client.document_text_detection(image=image)

    return(response)

response = ocr_document('handwritten-scan.jpg')
serialized = response.SerializeToString()


with gzip.open('response.pb.gz', 'wb') as f:
    f.write(serialized)
print(os.path.getsize('response.pb.gz'), 'bytes') # Output: 11032 bytes

# Figure this part out!

with gzip.open('response.pb.gz', 'rb') as f:
    serialized=f.read()
    ### parsed = message_pb2.Message()  # < - Protobuf message I'm missing
    parsed.ParseFromString(serialized)
    print(parsed)

- Al Kari

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Al Kari · Answer 1

浏览代码后，答案如下：

from google.cloud.vision_v1.proto import image_annotator_pb2
from google.protobuf.json_format import MessageToDict

with gzip.open('response.pb.gz', 'rb') as lf:
    Loaded=lf.read()
    parsed = image_annotator_pb2.AnnotateImageResponse()
    parsed.ParseFromString(Loaded)

print(MessageToDict(parsed))