使用Gmail API从Gmail下载附件

24

我正在使用Gmail API访问我的Gmail数据,以及Google Python API客户端

根据文档,在Python中获取邮件附件的示例代码是这个。但是,我尝试使用相同的代码时出现错误:

AttributeError: 'Resource'对象没有属性'user'

我遇到错误的代码行:

message = service.user().messages().get(userId=user_id, id=msg_id).execute()

于是我尝试通过替换user(),来使用users()

message = service.users().messages().get(userId=user_id, id=msg_id).execute()

但我在 for part in message['payload']['parts'] 中没有得到 part['body']['data']

7个回答

54

在扩展 @Eric 的回答后,我编写了以下经过更正的 GetAttachments 函数版本(来自文档):

# based on Python example from 
# https://developers.google.com/gmail/api/v1/reference/users/messages/attachments/get
# which is licensed under Apache 2.0 License

import base64
from apiclient import errors

def GetAttachments(service, user_id, msg_id):
    """Get and store attachment from Message with given id.

    :param service: Authorized Gmail API service instance.
    :param user_id: User's email address. The special value "me" can be used to indicate the authenticated user.
    :param msg_id: ID of Message containing attachment.
    """
    try:
        message = service.users().messages().get(userId=user_id, id=msg_id).execute()

        for part in message['payload']['parts']:
            if part['filename']:
                if 'data' in part['body']:
                    data = part['body']['data']
                else:
                    att_id = part['body']['attachmentId']
                    att = service.users().messages().attachments().get(userId=user_id, messageId=msg_id,id=att_id).execute()
                    data = att['data']
                file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
                path = part['filename']

                with open(path, 'w') as f:
                    f.write(file_data)

    except errors.HttpError, error:
        print 'An error occurred: %s' % error

11
对于那些无法写入文件的人,使用'wb',因为有时数据不是字符串,实际上是二进制数据。 - Shashank
1
and inline images ? - PreguntonCojoneroCabrón

11

如果您按照@Ilya V. Schurov@Cam T的回答方式仍然错过附件,原因是电子邮件结构可能基于mimeType而不同。

这个答案的启发,这是我解决问题的方法。

import base64
from apiclient import errors

def GetAttachments(service, user_id, msg_id, store_dir=""):
    """Get and store attachment from Message with given id.
        Args:
            service: Authorized Gmail API service instance.
            user_id: User's email address. The special value "me"
                can be used to indicate the authenticated user.
            msg_id: ID of Message containing attachment.
            store_dir: The directory used to store attachments.
    """
    try:
        message = service.users().messages().get(userId=user_id, id=msg_id).execute()
        parts = [message['payload']]
        while parts:
            part = parts.pop()
            if part.get('parts'):
                parts.extend(part['parts'])
            if part.get('filename'):
                if 'data' in part['body']:
                    file_data = base64.urlsafe_b64decode(part['body']['data'].encode('UTF-8'))
                    #self.stdout.write('FileData for %s, %s found! size: %s' % (message['id'], part['filename'], part['size']))
                elif 'attachmentId' in part['body']:
                    attachment = service.users().messages().attachments().get(
                        userId=user_id, messageId=message['id'], id=part['body']['attachmentId']
                    ).execute()
                    file_data = base64.urlsafe_b64decode(attachment['data'].encode('UTF-8'))
                    #self.stdout.write('FileData for %s, %s found! size: %s' % (message['id'], part['filename'], attachment['size']))
                else:
                    file_data = None
                if file_data:
                    #do some staff, e.g.
                    path = ''.join([store_dir, part['filename']])
                    with open(path, 'w') as f:
                        f.write(file_data)
    except errors.HttpError as error:
        print 'An error occurred: %s' % error

你是如何处理那些丢失的附件的?我只看到一个 file_data = None,然后什么也没做。 - gdvalderrama
看一下 while 语句,这是区别所在。最后的 else: file_data = None 只是为了代码安全。 - Todor
1
啊,我明白了,区别在于你还处理了最上层的数据(payload['body']['data']),而其他答案只关注了部分中的主体(payload['parts'])。 - gdvalderrama
1
'wb' 而不是 'w' - CodeFarmer
我能够使用您的方法下载PDF附件,但是当我打开这些PDF时,它们无法打开。我收到错误消息:“打开此文档时出错。文件已损坏,无法修复”。 - KawaiKx

9

我测试了以上代码,但并没有起作用。对于其他帖子,我更新了一些内容。WriteFileError

    import base64
    from apiclient import errors


    def GetAttachments(service, user_id, msg_id, prefix=""):
       """Get and store attachment from Message with given id.

       Args:
       service: Authorized Gmail API service instance.
       user_id: User's email address. The special value "me"
       can be used to indicate the authenticated user.
       msg_id: ID of Message containing attachment.
       prefix: prefix which is added to the attachment filename on saving
       """
       try:
           message = service.users().messages().get(userId=user_id, id=msg_id).execute()

           for part in message['payload'].get('parts', ''):
              if part['filename']:
                  if 'data' in part['body']:
                     data=part['body']['data']
                  else:
                     att_id=part['body']['attachmentId']
                     att=service.users().messages().attachments().get(userId=user_id, messageId=msg_id,id=att_id).execute()
                     data=att['data']
            file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
            path = prefix+part['filename']

            with open(path, 'wb') as f:
                f.write(file_data)

        except errors.HttpError as error:
            print('An error occurred: %s' % error)

此实现解决了问题:TypeError: write()参数必须是str,而不是bytes。 - Regis Albuquerque

4

这绝对是users()

响应消息的格式很大程度上取决于您使用的格式参数。如果您使用默认值(FULL),那么部分将具有part ['body'] ['data'] 或者在数据很大时会有一个attachment_id字段,您可以将其传递给messages().attachments().get()

如果您查看附件文档,您将会看到以下内容: https://developers.google.com/gmail/api/v1/reference/users/messages/attachments

(如果在主消息文档页面上也提到了这一点,那就更好了。)


4

感谢@Ilya V. Schurov@Todor的回答。即使搜索相同的字符串,仍然可能会错过具有附件和没有附件的邮件。以下是我获取包含附件和不包含附件的邮件正文的方法。

def get_attachments(service, msg_id):
try:
    message = service.users().messages().get(userId='me', id=msg_id).execute()

    for part in message['payload']['parts']:
        if part['filename']:
            if 'data' in part['body']:
                data = part['body']['data']
            else:
                att_id = part['body']['attachmentId']
                att = service.users().messages().attachments().get(userId='me', messageId=msg_id,id=att_id).execute()
                data = att['data']
            file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
            path = part['filename']

            with open(path, 'wb') as f:
                f.write(file_data)
    

except errors.HttpError as error:
    print ('An error occurred: %s') % error

def get_message(service,msg_id):
try:
    message = service.users().messages().get(userId='me', id=msg_id).execute()
    if message['payload']['mimeType'] == 'multipart/mixed':
        for part in message['payload']['parts']:
            for sub_part in part['parts']:
                if sub_part['mimeType'] == 'text/plain':
                    data = sub_part['body']['data']
                    break
            if data:
                break           
    else:
        for part in message['payload']['parts']:
            if part['mimeType'] == 'text/plain':
                data = part['body']['data']
                break
    
    content = base64.b64decode(data).decode('utf-8')
    print(content)

    return content    

except errors.HttpError as error:
    print("An error occured : %s") %error

2
我对上面的代码进行了以下更改,对于包含附件文档的每个电子邮件ID都可以完美运行,希望这可以帮助您,因为使用API示例会出现错误密钥。
def GetAttachments(service, user_id, msg_id, store_dir):

"""Get and store attachment from Message with given id.

Args:
service: Authorized Gmail API service instance.
user_id: User's email address. The special value "me"
can be used to indicate the authenticated user.
msg_id: ID of Message containing attachment.
prefix: prefix which is added to the attachment filename on saving
"""
try:
    message = service.users().messages().get(userId=user_id, id=msg_id).execute()
    for part in message['payload']['parts']:
        newvar = part['body']
        if 'attachmentId' in newvar:
            att_id = newvar['attachmentId']
            att = service.users().messages().attachments().get(userId=user_id, messageId=msg_id, id=att_id).execute()
            data = att['data']
            file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))
            print(part['filename'])
            path = ''.join([store_dir, part['filename']])
            f = open(path, 'wb')
            f.write(file_data)
            f.close()
except errors.HttpError, error:
    print 'An error occurred: %s' % error

Google官方API附件


'wb' instead of 'w' - CodeFarmer

1
from __future__ import print_function
import base64
import os.path
import oauth2client
from googleapiclient.discovery import build
from oauth2client import file,tools
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
store_dir=os.getcwd()

def attachment_download():

store = oauth2client.file.Storage('credentials_gmail.json')
creds = store.get()
if not creds or creds.invalid:
    flow = oauth2client.client.flow_from_clientsecrets('client_secrets.json', SCOPES)
    creds = oauth2client.tools.run_flow(flow, store)


try:
    service = build('gmail', 'v1', credentials=creds)
    results = service.users().messages().list(userId='me', labelIds=['XXXX']).execute() # XXXX is label id use INBOX to download from inbox
    messages = results.get('messages', [])
    for message in messages:
        msg = service.users().messages().get(userId='me', id=message['id']).execute()
        for part in msg['payload'].get('parts', ''):

            if part['filename']:
                if 'data' in part['body']:
                    data = part['body']['data']
                else:
                    att_id = part['body']['attachmentId']
                    att = service.users().messages().attachments().get(userId='me', messageId=message['id'],id=att_id).execute()
                    data = att['data']
                file_data = base64.urlsafe_b64decode(data.encode('UTF-8'))

                filename = part['filename']
                print(filename)
                path = os.path.join(store_dir + '\\' 'Downloaded files' + '\\' + filename)

                with open(path, 'wb') as f:
                    f.write(file_data)
                    f.close()
except Exception as error:
    print(error)

请看: 获取标签ID https://developers.google.com/gmail/api/v1/reference/users/labels/list

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接