使用Python/Pandas将数据框写入Google表格

3
我正在使用Google表格来保存一个共享项目的数据。使用Google的Sheets API,我可以访问数据,在Python中处理它,并尝试在函数writer中使用batchUpdate更新Sheets文件。
  • 如果我将数据作为列表传递给该函数,则按预期工作。
  • 如果我传递一个数据框(正如我所希望的那样),我会收到以下错误:TypeError:类型为DataFrame的对象无法进行JSON序列化
  • 如果我使用.to_json(),则会得到以下结果:
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://sheets.googleapis.com/v4/spreadsheets/XXX/values:batchUpdate?alt=json returned "Invalid value at 'data[0].values' (type.googleapis.com/google.protobuf.ListValue), "{"0":{"0":1},"1":{"0":2},"2":{"0":3},"3":{"0":4}}"". Details: "[{'@type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'data[0].values', 'description': 'Invalid value at 'data[0].values' (type.googleapis.com/google.protobuf.ListValue), "{"0":{"0":1},"1":{"0":2},"2":{"0":3},"3":{"0":4}}"'}]}]">
任何指针都将不胜感激。
import pickle
import os.path
import pandas as pd
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from pprint import pprint

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/spreadsheets']

# Spreadsheet ID: https://docs.google.com/spreadsheets/d/XXX/edit#gid=0
SPREADSHEET_ID = 'XXX'
RANGE_NAME = 'contacts'

def writer(df):
    service = build('sheets', 'v4', credentials=gsheet_api(SCOPES))
    sheet_name = 'contacts'
    data = [{'range' : sheet_name, 'values' : df}]
    batch_update_values_request_body = {
        'value_input_option': 'RAW',
        'data': data }

    request = service.spreadsheets().values().batchUpdate(spreadsheetId=SPREADSHEET_ID,
                                                          body=batch_update_values_request_body)
    response = request.execute()
    pprint(response)

df = [[1, 2, 3, 4]]
writer(df)
1个回答

8

我了解您的目标和情况如下:

  • 您希望使用Python和googleapis将数据框架放入Google电子表格中。
  • 您已经能够使用Sheets API获取和放置Google电子表格的值。

对于此问题,以下是一些修改点:

修改要点:

  • I'm not sure about the values of the dataframe. So in this answer, I would like to explain about the modification points using the following sample dataframe.

         A  B  C
      0  1  2  3
      1  4  5  6
      2  7  8  9
    
  • Unfortunately, the dataframe cannot be directly used for the request body of the method "spreadsheets.values.batchUpdate". So in this case, it is required to convert from the dataframe to the 2 dimensional array. For this, I used tolist().

当你使用示例数据框修改脚本后,它会变得如下所示。

修改后的脚本:

df = [[1, 2, 3, 4]]
writer(df)

sampleValue = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
df = pd.DataFrame(sampleValue, columns=list('ABC'))
values = [df.columns.values.tolist()]
values.extend(df.values.tolist())
writer(values)

注意:

  • If you don't want to include the header row, please modify as follows.
    • From

        values = [df.columns.values.tolist()]
        values.extend(df.values.tolist())
      
    • To

        values = df.values.tolist()
      

参考资料:


谢谢,Tanaike,非常完美。 - HamishCrichton
@HamishCrichton 谢谢您的回复。我很高兴您的问题得到了解决。 - Tanaike
1
请注意,如果您有日期时间并收到错误消息 TypeError: Object of type Timestamp is not JSON serializable,请先将相关字段转换为 'str'df = df.astype('str')(将转换所有列)。 - conor

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接