使用Python Pandas读取SharePoint Excel文件

Question

使用Python Pandas读取SharePoint Excel文件

3

"我正在尝试使用来自如何在Python中读取SharePoint Online(Office365) Excel文件，特别是带有工作或学校帐户的pandas？的答案，但我收到了XLRDError：Unsupported format，or corrupt file: Expected BOF record; found b'\r\n<!DOCT'的错误。我认为问题在于我放置路径的方式。是否有人知道如何获取这种类型的SharePoint路径，就像下面的例子一样?" 我得到的看起来更像这样"https://company.sharepoint.com/sites/site/Shared%20Documents/Forms/AllItems.aspx"

#import all the libraries
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File 
import io
import pandas as pd

#target url taken from sharepoint and credentials
url = 'https://company.sharepoint.com/Shared%20Documents/Folder%20Number1/Folder%20Number2/Folder3/Folder%20Number4/Target_Excel_File_v4.xlsx?cid=_Random_letters_and_numbers-21dbf74c'
username = 'Dumby_account@company.com'
password = 'Password!'

ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  print("Authentication successful")

response = File.open_binary(ctx, url)

#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start

#read excel file and each sheet into pandas dataframe 
df = pd.read_excel(bytes_file_obj, sheetname = None)

- Hector Castro

这对我很有效，谢谢！ - Laurent

2个回答

0

看起来您正在使用共享链接而不是文件路径。您需要复制正确的路径。以下是方法：

打开SharePoint文件夹
单击文件中的3个点，然后单击详细信息
向下滚动并复制路径路径应该类似于：'/user/folder/Documents/Target_Excel_File_v4.xlsx'

使用SharePoint网址进行身份验证，然后使用已复制的路径打开二进制文件。

#import all the libraries
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File 
import io
import pandas as pd

#target url taken from sharepoint and credentials
url = 'https://company.sharepoint.com/user/folder'
path = '/user/folder/Documents/Target_Excel_File_v4.xlsx'
username = 'Dumby_account@company.com'
password = 'Password!'

ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  print("Authentication successful")

response = File.open_binary(ctx, path)

#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start

#read excel file and each sheet into pandas dataframe 
df = pd.read_excel(bytes_file_obj, sheet_name = None)
print(df)

- Sprookd

类型错误：read_excel（）收到了一个意外的关键字参数'sheetname'。 - doubando

@doubando，有一个_符号丢失了。我已经更新为sheet_name。 - Sprookd

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Hector Castro · Accepted Answer

我是通过在桌面上打开文件，然后进入“文件”>“信息”>“复制路径”来完成的。这条路径应该是可用的。