Python - 从SharePoint网站下载文件

26
我有一个需要使用Python下载和上传文件到Sharepoint网站的需求。 我的网站链接为https://ourOrganizationName.sharepoint.com/Followed,还有更多的链接。 最初我打算使用Request、BeautifulSoup等库完成这个任务,但我无法在该网站的body中进行“检查元素”操作。 我尝试了Sharepoint、HttpNtlmAuth、office365等库,但都不成功,总是返回403错误。 我已经尽可能地在Google上搜索,但仍然没有成功。即使是Youtube也没有帮助我。 请问有谁能帮我解决这个问题吗?提供文档链接的库建议将不胜感激。 谢谢!

1
你有看过requests库吗? - JGerulskis
3
由于身份验证错误,会抛出403错误。您应该检查是否有访问此站点的权限,并检查请求中是否提供了身份验证数据。 - Konrad
请问您能否提供一个可用的SharePoint网站链接?也许我们可以从期望的页面结构开始逆推。 - Nick Settje
我也尝试过 requests 库,但它并没有起作用。需求是将文件上传到 Sharepoint 后台,以便网络用户可以使用。我们是否有关于 Python 库 Sharepoint 的文档? - DKS
3个回答

36

你尝试过Office365-REST-Python-Client吗?它支持SharePoint Online身份验证,并允许像下面演示的那样下载/上传文件:

下载文件

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File

ctx_auth = AuthenticationContext(url)
ctx_auth.acquire_token_for_user(username, password)   
ctx = ClientContext(url, ctx_auth)
response = File.open_binary(ctx, "/Shared Documents/User Guide.docx")
with open("./User Guide.docx", "wb") as local_file:
    local_file.write(response.content)

上传文件

ctx_auth = AuthenticationContext(url)
ctx_auth.acquire_token_for_user(username, password)   
ctx = ClientContext(url, ctx_auth)

path = "./User Guide.docx" #local path
with open(path, 'rb') as content_file:
   file_content = content_file.read()
target_url = "/Shared Documents/{0}".format(os.path.basename(path))  # target url of a file 
File.save_binary(ctx, target_url, file_content) # upload a file

用法

安装{{最新版本}}(从GitHub):

pip install git+https://github.com/vgrem/Office365-REST-Python-Client.git

请参考/examples/shrepoint/files/*以获取更详细信息


@VadimGremyachev - 你提供的链接是空的。 - kensai
2
仍然无法工作 ::-)) 但我找到了它 https://github.com/vgrem/Office365-REST-Python-Client/blob/master/examples/sharepoint/file_operations.py - kensai
安装git时出现SSL错误,请改用以下命令进行安装:pip install Office365-REST-Python-Client。 - DonkeyKong
3
由于用户不接受编辑,我无法编辑答案。以下是让它正常工作所需的导入内容:from office365.runtime.auth.authentication_context import AuthenticationContext from office365.sharepoint.client_context import ClientContext from office365.sharepoint.file import File - Zach
如果O365帐户受到MFA保护,这是否有效? - Rajesh Swarnkar
如何将下载代码更改为列出所有CSV文件 - undefined

1
您可以尝试以下解决方案来上传文件。对我来说,第一种上传的解决方案无法使用。第一步:pip3 install Office365-REST-Python-Client==2.3.11
import os
from office365.sharepoint.client_context import ClientContext
from office365.runtime.auth.user_credential import UserCredential

def print_upload_progress(offset):
    print("Uploaded '{0}' bytes from '{1}'...[{2}%]".format(offset, file_size, round(offset / file_size * 100, 2)))


# Load file to upload:
path = './' + filename # if file to upload is in the same directory
try:
    with open(path, 'rb') as content_file:
        file_content = content_file.read()
except Exception as e:
    print(e)

file_size = os.path.getsize(path)

site_url = "https://YOURDOMAIN.sharepoint.com"
user_credentials = UserCredential('user_login', 'user_password') # this user must login to space

ctx = ClientContext(site_url).with_credentials(user_credentials)

size_chunk = 1000000
target_url = "/sites/folder1/folder2/folder3/"
target_folder = ctx.web.get_folder_by_server_relative_url(target_url)


# Upload file to SharePoint:
try:
    uploaded_file = target_folder.files.create_upload_session(path, size_chunk, print_upload_progress).execute_query()
    print('File {0} has been uploaded successfully'.format(uploaded_file.serverRelativeUrl))
except Exception as e:
    print("Error while uploading to SharePoint:\n", e)

基于: https://github.com/vgrem/Office365-REST-Python-Client/blob/e2b089e7a9cf9a288204ce152cd3565497f77215/examples/sharepoint/files/upload_large_file.py


0
这是您的操作方式,如果您拥有公共的SharePoint URL (无需验证)
import requests, mimetypes

# Specify file sharepoint URL
file_url = 'https://organisarion-my.sharepoint.com/:b:/p/user1/Eej3XCFj7N1AqErjlxrzebgBO7NJMV797ClDPuKkBEi6zg?e=dJf2tJ'

# Specify desination filename
save_path = 'file'

# Make GET request with allow_redirect
res = requests.get(file_url, allow_redirects=True)

if res.status_code == 200:
    # Get redirect url & cookies for using in next request
    new_url = res.url
    cookies = res.cookies.get_dict()
    for r in res.history:
        cookies.update(r.cookies.get_dict())
    
    # Do some magic on redirect url
    new_url = new_url.replace("onedrive.aspx","download.aspx").replace("?id=","?SourceUrl=")

    # Make new redirect request
    response = requests.get(new_url, cookies=cookies)

    if response.status_code == 200:
        content_type = response.headers.get('Content-Type')
        print(content_type)
        file_extension = mimetypes.guess_extension(content_type)
        print(response.content)
        if file_extension:
            destination_with_extension = f"{save_path}{file_extension}"
        else:
            destination_with_extension = save_path

        with open(destination_with_extension, 'wb') as file:
            for chunk in response.iter_content(1024):
                file.write(chunk)
        print("File downloaded successfully!")
    else:
        print("Failed to download the file.")
        print(response.status_code)


获取Cookie和重定向URL的简要说明是,使用这些Cookie进行新的GET请求。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接