如何使用Python从Git存储库中下载单个文件

Question

如何使用Python从Git存储库中下载单个文件

7

我想使用Python从我的Git仓库中下载单个文件。目前我正在使用 gitpython 库。使用以下代码进行Git克隆可以正常工作，但我不想下载整个目录。

import os
from git import Repo
git_url = 'stack@127.0.1.7:/home2/git/stack.git'
repo_dir = '/root/gitrepo/'
if __name__ == "__main__":
    Repo.clone_from(git_url, repo_dir, branch='master', bare=True)
    print("OK")

- Ravi Ranjan

什么类型的文件？哪个操作系统？文件路径？ - Shashank Singh

1

使用git archive --remote。 - phd

@ShashankSingh：任何C或CPP源文件，在Windows操作系统上，路径为：master/code/repo/ - Ravi Ranjan

5个回答

3

你也可以在Python中使用subprocess：

import subprocess

args = ['git', 'clone', '--depth=1', 'stack@127.0.1.7:/home2/git/stack.git']
res = subprocess.Popen(args, stdout=subprocess.PIPE)
output, _error = res.communicate()

if not _error:
    print(output)
else:
    print(_error)

然而，你的主要问题仍然存在。

Git不支持下载存储库的部分内容。你必须下载全部。但是你可以通过GitHub来完成这个任务。参考

- Benyamin Jafari

0

你需要请求文件的原始版本！你可以从raw.github.com获取它。

- Lucifer

我猜他从来没有说过 GitHub。 - Shashank Singh

那是我的错，我以为那是Github。 - Lucifer

0

我不想把这个标记为直接的重复，因为它没有完全反映出这个问题的范围，但是路西法在他的答案中提到的部分似乎是正确的解决方法，根据这个SO帖子。简而言之，git不允许进行部分下载，但某些提供者（如GitHub）可以通过原始内容来实现部分下载。
话虽如此，Python确实提供了相当多的不同库来进行下载，其中最著名的是urllib.request。

- dennlinger

0

您可以使用此函数从特定分支下载单个文件内容。此代码仅使用requests库。

def download_single_file(
    repo_owner: str,
    repo_name: str,
    access_token: str,
    file_path: str,
    branch: str = "main",
    destination_path: str = None,
):
    if destination_path is None:
        destination_path = "./" + file_path

    url = f"https://api.github.com/repos/{repo_owner}/{repo_name}/contents/{file_path}?ref={branch}"

    # Set the headers with the access token and API version
    headers = {
        "Accept": "application/vnd.github+json",
        "Authorization": f"Bearer {access_token}",
    }

    # Send a GET request to the API endpoint
    response = requests.get(url, headers=headers)

    # Check if the request was successful
    if response.status_code == 200:
        # Get the content data from the response
        content_data = response.json()

        # Extract the content and decode it from base64
        content_base64 = content_data.get("content")
        content_bytes = base64.b64decode(content_base64)
        content = content_bytes.decode("utf-8")

        # Set the local destination path

        # Save the file content to the local destination path
        with open(destination_path, "w") as file:
            file.write(content)

        print("File downloaded successfully.")
    else:
        print(
            "Request failed. Check the repository owner, repository name, access token, and API version."
        )
    ```

- Aayush Neupane

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Nils Werner · Accepted Answer

不要把Git仓库看作文件的集合，而是快照的集合。Git不允许您选择下载哪些文件，但允许您选择下载多少个快照：

git clone stack@127.0.1.7:/home2/git/stack.git

将下载所有文件的所有快照，而

git clone --depth 1 stack@127.0.1.7:/home2/git/stack.git

只会下载所有文件的最新快照。您仍将下载所有文件，但至少可以省略它们的全部历史记录。

对于这些文件，您只需选择想要的文件，然后删除其余文件：

import os
import git
import shutil
import tempfile

# Create temporary dir
t = tempfile.mkdtemp()
# Clone into temporary dir
git.Repo.clone_from('stack@127.0.1.7:/home2/git/stack.git', t, branch='master', depth=1)
# Copy desired file from temporary dir
shutil.move(os.path.join(t, 'setup.py'), '.')
# Remove temporary dir
shutil.rmtree(t)