如何修复Git错误的树之间断开的链接？

Question

如何修复Git错误的树之间断开的链接？

17

我进行了一项交易，但交易被中断，当我再次尝试时，我遇到了对象为空或损坏的错误，根据另一个问题的建议，我删除了所有空文件，然后运行

git fsck --full

我遇到了这个错误：

Checking object directories: 100% (256/256), done.
Checking objects: 100% (48774/48774), done.
error: d193ccbc48a30e8961e9a2515a708e228d5ea16d: invalid sha1 pointer in cache-tree
error: df084ac4214f1a981481b40080428950865a6b31: invalid sha1 pointer in cache-tree
broken link from    tree 4bf4869299b294be9dee4ecdcb45d2c204ce623b
          to    tree df084ac4214f1a981481b40080428950865a6b31
broken link from    tree 4bf4869299b294be9dee4ecdcb45d2c204ce623b
          to    tree d193ccbc48a30e8961e9a2515a708e228d5ea16d
missing tree df084ac4214f1a981481b40080428950865a6b31
missing blob a632281618ca6895282031732d28397c18038e35
missing tree d193ccbc48a30e8961e9a2515a708e228d5ea16d
missing blob 70aa143b05d1d7560e22f61fb737a1cab4ff74c6
missing blob c21c0545e08f5cac86ce4dde103708a1642f23fb
missing blob 9f341b8a9fcd26af3c44337ee121e2d6f6814088
missing blob 396aaf36f602018f88ce985df85e73a71dea6f14
missing blob 87b9d1933d37cc9eb7618c7984439e3c2e685a11

我该如何解决这个问题？

Git

- SSM89

使用Git 2.10（2016年第三季度），git fsck --name-objects可以提供帮助。请参见下面的答案。 - VonC

我因为在 main 分支上最近的推送中发现了一个坏链接，而找到了这个。我的本地 main 分支无法干净地与之匹配。我找到了这篇文章：https://blog.pterodactylus.net/2020/10/18/fixing-a-git-repository-with-broken-links/，它帮助我恢复了丢失/损坏的包，并修复了这些坏链接。 - unmultimedio

5个回答

6

我曾遇到一个非常相似的问题，其中包括来自树形结构的broken link 导致了一些git命令出现错误fatal: bad tree object。

但通过运行以下命令解决了这个问题:

修复问题

git stash clear（[可选] 只是删除由于变基或其他原因可能损坏的暂存）
git reflog expire --expire-unreachable=now --all（删除悬空提交）
git gc --prune=now（同样删除提交）

检查是否已修复

git fsck --full --name-objects（检查完整性，并应返回没有悬空提交或错误树形结构）

之后，错误信息fatal: bad tree object消失了！ :tada:

- Ben Winding

6

这是我用来解决“损坏链接”错误的方法：sehe在此回答中提供的答案，该答案是针对如何修复无法找到<插入sha1代码>错误的问题。

Like Adam said, recover the object from another repository/clone.
On a 'complete' Git database:
git cat-file -p a47058d09b4ca436d65609758a9dba52235a75bd > tempfile
and on the receiving end:
git hash-object -w tempfile

重要的补充是，在步骤1和步骤2之间，直接将文件从一个位置传输到另一个位置非常重要。根据我的经验，使用Git push和pull移动临时文件无法正常工作。

- Tim Nafziger

3

git gc --aggressive命令可以清理不必要的文件并优化本地仓库。

您可以通过以下方式验证问题是否已解决：

git fsck --full

- CodeWizard

6

当存在损坏的链接时，以下内容将无法正常工作：错误：无法读取 xxxxxxxxx", 致命错误：无法遍历提交 yyyyyyyyyy 的父级,错误：重打包失败。 - Guillaume D

-1

$ git push -f origin <last_good_commit>:<branch_name>

这段代码是关于Git的，它的作用是将本地仓库中的代码强制推送到远程仓库中。其中，<last_good_commit>代表最后一次提交的版本号，<branch_name>代表分支名称。

- Lemon Sandwich

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- VonC · Accepted Answer

从 Git 2.10（2016年第三季度）开始，您可以了解这些损坏链接的来源。

git fsck --name-objects

请参见提交 90cf590, 提交 1cd772c, 提交 7b35efd, 提交 993a21b (2016年7月17日)，作者是Johannes Schindelin (dscho)。
^{(由Junio C Hamano -- gitster --合并在提交 9db3979，2016年7月25日)}

fsck：可选显示更多有关损坏链接的有用信息

When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable.

With the new --name-objects option, git-fsck will try to do exactly that:
name the objects in a way that shows how they are reachable.

For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry.
This option helps them find that entry: git fsck --name-objects will now report something like this:
  broken link from    tree b5eb6ff...  (refs/stash@{<date>}~37:)
                to    blob ec5cf80...

如果这些损坏的链接不是来自本地存储而是远程仓库，获取这些打包对象可以解决问题。
另请参见 "如何恢复因硬盘故障而损坏的Git对象?。"

在 Git 2.31（2021年第一季度）中，修复了 "git fsck --name-objects"^(man) 的问题，显然没有人使用它并报告了故障。

查看提交 e89f893，提交 8c891ee (2021年2月10日) 由 Johannes Schindelin (dscho) 进行。
^{(由 Junio C Hamano -- gitster -- 合并于提交 9e634a9，2021年2月17日)}

fsck --name-objects：在解析生成编号时要更加小心

^{签名作者：Johannes Schindelin}

在7b35efd（fsck_walk(): optionally name objects on the go, 2016-07-17, Git v2.10.0-rc0 -- merge listed in batch #7）中，fsck机制学会了可选地为对象命名，以便更容易地查看仓库的哪个部分出现问题，例如当对象丢失时。为了简化复杂性，此机制使用解析器来确定给定提交名称的父级名称：任何~<n>后缀都将被解析，并且父级名称将由前缀和~<n+1>组成。然而，此解析器存在一个错误：如果它找到一个不是~<n>的后缀<n>，它将错误地将空字符串误认为前缀，将<n>误认为是生成号码。换句话说，它将生成一个形式为~<bogus-number>的名称。让我们修复这个问题。

在Git 2.40（2023年第一季度）中， "git hash-object"^(man) 现在检查生成的对象是否与 git fsck 使用相同的代码格式。

请参见提交 8e43090（2023年1月19日）以及提交 69bbbe4、提交 35ff327、提交 34959d8、提交 ad5dfea、提交 61cc4be、提交 6e26460（2023年1月18日），作者为Jeff King（peff）。
^{（由Junio C Hamano -- gitster --于2023年1月30日合并至提交 abf2bb8）}

hash-object: 使用 fsck 进行对象检查

^{Signed-off-by: Jeff King}

Since c879daa ("Make hash-object more robust against malformed objects", 2011-02-05, Git v1.7.5-rc0 -- merge), we've done some rudimentary checks against objects we're about to write by running them through our usual parsers for trees, commits, and tags.

These parsers catch some problems, but they are not nearly as careful as the fsck functions (which make sense; the parsers are designed to be fast and forgiving, bailing only when the input is unintelligible).
We are better off doing the more thorough fsck checks when writing objects.
Doing so at write time is much better than writing garbage only to find out later (after building more history atop it!) that fsck complains about it, or hosts with transfer.fsckObjects reject it.

This is obviously going to be a user-visible behavior change, and the test changes earlier in this series show the scope of the impact.
But I'd argue that this is OK:

the documentation for hash-object is already vague about which checks we might do, saying that --literally will allow any garbage[...] which might not otherwise pass standard object parsing or git-fsck^(man) checks".
So we are already covered under the documented behavior.

users don't generally run hash-object anyway.
There are a lot of spots in the tests that needed to be updated because creating garbage objects is something that Git's tests disproportionately do.

it's hard to imagine anyone thinking the new behavior is worse.
Any object we reject would be a potential problem down the road for the user.
And if they really want to create garbage, --literally is already the escape hatch they need.

Note that the change here is actually in index_mem(), which handles the HASH_FORMAT_CHECK flag passed by hash-object.
That flag is also used by "git-replace --edit"^(man) to sanity-check the result.
Covering that with more thorough checks likewise seems like a good thing.

Besides being more thorough, there are a few other bonuses:
we get rid of some questionable stack allocations of object structs.
These don't seem to currently cause any problems in practice, but they subtly violate some of the assumptions made by the rest of the code (e.g., the "struct commit" we put on the stack and zero-initialize will not have a proper index from alloc_comit_index().

likewise, those parsed object structs are the source of some small memory leaks
the resulting messages are much better.
For example:
[before]
$ echo 'tree 123' | git hash-object -t commit --stdin
error: bogus commit object 0000000000000000000000000000000000000000
fatal: corrupt commit

[after]
$ echo 'tree 123' | git.compile hash-object -t commit --stdin
error: object fails fsck: badTreeSha1: invalid 'tree' line format - bad sha1
fatal: refusing to create malformed object

如何修复Git错误的树之间断开的链接？

`fsck`：可选显示更多有关损坏链接的有用信息

`fsck --name-objects`：在解析生成编号时要更加小心

`hash-object`: 使用 `fsck` 进行对象检查

修复问题

检查是否已修复

如何修复Git错误的树之间断开的链接？

fsck：可选显示更多有关损坏链接的有用信息

fsck --name-objects：在解析生成编号时要更加小心

hash-object: 使用 fsck 进行对象检查

修复问题

检查是否已修复

`fsck`：可选显示更多有关损坏链接的有用信息

`fsck --name-objects`：在解析生成编号时要更加小心

`hash-object`: 使用 `fsck` 进行对象检查