为什么Mercurial不需要“递归合并策略”?

9
据我所知,git的默认合并策略是“递归”,这意味着当多个“共同祖先”成为“好的候选者”时,git会将它们合并并为贡献者创建一个新的“虚拟共同祖先”。它基本上有助于解决文件已经合并的情况,并避免再次合并它们或产生不正确的合并贡献者的情况。
我的问题是:如果Mercurial不使用“递归”,它如何处理相同的情况?
谢谢。

我实际上还没有尝试过,但我在上周末的Mercurial Sprint上听说递归也可能会创建错误合并(而且过去确实发生过这种情况)。 - tonfa
我的意思是,Mercurial已经通过找到正在合并的两个分支的最后一个共同祖先来避免重新合并已经合并过的内容。因此,如果您已经有2个头+1个共同祖先,那么额外的祖先会给您带来什么?可以推测,这些其他祖先的更改已经存在于共同祖先中。 - Lasse V. Karlsen
2
@lasse v. karlsen:递归合并旨在避免在交叉合并的情况下重复执行相同的合并操作。 - tonfa
2
我尝试寻找一个确凿的例子来展示交叉合并的样子,也就是一个真实的文件。你能给我一个吗?请原谅我的愚钝,因为我仍在学习DVCS的所有细微差别,我想了解这种情况,如何避免它,检测它和/或解决它。 - Lasse V. Karlsen
我曾经是Oracle内部源代码控制系统的主要开发人员。因此,我们一直在解决这个问题,这是单次合并中无法解决的问题之一。递归合并的想法只有在您需要执行的每个合并都是自动合并时才可能实现。但是,在某些树中,要跨分支进行交叉合并,您必须执行N个三路合并。如果其中一个不适合自动合并,则会出现错误。Git的做法,就像许多其他事情一样,是错误的,并且会导致在某些边缘情况下丢失代码块。 - Jiri Klouda
显示剩余4条评论
2个回答

4
大多数版本控制系统不知道如何处理合并时存在多个基础版本的情况。数学合并方程式为:
结果 = 目标版本 + (从1到N的基础版本(I) - 源版本(I)的总和)
在大多数情况下,N=1,你会得到一个典型的三路合并工具可以处理的源版本、目标版本和基础版本的合并。尽管许多源代码控制系统甚至在这种简单情况下都没有找到正确的算法来查找基础版本。要做到这一点,你需要通过版本树向上跟踪合并箭头,直到遇到一个共同的祖先。但有时候共同的祖先太远了,不适用于上述N=1的方程式,在这种情况下,你需要找到多个部分合并的共同祖先。
例如,一个分支被多次合并下来和上升,然后我们尝试将该分支的更改交叉合并到另一个分支中。在这种情况下,N>1,但小于源分支的合并次数。
这是分支合并中最难做的事情之一,我不知道有哪个源代码控制系统实际上做得正确。

你的回答已经过去了几年。从你的角度来看,今天最好的源代码控制系统是什么?为什么?谢谢。 - mljrg

2
Mercurial的原始作者曾写过为什么他没有使用递归合并策略(链接):基本上答案是:

对于祖先模糊最有趣的情况[...]递归合并根本没有帮助。因此,我认为它们不值得额外的复杂性

但完整的答案真的很有趣,建议你阅读一下。我会在这里复制它,以防它消失:
> Does Mercurial supports recursive merge strategy like git? It is used
> in situation when
> merge has two "common" ancestors (also know as criss-cross merge)
> 
> According to http://codicesoftware.blogspot.com/2011/09/merge-recursive-strategy.html
> Mercurial
> does not support it but I wanted to ask to make sure that nothing has changed.

Indeed. But you shouldn't judge the situation from this blog post as
it's not coherent.

In particular, the example given under "Why merge recursive is better –
a step by step example" doesn't appear to be a recursive merge situation
at all! Notice the key difference in topology as compared with the
initial diagrams: no criss-crossing merges leading up to the merge. Some
kind of bait and switch happening here.

In the example itself, Git will choose the same (single) ancestor in a
merge between nodes 5 and 4 as Mercurial would, 0. And thus both give
the result 'bcdE'. So we've learned precisely nothing about recursive
merge and how it compares to Mercurial from this example. The claim that
Mercurial chooses the "deepest" ancestor: also wrong and nonsensical.
The deepest ancestor is the root.

This seems to be yet another instance of "Git is incomprehensible,
therefore Git is magic, therefore Git magically works better" logic at
work.

Let's _actually_ work his original example diagram which has the
criss-crossing merges (which I guess he copied from someone who knew
what they were talking about). I'm going to ignore the blogger's
nonsensical use of arrows that point the wrong way for branch merges and
thus add cycles into the "directed acyclic graph". Here history flows
from left to right, thus the edges are right to left:

a---b-d-f---?
 \   \ /   / 
  \   X   /
   \ / \ /
    c-e-g

Let's make up a simple set of changes to go with that picture. Again,
think of each character as a line:

a = "a"
b = "a1"
c = "1a"
d = "a2"
e = "2a"
f = merge of d and c = "1a2" 
g = merge of e and b = "2a1"

When we merge f and g, our greatest common ancestor is either b or c. So
we've got the following cases:

b: we had a1 originally, and are looking at 1a2 and 2a1. So we have a
conflict at the start, but can simply choose 2 for the end as only one
side touched the end.

c: we had 1a originally, and are looking at 1a2 and 2a1. So we have a
conflict at the end, but can simply choose 2 for the start as only one
side touched the start.

Mercurial will choose whichever one of these it finds first, so we have
one conflict to resolve. It definitely does not choose 'a' as the
ancestor, which would give two conflicts.

Now what a recursive merge would do would be merging b and c first,
giving us "1a1". So now when we merge, we don't have conflicts at the
front or the back.

So yay, in this simplest of examples, it's a win. But cases where this
actually matters aren't terribly common (let's call it 1% to be
generous) and cases where it actually automatically solves the problem
for you seamlessly are actually less than half of THOSE cases.

Instead, if you've got conflicts in your recursive merge, now you've
made the whole situation more confusing. Take your blog post as Exhibit
A that most people don't understand recursive merge at all which means
when a merge goes wrong, not only do you need an expert to diagnose it,
you need an expert to tell you who the 'experts' even are.

We talk about recursive merge occasionally. But as it happens, for the
cases where ancestor ambiguity is the most interesting (merging with
backouts, exec bit changes), recursive merges don't help at all. So I
don't think they warrant the extra complexity.

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接