处理仅更改空格的补丁

4
在我维护的一段代码中,有时会收到贡献者未经明显原因而重新排列了段落的拉取请求。以下是一个示例:
diff --git a/knuth.tex b/knuth.tex
index 2f6a2f8..7b0827d 100644
--- a/knuth.tex
+++ b/knuth.tex
@@ -1,6 +1,6 @@
 Thus, I came to the conclusion that the designer of a new
 system must not only be the implementer and first
-large||scale user; the designer should also write the first
+large-scale user; the designer should also write the first
 user manual.

 The separation of any of these four components would have
@@ -9,8 +9,7 @@ all these activities, literally hundreds of improvements
 would never have been made, because I would never have
 thought of them or perceived why they were important.

-But a system cannot be successful if it is too strongly
-influenced by a single person. Once the initial design is
-complete and fairly robust, the real test begins as people
-with many different viewpoints undertake their own
-experiments.
+But a system cannot be successful if it is too strongly influenced by
+a single person. Once the initial design is complete and fairly
+robust, the real test begins as people with many different viewpoints
+undertake their own experiments.

如您所见,第一个hunk通过将||替换为-引入了实际更改,而第二个hunk只是更改了换行和空格。事实上,第二个hunk的word-diff为空。

是否可以制定一项政策(例如在GitHub或我的CI中)以拒绝包含此类“空”hunk的提交,或重新格式化补丁以完全省略这些hunk,以便我可以使用git apply而不包含它们?

相关:如何应用git word diff


你的问题似乎是关于GitHub,而不是Git。我建议使用那个标签(也许不用当前的任何一个)。 - torek
@torek 这个策略不一定非得在 GitHub 上。我也可以在我的 CI 中执行它。因此,我把 GitHub 放在括号里。 - Henri Menke
你尝试过更改core.eol、core.safecrlf或core.autocrlf来处理换行符问题吗?这是你要找的吗:https://dev59.com/J3A75IYBdhLWcg3wBkO1? - Yazeed Sabri
@YazeedSabri 谢谢您指出这一点,但不,您链接的问题并没有涉及重新排列的文本。 - Henri Menke
@YazeedSabri --word-diff 的问题在于它生成的补丁与 git apply 不兼容。 - Henri Menke
显示剩余2条评论
1个回答

3
如果您正在寻找内置解决方案,则我不知道有没有这样的解决方案。但这并不意味着不能将其相对容易地集成到CI系统中。
您可以将适当的git diff命令的输出导入以下脚本中,如果输入包含像上面第二个示例中的块,则该脚本将退出1。
#!/usr/bin/env ruby

def filter(arr)
  arr.join.split("\n\n").map { |x| x.gsub(/\s+/, ' ') }.join("\n\n")
end

def should_reject(before, after)
  return false if before.empty? && after.empty?
  before = filter(before)
  after = filter(after)
  return true if before == after
  false
end

chunk = nil
before = []
after = []
while (line = gets)
  trimmed = line[1..-1]
  case line
  when /^(\+\+\+|---)/
    # Do nothing.
  when /^@@ /
    if should_reject(before, after)
      warn "Useless change to hunk #{chunk}"
      exit 1
    end
    chunk = line
    before = []
    after = []
  when /^ /
    before << trimmed
    after << trimmed
  when /^\+/
    after << trimmed
  when /^-/
    before << trimmed
  end
end

if should_reject(before, after)
  warn "Useless change to hunk #{chunk}"
  exit 1
end

它基本上将每个块分成有空行的段落,将所有空格变成空格,并进行比较。如果它们相等,它会发出警告并退出非零值。您可能希望修改它以更具鲁棒性,例如处理CRLF结尾之类,但这种方法是可行的。
另一个值得注意的方法是使用每行一个句子的风格。无论长度如何,每个句子都在一整行中,并且每行只有一个句子。这使得区分任何形式的更改变得更加容易,完全避免了换行问题。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接