.NET参考源代码中的四个短横线组是什么?

12

当我浏览PluralizationService的源代码时,我注意到了一些奇怪的东西。在这个类中有几个私有字典反映了不同的复数规则。例如:

    private string[] _uninflectiveWordList =
        new string[] { 
            "bison", "flounder", "pliers", "bream", "gallows", "proceedings", 
            "breeches", "graffiti", "rabies", "britches", "headquarters", "salmon", 
            "carp", "----", "scissors", "ch----is", "high-jinks", "sea-bass", 
            "clippers", "homework", "series", "cod", "innings", "shears", "contretemps", 
            "jackanapes", "species", "corps", "mackerel", "swine", "debris", "measles", 
            "trout", "diabetes", "mews", "tuna", "djinn", "mumps", "whiting", "eland", 
            "news", "wildebeest", "elk", "pincers", "police", "hair", "ice", "chaos",
            "milk", "cotton", "pneumonoultramicroscopicsilicovolcanoconiosis",
            "information", "aircraft", "scabies", "traffic", "corn", "millet", "rice", 
            "hay", "----", "tobacco", "cabbage", "okra", "broccoli", "asparagus", 
            "lettuce", "beef", "pork", "venison", "mutton",  "cattle", "offspring", 
            "molasses", "shambles", "shingles"};
什么是字符串中的四个破折号组?我没有在代码中看到对它们的处理,所以它们不是某种模板。我唯一能想到的是它们是被审查的粗话('ch----is'本应为'chassis'),在这种情况下实际上会影响可读性。有其他人遇到过这个问题吗?如果我对实际完整列表感兴趣,该如何查看?

不确定,但我猜测它是一种通配符占位符(例如,匹配由ch组成的模式,然后是4个字符,然后是would match)。 - Chris Disley
4
“肺石英尘肺病” 我猜发现这个单词的测试人员一定会在错误报告中开心地笑出声,而修复它的开发人员也会回以微笑...(根据维基百科,这是英语中最长的单词) - Ron Beyer
2
我只能想到一个单词(梯形)与t----zium相匹配(这是来自同一文件的另一个单词,因此看起来它正在审查某些单词)。 - sgmoore
2
https://dev59.com/D10a5IYBdhLWcg3wLWAG#30631947 - Hans Passant
“两个卷心菜”比“两个卷心菜”更不可能是正确的吗? - Jon Hanna
显示剩余2条评论
1个回答

6

通过使用Reflector查看反编译代码,我可以确认编译版本中没有“----”,确实似乎在某个地方进行了某种形式的审查。构造函数中包含以下内容:

this._uninflectiveWordList = new string[] { 
    "bison", "flounder", "pliers", "bream", "gallows", "proceedings", "breeches", "graffiti", "rabies", "britches", "headquarters", "salmon", "carp", "herpes", "scissors", "chassis", 
    "high-jinks", "sea-bass", "clippers", "homework", "series", "cod", "innings", "shears", "contretemps", "jackanapes", "species", "corps", "mackerel", "swine", "debris", "measles", 
    "trout", "diabetes", "mews", "tuna", "djinn", "mumps", "whiting", "eland", "news", "wildebeest", "elk", "pincers", "police", "hair", "ice", "chaos", 
    "milk", "cotton", "pneumonoultramicroscopicsilicovolcanoconiosis", "information", "aircraft", "scabies", "traffic", "corn", "millet", "rice", "hay", "hemp", "tobacco", "cabbage", "okra", "broccoli", 
    "asparagus", "lettuce", "beef", "pork", "venison", "mutton", "cattle", "offspring", "molasses", "shambles", "shingles"
 };

如您所见,被审查的词语是“herpes”、“chassis”和“hemp”(如果我理解正确的话)。我个人认为这些词语都不需要进行审查,这表明它是某种自动化系统在执行审查。我假设原始来源中包含这些词语,而它们并非在某种预编译合并中添加的(因为“----”实际上无法说明应该用什么替换它们)。我想某种原因导致参考网站对它们进行了审查。
Hans Passant 在评论中还链接到了一个非常类似的问题的答案:StringBuilder.ToString() 的上下文中,“----s”是什么意思?。它解释说:“发布的参考源代码会通过一个过滤器,从源代码中删除不良内容”。

屁股,不是底盘。这可能会让某些人脸红。 - Hans Passant
3
您说得对,“ass”是被删除的。我指的是完整单词是什么。 - Chris
@PeterM:看起来是这样,但我不确定他们到底做了什么以及为何会搞得一团糟,因为我没在内部了解过他们的流程。不过,很可能是由于自动化流程有问题导致的。至少这让我放心,Skynet(指人工智能机器)在一段时间内不太可能成为问题。;-) - Chris
@PeterM 你是指“经典”过滤吗? - Dr Rob Lang
1
@RobLang 具有讽刺意味的是(我指的是在SO上提到clbuttic),http://blog.codinghorror.com/obscenity-filters-bad-idea-or-incredibly-intercoursing-bad-idea/ - Peter M
2
我记得在线浏览器游戏的聊天室运行了一个过滤器,只是去除了淫秽内容。而提到刺客(或“内奸”)变得相当困难,这很麻烦,因为游戏中的一个单位被称为刺客... - Chris

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接