如何在C#源代码中检测非ASCII字符

7

我有一个非常愚蠢但困难的问题。 我正在处理一些非常糟糕的C#遗留代码,其中类和方法名称中包含许多非ASCII字符,这在将对象序列化为字符串并通过http/tcp/...发送时会导致很多问题。 我正在寻找一种方式来扫描C#代码并检测所有类/方法/属性/枚举等名称中的非ASCII字符,但不包括字符串文字、注释或其他任何不是C#代码的内容:

public enum Languages
{
    /// <summary>
    /// English
    /// </summary>
    English = 1,

    /// <summary>
    /// Česká         <- acceptable
    /// </summary>
    Česká = 2,     // <- not acceptable
}

Logger.Info("Selectes language: Česká"); // <- acceptable

任何以编程或使用任何工具实现此目标的方法都将有所帮助。编辑:正如评论中的许多人建议我应该修复序列化而不是清除非ASCII字符,我想澄清为什么这不是一个选择。一些类/方法名称中混合了ASCII和非ASCII字符。例如,MyСlass看起来像一个完美有效的英文名称,但实际上它在中间有一个俄语字母“С”,这显然是一个开发人员犯下的错误,他忘记改变他们的输入语言。我希望消除代码中这样的错误。

4
您可以使用Roslyn来分析代码,并检查非ASCII字符的符号名称。或者您可以在VS中进行快速且简单的正则表达式搜索,以匹配双引号内以外的非ASCII字符。这种方法无法处理字符串中转义引号的情况,但可以立即给出一些结果。 - Panagiotis Kanavos
4
使用简单的反射,你可以做任何事情,从Assembly.GetTypes()开始。但天啊,这是一个对于简单编码错误来说过于繁琐的解决方案。解决这个错误就好了。 - Hans Passant
Visual Studio 2012+使用与.NET相同的正则表达式语法。这意味着您可以指定Unicode范围和类,例如\p{IsBasicLatin}+ - Panagiotis Kanavos
2
@AndreBorges 我同意Hans Passant的观点,即序列化不是更改名称的好理由。.NET内置机制不关心ASCII或Unicode,因为每个字符串都是Unicode。你必须额外努力才能在.NET中创建代码页错误。我可以理解将Česká更改为Czech以使代码更易读。但对于序列化来说,完全没有必要。 - Panagiotis Kanavos
1
@HansPassant, PanagiotisKanavos,请查看我的修改。 - Andre Borges
显示剩余3条评论
1个回答

2

有人建议你可以编写一个分析器来检查类/方法等的命名。

我对编写分析器很感兴趣,所以我写了它。需要使用VS 2015/2017。

你需要.NET编译器平台SDK。所以...

  • 新建项目,模型,Visual C#,可扩展性,使用.NET 4.6.2(检查窗口顶部的组合框),带代码修复的分析器

如果你找不到它,那么可以从https://marketplace.visualstudio.com/items?itemName=VisualStudioProductTeam.NETCompilerPlatformSDK下载。

然后将DiagnosticAnalyzer.cs的内容替换为:

using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Diagnostics;
using System.Collections.Immutable;
using System.Diagnostics;
using System.Linq;

namespace NonAsciiAnalyzer
{
    [DiagnosticAnalyzer(LanguageNames.CSharp)]
    public class NonAsciiAnalyzerAnalyzer : DiagnosticAnalyzer
    {
        public const string DiagnosticId = "NonAsciiAnalyzer";

        // You can change these strings in the Resources.resx file. If you do not want your analyzer to be localize-able, you can use regular strings for Title and MessageFormat.
        // See https://github.com/dotnet/roslyn/blob/master/docs/analyzers/Localizing%20Analyzers.md for more on localization
        private static readonly LocalizableString Title = new LocalizableResourceString(nameof(Resources.AnalyzerTitle), Resources.ResourceManager, typeof(Resources));
        private static readonly LocalizableString MessageFormat = new LocalizableResourceString(nameof(Resources.AnalyzerMessageFormat), Resources.ResourceManager, typeof(Resources));
        private static readonly LocalizableString Description = new LocalizableResourceString(nameof(Resources.AnalyzerDescription), Resources.ResourceManager, typeof(Resources));
        private const string Category = "Naming";

        private static DiagnosticDescriptor Rule = new DiagnosticDescriptor(DiagnosticId, Title, MessageFormat, Category, DiagnosticSeverity.Warning, isEnabledByDefault: true, description: Description);

        public override ImmutableArray<DiagnosticDescriptor> SupportedDiagnostics { get { return ImmutableArray.Create(Rule); } }

        public override void Initialize(AnalysisContext context)
        {
            context.RegisterSyntaxNodeAction(AnalyzeNamespaceDeclaration, SyntaxKind.NamespaceDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeInterfaceDeclaration, SyntaxKind.InterfaceDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeClassDeclaration, SyntaxKind.ClassDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeStructDeclaration, SyntaxKind.StructDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeDelegateDeclaration, SyntaxKind.DelegateDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeEnumDeclaration, SyntaxKind.EnumDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeEnumMemberDeclaration, SyntaxKind.EnumMemberDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeFieldDeclaration, SyntaxKind.FieldDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeEventDeclaration, SyntaxKind.EventDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzePropertyDeclaration, SyntaxKind.PropertyDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeMethodDeclaration, SyntaxKind.MethodDeclaration);

            context.RegisterSyntaxNodeAction(AnalyzeUsingDirective, SyntaxKind.UsingDirective);
            context.RegisterSyntaxNodeAction(AnalyzeExternAliasDirective, SyntaxKind.ExternAliasDirective);

            context.RegisterSyntaxNodeAction(AnalyzeTypeParameter, SyntaxKind.TypeParameter);
            context.RegisterSyntaxNodeAction(AnalyzeParameter, SyntaxKind.Parameter);
            context.RegisterSyntaxNodeAction(AnalyzeVariableDeclaration, SyntaxKind.VariableDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeLabelStatement, SyntaxKind.LabeledStatement);
            context.RegisterSyntaxNodeAction(AnalyzeCatchDeclaration, SyntaxKind.CatchDeclaration);
            context.RegisterSyntaxNodeAction(AnalyzeForEachStatement, SyntaxKind.ForEachStatement);
            context.RegisterSyntaxNodeAction(AnalyzeAnonymousObjectMemberDeclarator, SyntaxKind.AnonymousObjectMemberDeclarator);

            context.RegisterSyntaxNodeAction(AnalyzeFromClause, SyntaxKind.FromClause);
            context.RegisterSyntaxNodeAction(AnalyzeLetClause, SyntaxKind.LetClause);
            context.RegisterSyntaxNodeAction(AnalyzeJoinClause, SyntaxKind.JoinClause);
            context.RegisterSyntaxNodeAction(AnalyzeJoinIntoClause, SyntaxKind.JoinIntoClause);
            context.RegisterSyntaxNodeAction(AnalyzeQueryContinuation, SyntaxKind.QueryContinuation);
        }

        private void AnalyzeNamespaceDeclaration(SyntaxNodeAnalysisContext context)
        {
            var nds = (NamespaceDeclarationSyntax)context.Node;
            var sns = (SimpleNameSyntax)nds.Name;
            string name = sns.Identifier.Text;
            Debug.WriteLine("Namespace: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeInterfaceDeclaration(SyntaxNodeAnalysisContext context)
        {
            var ids = (InterfaceDeclarationSyntax)context.Node;
            string name = ids.Identifier.Text;
            Debug.WriteLine("Interface: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeClassDeclaration(SyntaxNodeAnalysisContext context)
        {
            var cds = (ClassDeclarationSyntax)context.Node;
            string name = cds.Identifier.Text;
            Debug.WriteLine("Class: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeStructDeclaration(SyntaxNodeAnalysisContext context)
        {
            var sds = (StructDeclarationSyntax)context.Node;
            string name = sds.Identifier.Text;
            Debug.WriteLine("Struct: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeDelegateDeclaration(SyntaxNodeAnalysisContext context)
        {
            var dds = (DelegateDeclarationSyntax)context.Node;
            string name = dds.Identifier.Text;
            Debug.WriteLine("Delegate: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeEnumDeclaration(SyntaxNodeAnalysisContext context)
        {
            var eds = (EnumDeclarationSyntax)context.Node;
            string name = eds.Identifier.Text;
            Debug.WriteLine("Enum: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeEnumMemberDeclaration(SyntaxNodeAnalysisContext context)
        {
            var emds = (EnumMemberDeclarationSyntax)context.Node;
            string name = emds.Identifier.Text;
            Debug.WriteLine("Enum Member: " + name);
            Check(context, name, Rule);
        }

        // Already done by LocalDeclaration
        private void AnalyzeFieldDeclaration(SyntaxNodeAnalysisContext context)
        {
            //var fds = (FieldDeclarationSyntax)context.Node;

            //foreach (var fds2 in fds.Declaration.Variables)
            //{
            //    string name = fds2.Identifier.Text;
            //    Debug.WriteLine("Field: " + name);
            //    Check(context, name, Rule);
            //}
        }

        private void AnalyzeEventDeclaration(SyntaxNodeAnalysisContext context)
        {
            var eds = (EventDeclarationSyntax)context.Node;
            string name = eds.Identifier.Text;
            Debug.WriteLine("Event: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzePropertyDeclaration(SyntaxNodeAnalysisContext context)
        {
            var pds = (PropertyDeclarationSyntax)context.Node;
            string name = pds.Identifier.Text;
            Debug.WriteLine("Property: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeMethodDeclaration(SyntaxNodeAnalysisContext context)
        {
            var mds = (MethodDeclarationSyntax)context.Node;
            string name = mds.Identifier.Text;
            Debug.WriteLine("Method: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeUsingDirective(SyntaxNodeAnalysisContext context)
        {
            var uds = (UsingDirectiveSyntax)context.Node;

            if (uds.Alias == null)
            {
                return;
            }

            string name = uds.Alias.Name.Identifier.Text;
            Debug.WriteLine("Using Alias: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeExternAliasDirective(SyntaxNodeAnalysisContext context)
        {
            var eads = (ExternAliasDirectiveSyntax)context.Node;
            string name = eads.Identifier.Text;
            Debug.WriteLine("Extern Alias: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeTypeParameter(SyntaxNodeAnalysisContext context)
        {
            var tps = (TypeParameterSyntax)context.Node;
            string name = tps.Identifier.Text;
            Debug.WriteLine("Type Parameter: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeParameter(SyntaxNodeAnalysisContext context)
        {
            var ps = (ParameterSyntax)context.Node;
            string name = ps.Identifier.Text;
            Debug.WriteLine("Parameter: " + name);
            Check(context, name, Rule);
        }

        // Fields/const fields/local variables/local functions
        private void AnalyzeVariableDeclaration(SyntaxNodeAnalysisContext context)
        {
            var vds = (VariableDeclarationSyntax)context.Node;

            foreach (var vds2 in vds.Variables)
            {
                string name = vds2.Identifier.Text;
                Debug.WriteLine("Local: " + name);
                Check(context, name, Rule);
            }
        }

        private void AnalyzeLabelStatement(SyntaxNodeAnalysisContext context)
        {
            var lss = (LabeledStatementSyntax)context.Node;
            string name = lss.Identifier.Text;
            Debug.WriteLine("Label: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeCatchDeclaration(SyntaxNodeAnalysisContext context)
        {
            var cds = (CatchDeclarationSyntax)context.Node;
            string name = cds.Identifier.Text;
            Debug.WriteLine("Catch Variable: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeForEachStatement(SyntaxNodeAnalysisContext context)
        {
            var fess = (ForEachStatementSyntax)context.Node;
            string name = fess.Identifier.Text;
            Debug.WriteLine("ForEach Variable: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeAnonymousObjectMemberDeclarator(SyntaxNodeAnalysisContext context)
        {
            var aqns = (AnonymousObjectMemberDeclaratorSyntax)context.Node;
            string name = aqns.NameEquals.Name.Identifier.Text;
            Debug.WriteLine("Anonymous Field: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeFromClause(SyntaxNodeAnalysisContext context)
        {
            var fcs = (FromClauseSyntax)context.Node;
            string name = fcs.Identifier.Text;
            Debug.WriteLine("From: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeLetClause(SyntaxNodeAnalysisContext context)
        {
            var lcs = (LetClauseSyntax)context.Node;
            string name = lcs.Identifier.Text;
            Debug.WriteLine("Let: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeJoinClause(SyntaxNodeAnalysisContext context)
        {
            var jcs = (JoinClauseSyntax)context.Node;
            string name = jcs.Identifier.Text;
            Debug.WriteLine("Join: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeJoinIntoClause(SyntaxNodeAnalysisContext context)
        {
            var jics = (JoinIntoClauseSyntax)context.Node;
            string name = jics.Identifier.Text;
            Debug.WriteLine("Join Into: " + name);
            Check(context, name, Rule);
        }

        private void AnalyzeQueryContinuation(SyntaxNodeAnalysisContext context)
        {
            var qcs = (QueryContinuationSyntax)context.Node;
            string name = qcs.Identifier.Text;
            Debug.WriteLine("Into: " + name);
            Check(context, name, Rule);
        }

        private void Check(SyntaxNodeAnalysisContext context, string name, DiagnosticDescriptor rule)
        {
            // .NET Core, with full .NET no .ToCharArray()
            if (name.ToCharArray().Any(x => x > 0x7f))
            {
                // For all such symbols, produce a diagnostic.
                var diagnostic = Diagnostic.Create(rule, context.Node.GetLocation(), name);
                context.ReportDiagnostic(diagnostic);
            }
        }
    }
}

请注意,我对如何检查局部变量、参数等名称很感兴趣,因此最终我几乎检查了所有内容(当然我可能忘记了某些东西……可悲的是没有统一的“获取用户定义的所有‘事物’的名称”的方法)……而RegisterSymbolAction仅返回顶级符号(例如没有局部函数)。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接