如何计算字符串中子字符串出现的次数(而不是字符出现的次数)

95
假设我有一个字符串,像这样:
MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";

我想知道在这个字符串中子字符串"OU="出现了多少次。使用单个字符,可能有类似以下的方法:
int count = MyString.Split("OU=").Length - 1;

但是Split只适用于char,而不适用于string
还有如何找到n个出现的位置?例如,在字符串中第2个"OU="的位置?
如何解决这个问题?

5
String.Split有几种重载形式,可以让你按字符串来拆分。请参见http://msdn.microsoft.com/en-us/library/tabh47cf.aspx。 - shf301
split 不仅适用于字符... - MUG4N
你可以使用多个分隔符来完成这个操作。如果需要,我很乐意为您提供一个简单的编码示例以备将来参考。请记住,Split() 函数可以用于字符串操作,它返回一个数组。因此,在处理字符串时,它会返回 string[] 类型的数组,您需要创建新的 string[] { "somestring", "someotherString"..etc..}。 - MethodMan
8个回答

217
Regex.Matches(input, "OU=").Count

谢谢。还有如何找到n次出现的位置?例如,在字符串中第二次出现“OU=”的位置是什么? - KentZhou
我稍微研究了一下,在SO上找到了这个问题,应该能帮到你:https://dev59.com/sUfRa4cB1Zd3GeqP7kbm - tnw
3
一般来说,如果要匹配的子字符串包含正则表达式字符怎么办?使用Regex.Escape(subString)将子字符串中的所有可能的正则表达式字符转义后,再用Regex.Matches(input, 转义后的子字符串).Count就可以匹配所有可能的子字符串。 - iwtu
12
一般情况下最好使用 Regex.Escape(...),因此抽象版本为:Regex.Matches(input, Regex.Escape("不是正则表达式的字符串")).Count - Julian
2
以字符串 aaaaaa 和子字符串 aa 为例。该答案产生了一个计数为3,而子字符串 aa 的实际计数为5。 - ProfK

21

你可以使用IndexOf方法找到所有匹配项及其位置:

string MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";
string stringToFind = "OU=";

List<int> positions = new List<int>();
int pos = 0;
while ((pos < MyString.Length) && (pos = MyString.IndexOf(stringToFind, pos)) != -1)
{
    positions.Add(pos);
    pos += stringToFind.Length();
}

Console.WriteLine("{0} occurrences", positions.Count);
foreach (var p in positions)
{
    Console.WriteLine(p);
}
您可以通过正则表达式获得相同的结果:
var matches = Regex.Matches(MyString, "OU=");
Console.WriteLine("{0} occurrences", matches.Count);
foreach (var m in matches)
{
    Console.WriteLine(m.Index);
}

主要区别:

  • 正则表达式(Regex)代码更短。
  • 正则表达式(Regex)代码分配了一个集合和多个字符串。
  • IndexOf(索引)代码可以被编写成立即输出位置,而不必创建集合。
  • 在单独使用时,正则表达式(Regex)代码可能会更快,但如果使用次数很多,则字符串分配的总开销可能会对垃圾回收器造成更高的负载。

如果我将其作为内联内容编写,并且不经常使用,我可能会选择正则表达式(Regex)解决方案。如果我将其放入库中作为经常使用的内容,则可能会选择IndexOf(索引)解决方案。


1
小错误?List<int> positions = new List<int>[]; 应该是 List<int> positions = new List<int>(); - RickL
1
我相信这是一个更好的解决方案。如果您要匹配的字符串类似于 .s,则正则表达式将返回错误的数字。使用Jim提出的while循环可以正确计算 .s 的数量。 - ashlar64

12

这个扩展程序所需资源比正则表达式要少。

public static int CountSubstring(this string text, string value)
{                  
    int count = 0, minIndex = text.IndexOf(value, 0);
    while (minIndex != -1)
    {
        minIndex = text.IndexOf(value, minIndex + value.Length);
        count++;
    }
    return count;
}

用法:

MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";
int count = MyString.CountSubstring("OU=");

7

(Clippy模式:打开)

您似乎正在解析LDAP查询!

是否想要手动解析它:

  • 手动解析?转到“拆分和解析”
  • 通过Win32调用自动解析?转到“使用PInvoke通过Win32”

(Clippy模式:关闭)

"SplittingAndParsing":

var MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";
var chunksAsKvps = MyString
    .Split(',')
    .Select(chunk => 
        { 
            var bits = chunk.Split('='); 
            return new KeyValuePair<string,string>(bits[0], bits[1]);
        });

var allOUs = chunksAsKvps
    .Where(kvp => kvp.Key.Equals("OU", StringComparison.OrdinalIgnoreCase));

"使用PInvoke调用Win32 API":

使用方法:

var parsedDn = Win32LDAP.ParseDN(str);    
var allOUs2 = parsedDn
    .Where(dn => dn.Key.Equals("OU", StringComparison.OrdinalIgnoreCase));

实用代码:

// I don't remember where I got this from, honestly...I *think* it came
// from another SO user long ago, but those details I've lost to history...
public class Win32LDAP
{
   #region Constants
   public const int ERROR_SUCCESS = 0;
   public const int ERROR_BUFFER_OVERFLOW = 111;
   #endregion Constants

   #region DN Parsing
   [DllImport("ntdsapi.dll", CharSet = CharSet.Unicode)]
   protected static extern int DsGetRdnW(
       ref IntPtr ppDN, 
       ref int pcDN, 
       out IntPtr ppKey, 
       out int pcKey, 
       out IntPtr ppVal, 
       out int pcVal
   );

   public static KeyValuePair<string, string> GetName(string distinguishedName)
   {
       IntPtr pDistinguishedName = Marshal.StringToHGlobalUni(distinguishedName);
       try
       {
           IntPtr pDN = pDistinguishedName, pKey, pVal;
           int cDN = distinguishedName.Length, cKey, cVal;

           int lastError = DsGetRdnW(ref pDN, ref cDN, out pKey, out cKey, out pVal, out cVal);

           if(lastError == ERROR_SUCCESS)
           {
               string key, value;

               if(cKey < 1)
               {
                   key = string.Empty;
               }
               else
               {
                   key = Marshal.PtrToStringUni(pKey, cKey);
               }

               if(cVal < 1)
               {
                   value = string.Empty;
               }
               else
               {
                   value = Marshal.PtrToStringUni(pVal, cVal);
               }

               return new KeyValuePair<string, string>(key, value);
           }
           else
           {
               throw new Win32Exception(lastError);
           }
       }
       finally
       {
           Marshal.FreeHGlobal(pDistinguishedName);
       }
   }

   public static IEnumerable<KeyValuePair<string, string>> ParseDN(string distinguishedName)
   {
       List<KeyValuePair<string, string>> components = new List<KeyValuePair<string, string>>();
       IntPtr pDistinguishedName = Marshal.StringToHGlobalUni(distinguishedName);
       try
       {
           IntPtr pDN = pDistinguishedName, pKey, pVal;
           int cDN = distinguishedName.Length, cKey, cVal;

           do
           {
               int lastError = DsGetRdnW(ref pDN, ref cDN, out pKey, out cKey, out pVal, out cVal);

               if(lastError == ERROR_SUCCESS)
               {
                   string key, value;

                   if(cKey < 0)
                   {
                       key = null;
                   }
                   else if(cKey == 0)
                   {
                       key = string.Empty;
                   }
                   else
                   {
                       key = Marshal.PtrToStringUni(pKey, cKey);
                   }

                   if(cVal < 0)
                   {
                       value = null;
                   }
                   else if(cVal == 0)
                   {
                       value = string.Empty;
                   }
                   else
                   {
                       value = Marshal.PtrToStringUni(pVal, cVal);
                   }

                   components.Add(new KeyValuePair<string, string>(key, value));

                   pDN = (IntPtr)(pDN.ToInt64() + UnicodeEncoding.CharSize); //skip over comma
                   cDN--;
               }
               else
               {
                   throw new Win32Exception(lastError);
               }
           } while(cDN > 0);

           return components;
       }
       finally
       {
           Marshal.FreeHGlobal(pDistinguishedName);
       }
   }

   [DllImport("ntdsapi.dll", CharSet = CharSet.Unicode)]
   protected static extern int DsQuoteRdnValueW(
       int cUnquotedRdnValueLength,
       string psUnquotedRdnValue,
       ref int pcQuotedRdnValueLength,
       IntPtr psQuotedRdnValue
   );

   public static string QuoteRDN(string rdn)
   {
       if (rdn == null) return null;

       int initialLength = rdn.Length;
       int quotedLength = 0;
       IntPtr pQuotedRDN = IntPtr.Zero;

       int lastError = DsQuoteRdnValueW(initialLength, rdn, ref quotedLength, pQuotedRDN);

       switch (lastError)
       {
           case ERROR_SUCCESS:
               {
                   return string.Empty;
               }
           case ERROR_BUFFER_OVERFLOW:
               {
                   break; //continue
               }
           default:
               {
                   throw new Win32Exception(lastError);
               }
       }

       pQuotedRDN = Marshal.AllocHGlobal(quotedLength * UnicodeEncoding.CharSize);

       try
       {
           lastError = DsQuoteRdnValueW(initialLength, rdn, ref quotedLength, pQuotedRDN);

           switch(lastError)
           {
               case ERROR_SUCCESS:
                   {
                       return Marshal.PtrToStringUni(pQuotedRDN, quotedLength);
                   }
               default:
                   {
                       throw new Win32Exception(lastError);
                   }
           }
       }
       finally
       {
           if(pQuotedRDN != IntPtr.Zero)
           {
               Marshal.FreeHGlobal(pQuotedRDN);
           }
       }
   }


   [DllImport("ntdsapi.dll", CharSet = CharSet.Unicode)]
   protected static extern int DsUnquoteRdnValueW(
       int cQuotedRdnValueLength,
       string psQuotedRdnValue,
       ref int pcUnquotedRdnValueLength,
       IntPtr psUnquotedRdnValue
   );

   public static string UnquoteRDN(string rdn)
   {
       if (rdn == null) return null;

       int initialLength = rdn.Length;
       int unquotedLength = 0;
       IntPtr pUnquotedRDN = IntPtr.Zero;

       int lastError = DsUnquoteRdnValueW(initialLength, rdn, ref unquotedLength, pUnquotedRDN);

       switch (lastError)
       {
           case ERROR_SUCCESS:
               {
                   return string.Empty;
               }
           case ERROR_BUFFER_OVERFLOW:
               {
                   break; //continue
               }
           default:
               {
                   throw new Win32Exception(lastError);
               }
       }

       pUnquotedRDN = Marshal.AllocHGlobal(unquotedLength * UnicodeEncoding.CharSize);

       try
       {
           lastError = DsUnquoteRdnValueW(initialLength, rdn, ref unquotedLength, pUnquotedRDN);

           switch(lastError)
           {
               case ERROR_SUCCESS:
                   {
                       return Marshal.PtrToStringUni(pUnquotedRDN, unquotedLength);
                   }
               default:
                   {
                       throw new Win32Exception(lastError);
                   }
           }
       }
       finally
       {
           if(pUnquotedRDN != IntPtr.Zero)
           {
               Marshal.FreeHGlobal(pUnquotedRDN);
           }
       }
   }
   #endregion DN Parsing
}

public class DNComponent
{
   public string Type { get; protected set; }
   public string EscapedValue { get; protected set; }
   public string UnescapedValue { get; protected set; }
   public string WholeComponent { get; protected set; }

   public DNComponent(string component, bool isEscaped)
   {
       string[] tokens = component.Split(new char[] { '=' }, 2);
       setup(tokens[0], tokens[1], isEscaped);
   }

   public DNComponent(string key, string value, bool isEscaped)
   {
       setup(key, value, isEscaped);
   }

   private void setup(string key, string value, bool isEscaped)
   {
       Type = key;

       if(isEscaped)
       {
           EscapedValue = value;
           UnescapedValue = Win32LDAP.UnquoteRDN(value);
       }
       else
       {
           EscapedValue = Win32LDAP.QuoteRDN(value);
           UnescapedValue = value;
       }

       WholeComponent = Type + "=" + EscapedValue;
   }

   public override bool Equals(object obj)
   {
       if (obj is DNComponent)
       {
           DNComponent dnObj = (DNComponent)obj;
           return dnObj.WholeComponent.Equals(this.WholeComponent, StringComparison.CurrentCultureIgnoreCase);
       }
       return base.Equals(obj);
   }

   public override int GetHashCode()
   {
       return WholeComponent.GetHashCode();
   }
}

public class DistinguishedName
{
   public DNComponent[] Components
   {
       get
       {
           return components.ToArray();
       }
   }

   private List<DNComponent> components;
   private string cachedDN;

   public DistinguishedName(string distinguishedName)
   {
       cachedDN = distinguishedName;
       components = new List<DNComponent>();
       foreach (KeyValuePair<string, string> kvp in Win32LDAP.ParseDN(distinguishedName))
       {
           components.Add(new DNComponent(kvp.Key, kvp.Value, true));
       }
   }

   public DistinguishedName(IEnumerable<DNComponent> dnComponents)
   {
       components = new List<DNComponent>(dnComponents);
       cachedDN = GetWholePath(",");
   }

   public bool Contains(DNComponent dnComponent)
   {
       return components.Contains(dnComponent);
   }

   public string GetDNSDomainName()
   {
       List<string> dcs = new List<string>();
       foreach (DNComponent dnc in components)
       {
           if(dnc.Type.Equals("DC", StringComparison.CurrentCultureIgnoreCase))
           {
               dcs.Add(dnc.UnescapedValue);
           }
       }
       return string.Join(".", dcs.ToArray());
   }

   public string GetDomainDN()
   {
       List<string> dcs = new List<string>();
       foreach (DNComponent dnc in components)
       {
           if(dnc.Type.Equals("DC", StringComparison.CurrentCultureIgnoreCase))
           {
               dcs.Add(dnc.WholeComponent);
           }
       }
       return string.Join(",", dcs.ToArray());
   }

   public string GetWholePath()
   {
       return GetWholePath(",");
   }

   public string GetWholePath(string separator)
   {
       List<string> parts = new List<string>();
       foreach (DNComponent component in components)
       {
           parts.Add(component.WholeComponent);
       }
       return string.Join(separator, parts.ToArray());
   }

   public DistinguishedName GetParent()
   {
       if(components.Count == 1)
       {
           return null;
       }
       List<DNComponent> tempList = new List<DNComponent>(components);
       tempList.RemoveAt(0);
       return new DistinguishedName(tempList);
   }

   public override bool Equals(object obj)
   {
       if(obj is DistinguishedName)
       {
           DistinguishedName objDN = (DistinguishedName)obj;
           if (this.Components.Length == objDN.Components.Length)
           {
               for (int i = 0; i < this.Components.Length; i++)
               {
                   if (!this.Components[i].Equals(objDN.Components[i]))
                   {
                       return false;
                   }
               }
               return true;
           }
           return false;
       }
       return base.Equals(obj);
   }

   public override int GetHashCode()
   {
       return cachedDN.GetHashCode();
   }
}

3
应该可以正常工作
  MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";
  int count = Regex.Matches(MyString, "OU=").Count

2
int count = myString.Split(new []{','})
                    .Count(item => item.StartsWith(
                        "OU=", StringComparison.OrdinalIgnoreCase))

2
不是自吹自擂,但这取决于逗号分隔。当然,在这种非常特定的情况下它可以工作,但我的正则表达式解决方案更简单、更动态。 - tnw
是的,我同意,只是提供一个替代方案。 - Phil
当然,我能理解。+1 - tnw

1
public static int CountOccurences(string needle, string haystack)
{
    return (haystack.Length - haystack.Replace(needle, "").Length) / needle.Length;
}

与其他答案进行了基准测试(正则表达式和“IndexOf”),速度更快。


1
这只是一个复制自https://dev59.com/ZXRB5IYBdhLWcg3wtZMV的代码,附带参考文献。 - fubo

1
这里有两个例子,可以帮助你获得所需的结果。
var MyString = "OU=Level3,OU=Level2,OU=Level1,DC=domain,DC=com";

这里您会看到一个值列表,它们被分隔开来,但其中会有 DC,这只是为了展示使用字符串拆分的效果。
var split = MyString.Split(new string[] { "OU=", "," }, StringSplitOptions.RemoveEmptyEntries);

这个函数将分割并返回三个项目的列表,这样如果您不依赖计数,您可以通过目测验证它是否返回了“OU =”的三个级别。
var lstSplit = MyString.Split(new[] { ',' })
        .Where(splitItem => splitItem.StartsWith(
               "OU=", StringComparison.OrdinalIgnoreCase)).ToList();

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接