如何统计字符串在另一个字符串中出现的次数？

Question

如何统计字符串在另一个字符串中出现的次数？

839

我该如何在一个字符串中统计另一个特定字符串出现的次数？例如，我想在Javascript中实现以下操作：

var temp = "This is a string.";
alert(temp.count("is")); //should output '2'

- TruMan1

23

这取决于你是否接受重叠实例，例如：var t = "sss"；上述字符串中子串"ss"出现了几次？是1次还是2次？你是跳过每个实例，还是逐个字符地移动指针来寻找子串呢？ - Tim

4

这个问题的改进基准：http://jsperf.com/string-ocurrence-split-vs-match/2 （基于Kazzkiq的基准测试）。 - idmean

在 JavaScript 中计算字符串中特定单词的总数 https://stackoverflow.com/a/65036248/4752258 - Farbod Aprin

这个视频似乎与此相关 - “Google编程面试与Facebook软件工程师” - https://www.youtube.com/watch?v=PIeiiceWe_w - Deryck

41个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sunil Garg · Answer 1

var temp = "This is a string.";
console.log((temp.match(new RegExp("is", "g")) || []).length);

- H S W · Answer 2

一个简单的方法是在所需单词上拆分字符串，该单词是我们想要计算出现次数的单词，并从部分的数量中减去1：

function checkOccurences(string, word) {
      return string.split(word).length - 1;
}
const text="Let us see. see above, see below, see forward, see backward, see left, see right until we will be right"; 
const count=countOccurences(text,"see "); // 2

- Simm · Answer 3

我认为正则表达式的目的与indexOf大不相同。 indexOf仅查找特定字符串的出现，而在正则表达式中，您可以使用通配符，如[A-Z]，它表示将查找单词中的任何大写字符，而无需指定实际字符。

例子：

 var index = "This is a string".indexOf("is");
 console.log(index);
 var length = "This is a string".match(/[a-z]/g).length;
 // where [a-z] is a regex wildcard expression thats why its slower
 console.log(length);

- balaji sukumaran · Answer 4

3

我们可以使用js的split函数，它的长度减1将是出现次数。

var temp = "This is a string.";
alert(temp.split('is').length-1);

- balaji sukumaran

1

欢迎。SO与论坛的运作方式不同。SO的设计是为了好的答案应该被投票，而不是重复。您提出的答案已经存在，因此您应该投票支持它。还有一些基于相同概念的答案，但也考虑到更微妙的解释（例如，在“sss”中，“ss”应该计为“1”还是“2”？）。如果您喜欢，也可以投票支持这些答案。对于入职，请在帮助部分中阅读“如何回答”和“如何提问”的主题，链接位于每个页面的顶部。我们感谢并期待您未来的贡献。 - SherylHohman

话说，你在发布时言简意赅、清晰明了，并尝试提供解释，这一点非常棒。许多最早的答案未能做到这一点。需要明确的是，在SO上，仅有代码的回答是不被鼓励的（虽然在过去并没有得到很好的执行）。期待未来看到更多优秀的回答。 - SherylHohman

- Jason Larke · Answer 5

虽然这篇文章有些过时了，但我今天也需要类似的东西，只是在查阅Stack Overflow之后才想起来。对我来说运行相当快。

String.prototype.count = function(substr,start,overlap) {
    overlap = overlap || false;
    start = start || 0;

    var count = 0, 
        offset = overlap ? 1 : substr.length;

    while((start = this.indexOf(substr, start) + offset) !== (offset - 1))
        ++count;
    return count;
};

- Ranju · Answer 6

       var myString = "This is a string.";
        var foundAtPosition = 0;
        var Count = 0;
        while (foundAtPosition != -1)
        {
            foundAtPosition = myString.indexOf("is",foundAtPosition);
            if (foundAtPosition != -1)
            {
                Count++;
                foundAtPosition++;
            }
        }
        document.write("There are " + Count + " occurrences of the word IS");

参考：计算字符串中子字符串出现的次数，以获得逐步解释。

- Ayo I · Answer 7

在 @Vittim.us 的答案基础上进行改进。我喜欢他的方法，它让我有了更多的控制，使得扩展变得容易，但我需要添加不区分大小写并且限制匹配整个单词（支持标点符号）。（例如，“bath”在“take a bath.”中，但不在“bathing”中）

标点符号的正则表达式来自：https://dev59.com/e2855IYBdhLWcg3weEGJ#25575009 （如何使用JavaScript使用正则表达式从字符串中删除所有标点符号？）

function keywordOccurrences(string, subString, allowOverlapping, caseInsensitive, wholeWord)
{

    string += "";
    subString += "";
    if (subString.length <= 0) return (string.length + 1); //deal with empty strings

    if(caseInsensitive)
    {            
        string = string.toLowerCase();
        subString = subString.toLowerCase();
    }

    var n = 0,
        pos = 0,
        step = allowOverlapping ? 1 : subString.length,
        stringLength = string.length,
        subStringLength = subString.length;

    while (true)
    {
        pos = string.indexOf(subString, pos);
        if (pos >= 0)
        {
            var matchPos = pos;
            pos += step; //slide forward the position pointer no matter what

            if(wholeWord) //only whole word matches are desired
            {
                if(matchPos > 0) //if the string is not at the very beginning we need to check if the previous character is whitespace
                {                        
                    if(!/[\s\u2000-\u206F\u2E00-\u2E7F\\'!"#$%&\(\)*+,\-.\/:;<=>?@\[\]^_`{|}~]/.test(string[matchPos - 1])) //ignore punctuation
                    {
                        continue; //then this is not a match
                    }
                }

                var matchEnd = matchPos + subStringLength;
                if(matchEnd < stringLength - 1)
                {                        
                    if (!/[\s\u2000-\u206F\u2E00-\u2E7F\\'!"#$%&\(\)*+,\-.\/:;<=>?@\[\]^_`{|}~]/.test(string[matchEnd])) //ignore punctuation
                    {
                        continue; //then this is not a match
                    }
                }
            }

            ++n;                
        } else break;
    }
    return n;
}

如果您发现错误或改进的地方，请随意修改和重构此答案。

- bcherny · Answer 8

对于将来发现这个主题的任何人，请注意，接受的答案如果泛化，不总是会返回正确的值，因为它会被正则表达式操作符 $ 和 . 扼杀。以下是更好的版本，可以处理任何针:

function occurrences (haystack, needle) {
  var _needle = needle
    .replace(/\[/g, '\\[')
    .replace(/\]/g, '\\]')
  return (
    haystack.match(new RegExp('[' + _needle + ']', 'g')) || []
  ).length
}

- Meghendra S Yadav · Answer 9

试一试

<?php 
$str = "33,33,56,89,56,56";
echo substr_count($str, '56');
?>

<script type="text/javascript">
var temp = "33,33,56,89,56,56";
var count = temp.match(/56/g);  
alert(count.length);
</script>

- Jorge Alberto · Answer 10

不使用正则表达式的简单版本：

var temp = "This is a string.";

var count = (temp.split('is').length - 1);

alert(count);