如何查找子字符串的所有出现？

Question

如何查找子字符串的所有出现？

581

Python有 string.find() 和 string.rfind() 方法可以在字符串中寻找子字符串并返回它的索引。

我想知道是否有像 string.find_all() 这样的方法可以返回所有找到的索引（不仅仅是从开头开始或者从末尾开始的第一个）。

例如：

string = "test test test test"

print string.find('test') # 0
print string.rfind('test') # 15

#this is the goal
print string.find_all('test') # [0,5,10,15]

_{如需计算字符串中子串出现的次数，请参阅计算字符串中子串的出现次数。}

- nukl

20

'ttt'.find_all('tt')应该返回一个错误，因为在Python中字符串对象没有名为find_all()的方法。 - Santiago Alessandri

4

它应该返回'0'。当然，在完美的世界中也必须有'ttt'.rfind_all('tt')，它应该返回'1'。 - nukl

32个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Harsha Biyani · Answer 1

你可以尝试以下方法：

>>> string = "test test test test"
>>> for index,value in enumerate(string):
    if string[index:index+(len("test"))] == "test":
        print index

0
5
10
15

- Mohammad Amin Eskandari · Answer 2

您可以尝试：

import re
str1 = "This dress looks good; you have good taste in clothes."
substr = "good"
result = [_.start() for _ in re.finditer(substr, str1)]
# result = [17, 32]

- Andrew H · Answer 3

这个帖子有点旧，但是这对我有效：

此线程略有陈旧，但对我有效：

numberString = "onetwothreefourfivesixseveneightninefiveten"
testString = "five"

marker = 0
while marker < len(numberString):
    try:
        print(numberString.index("five",marker))
        marker = numberString.index("five", marker) + 1
    except ValueError:
        print("String not found")
        marker = len(numberString)

- Uri Goren · Answer 4

在查找文档中大量关键词时，请使用flashtext

。

from flashtext import KeywordProcessor
words = ['test', 'exam', 'quiz']
txt = 'this is a test'
kwp = KeywordProcessor()
kwp.add_keywords_from_list(words)
result = kwp.extract_keywords(txt, span_info=True)

Flashtext在大量搜索词列表上运行速度比正则表达式快。

- ulas.kesik · Answer 5

我认为最干净的解决方案是不使用库和yield：

def find_all_occurrences(string, sub):
    index_of_occurrences = []
    current_index = 0
    while True:
        current_index = string.find(sub, current_index)
        if current_index == -1:
            return index_of_occurrences
        else:
            index_of_occurrences.append(current_index)
            current_index += len(sub)

find_all_occurrences(string, substr)

注意：find()方法在无法找到任何内容时返回-1

- mascai · Answer 6

3

src = input() # we will find substring in this string
sub = input() # substring

res = []
pos = src.find(sub)
while pos != -1:
    res.append(pos)
    pos = src.find(sub, pos + 1)

- mascai

2

虽然这段代码可能解决了问题，但最好还是解释一下你的代码如何解决问题。这样，未来的访问者可以从您的帖子中学习，并将其应用到自己的代码中。SO不是编码服务，而是知识资源。此外，高质量、完整的答案更有可能被点赞。这些特点以及所有帖子都必须是自包含的要求，是SO作为一个平台的一些优势，这使它与论坛区分开来。您可以编辑以添加其他信息和/或使用源文档补充您的解释。 - SherylHohman

- Valentin Goikhman · Answer 7

这个函数不会查看字符串内所有的位置，因此不会浪费计算资源。

def findAll(string,word):
    all_positions=[]
    next_pos=-1
    while True:
        next_pos=string.find(word,next_pos+1)
        if(next_pos<0):
            break
        all_positions.append(next_pos)
    return all_positions

要使用它，请像这样调用：

result=findAll('this word is a big word man how many words are there?','word')

- naveen raja · Answer 8

其他人提供的解决方案完全基于可用的方法find()或任何可用的方法。

查找字符串中所有子字符串出现的核心基本算法是什么？

def find_all(string,substring):
    """
    Function: Returning all the index of substring in a string
    Arguments: String and the search string
    Return:Returning a list
    """
    length = len(substring)
    c=0
    indexes = []
    while c < len(string):
        if string[c:c+length] == substring:
            indexes.append(c)
        c=c+1
    return indexes

你也可以继承 str 类到新的类中，并且可以使用下面的函数。

class newstr(str):
def find_all(string,substring):
    """
    Function: Returning all the index of substring in a string
    Arguments: String and the search string
    Return:Returning a list
    """
    length = len(substring)
    c=0
    indexes = []
    while c < len(string):
        if string[c:c+length] == substring:
            indexes.append(c)
        c=c+1
    return indexes

调用方法

newstr.find_all('你是否发现这个答案有帮助？那么请点赞！','this')

- Mike · Answer 9

我想到了一个解决方案，使用赋值表达式（Python 3.8新特性）：

string = "test test test test"
phrase = "test"
start = -1
result = [(start := string.find(phrase, start + 1)) for _ in range(string.count(phrase))]

输出：

[0, 5, 10, 15]

- WangSung · Answer 10

如果你想不使用re（正则表达式）的话，那么：

find_all = lambda _str,_w : [ i for i in range(len(_str)) if _str.startswith(_w,i) ]

string = "test test test test"
print( find_all(string, 'test') ) # >>> [0, 5, 10, 15]