Python获取字符串中所有子字符串出现的索引范围

Question

Python获取字符串中所有子字符串出现的索引范围

3

什么是获取字符串中子字符串的所有首尾索引对的最佳方法？

例如，如果我的字符串s是“abcdegf”，那么子字符串“bcd”是s[1:4]。

这个函数可以给我答案，但我会很惊讶如果没有更优雅的解决方案。

>>> def substring_range(s, substring):
    for i in range(len(s)-len(substring)):
        if s[i:i+len(substring)] == substring:
            yield (i, i+len(substring))


>>> [x for x in substring_range('abcdabcd', 'bc')]
[(1, 3), (5, 7)]

- aberger

"abcdegf".index("bcd") 和 "abcdegf".index("bcd") + len("bcd")？ - Psidom

str.find和str.index都返回子字符串的索引。要获取索引范围，您可以使用一个元组，将该索引作为第一个值，将索引加上子字符串长度作为第二个值。 - Random Davis

@RandomDavis 这是正确的，但我只能找到子字符串的第一个出现。我会更新以确保找到所有出现的条件。 - aberger

我已更新我的回答以支持多个出现次数。 - Denis Olehov

2个回答

2

可能是这样的东西吗？

control_s, sub_str = "abcdegfbcd", "bcd"

def subs_str_finder(control_s, sub_str):
    """
    Finds indexes of all sub_str occurences in control_s.
    """
    sub_len = len(sub_str)

    while sub_str in control_s:
        first_index = control_s.find(sub_str)
        second_index = first_index + sub_len
        yield first_index, second_index

        control_s = control_s.replace(sub_str, "", 1)

for first_index, second_index in subs_str_finder(control_s, sub_str):
    print(first_index, second_index)

更新: 支持多个子字符串出现。

- Denis Olehov

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Wiktor Stribiżew · Accepted Answer

您可以利用正则表达式，match.start()将返回起始位置，match.end()将提供结束位置（搜索是一个文字字符串，因此必须进行re.escape处理）：

import re
def substring_range(s, substring):
    for i in re.finditer(re.escape(substring), s):
        yield (i.start(), i.end())

s = "abcdegfbcd"
substring = "bcd"
print([x for x in substring_range(s, substring)])

请查看Python演示。