如何在Python中删除特定字符之前的所有字符？

Question

如何在Python中删除特定字符之前的所有字符？

79

我想要删除指定字符或一组字符之前的所有字符（例如）:

intro = "<>I'm Tom."

现在我想移除 I'm (更具体地说是I）前面的<>。你有什么建议吗？

- Saroekin

1

指定字符是什么？ - Simeon Visser

在这种情况下，它是 I。 - Saroekin

1

我明白这一点，但在其他情况下呢？我们如何知道文本从哪里开始？ - Simeon Visser

好的，我正在筛选我在文本中寻找的内容；因此，你可以使用循环、拆分文本/单词等方式来确定它的起始位置。 - Saroekin

12个回答

48

str.find可以找到某个字符串的首次出现的字符索引：

intro[intro.find('I'):]

- duan

6

如果字符串中缺少这个字符，那么这将仅返回输入字符串的最后一个字符，因为 .find 将返回 -1，而 some_str[-1:] 是“从最后一个字符开始返回所有字符”。 - user3064538

3

谢谢你部分帮助我。如果想删除找到的字符索引之前/包括找到的字符索引之前的所有字符，请使用以下代码：intro[intro.find('I')+1:]。 - Fisal Assubaieye

30

由于index(char)将为您获取字符的第一个索引，因此您可以简单地使用string[index(char):]。

例如，在这种情况下，index("I") = 2，并且intro[2:] = "I'm Tom."

- Ashkay

1

没问题。这也适用于任何字符串。请注意：1）您可能需要确保索引是有效的，即不为-1；2）index仅返回给定字符串的第一个出现。 - Ashkay

25

实际示例为：intro[intro.index('I'):]该代码的作用是从字符串intro中找到第一个字母'I'的位置，并返回从该位置开始到字符串末尾的子串。 - mattalxndr

1

如果字符在字符串中不存在，这将引发 ValueError。 - user3064538

8

如果您知道要删除的起始字符位置，那么可以使用切片符号：

intro = intro[2:]

如果你不知道从何入手，可以使用lstrip()函数，并且知道要删除的字符：

intro = intro.lstrip("<>")

- Brent Washburne

3

str = "<>I'm Tom."
temp = str.split("I",1)
temp[0]=temp[0].replace("<>","")
str = "I".join(temp)

- ahmad valipour

不是downvoter，但你可以使用'I' + intro.split('I', 1)[1]。 - Avinash Raj

@AvinashRaj 我也不是，但这会如何（真的很好奇）使函数有所不同呢？在我看来，你是在分割 I 之前的所有内容吗？另外，[1] 代表什么？ - Saroekin

1

分割列表的第1个索引 - Avinash Raj

2

>>> intro = "<>I'm Tom."
#Just split the string at the special symbol

>>> intro.split("<>")

Output = ['', "I'm Tom."]

>>> new = intro.split("<>")

>>> new[1]
"I'm Tom."

- Chethan Raj

2

import re

date_div = "Blah blah\nblah, Updated: Aug. 23, 2012 Blah blah Updated: Feb. 13, 2019"

up_to_word = ":"
rx_to_first = r'^.*?{}'.format(re.escape(up_to_word))
rx_to_last = r'^.*{}'.format(re.escape(up_to_word))

# (Dot.) In the default mode, this matches any character except a newline. 
# If the DOTALL flag has been specified, this matches any character including a newline.

print("Remove all up to the first occurrence of the word including it:")
print(re.sub(rx_to_first, '', date_div, flags=re.DOTALL).strip())

print("Remove all up to the last occurrence of the word including it:")
print(re.sub(rx_to_last, '', date_div, flags=re.DOTALL).strip())

- Rhea Thomas

在这种方法中，我该如何保留第一个和最后一个出现的元素？ - joey11235

2

我遍历了字符串并传递了索引。

intro_list = []

intro = "<>I'm Tom."
for i in range(len(intro)):
    if intro[i] == '<' or intro[i] == '>':
        pass
    else:
        intro_list.append(intro[i])

intro = ''.join(intro_list)
print(intro)

- Mafematic

0

根据@AvinashRaj的答案，您可以使用re.sub通过正则表达式将子字符串替换为字符串或字符：

missing import re

output_str = re.sub(r'^.*?I', 'I', input_str)

- quent

0

你可以使用itertools.dropwhile来删除在看到某个字符之前的所有字符。然后，你可以使用''.join()将结果可迭代对象转换回字符串：

from itertools import dropwhile
''.join(dropwhile(lambda x: x not in stop, intro))

这将输出：

I'm Tom.

- BrokenBenchmark

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Avinash Raj · Accepted Answer

70

使用re.sub函数。匹配直到I的所有字符，然后用匹配的字符替换为I。

re.sub(r'^.*?I', 'I', stri)

- Avinash Raj

我对 re 还比较陌生，我会再深入了解一下；感谢您的回答，谢谢！ - Saroekin

2

请注意，您可以在第一个或最后一个 I 之间切换 re.sub(r'.*?I', 'I', stri)。但其他答案将无法满足此要求。 - Avinash Raj

1

那么你是说re是最好的选择？你有没有一些好的教程/文章来解释re的基础知识？感谢你的帮助。 - Saroekin

2

选择答案完全取决于您。是的，学习正则表达式对于每个开发人员来说都是必须的，因为只有少数语言不使用正则表达式。 - Avinash Raj

2

缺少 import re。 - quent

显示剩余4条评论