如何在Python中从字符串开头删除特殊字符

Question

如何在Python中从字符串开头删除特殊字符

pythonregexstringsubstring

5

我从 XML 获取数据，有时会出现以特殊字符开头的情况，例如：

'This is a sample title or %&*I don't know if this is the text

我尝试使用以下代码来解决问题：title[0].isstring() 或 title[0].isdigit()，然后删除这些字符。但是如果开头有多个特殊字符，该怎么办？需要使用 for 循环吗？

- Simsons

1

我建议您检查为什么从XML文档中获取“特殊字符”。文档是否以UTF-8编码 - 您是否正确解码了XML？有时看到特殊字符通常是编码问题，而不是XML内容的问题。 - lifeisstillgood

4个回答

1

使用strip函数从字符串的开头和结尾删除任何特殊字符。例如：

str = ").* this is text .("
str.strip(")(.* ")

Output: 'this is text'

如果您想从字符串开头删除内容，请使用lstrip() 例如：

str = ").* this is text .("
str.lstrip(")(.* ")

Output: 'this is text .('

如果您想从字符串末尾删除，请使用rstrip()。例如：

str = ").* this is text .("
str.rstrip(")(.* ")

Output: ').* this is text'

- Umang Suthar

1

>>> import re
>>> re.sub(r'^\W*', '', "%&*I don't know if this is the text")
"I don't know if this is the text"

#or

>>> "%&*I don't know if this is the text".lstrip("!@#$%^&*()")
"I don't know if this is the text"

- Adam Jurczyk

2

\W+ 更好。\W* 也匹配空字符串，因此即使没有要替换的内容，仍需要进行替换操作和字符串重新赋值。 - Tim Pietzcker

1

如果您只想删除一些特定类型的字符，请使用lstrip()（“左侧修剪”）。

例如，如果您想要删除任何以%、&或*字符开头的内容，则可以使用：

actual_title = title.lstrip("%&*")

另一方面，如果您想删除任何不属于某个特定集合（例如字母数字字符）的字符，则 Tim Pietzcker 的正则表达式解决方案可能是最简单的方法。

- Amber

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Tim Pietzcker · Accepted Answer

你可以使用正则表达式：

import re
mystring = re.sub(r"^\W+", "", mystring)

这将从字符串开头删除所有非字母数字字符：

说明：

^   # Start of string
\W+ # One or more non-alphanumeric characters