如何从字符串中删除所有空格

Question

如何从字符串中删除所有空格

238

我该如何在Python字符串中删除所有空格？例如，我想将像strip my spaces这样的字符串转换为stripmyspaces，但似乎无法使用strip()实现：

>>> 'strip my spaces'.strip()
'strip my spaces'

- wrongusername

16

注意，str.strip只影响开头和结尾的空格。 - Roger Pate

它也不能处理Unicode的实际空格，例如零宽度空格。有关详细信息，请参见https://dev59.com/Tm865IYBdhLWcg3wlvqq#3739928。 - Greg Dubicki

'strip my spaces'.replace(' ', '') 去除我的空格 - juan

14个回答

86

对于Python 3:

>>> import re
>>> re.sub(r'\s+', '', 'strip my \n\t\r ASCII and \u00A0 \u2003 Unicode spaces')
'stripmyASCIIandUnicodespaces'
>>> # Or, depending on the situation:
>>> re.sub(r'(\s|\u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF)+', '', \
... '\uFEFF\t\t\t strip all \u000A kinds of \u200B whitespace \n')
'stripallkindsofwhitespace'

这里提到的\s可以匹配所有ASCII空白字符，包括：

普通空格
制表符
换行符 (\n)
回车符 (\r)
换页符
垂直制表符

此外：

对于启用了re.UNICODE的Python 2，
对于没有任何额外操作的Python 3，

\s也可以匹配Unicode空格字符，例如：

不间断空格
全角空格
表意文字空格

完整列表请参见这里，"Unicode characters with White_Space property"部分。

然而，\s不能匹配那些虽然实际上是空格但却未被分类为空格的字符，例如：

零宽连接符
蒙古文元音分隔符
零宽非断空格（也称字节顺序标记）

完整列表请参见这里，"Related Unicode characters without White_Space property"部分。

因此，这6个字符被包含在第二个正则表达式的列表中：\u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF。

参考资料：

- Tim Yates

6

这个解决方案比被接受的答案要更加简洁。 - Giga Chad Coding

2

这个回答比其他的更明确，所以我认为它是最好的。 - Tristan

1

被采纳的答案已经过时了，它来自几乎没有人使用Python 3的时代，因此没有涉及Unicode字符串。它还不必要地涉及了优化，而这并不是问题的所在。那就是为什么我更新了这个答案，认为它是最好的。 - Greg Dubicki

1

@GregDubicki 感谢您的补充。我重新添加了简单选项，因为完整列表在某些情况下可能过于繁琐，或者如果您需要保留BOM，则可能会有害（但希望您不需要）。还注意到MVS，因为它是我最喜欢的Unicode字符，并且在我最初编写此代码时仍然是Zs。 :P - Tim Yates

1

那个编辑看起来有误导性（\s 在许多情况下对于 Unicode 字符串而言已经足够，不仅仅是 ASCII）。它强调了 Python 2（现在已经停止支持），并增加了复杂性。我认为简要提到 Python 2 并解释其差异就足以选择正确的方法了。 - Tim Yates

显示剩余2条评论

42

或者，

"strip my spaces".translate( None, string.whitespace )

以下是Python3版本：

"strip my spaces".translate(str.maketrans('', '', string.whitespace))

- Dan Menes

这似乎是最符合Python风格的。为什么它没有被投票到顶部？ - rbp

Python 3的代码是有效的。@DanMenes的评论已经过时。 - igo

3

名称错误: 名称'string'未定义。 - Zelphir Kaltstahl

3

你需要导入string模块。 - Shahryar Saljoughi

1

"string.whitespace" 只包含 ASCII 空格符，所以在包含 U+2028 行分隔符的字符串中会出现错误。 - user2357112

19

在Python中去除开头的空格

string1 = "    This is Test String to strip leading space"
print(string1)
print(string1.lstrip())

在Python中去除尾部空格

string2 = "This is Test String to strip trailing space     "
print(string2)
print(string2.rstrip())

在Python中从字符串开头和结尾删除空格

string3 = "    This is Test String to strip leading and trailing space      "
print(string3)
print(string3.strip())

在Python中删除所有空格

string4 = "   This is Test String to test all the spaces        "
print(string4)
print(string4.replace(" ", ""))

- JohnSmitoff

17

最简单的方法是使用replace函数：

"foo bar\t".replace(" ", "").replace("\t", "")

或者，使用正则表达式：

import re
re.sub(r"\s", "", "foo bar\t")

- carl

4

正如Roger Pate所提到的，以下代码对我有效：

s = " \t foo \n bar "
"".join(s.split())
'foobar'

我正在使用Jupyter Notebook来运行以下代码：

i=0
ProductList=[]
while i < len(new_list): 
   temp=''                            # new_list[i]=temp=' Plain   Utthapam  '
   #temp=new_list[i].strip()          #if we want o/p as: 'Plain Utthapam'
   temp="".join(new_list[i].split())  #o/p: 'PlainUtthapam' 
   temp=temp.upper()                  #o/p:'PLAINUTTHAPAM' 
   ProductList.append(temp)
   i=i+2

- Yogesh Awdhut Gadade

3

import re
re.sub(' ','','strip my spaces')

- PrabhuPrakash

4

欢迎来到 Stack Overflow。虽然我们感谢您的回答，但如果它能在其他回答的基础上提供额外的价值就更好了。在这种情况下，您的回答并没有提供额外的价值，因为另一个用户已经发布了那个解决方案。如果之前的回答对您有帮助，当您拥有足够的声望时，请投票支持它。 - Maximilian Peters

1

这并没有回答问题“如何删除所有空格”。它只是删除了空格。 - Nic

3

标准的过滤列表技术可以应用，但它们不如 split/join 或 translate 方法高效。

我们需要一组空格：

>>> import string
>>> ws = set(string.whitespace)

filter 内建函数：

>>> "".join(filter(lambda c: c not in ws, "strip my spaces"))
'stripmyspaces'

一个列表推导式（是的，请使用方括号：参见下面的基准测试）：

>>> import string
>>> "".join([c for c in "strip my spaces" if c not in ws])
'stripmyspaces'

折叠：

>>> import functools
>>> "".join(functools.reduce(lambda acc, c: acc if c in ws else acc+c, "strip my spaces"))
'stripmyspaces'

基准测试：

>>> from timeit import timeit
>>> timeit('"".join("strip my spaces".split())')
0.17734256500003198
>>> timeit('"strip my spaces".translate(ws_dict)', 'import string; ws_dict = {ord(ws):None for ws in string.whitespace}')
0.457635745999994
>>> timeit('re.sub(r"\s+", "", "strip my spaces")', 'import re')
1.017787621000025

>>> SETUP = 'import string, operator, functools, itertools; ws = set(string.whitespace)'
>>> timeit('"".join([c for c in "strip my spaces" if c not in ws])', SETUP)
0.6484303600000203
>>> timeit('"".join(c for c in "strip my spaces" if c not in ws)', SETUP)
0.950212219999969
>>> timeit('"".join(filter(lambda c: c not in ws, "strip my spaces"))', SETUP)
1.3164566040000523
>>> timeit('"".join(functools.reduce(lambda acc, c: acc if c in ws else acc+c, "strip my spaces"))', SETUP)
1.6947649049999995

- jferard

3

尝试使用re.sub正则表达式。您可以搜索所有空格并替换为空字符串。

您的模式中的\s将匹配空格字符，而不仅仅是一个空格（制表符、换行符等）。您可以在手册中了解更多相关信息。

- Matthew Iselin

我不知道如何使用正则表达式 :( - wrongusername

@wrongusername：已更新，附上re模块手册页面链接。 - Matthew Iselin

2

将字符串解析为单独的单词
去除两侧的空格
以单个空格连接它们

最终代码行：

' '.join(word.strip() for word in message_text.split()

- aleveha

缺少一个括号。 - Ben Law

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Roger Pate · Accepted Answer

利用没有 sep 参数的 str.split 行为：

>>> s = " \t foo \n bar "
>>> "".join(s.split())
'foobar'

如果您只想删除空格而不是所有的空白字符：

>>> s.replace(" ", "")
'\tfoo\nbar'

过早优化

尽管编写清晰的代码是主要目标，而非效率，以下是一些初始计时：

$ python -m timeit '"".join(" \t foo \n bar ".split())'
1000000 loops, best of 3: 1.38 usec per loop
$ python -m timeit -s 'import re' 're.sub(r"\s+", "", " \t foo \n bar ")'
100000 loops, best of 3: 15.6 usec per loop

请注意，正则表达式已被缓存，因此不会像您想象的那样慢。预先编译它可以有所帮助，但只有在您多次调用它时才会实际发挥作用。
$ python -m timeit -s 'import re; e = re.compile(r"\s+")' 'e.sub("", " \t foo \n bar ")' 100000 loops, best of 3: 7.76 usec per loop

尽管 re.sub 操作较慢，但请记住你的程序瓶颈肯定出现在其他地方。大多数程序不会注意到这 3 种选择之间的差异。