在Python2.7中去除字符串中类似于Unicode \u2026 的字符

42

我有一个Python2.7的字符串,像这样:

 This is some \u03c0 text that has to be cleaned\u2026! it\u0027s annoying!

如何将它转换成这样:

This is some text that has to be cleaned! its annoying!

2
你想根据什么筛选字符?你只想保留 ASCII 吗? - root
@root,是的,我只想保留ASCII。 - Sandeep Raju Prabhakar
1个回答

89

Python 2.x

>>> s
'This is some \\u03c0 text that has to be cleaned\\u2026! it\\u0027s annoying!'
>>> print(s.decode('unicode_escape').encode('ascii','ignore'))
This is some  text that has to be cleaned! it's annoying!

Python 3.x

>>> s = 'This is some \u03c0 text that has to be cleaned\u2026! it\u0027s annoying!'
>>> s.encode('ascii', 'ignore')
b"This is some  text that has to be cleaned! it's annoying!"

2
这就是它的打印方式。 - Burhan Khalid
你为什么写 \\ 而不是 \,使用 \ 输出的结果不是和你的代码一样吗? - zhangxaochen
4
这也会去除一些字符,例如 ü, ä, ö 等,但大多数情况下这是不希望的。 - Floran Gmehlin
1
请阅读 此文档此文档 - Burhan Khalid
这对我不起作用...字符串完全相同。 - coderboi
显示剩余2条评论

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接