正则表达式查找子字符串并替换字符，更新整个字符串。

Question

正则表达式查找子字符串并替换字符，更新整个字符串。

3

从

string= this is, not good "type of ,question" to ask, on stackoverflow

我希望提取"type of, question"子字符串并用' '替换','。

使用re.findall()可以得到" "之间的字符列表，而使用re.search()则会返回类对象。

使用re.sub()可以替换所有','，但我需要保留那些位于双引号子串内的逗号。

有谁能帮我解决这个问题呢？

先行感谢！！

- dumb_coder

听起来你已经尝试过使用re.findall，re.search和re.sub了，是吗？请分享每个尝试的代码。 - Kevin

输出：这不是一个好的问题类型在stackoverflow上提问。 - dumb_coder

如果字符串中有两个以上的引号，应该发生什么？如果引号数量是奇数怎么办？如果在引号内部有两个“真正”的引号和一个转义引号怎么办？ - Kevin

如果new_li1中有“”，则执行以下操作： print('还需要处理', new_li1)

    sub_string = re.search(r'\""(.*?)\""', new_li1)
    print(sub_string)

- dumb_coder

1

如果代码需要处理的唯一数据是您提供的示例，那么您只需要执行result ='这是在stackoverflow上问的不好的“问题类型”'。如果您认为“非常有趣，我实际上需要它能够处理各种输入”，那么这正是我询问这些澄清问题的原因 :-) - Kevin

显示剩余4条评论

5个回答

2

另一种更灵活的方法是可以分两步完成：

查找所有被引号包含的匹配项，
在每个匹配中查找并替换 ','。

例子：

# define a pattern that gets you everything inside a double quote
pat = re.compile(r'"[^"]+"')

# re.sub the quote pattern and replace the , in each of those matches.
string = pat.sub(lambda x: x.group(0).replace(',',''), string)

# 'this is, not good "type of question" to ask, on stackoverflow'

这个方法的灵活性在于它允许您替换任意数量的','，并且一旦找到所有双引号模式，您还可以执行其他更改。

- r.ook

这个应该在更高的位置。我的解决方案没有考虑到单引号中的多个',',这是我疏忽了重要细节。 - Rocky Li

1

如何结合使用 split() 和 replace() ？

s = 'this is, not good "type of ,question" to ask, on stackoverflow'

splitted = s.split('"')
print(s.replace(splitted[1], splitted[1].replace(',', '')))

# this is, not good "type of question" to ask, on stackoverflow

注意：这在此情况下有效，但在双引号外部有完全相同的字符串的情况下无效。

- Austin

1

这个怎么样？

b=""" "hello, howdy". sample text, text then comes "Another, double, quotes" """

for str_match in re.findall(r"\".*?\"",b):
    b = re.sub(str_match,re.sub(r","," ",str_match),b)

print(b)

输出："hello howdy"。示例文本，文本后面是"Another double quotes"。

- Akhilesha

0

我不完全确定这是否符合您的所有要求，但是在您提供的模板上，以下内容将返回您所需的内容。

result = re.sub('("(?:[^"])*),((?:[^"])*")', r"\1 \2")

- Daniel F

Rocky Li的答案更加简洁，但是很有力。 - Daniel F

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Rocky Li · Accepted Answer

使用正则表达式捕获组：

import re
s= 'this is, not good "type of ,question" to ask, on stackoverflow'
re.sub(r'(".*?),(.*?")', r'\1\2', s)

输出：

'this is, not good "type of question" to ask, on stackoverflow'

说明：在正则表达式中，(stuff) 代表捕获组，分别用 \1 和 \2 来替换字符串中引号内逗号前后的部分。请注意，这也适用于单个字符串中的多个引号。