我需要在地址字段中替换类似于“north”、“south”等的文本为“N”、“S”等。我想过用字典来保存要替换的内容。假设我们有:
replacements = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "123 north anywhere street"
我可以使用replacements
字典来进行所有替换,例如通过迭代它吗?这样做的代码是什么样子?
我需要在地址字段中替换类似于“north”、“south”等的文本为“N”、“S”等。我想过用字典来保存要替换的内容。假设我们有:
replacements = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "123 north anywhere street"
我可以使用replacements
字典来进行所有替换,例如通过迭代它吗?这样做的代码是什么样子?
address = "123 north anywhere street"
for word, initial in {"NORTH":"N", "SOUTH":"S" }.items():
address = address.replace(word.lower(), initial)
print address
好的,简洁易读。
import xml.sax.saxutils as su; print(inspect.getsource(su.escape))
中看到了相同的方法,这让我们想到了print(inspect.getsource(su.__dict_replace))
。 - C8H10N4O2你很接近了,实际上是:
dictionary = {"NORTH":"N", "SOUTH":"S" }
for key in dictionary.iterkeys():
address = address.upper().replace(key, dictionary[key])
注意: 对于Python 3用户,您应该使用.keys()
而不是.iterkeys()
:
dictionary = {"NORTH":"N", "SOUTH":"S" }
for key in dictionary.keys():
address = address.upper().replace(key, dictionary[key])
address.upper().replace(...)
并不会就地修改任何内容,它只是返回一个值,并且没有被分配给任何变量。 - Enrico Borbafor key, value in dictionary.items()
同时迭代字典的键和值。我不知道这样做是否有性能上的优势,但我认为这更符合 Python 的风格。 - gionniDo you like café? No, I prefer tea.
并且你执行.replace("café", "tea")和.replace("tea", "café")时,你将得到Do you like café? No, I prefer café.
。如果替换只在一次操作中完成,"café"会变成"tea",但不会再变回"café"。例如,请参考这个问题:https://dev59.com/tW025IYBdhLWcg3wRDm8#15221068 - mouwsy我认为还没有人提出的一种选择是构建一个包含所有关键词的正则表达式,然后在字符串上执行一次替换:
>>> import re
>>> l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
>>> pattern = '|'.join(sorted(re.escape(k) for k in l))
>>> address = "123 north anywhere street"
>>> re.sub(pattern, lambda m: l.get(m.group(0).upper()), address, flags=re.IGNORECASE)
'123 N anywhere street'
>>>
这种方法的优势在于正则表达式可以忽略输入字符串的大小写而不修改它。
如果你想只操作完整的单词,那么你也可以通过简单修改模式实现:
>>> pattern = r'\b({})\b'.format('|'.join(sorted(re.escape(k) for k in l)))
>>> address2 = "123 north anywhere southstreet"
>>> re.sub(pattern, lambda m: l.get(m.group(0).upper()), address2, flags=re.IGNORECASE)
'123 N anywhere southstreet'
使用字典翻译字符串是一个非常普遍的需求。我建议您在工具箱中保留以下函数:
def translate(text, conversion_dict, before=None):
"""
Translate words from a text using a conversion dictionary
Arguments:
text: the text to be translated
conversion_dict: the conversion dictionary
before: a function to transform the input
(by default it will to a lowercase)
"""
# if empty:
if not text: return text
# preliminary transformation:
before = before or str.lower
t = before(text)
for key, value in conversion_dict.items():
t = t.replace(key, value)
return t
>>> a = {'hello':'bonjour', 'world':'tout-le-monde'}
>>> translate('hello world', a)
'bonjour tout-le-monde'
您可能正在寻找 iteritems()
:
d = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "123 north anywhere street"
for k,v in d.iteritems():
address = address.upper().replace(k, v)
地址现在为'123 N ANYWHERE STREET'
好的,如果您想保留大小写、空格和嵌套单词(例如Southstreet
不应转换为Sstreet
),请考虑使用这个简单的列表推导式:
import re
l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "North 123 East Anywhere Southstreet West"
new_address = ''.join(l[p.upper()] if p.upper() in l else p for p in re.split(r'(\W+)', address))
新地址现在是
N 123 E Anywhere Southstreet W
from functools import reduce
str_to_replace = "The string for replacement."
replacement_dict = {"The ": "A new ", "for ": "after "}
str_replaced = reduce(lambda x, y: x.replace(*y), [str_to_replace, *list(replacement_dict.items())])
print(str_replaced)
import json
import re
with open('filePath.txt') as f:
data = f.read()
with open('filePath.json') as f:
glossar = json.load(f)
for word, initial in glossar.items():
data = re.sub(r'\b' + word + r'\b', initial, data)
print(data)
def replace_values_in_string(text, args_dict):
for key in args_dict.keys():
text = text.replace(key, str(args_dict[key]))
return text
import re
l = {'NORTH':'N','SOUTH':'S','EAST':'E','WEST':'W'}
address = "123 north anywhere street"
for k, v in l.iteritems():
t = re.compile(re.escape(k), re.IGNORECASE)
address = t.sub(v, address)
print(address)
使用replace()
和format()
都不是很精确:
data = '{content} {address}'
for k,v in {"{content}":"some {address}", "{address}":"New York" }.items():
data = data.replace(k,v)
# results: some New York New York
'{ {content} {address}'.format(**{'content':'str1', 'address':'str2'})
# results: ValueError: unexpected '{' in field name
如果您需要更加精确的位置,最好使用 re.sub()
进行翻译:
import re
def translate(text, kw, ignore_case=False):
search_keys = map(lambda x:re.escape(x), kw.keys())
if ignore_case:
kw = {k.lower():kw[k] for k in kw}
regex = re.compile('|'.join(search_keys), re.IGNORECASE)
res = regex.sub( lambda m:kw[m.group().lower()], text)
else:
regex = re.compile('|'.join(search_keys))
res = regex.sub( lambda m:kw[m.group()], text)
return res
#'score: 99.5% name:%(name)s' %{'name':'foo'}
res = translate( 'score: 99.5% name:{name}', {'{name}':'foo'})
print(res)
res = translate( 'score: 99.5% name:{NAME}', {'{name}':'foo'}, ignore_case=True)
print(res)
replace()
方法会返回一个替换了出现次数的字符串的副本 - 它不会原地进行替换。 - martineau