部分字符串格式化

Question

部分字符串格式化

180

使用高级字符串格式化方法，类似于字符串模板中的safe_substitute()函数，是否可以进行部分字符串格式化？

例如：

s = '{foo} {bar}'
s.format(foo='FOO') #Problem: raises KeyError 'bar'

- P3trus

3

在我看來，這個問題最好的答案不在這裡，而是在一個相似（但已關閉）的問題中：https://dev59.com/x2Qm5IYBdhLWcg3w8iuD#17215533。 - mmj

25个回答

164

如果您知道您正在格式化的顺序：

s = '{foo} {{bar}}'

使用方法如下：

ss = s.format(foo='FOO') 
print ss 
>>> 'FOO {bar}'

print ss.format(bar='BAR')
>>> 'FOO BAR'

你不能同时指定foo和bar - 你必须按顺序执行。

- aaren

1

这有什么意义？如果我同时指定foo和bar：s.format(foo='FOO'，bar='BAR')，那么无论如何我仍然得到'FOO {bar}'。你能澄清一下吗？ - n611x007

12

你无法同时填写两个表格，这很烦人。当你必须分阶段格式化字符串并且了解这些阶段的顺序时，这很有用。 - aaren

1

你可能应该设计一种方法来避免这样做，但也许你被迫这么做。 - aaren

2

不知道这个。我有几个使用情况，我想要将一个字符串“预置”为一个小模板。 - ejrb

当您在代码的某个部分填充字符串的一部分，但在代码的另一个部分留下占位符以便稍后填充时，这非常有用。 - Alex Petralia

显示剩余2条评论

73

你可以通过覆盖映射来欺骗它进行部分格式化：

import string

class FormatDict(dict):
    def __missing__(self, key):
        return "{" + key + "}"

s = '{foo} {bar}'
formatter = string.Formatter()
mapping = FormatDict(foo='FOO')
print(formatter.vformat(s, (), mapping))

打印

FOO {bar}

当然，这种基本实现只适用于基本情况。

- Sven Marnach

11

这种方法无法适用于更高级的格式，比如{bar:1.2f}。 - MaxNoe

我理解说“最基本的实现只能正确地处理基本情况”，但是否有一种方法可以扩展到甚至不删除格式规范？ - Tadhg McDonald-Jensen

5

可以的。在__missing__()中，不要返回字符串，而是返回一个自定义类的实例，并在该类中重写__format__()方法，以包含原始占位符和格式规范。这是一个概念验证：http://ideone.com/xykV7R - Sven Marnach

2

@norok2 这是对评论中提出的问题的回答，所以我把回复放在了评论中。原始问题实际上并没有包含这个要求，而且我通常认为尝试部分格式化字符串有点奇怪。 - Sven Marnach

ChatGPT把上面的代码复制了一遍，浪费了我的时间，而答案是使用partial，就像其他答案中所示。 - Leo

显示剩余5条评论

63

.format() 的局限性 - 无法进行部分替换 - 一直让我感到困扰。

在评估编写自定义 Formatter 类（如许多答案中所描述的）甚至考虑使用第三方软件包（如lazy_format）之后，我发现了一个更简单的内置解决方案：模板字符串

它提供了类似的功能，但还通过 safe_substitute() 方法提供了部分替换。模板字符串需要有一个 $ 前缀（这感觉有些奇怪 - 但我认为总体解决方案更好）。

import string
template = string.Template('${x} ${y}')
try:
  template.substitute({'x':1}) # raises KeyError
except KeyError:
  pass

# but the following raises no error
partial_str = template.safe_substitute({'x':1}) # no error

# partial_str now contains a string with partial substitution
partial_template = string.Template(partial_str)
substituted_str = partial_template.safe_substitute({'y':2}) # no error
print substituted_str # prints '12'

基于这个，形成一个便捷的包装器：

class StringTemplate(object):
    def __init__(self, template):
        self.template = string.Template(template)
        self.partial_substituted_str = None

    def __repr__(self):
        return self.template.safe_substitute()

    def format(self, *args, **kws):
        self.partial_substituted_str = self.template.safe_substitute(*args, **kws)
        self.template = string.Template(self.partial_substituted_str)
        return self.__repr__()


>>> s = StringTemplate('${x}${y}')
>>> s
'${x}${y}'
>>> s.format(x=1)
'1${y}'
>>> s.format({'y':2})
'12'
>>> print s
12

类似于Sven答案的包装器，它使用默认字符串格式化：

class StringTemplate(object):
    class FormatDict(dict):
        def __missing__(self, key):
            return "{" + key + "}"

    def __init__(self, template):
        self.substituted_str = template
        self.formatter = string.Formatter()

    def __repr__(self):
        return self.substituted_str

    def format(self, *args, **kwargs):
        mapping = StringTemplate.FormatDict(*args, **kwargs)
        self.substituted_str = self.formatter.vformat(self.substituted_str, (), mapping)

- Mohan Raj

30

不确定这是否可以作为一个快速解决方法，但怎么样？

s = '{foo} {bar}'
s.format(foo='FOO', bar='{bar}')

？：）

- Memphis

我完全做了同样的事情，但希望知道这样做是否有什么注意事项。 - ramgo

1

@ramgo 一个注意事项：如果“次要”占位符使用某些格式说明符，则此方法将无法正常工作。例如：'{foo} {bar:3.6f}'.format(foo='FOO', bar='{bar}') 将会出现错误 ValueError: Unknown format code 'f' for object of type 'str'。 - 0x5453

11

如果您定义了自己的Formatter并重写get_value方法，则可以使用它将未定义的字段名称映射到任何您想要的内容：

例如，如果bar不在kwargs中，则可以将其映射为"{bar}"。

但是，这需要使用您的Formatter对象的format()方法，而不是字符串的format()方法。 http://docs.python.org/library/string.html#string.Formatter.get_value

- Amber

似乎是 Python >= 2.6 的特性。 - n611x007

9

>>> 'fd:{uid}:{{topic_id}}'.format(uid=123)
'fd:123:{topic_id}'

试一下这个。

- Pengfei.X

哇，正是我需要的！你能解释一下吗？ - Sergey Chizhik

2

{{ 和 }} 是一种转义格式标记的方式，因此 format() 不会执行替换操作，而是将 {{ 和 }} 分别替换为 { 和 }。 - 7yl4r

这个解决方案的问题在于双重{{}}只适用于一个格式，如果您需要应用更多，您需要添加更多{}。例如：'fd:{uid}:{{topic_id}}'.format(uid=123).format(a=1)将返回错误，因为第二个格式没有提供topic_id值。 - Franzi

7

感谢 Amber 的评论，我想出了这个：

import string

try:
    # Python 3
    from _string import formatter_field_name_split
except ImportError:
    formatter_field_name_split = str._formatter_field_name_split


class PartialFormatter(string.Formatter):
    def get_field(self, field_name, args, kwargs):
        try:
            val = super(PartialFormatter, self).get_field(field_name, args, kwargs)
        except (IndexError, KeyError, AttributeError):
            first, _ = formatter_field_name_split(field_name)
            val = '{' + field_name + '}', first
        return val

- gatto

似乎是Python >= 2.6的特性。 - n611x007

我一定会使用这个解决方案 :) 谢谢！ - astrojuanlu

2

请注意，这将丢失转换和格式规范（如果存在），并且实际上将格式规范应用于返回的值。即({field!s: >4}变成{field})。 - Brendan Abel

6

我找到的所有解决方案似乎都存在更高级规范或转换选项的问题。@SvenMarnach的FormatPlaceholder非常聪明，但是它无法正确处理强制转换（例如{a!s:>2s}），因为它调用了__str__方法（在此示例中），而不是__format__，导致您失去了任何其他格式。

这是我最终得出的一些关键特点：

sformat('The {} is {}', 'answer')
'The answer is {}'

sformat('The answer to {question!r} is {answer:0.2f}', answer=42)
'The answer to {question!r} is 42.00'

sformat('The {} to {} is {:0.{p}f}', 'answer', 'everything', p=4)
'The answer to everything is {:0.4f}'

提供与 str.format 相似的接口（不仅仅是映射）
支持更复杂的格式选项:
- 强制转换 {k!s} {!r}
- 嵌套 {k:>{size}}
- 获取属性 {k.foo}
- 获取元素 {k[0]}
- 强制转换和格式化 {k!s:>{size}}

import string


class SparseFormatter(string.Formatter):
    """
    A modified string formatter that handles a sparse set of format
    args/kwargs.
    """

    # re-implemented this method for python2/3 compatibility
    def vformat(self, format_string, args, kwargs):
        used_args = set()
        result, _ = self._vformat(format_string, args, kwargs, used_args, 2)
        self.check_unused_args(used_args, args, kwargs)
        return result

    def _vformat(self, format_string, args, kwargs, used_args, recursion_depth,
                 auto_arg_index=0):
        if recursion_depth < 0:
            raise ValueError('Max string recursion exceeded')
        result = []
        for literal_text, field_name, format_spec, conversion in \
                self.parse(format_string):

            orig_field_name = field_name

            # output the literal text
            if literal_text:
                result.append(literal_text)

            # if there's a field, output it
            if field_name is not None:
                # this is some markup, find the object and do
                #  the formatting

                # handle arg indexing when empty field_names are given.
                if field_name == '':
                    if auto_arg_index is False:
                        raise ValueError('cannot switch from manual field '
                                         'specification to automatic field '
                                         'numbering')
                    field_name = str(auto_arg_index)
                    auto_arg_index += 1
                elif field_name.isdigit():
                    if auto_arg_index:
                        raise ValueError('cannot switch from manual field '
                                         'specification to automatic field '
                                         'numbering')
                    # disable auto arg incrementing, if it gets
                    # used later on, then an exception will be raised
                    auto_arg_index = False

                # given the field_name, find the object it references
                #  and the argument it came from
                try:
                    obj, arg_used = self.get_field(field_name, args, kwargs)
                except (IndexError, KeyError):
                    # catch issues with both arg indexing and kwarg key errors
                    obj = orig_field_name
                    if conversion:
                        obj += '!{}'.format(conversion)
                    if format_spec:
                        format_spec, auto_arg_index = self._vformat(
                            format_spec, args, kwargs, used_args,
                            recursion_depth, auto_arg_index=auto_arg_index)
                        obj += ':{}'.format(format_spec)
                    result.append('{' + obj + '}')
                else:
                    used_args.add(arg_used)

                    # do any conversion on the resulting object
                    obj = self.convert_field(obj, conversion)

                    # expand the format spec, if needed
                    format_spec, auto_arg_index = self._vformat(
                        format_spec, args, kwargs,
                        used_args, recursion_depth-1,
                        auto_arg_index=auto_arg_index)

                    # format the object and append to the result
                    result.append(self.format_field(obj, format_spec))

        return ''.join(result), auto_arg_index


def sformat(s, *args, **kwargs):
    # type: (str, *Any, **Any) -> str
    """
    Sparse format a string.

    Parameters
    ----------
    s : str
    args : *Any
    kwargs : **Any

    Examples
    --------
    >>> sformat('The {} is {}', 'answer')
    'The answer is {}'

    >>> sformat('The answer to {question!r} is {answer:0.2f}', answer=42)
    'The answer to {question!r} is 42.00'

    >>> sformat('The {} to {} is {:0.{p}f}', 'answer', 'everything', p=4)
    'The answer to everything is {:0.4f}'

    Returns
    -------
    str
    """
    return SparseFormatter().format(s, *args, **kwargs)

在编写测试以确定该方法的行为方式后，我发现各种实施方案存在问题。如果有人发现它们有用，下面是这些测试。

import pytest


def test_auto_indexing():
    # test basic arg auto-indexing
    assert sformat('{}{}', 4, 2) == '42'
    assert sformat('{}{} {}', 4, 2) == '42 {}'


def test_manual_indexing():
    # test basic arg indexing
    assert sformat('{0}{1} is not {1} or {0}', 4, 2) == '42 is not 2 or 4'
    assert sformat('{0}{1} is {3} {1} or {0}', 4, 2) == '42 is {3} 2 or 4'


def test_mixing_manualauto_fails():
    # test mixing manual and auto args raises
    with pytest.raises(ValueError):
        assert sformat('{!r} is {0}{1}', 4, 2)


def test_kwargs():
    # test basic kwarg
    assert sformat('{base}{n}', base=4, n=2) == '42'
    assert sformat('{base}{n}', base=4, n=2, extra='foo') == '42'
    assert sformat('{base}{n} {key}', base=4, n=2) == '42 {key}'


def test_args_and_kwargs():
    # test mixing args/kwargs with leftovers
    assert sformat('{}{k} {v}', 4, k=2) == '42 {v}'

    # test mixing with leftovers
    r = sformat('{}{} is the {k} to {!r}', 4, 2, k='answer')
    assert r == '42 is the answer to {!r}'


def test_coercion():
    # test coercion is preserved for skipped elements
    assert sformat('{!r} {k!r}', '42') == "'42' {k!r}"


def test_nesting():
    # test nesting works with or with out parent keys
    assert sformat('{k:>{size}}', k=42, size=3) == ' 42'
    assert sformat('{k:>{size}}', size=3) == '{k:>3}'


@pytest.mark.parametrize(
    ('s', 'expected'),
    [
        ('{a} {b}', '1 2.0'),
        ('{z} {y}', '{z} {y}'),
        ('{a} {a:2d} {a:04d} {y:2d} {z:04d}', '1  1 0001 {y:2d} {z:04d}'),
        ('{a!s} {z!s} {d!r}', '1 {z!s} {\'k\': \'v\'}'),
        ('{a!s:>2s} {z!s:>2s}', ' 1 {z!s:>2s}'),
        ('{a!s:>{a}s} {z!s:>{z}s}', '1 {z!s:>{z}s}'),
        ('{a.imag} {z.y}', '0 {z.y}'),
        ('{e[0]:03d} {z[0]:03d}', '042 {z[0]:03d}'),
    ],
    ids=[
        'normal',
        'none',
        'formatting',
        'coercion',
        'formatting+coercion',
        'nesting',
        'getattr',
        'getitem',
    ]
)
def test_sformat(s, expected):
    # test a bunch of random stuff
    data = dict(
        a=1,
        b=2.0,
        c='3',
        d={'k': 'v'},
        e=[42],
    )
    assert expected == sformat(s, **data)

- Sam Bourne

我添加了一个类似于@SvenMarnach代码的答案，但它可以正确处理你的测试中的强制转换。 - Tohiko

3

对我来说，这已经足够好了：

>>> ss = 'dfassf {} dfasfae efaef {} fds'
>>> nn = ss.format('f1', '{}')
>>> nn
'dfassf f1 dfasfae efaef {} fds'
>>> n2 = nn.format('whoa')
>>> n2
'dfassf f1 dfasfae efaef whoa fds'

- utilizator

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Saikiran Yerram · Accepted Answer

165

你可以使用Python中的functools库中的partial函数，它既简短易懂，也能明确程序员的意图。

from functools import partial

s = partial("{foo} {bar}".format, foo="FOO")
print s(bar="BAR")
# FOO BAR

- Saikiran Yerram

4

不仅是最短和最易读的解决方案，还能描述编码者的意图。Python3版本：

from functools import partial
s = "{foo} {bar}".format
s_foo = partial(s, foo="FOO")
print(s_foo(bar="BAR")) # FOO BAR
print(s(foo="FOO", bar="BAR")) # FOO BAR

- Paul Brown

17

嗯，我不确定这是否正是大多数人所寻找的。如果我需要对部分格式化的字符串（即“FOO {bar}”）进行处理，那么partial()方法是无法帮助我的。 - Delgan

2

这种方式更适用于你不能百分之百控制的输入情况。想象一下："{foo} {{bar}}".format(foo="{bar}").format(bar="123")，与其他示例不同。我期望得到"{bar} 123"，但它们却输出了"123 123"。 - Benjamin Manns

1

使用partial函数时，如果我们设置了partial参数，则不会返回字符串。 - TomSawyer

哎呀，这太棒了。 - VimNing

显示剩余2条评论