格式化字符串 vs 字符串拼接

Question

格式化字符串 vs 字符串拼接

52

我看到很多人使用这样的格式化字符串：

root = "sample"
output = "output"
path = "{}/{}".format(root, output)

不要像这样简单地连接字符串：

path = root + '/' + output

格式化字符串是否具有更好的性能，还是仅仅为了美观？

- wjk2a1

我猜它们的性能更差。但在非性能关键的地方，我仍然更喜欢格式化字符串。 - user319799

3

可能是Python字符串格式化：％与连接的重复。 - David Ferenczy Rogožan

7个回答

38

我认为格式化主要是用于可读性，但自从3.6引入f-strings以来，性能方面的形势已经改变。我认为f-strings更易读/易维护，因为它们可以像大多数普通文本一样从左到右阅读，并且避免了连接带来的间距问题，因为变量被放在字符串中。

运行以下代码：

from timeit import timeit

runs = 1000000


def print_results(time, start_string):
    print(f'{start_string}\n'
          f'Total: {time:.4f}s\n'
          f'Avg: {(time/runs)*1000000000:.4f}ns\n')


t1 = timeit('"%s, %s" % (greeting, loc)',
            setup='greeting="hello";loc="world"',
            number=runs)
t2 = timeit('f"{greeting}, {loc}"',
            setup='greeting="hello";loc="world"',
            number=runs)
t3 = timeit('greeting + ", " + loc',
            setup='greeting="hello";loc="world"',
            number=runs)
t4 = timeit('"{}, {}".format(greeting, loc)',
            setup='greeting="hello";loc="world"',
            number=runs)

print_results(t1, '% replacement')
print_results(t2, 'f strings')
print_results(t3, 'concatenation')
print_results(t4, '.format method')

在我的机器上执行此操作的结果为：

% replacement
Total: 0.3044s
Avg: 304.3638ns

f strings
Total: 0.0991s
Avg: 99.0777ns

concatenation
Total: 0.1252s
Avg: 125.2442ns

.format method
Total: 0.3483s
Avg: 348.2690ns

在这个答案中，对于另一个问题给出了类似的回答。

- Eric Ed Lohmar

2

我认为f字符串具备三个特点：易读性、简洁（或易于编写）和最优化。我很好奇其他开发人员的想法。 - Positive Navid

14

像大多数事情一样，会有性能差异，但请问自己，“如果这个慢了几纳秒，真的很重要吗？”root + '/' 输出方法快速且易于键入。但是当您需要打印多个变量时，这可能很快变得难以阅读。

foo = "X = " + myX + " | Y = " + someY + " Z = " + Z.toString()

对比

foo = "X = {} | Y= {} | Z = {}".format(myX, someY, Z.toString())

什么更容易理解正在发生的事情？除非你真的需要提高性能，选择那种对人们阅读和理解最容易的方式。

- FuriousGeorge

13

它不仅仅是为了"外表"或强大的词法类型转换，对于国际化来说也是必不可少的。

你可以根据所选择的语言替换格式字符串。

如果在源代码中有一长串的字符串连接，这样做将变得非常困难。

- Lightness Races in Orbit

10

从Python 3.6开始，您可以通过在字符串前面添加f来进行字面字符串插值：

foo = "foo"
bar = "bar"
path = f"{foo}/{bar}"

- Cyzanfar

9

字符串格式化在绑定数据时不受数据类型限制。而在连接字符串时，我们必须根据需要进行类型转换或转换数据。

例如：

a = 10
b = "foo"
c = str(a) + " " + b
print c
> 10 foo

可以通过字符串格式化完成，例如：

a = 10
b = "foo"
c = "{} {}".format(a, b)
print c
> 10 foo

在这种情况下，在占位符{} {}内，我们假设有两个进一步的事情，即a和b。

- Muhammad Yaseen Khan

1

这就是我使用.format的原因，它可以避免出现类似于“无法连接'str'和'int'对象”的错误。 - Bert

1

@Bert 有时这是一个缺点 - 你可能希望 TypeError 从连接中传播。因此，如果使用 .format，则需要在之前添加一些额外的类型检查。所以我认为最好根据情况而定。我可能会同时使用这两种方法。 - edanfalls

3

这是为了外观和代码维护。如果使用格式，编辑你的代码会更加容易。此外，使用+时可能会忽略细节，如空格。使用格式对于你和其他可能的维护者都有好处。

- Doruk

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Kijewski · Accepted Answer

这只是为了好看而已。一眼就可以看出格式是什么。我们中的许多人更喜欢易读性而不是微调优化。

让我们看看IPython的%timeit会说什么：

Python 3.7.2 (default, Jan  3 2019, 02:55:40)
IPython 5.8.0
Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz

In [1]: %timeit root = "sample"; output = "output"; path = "{}/{}".format(root, output)
The slowest run took 12.44 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 223 ns per loop

In [2]: %timeit root = "sample"; output = "output"; path = root + '/' + output
The slowest run took 13.82 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 101 ns per loop

In [3]: %timeit root = "sample"; output = "output"; path = "%s/%s" % (root, output)
The slowest run took 27.97 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 155 ns per loop

In [4]: %timeit root = "sample"; output = "output"; path = f"{root}/{output}"
The slowest run took 19.52 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 5: 77.8 ns per loop