寻找列表的平均值

Question

寻找列表的平均值

673

如何在Python中计算列表的平均值？

[1, 2, 3, 4]  ⟶  2.5

- Carla Dessi

59

如果您有安装numpy的能力，可以使用numpy.mean函数。 - mitch

9

sum(L) / float(len(L)) 的意思是计算列表 L 的平均值。在调用代码中处理空列表，可以使用 if not L: ... 来判断。 - n611x007

抱歉，我无法翻译此内容，因为它是一个链接。请提供要翻译的具体文本。 - n611x007

6

@mitch: 这不是你是否负担得起安装numpy的问题。numpy本身就是一个完整的工具。重点是你是否真正需要numpy。为了计算平均值而安装一个16mb的C扩展程序numpy，对于那些没有在其他方面使用它的人来说，这将是非常不切实际的。 - n611x007

4

如果使用Python 3，为了计算平均值而不必安装整个NumPy包，我们可以使用统计模块完成此操作，只需使用"from statistic import mean"即可。如果使用Python 2.7或更低版本，则可以从以下源代码安装统计模块：https://hg.python.org/cpython/file/default/Lib/statistics.py 文档链接：https://docs.python.org/dev/library/statistics.html。 - 25mhz

3

可能是Python中计算算术平均数的重复问题。 - Ravindra S

25个回答

592

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]
sum(xs) / len(xs)

- yprez

35

作为C++程序员，这真是太酷了，float类型一点也不难看！ - lahjaton_j

3

如果你想要在小数点后保留一些数字，这段代码可能会很有用：float('%.2f' % float(sum(l) / len(l)))。 - Steinfeld

3

@Steinfeld 我认为将其转换为字符串不是最好的方法。你可以通过 round(result, 2) 以更简洁的方式实现相同的效果。 - yprez

321

使用 numpy.mean:

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import numpy as np
print(np.mean(xs))

- Akavall

8

很奇怪。我本以为这会更有效率，但似乎在一个随机的浮点数列表上，它比简单地使用“sum(l)/len(l)”要慢8倍。 - L. Amber O'Hearn

12

哦，但是np.array(l).mean()更快。 - L. Amber O'Hearn

10

@L.AmberO'Hearn，我刚刚计时了一下，np.mean(l)和np.array(l).mean()的速度差不多，而sum(l)/len(l)的速度大约是它们的两倍。我使用的是l = list(np.random.rand(1000))，当然如果l是numpy.array，那么两种numpy方法的速度都会更快。 - Akavall

13

除非安装NumPy就是唯一的原因，否则为了进行均值计算而安装一个16MB的C包看起来在这个规模上非常奇怪。 - n611x007

另外，最好使用 np.nanmean(l) 来避免在计算时出现 NAN 和零除的问题。 - Elias

242

对于 Python 3.4 及以上版本，请使用新的 statistics 模块中的 mean() 函数来计算平均值：（链接）（链接）（链接）

from statistics import mean
xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]
mean(xs)

- Marwan Alsabbagh

32

这是最优雅的答案，因为它使用了一个标准库模块，这个模块自 Python 3.4 以来就可用。 - Serge Stroobandt

6

而且它在数值上更加稳定。 - Antti Haapala -- Слава Україні

如果您意外传递了一个空列表，它会产生更好的错误提示statistics.StatisticsError: mean requires at least one data point，而不是使用sum(x) / len(x)解决方案时出现更加晦涩的ZeroDivisionError: division by zero。 - user3064538

51

Python已经有完全可用的sum()函数了，为什么要使用reduce()呢？

print sum(l) / float(len(l))

(在Python 2中，float()是必要的，可以强制Python进行浮点数除法。)

- kindall

35

对于那些第一次听到“cromulent”这个词的人 - RolfBly

2

Python 3 上不需要使用 float()。 - user3064538

39

如果您使用的是Python >= 3.4版本，则有一个统计库可用。

https://docs.python.org/3/library/statistics.html

您可以像这样使用它的mean方法。假设您有一个数字列表，想要找到平均值：

list = [11, 13, 12, 15, 17]
import statistics as s
s.mean(list)

它还有其他方法，如stdev、variance、mode、harmonic mean、median等，这些方法也非常有用。

- Chetan Sharma

19

不需要将数值强制转换为浮点型，只需将0.0加入到总和中：

def avg(l):
    return sum(l, 0.0) / len(l)

- Maxime Chéramy

17

编辑：

我添加了另外两种获取列表平均值的方法（仅适用于Python 3.8+）。这是我所做的比较：

import timeit
import statistics
import numpy as np
from functools import reduce
import pandas as pd
import math

LIST_RANGE = 10
NUMBERS_OF_TIMES_TO_TEST = 10000

l = list(range(LIST_RANGE))

def mean1():
    return statistics.mean(l)


def mean2():
    return sum(l) / len(l)


def mean3():
    return np.mean(l)


def mean4():
    return np.array(l).mean()


def mean5():
    return reduce(lambda x, y: x + y / float(len(l)), l, 0)

def mean6():
    return pd.Series(l).mean()


def mean7():
    return statistics.fmean(l)


def mean8():
    return math.fsum(l) / len(l)


for func in [mean1, mean2, mean3, mean4, mean5, mean6, mean7, mean8 ]:
    print(f"{func.__name__} took: ",  timeit.timeit(stmt=func, number=NUMBERS_OF_TIMES_TO_TEST))

这是我得到的结果：

mean1 took:  0.09751558300000002
mean2 took:  0.005496791999999973
mean3 took:  0.07754683299999998
mean4 took:  0.055743208000000044
mean5 took:  0.018134082999999968
mean6 took:  0.6663848750000001
mean7 took:  0.004305374999999945
mean8 took:  0.003203333000000086

有趣！看起来 math.fsum(l) / len(l) 是最快的方法，然后是 statistics.fmean(l)，最后才是 sum(l) / len(l)。很好！


感谢 @Asclepius 向我展示这两种其他方法！

旧答案：
就效率和速度而言，这些是我测试其他答案得到的结果：
# test mean caculation

import timeit
import statistics
import numpy as np
from functools import reduce
import pandas as pd

LIST_RANGE = 10
NUMBERS_OF_TIMES_TO_TEST = 10000

l = list(range(LIST_RANGE))

def mean1():
    return statistics.mean(l)


def mean2():
    return sum(l) / len(l)


def mean3():
    return np.mean(l)


def mean4():
    return np.array(l).mean()


def mean5():
    return reduce(lambda x, y: x + y / float(len(l)), l, 0)

def mean6():
    return pd.Series(l).mean()



for func in [mean1, mean2, mean3, mean4, mean5, mean6]:
    print(f"{func.__name__} took: ",  timeit.timeit(stmt=func, number=NUMBERS_OF_TIMES_TO_TEST))
结果如下：
mean1 took:  0.17030245899968577
mean2 took:  0.002183011999932205
mean3 took:  0.09744236000005913
mean4 took:  0.07070840100004716
mean5 took:  0.022754742999950395
mean6 took:  1.6689282460001778
显然，获胜者是:
sum(l) / len(l)

- Alon Gouldman

我尝试了一个长度为100000000的列表的这些时间：mean2 < 1秒; mean3,4约8秒; mean5,6约27秒; mean1约1分钟。我发现这很令人惊讶，本以为numpy在处理大型列表时表现最佳，但事实并非如此！似乎统计包存在问题！（这是在Mac笔记本上使用Python 3.8，据我所知没有BLAS）。 - drevicko

顺便说一下，如果我先将l转换为np.array，然后使用np.mean函数，速度大约为0.16秒，比sum(l)/len(l)快6倍。结论是：如果你要进行大量的计算，最好全部使用numpy来完成。 - drevicko

@drevicko 看看 mean4，这是我在那里做的... 我猜它已经是一个 np.array，那么使用 np.mean 就有意义了，但如果你有一个列表，那么你应该使用 sum(l) / len(l)。 - Alon Gouldman

2

没错！这也取决于你以后要用它做什么。在我的工作中，我通常会进行一系列的计算，因此最好在开始时转换为numpy并利用numpy快速的底层库。 - drevicko

@AlonGouldman 很好。我建议将每个速度显示为1/1000秒（作为整数），否则数字很难读取。例如，170、2、97等。这应该使其更容易阅读。请告诉我是否完成了，我会检查的。 - Asclepius

11

sum(l) / float(len(l)) 是正确答案，但为了完备性，你也可以使用单个reduce函数来计算平均值：

>>> reduce(lambda x, y: x + y / float(len(l)), l, 0)
20.111111111111114

注意，这可能会导致轻微的舍入误差：

>>> sum(l) / float(len(l))
20.111111111111111

- Andrew Clark

我知道这只是为了好玩，但对于一个空列表返回0可能不是最好的选择。 - Johan Lundberg

1

@JohanLundberg - 你可以在 reduce() 函数的最后一个参数中，将 0 替换为 False。这将使空列表返回 False，否则会像以前一样返回平均值。 - Andrew Clark

@AndrewClark 为什么你要强制使用 float 来计算 len？ - EndermanAPM

11

我尝试使用上述选项，但没有起作用。请尝试以下方法：

from statistics import mean

n = [11, 13, 15, 17, 19]

print(n)
print(mean(n))

曾经使用过 Python 3.5

- Ngury Mangueira

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Herms · Accepted Answer

对于Python 3.8+，使用statistics.fmean 来保证在使用浮点数时具有数值稳定性。(速度更快)

对于Python 3.4+，使用statistics.mean 来保证在使用浮点数时具有数值稳定性。(速度较慢)

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import statistics
statistics.mean(xs)  # = 20.11111111111111

对于较旧的Python 3版本，请使用

sum(xs) / len(xs)

对于Python 2，将len转换为浮点数以获得浮点除法：

sum(xs) / float(len(xs))