将函数应用于 Pandas DataFrame 时出现值错误（仅有一个参数）

Question

将函数应用于 Pandas DataFrame 时出现值错误（仅有一个参数）

3

看起来我可以对DataFrame应用一些函数而没有问题，但是其他的会给出一个值错误。

dates = pd.date_range('20130101',periods=6)
data = np.random.randn(6,4)

df = pd.DataFrame(data,index=dates,columns=list('ABCD'))

def my_max(y):
    return max(y,0)

def times_ten(y):
    return 10*y

df.apply(lambda x:times_ten(x)) # Works fine
df.apply(lambda x:my_max(x)) # Doesn't work

第一个应用程序正常运行，第二个则会生成以下错误信息：

ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index A')

我知道可以通过其他方式（例如，通过df[df<0]=0）生成“max(df,0)”，因此我不是在寻找解决这个特定问题的方法。相反，我对为什么上面的应用程序不起作用感兴趣。

- Moose

从文档中：如果只有一个可迭代参数，则返回其最大项。如果有两个或更多参数，则返回最大的参数。因此，它在这种情况下无法理解pandas Series或numpy数组，这就是为什么您应该使用可以理解它们的东西。 - EdChum

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- behzad.nouri · Accepted Answer

max无法处理一个标量和一个数组：

>>> max(df['A'], 0)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

你可以使用np.maximum函数进行逐元素最大值比较：

>>> def my_max(y):
...     return np.maximum(y, 0)
... 
>>> df.apply(lambda x:my_max(x))
                A      B      C      D
2013-01-01  0.000  0.000  0.178  0.992
2013-01-02  0.000  1.060  0.000  0.000
2013-01-03  0.528  2.408  2.679  0.000
2013-01-04  0.564  0.573  0.320  1.220
2013-01-05  0.903  0.497  0.000  0.032
2013-01-06  0.505  0.000  0.000  0.000

或者使用.applymap，该函数按元素进行操作：

>>> def my_max(y):
...     return max(y,0)
... 
>>> df.applymap(lambda x:my_max(x))
                A      B      C      D
2013-01-01  0.000  0.000  0.178  0.992
2013-01-02  0.000  1.060  0.000  0.000
2013-01-03  0.528  2.408  2.679  0.000
2013-01-04  0.564  0.573  0.320  1.220
2013-01-05  0.903  0.497  0.000  0.032
2013-01-06  0.505  0.000  0.000  0.000