Matplotlib/Pandas使用直方图时出现错误

Question

Matplotlib/Pandas使用直方图时出现错误

77

我无法从pandas系列对象中制作直方图，不明白为什么无法运行。这段代码以前可以正常工作，但现在无法工作。

这是我的一部分代码（具体来说，是我正在尝试制作直方图的pandas系列对象）：

type(dfj2_MARKET1['VSPD2_perc'])

输出结果为：pandas.core.series.Series

这是我的绘图代码：

fig, axes = plt.subplots(1, 7, figsize=(30,4))
axes[0].hist(dfj2_MARKET1['VSPD1_perc'],alpha=0.9, color='blue')
axes[0].grid(True)
axes[0].set_title(MARKET1 + '  5-40 km / h')

错误消息：

    AttributeError                            Traceback (most recent call last)
    <ipython-input-75-3810c361db30> in <module>()
      1 fig, axes = plt.subplots(1, 7, figsize=(30,4))
      2 
    ----> 3 axes[1].hist(dfj2_MARKET1['VSPD2_perc'],alpha=0.9, color='blue')
      4 axes[1].grid(True)
      5 axes[1].set_xlabel('Time spent [%]')

    C:\Python27\lib\site-packages\matplotlib\axes.pyc in hist(self, x, bins, range, normed,          weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label,    stacked, **kwargs)
   8322             # this will automatically overwrite bins,
   8323             # so that each histogram uses the same bins
-> 8324             m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
   8325             m = m.astype(float) # causes problems later if it's an int
   8326             if mlast is None:

    C:\Python27\lib\site-packages\numpy\lib\function_base.pyc in histogram(a, bins, range,     normed, weights, density)
    158         if (mn > mx):
    159             raise AttributeError(
--> 160                 'max must be larger than min in range parameter.')
    161 
    162     if not iterable(bins):

AttributeError: max must be larger than min in range parameter.

- jonas

嗯，对我来说它是有效的。你能展示一下你的数据框吗？ - Andrey Shokhin

嗯，当我这样做时，我实际上可以生成一个直方图：s = dfj2_MARKET1 ['VSPD1_perc'] s.hist() - jonas

是的，但这样你就使用了pandas的hist函数，而不是matplotlib的。这个函数可以像预期的那样处理NaN值。请看我的更新。 - joris

2个回答

3

错误的原因正如上面所解释的那样，是由于NaN值引起的。只需使用以下命令即可：

df = df['column_name'].apply(pd.to_numeric)

如果值不是数字，则应用：

df = df['column_name'].replace(np.nan, your_value)

- brainhack

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- joris · Accepted Answer

如果Series中有NaN值，这个错误就会出现，你的情况可能是这样吗？

Matplotlib的hist函数不能很好地处理这些NaN值。例如：

s = pd.Series([1,2,3,2,2,3,5,2,3,2,np.nan])
fig, ax = plt.subplots()
ax.hist(s, alpha=0.9, color='blue')

产生相同的错误AttributeError: max must be larger than min in range parameter.一个选项是在绘图之前移除NaN。这将起作用：

ax.hist(s.dropna(), alpha=0.9, color='blue')

另一种选择是在您的系列上使用pandas的hist方法，并将axes[0]提供给ax关键字：

dfj2_MARKET1['VSPD1_perc'].hist(ax=axes[0], alpha=0.9, color='blue')