使用pandas在不同轴上绘制条形图和时间序列图

3

我有一个pandas数据框,从一个.csv文件中读取,其结构如下:

Date,       Latitude,   Longitude,        Brand,        Pump, AKI,  Trip Miles,  Total Miles, Gallons,  MPG,    PPG,    Total,  Tires,  MPG-D,
11/03/2013, 40° 1.729', -105° 15.516',    Boulder Gas,  2,    87,   134.3,       134.3,       6.563,    20.46,  3.319,  21.78,  Stock,  ,
11/17/2013, 40° 1.729', -105° 15.516',    Boulder Gas,  2,    87,   161.8,       296.0,       7.467,    21.67,  3.279,  24.48,  Stock,  ,
11/27/2013, 40° 0.872', -105° 12.775',    Buffalo Gas,  6,    87,   180.8,       477.0,       8.096,    22.33,  3.359,  27.19,  Stock,  ,
12/07/2013, 40° 1.729', -105° 15.516',    Boulder Gas,  6,    87,   265.1,       742.0,       12.073,   21.96,  3.179,  38.38,  Stock,  ,
12/11/2013, 40° 2.170', -105° 15.522',    Circle K,     4,    87,   240.9,       983.0,       9.868,    24.41,  3.179,  31.37,  Stock,  ,
12/15/2013, 40° 8.995', -105° 7.876',     Shell,        3,    87,   188.7,       1172,        8.596,    21.95,  3.059,  26.30,  ,       ,
12/21/2013, 40° 1.770', -105° 15.481',    Conoco,       3,    87,   113.8,       1286,        5.517,    20.62,  3.139,  17.32,  Winter, ,
01/09/2014, 40° 1.729', -105° 15.516',    Boulder Gas,  2,    87,   139.5,       1426,        7.181,    19.42,  3.279,  23.55,  Winter, 21.3,
01/13/2013, 40° 1.770', -105° 15.481',    Conoco,       7,    87,   260.8,       1688,        11.177,   23.33,  3.239,  36.20,  Winter, 25.5,
01/18/2014, 40° 1.729', -105° 15.516',    Boulder Gas,  2,    87,   102.0,       1790,        4.401,    23.18,  3.239,  14.26,  Winter, 25.5,
02/02/2014, 39° 59.132', -105° 14.962',   King Soopers, 5,    87,   175.3,       1965,        8.436,    20.78,  3.019,  25.47,  Winter, 24.0,
02/03/2014, 40° 1.770', -105° 15.481',    Conoco,       3,    87,   249.9,       2215,        10.452,   23.91,  3.219,  33.64,  Winter, 25.2,
02/08/2014, 40° 2.170', -105° 15.522',    Circle K,     7,    87,   186.4,       2402,        8.565,    21.76,  3.239,  27.74,  Winter, 24.3,
02/13/2014, 40° 1.729', -105° 15.516',    Boulder Gas,  8,    87,    79.6,       2481,        4.125,    19.30,  3.439,  14.19,  Winter, 21.3,
03/06/2014, 40.014460, -105.225034,       Conoco,       5,    87,   172.4,       2654,        8.618,    20.00,  3.779,  32.57,  Winter, 21.9,
03/09/2014, 40.029498, -105.258117,       Conoco,       6,    87,   230.4,       2884,        9.016,    25.55,  3.759,  33.89,  Winter, 27.3,
03/17/2014, 40.036236, -105.258763,       Conoco,       6,    87,   130.1,       3014,        5.368,    24.24,  3.719,  19.96,  Winter, 25.8,
03/24/2014, 40.036236, -105.258763,       Conoco,       1,    87,   282.3,       3297,       11.540,    24.46,  3.719,  42.92,  Winter, 27.3,

我想制作一个图表,其中x轴是日期,左y轴是每加仑英里数,右y轴是英里数。在这个图表中,我想用一种颜色显示“MPG”列的时间序列,用另一种颜色显示“MPG-D”的时间序列,并用第三种颜色显示“Trip Miles”列的条形图。
我一直在尝试遵循http://pandas.pydata.org/pandas-docs/stable/visualization.html,并且有下面的代码,但它会产生一个条形图和两个时间序列图,所有内容都在同一个坐标轴上,并且y标签没有显示。
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv('mpg.csv', skipinitialspace=True,index_col='Date')
plt.figure()
ax = data['Trip Miles'].plot(kind='bar',secondary_y=['Trip Miles'])
ax.right_ax.set_ylabel('Miles')
ax.set_ylabel('Miles/Gallon')
data['MPG'].plot()
data['MPG-D'].plot()

plot I get from above code

1个回答

10

你需要更明确地指定坐标轴。尝试像这样:

%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt

fig, tsax = plt.subplots()
barax = tsax.twinx()

data = pd.read_csv('mpg.csv', skipinitialspace=True,index_col='Date')
data['Trip Miles'].plot(kind='bar', ax=barax)
barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')
data['MPG'].plot(ax=tsax)
data['MPG-D'].plot(ax=tsax)

编辑

这里存在一个大问题,即Pandas条形图和折线图在格式化x轴方面存在根本性差异。具体而言,条形图试图使用刻度线和标签来创建每个单独条形的定性比例尺。但是在这里,似乎你更感兴趣的是获得类似典型时间序列的格式。

所以我建议您放弃双轴图表。取而代之的是,在完全分离的两个轴上绘图。就像这样:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as mgrid
import pandas as pd

fig = plt.figure(figsize=(12,5))
grid = mgrid.GridSpec(nrows=2, ncols=1, height_ratios=[2, 1])

barax = fig.add_subplot(grid[0])
tsax = fig.add_subplot(grid[1])
data = pd.DataFrame(np.random.randn(10,3), columns=list('ABC'), index=pd.DatetimeIndex(freq='1M', start='2012-01-01', periods=10))

data['A'] **= 2
data['A'].plot(ax=barax, style='o--')
barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')

barax.xaxis.tick_top()

data['B'].plot(ax=tsax)
data['C'].plot(ax=tsax)
fig.tight_layout()

这给了我一个:

分离的坐标轴

但是,如果你确实需要条形图或者你真的想把所有内容放在同一组双x轴上,那么你必须像这样使用matplotlib的API进行绘制:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as mgrid
import pandas as pd

fig, tsax = plt.subplots(figsize=(12,5))
barax = tsax.twinx()

data = pd.DataFrame(np.random.randn(10,3), columns=list('ABC'), index=pd.DatetimeIndex(freq='1M', start='2012-01-01', periods=10))
data['A'] **= 2

# the `width` is specified in days -- adjust for your data
barax.bar(data.index, data['A'], width=5, facecolor='indianred')

barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')

barax.xaxis.tick_top()

fig.tight_layout()

tsax.plot(data.index, data['B'])
tsax.plot(data.index, data['C'])

这随后给了我

单一轴


我该如何让它在x轴上显示日期? - deltap
@DeltaP,你能否发布更多行的数据?请删除我们不需要的列并保持它的CSV格式。 - Paul H
我已经包含了整个文件。我宁愿不要修剪不必要的列,因为它们将对其他事情有用。就CSV格式而言...我假设你的意思是要删除空格?那真的不必要,因为我遇到的所有函数都有去除初始空格的方法。 - deltap
1
@DeltaP 这里的区别在于SO并不关心您实现最终结果,而是关心如何实现最终结果。因此,在这种情况下,包含一个最小工作示例(即http://www.sscce.org/)更好,就像我在我的示例中所做的那样。 - Paul H

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接