日期时间的x轴matplotlib标签导致无法控制的重叠。

Question

日期时间的x轴matplotlib标签导致无法控制的重叠。

5

我正在尝试使用pandas库的series和'pandas.tseries.index.DatetimeIndex'来绘制图表，但是x轴标签重叠在一起，即使使用了几个提供的解决方案，也无法使它们变得好看。

我尝试了stackoverflow上的解决方案建议使用autofmt_xdate，但是并没有帮助。

我还尝试了使用plt.tight_layout()的建议，但是没有效果。

ax = test_df[(test_df.index.year ==2017) ]['error'].plot(kind="bar")
ax.figure.autofmt_xdate()
#plt.tight_layout()
print(type(test_df[(test_df.index.year ==2017) ]['error'].index))

更新：使用条形图会出现问题。常规的时间序列图能够展示良好管理的标签。

- user3556757

2个回答

0

在您的情况下，最简单的方法是手动创建标签和间距，并使用 ax.xaxis.set_major_formatter 应用它。

以下是可能的解决方案：

由于未提供示例数据，我尝试使用一些随机数字在数据框中模拟了您的数据集结构。

设置如下：

# imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker

# A dataframe with random numbers ro run tests on
np.random.seed(123456)
rows = 100
df = pd.DataFrame(np.random.randint(-10,10,size=(rows, 1)), columns=['error'])
datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=rows).tolist()
df['dates'] = datelist 
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)

test_df = df.copy(deep = True)

# Plot of data that mimics the structure of your dataset
ax = test_df[(test_df.index.year ==2017) ]['error'].plot(kind="bar")
ax.figure.autofmt_xdate()
plt.figure(figsize=(15,8))

一种可能的解决方案：

test_df = df.copy(deep = True)
ax = test_df[(test_df.index.year ==2017) ]['error'].plot(kind="bar")
plt.figure(figsize=(15,8))

# Make a list of empty myLabels
myLabels = ['']*len(test_df.index)

# Set labels on every 20th element in myLabels
myLabels[::20] = [item.strftime('%Y - %m') for item in test_df.index[::20]]
ax.xaxis.set_major_formatter(ticker.FixedFormatter(myLabels))
plt.gcf().autofmt_xdate()

# Tilt the labels
plt.setp(ax.get_xticklabels(), rotation=30, fontsize=10)
plt.show()

您可以通过查看strftime.org轻松更改标签的格式。

- vestland

这种方法的问题可以从图片中看出来。你用该月份的标签标记了某些任意日期，这导致在某些随机位置出现了“2017-01”两次。 - ImportanceOfBeingErnest

同意。你的建议已经得到我的支持 =) 我喜欢自己的建议之处在于标签密度和标签字符串格式的灵活性。 - vestland

1

哦，也许我的回答没有表达清楚，但更改位置和格式的灵活性正是使用数字轴的优势。我已经更新了它，使这一点更加清晰。 - ImportanceOfBeingErnest

我不知道那个。非常好！ - vestland

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ImportanceOfBeingErnest · Accepted Answer

一个pandas柱状图是一种分类图。它在标尺上的整数位置显示每个索引的一个柱状条。因此，第一个柱状条位于位置0，下一个位于1等等。标签对应于数据帧的索引。如果您有100个柱形条，您将最终获得100个标签。这是有道理的，因为pandas不知道它们是否应视为类别或序数/数字数据。

如果您使用普通的matplotlib柱状图，则会按照数据帧的索引进行数字化处理。这意味着柱状条的位置根据实际日期确定，并且标签根据自动刻度放置。

import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt

datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=42).tolist()
df = pd.DataFrame(np.cumsum(np.random.randn(42)), 
                  columns=['error'], index=pd.to_datetime(datelist))

plt.bar(df.index, df["error"].values)
plt.gcf().autofmt_xdate()
plt.show()

优点在于可以使用 matplotlib.dates 定位器和格式化程序。例如，用自定义格式标记每个月的第一天和15日，

import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=93).tolist()
df = pd.DataFrame(np.cumsum(np.random.randn(93)), 
                  columns=['error'], index=pd.to_datetime(datelist))

plt.bar(df.index, df["error"].values)
plt.gca().xaxis.set_major_locator(mdates.DayLocator((1,15)))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%d %b %Y"))
plt.gcf().autofmt_xdate()
plt.show()