如何绘制分组柱状图与叠加线

3
我正在尝试使用matplotlib根据下面的表格在Excel中创建下面的图表。enter image description here
类别 %_总分布_1 事件率_%_1 %_总分布_2 事件率_%_2
00 (-inf, 0.25) 5.7 36.5 5.8 10
01 [0.25, 4.75) 7 11.2 7 11
02 [4.75, 6.75) 10.5 5 10.5 4.8
03 [6.75, 8.25) 13.8 3.9 13.7 4
04 [8.25, 9.25) 9.1 3.4 9.2 3.1
05 [9.25, 10.75) 14.1 2.5 14.2 2.4
06 [10.75, 11.75) 13.7 1.6 13.7 1.8
07 [11.75, 13.75) 16.8 1.3 16.7 1.3
08 [13.75, inf) 9.4 1 9.1 1.3
我面临的问题是:
1. Matplotlib中的列重叠在一起。 2. 我想将x轴标签旋转45度,以避免重叠,但不知道如何实现。 3. 我想在线条上添加标记。
这是我使用的代码:
import pandas as pd
import matplotlib.pyplot as plt

# Create a Pandas DataFrame with your data
data = {
    "Category": ["00 (-inf, 0.25)", "01 [0.25, 4.75)", "02 [4.75, 6.75)", "03 [6.75, 8.25)",
                 "04 [8.25, 9.25)", "05 [9.25, 10.75)", "06 [10.75, 11.75)", "07 [11.75, 13.75)", "08 [13.75, inf)"],
    "%_total_dist_1": [5.7, 7, 10.5, 13.8, 9.1, 14.1, 13.7, 16.8, 9.4],
    "event_rate_%_1": [36.5, 11.2, 5, 3.9, 3.4, 2.5, 1.6, 1.3, 1],
    "%_total_dist_2": [5.8, 7, 10.5, 13.7, 9.2, 14.2, 13.7, 16.7, 9.1],
    "event_rate_%_2": [10, 11, 4.8, 4, 3.1, 2.4, 1.8, 1.3, 1.3]
}

df = pd.DataFrame(data)

# Create a figure and primary y-axis
fig, ax1 = plt.subplots(figsize=(10, 6))

# Plot percentage distribution on the primary y-axis
ax1.bar(df['Category'], df['%_total_dist_1'], alpha=0.7, label="%_total_dist_1", color='b')
ax1.bar(df['Category'], df['%_total_dist_2'], alpha=0.7, label="%_total_dist_2", color='g')
ax1.set_ylabel('% Distribution', color='b')
ax1.tick_params(axis='y', labelcolor='b')

# Create a secondary y-axis
ax2 = ax1.twinx()

# Plot event rate on the secondary y-axis
ax2.plot(df['Category'], df['event_rate_%_1'], marker='o', label='event_rate_%_1', color='r')
ax2.plot(df['Category'], df['event_rate_%_2'], marker='o', label='event_rate_%_2', color='orange')
ax2.set_ylabel('Event Rate (%)', color='r')
ax2.tick_params(axis='y', labelcolor='r')

# Adding legend
fig.tight_layout()
plt.title('Percentage Distribution and Event Rate')
fig.legend(loc="upper left", bbox_to_anchor=(0.15, 0.85))

# Rotate x-axis labels for better readability
plt.xticks(rotation=45, ha="right")

# Show the plot
plt.show()
2个回答

2
最佳实现方法是直接使用`pandas`绘图API,`pandas.DataFrame.plot`,其中`matplotlib`是默认的后端。
这将正确地对分组的柱形图进行间距调整。
仍然可以使用显式的“Axes”接口来应用其他格式化方法。
不建议在显式接口和隐式的`pyplot`接口之间切换。最好明确指明。
“如何将图例放在图外”提供了有关移动图例的其他信息,包括到底部,并具有多列。
请注意,每个Axes,`ax1`和`ax2`,都有一个单独的图例。
在`python 3.11.4`,`pandas 2.1.0`,`matplotlib 3.7.2`中进行了测试。
# optionally remove the digits preceding the cut range in the Category column
df.Category = df.Category.str.split('\d+ ', regex=True, expand=True)[1]

# plot the bars; add rot=45 to rotate the xtick labels
ax1 = df.plot(kind='bar', x='Category', y=['%_total_dist_1', '%_total_dist_2'], color=['b', 'g'], figsize=(15, 6), ylabel='% Distribution')

# plot the lines on the secondary_y
ax2 = df.plot(x='Category', y=['event_rate_%_1', 'event_rate_%_2'], marker='.', color=['r', 'orange'], secondary_y=True, ax=ax1, ylabel='Event Rate (%)')

# move the legends
ax1.legend(bbox_to_anchor=(1.05, 0.5), loc='center left', frameon=False)
ax2.legend(bbox_to_anchor=(1.05, 0.4), loc='center left', frameon=False)

# set the figure title
ax1.figure.suptitle('Percentage Distribution and Event Rate')

enter image description here


1

解决方案

为了解决重叠的柱状图,您可以为每个柱状图分配偏移量,该偏移量等于柱状图宽度的一半。这样可以使它们居中而不重叠。要旋转x轴标签,您应该在创建ax2之前调用plt.xticks(...)。这是因为x轴标签来自第一个轴。最后,要在y轴上创建网格线,您应该包括ax1.grid(which='major', axis='y', linestyle='--',zorder=1)。请确保在此行中将zorder参数设置为1,并在创建柱状图和线条时将其设置为2。这样可以确保网格线位于背景中,不会显示在柱状图的上方。

bar and line plot

代码

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Create a Pandas DataFrame with your data
data = {
    "Category": ["00 (-inf, 0.25)", "01 [0.25, 4.75)", "02 [4.75, 6.75)", "03 [6.75, 8.25)",
                 "04 [8.25, 9.25)", "05 [9.25, 10.75)", "06 [10.75, 11.75)", "07 [11.75, 13.75)", "08 [13.75, inf)"],
    "%_total_dist_1": [5.7, 7, 10.5, 13.8, 9.1, 14.1, 13.7, 16.8, 9.4],
    "event_rate_%_1": [36.5, 11.2, 5, 3.9, 3.4, 2.5, 1.6, 1.3, 1],
    "%_total_dist_2": [5.8, 7, 10.5, 13.7, 9.2, 14.2, 13.7, 16.7, 9.1],
    "event_rate_%_2": [10, 11, 4.8, 4, 3.1, 2.4, 1.8, 1.3, 1.3]
}

df = pd.DataFrame(data)

# Create a figure and primary y-axis
fig, ax1 = plt.subplots(figsize=(10, 6))

x=np.arange(len(df['Category']))

# THIS LINE MAKES THE HORIZONTAL GRID LINES ON THE PLOT
ax1.grid(which='major', axis='y', linestyle='--',zorder=1)

# THIS PLOTS THE BARS NEXT TO EACH OTHER INSTEAD OF OVERLAPPING
ax1.bar(x+0.1, df['%_total_dist_1'], width=0.2, alpha=1.0, label="%_total_dist_1", color='b',zorder=2)
ax1.bar(x-0.1, df['%_total_dist_2'], width=0.2, alpha=1.0, label="%_total_dist_2", color='g',zorder=2)
ax1.set_ylabel('% Distribution', color='b')
ax1.tick_params(axis='y', labelcolor='b')

# THIS LINE ROTATES THE X-AXIS LABELS
plt.xticks(rotation=45, ha="right")

# Create a secondary y-axis
ax2 = ax1.twinx()

# Plot event rate on the secondary y-axis
ax2.plot(df['Category'], df['event_rate_%_1'], marker='o', label='event_rate_%_1', color='r',zorder=2)
ax2.plot(df['Category'], df['event_rate_%_2'], marker='o', label='event_rate_%_2', color='orange',zorder=2)
ax2.set_ylabel('Event Rate (%)', color='r')
ax2.tick_params(axis='y', labelcolor='r')

# Adding legend
fig.tight_layout()
plt.title('Percentage Distribution and Event Rate')
fig.legend(loc="upper left", bbox_to_anchor=(0.15, 0.85))

# Show the plot
plt.show()

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接