我正在尝试构建一个处理棒球统计数据的程序。我要求用户输入一个球队,然后代码通过我创建的panda搜索匹配用户输入的“teamID”的值。
我已经尝试按“teamID”分组,但是在for循环之前还需要进行索引。
def AttendancePlot(teams,team_pick):
fig, ax = plt.subplots()
group_by_teamID = teams.groupby(by=['teamID'])
print group_by_teamID
for i in group_by_teamID.index:
if i == team_pick:
ax.scatter(teams['yearID'][i], teams['attendance'][i], color="#4DDB94", s=200)
ax.annotate(i, (teams['yearID'][i], teams['attendance'][i]),
bbox=dict(boxstyle="round", color="#4DDB94"),
xytext=(-30, 30), textcoords='offset points',
arrowprops=dict(arrowstyle="->", connectionstyle="angle,angleA=0,angleB=90,rad=10"))
我是如何创建Panda的
teams = pd.read_csv('Teams.csv')
salaries = pd.read_csv('Salaries.csv')
names = pd.read_csv('Names.csv')
teams = teams[teams['yearID'] >= 1985]
teams = teams[['yearID', 'teamID', 'Rank', 'R', 'RA', 'G', 'W', 'H', 'BB', 'HBP', 'AB', 'SF', 'HR', '2B', '3B', 'attendance']]
teams = teams.set_index(['yearID', 'teamID'])
salaries_by_yearID_teamID = salaries.groupby(['yearID', 'teamID']) ['salary'].sum()
teams = teams.join(salaries_by_yearID_teamID)
print teams.head(15)
输出的熊猫。
Rank R RA G ... 2B 3B attendance salary
yearID teamID ...
1985 ATL 5 632 781 162 ... 213 28 1350137.0 14807000.0
BAL 4 818 764 161 ... 234 22 2132387.0 11560712.0
BOS 5 800 720 163 ... 292 31 1786633.0 10897560.0
CAL 2 732 703 162 ... 215 31 2567427.0 14427894.0
我希望能够显示特定输入团队每年的出勤率散点图。目前只能得到一个空白图表且没有错误提示。
1985 ATL 5 632 781 162 ... 213 28 1350137.0 14807000.0 BAL 4 818 764 161 ... 234 22 2132387.0 11560712.0 BOS 5 800 720 163 ... 292 31 1786633.0 10897560.0 CAL 2 732 703 162 ... 215 31 2567427.0 14427894.0 CHA 3 736 720 163 ... 247 37 1669888.0 9846178.0 - Greg Milani