按照工作日重新排列Pandas系列

12

使用Pandas,我已经导入了一个CSV文件,然后创建了一系列数据来查找哪些星期几发生了最多的事故:

crashes_by_day = bc['DAY_OF_WEEK'].value_counts()

输入图像描述

然后我把它们绘制出来,但当然它们按照系列的相同排名顺序绘制。

crashes_by_day.plot(kind='bar')

输入图像描述

重新排列这些元素,最有效的方法是什么?

我是否需要将它拆分成列表?谢谢。


你能否发布一个可复制粘贴的小版本 bc - Lee
这是GitHub上的工作簿,如果有帮助的话?https://github.com/jakc/ExploringBikeCrashes/blob/master/ExploringBikeCrashData.ipynb - jakc
1个回答

15

您可以使用 有序分类,然后使用 sort_index

print bc
   DAY_OF_WEEK    a    b
0       Sunday  0.7  0.5
1       Monday  0.4  0.1
2      Tuesday  0.3  0.2
3    Wednesday  0.4  0.1
4     Thursday  0.3  0.6
5       Friday  0.4  0.9
6     Saturday  0.3  0.2
7       Sunday  0.7  0.5
8       Monday  0.4  0.1
9      Tuesday  0.3  0.2
10   Wednesday  0.4  0.1
11    Thursday  0.3  0.6
12      Friday  0.4  0.9
13    Saturday  0.3  0.2
14      Sunday  0.7  0.5
15      Monday  0.4  0.1
16     Tuesday  0.3  0.2
17   Wednesday  0.4  0.1
18    Thursday  0.3  0.6
19      Friday  0.4  0.9
20    Saturday  0.3  0.2
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
    ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
    ordered=True)

print bc['DAY_OF_WEEK']
0        Sunday
1        Monday
2       Tuesday
3     Wednesday
4      Thursday
5        Friday
6      Saturday
7        Sunday
8        Monday
9       Tuesday
10    Wednesday
11     Thursday
12       Friday
13     Saturday
14       Sunday
15       Monday
16      Tuesday
17    Wednesday
18     Thursday
19       Friday
20     Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday       3
Tuesday      3
Wednesday    3
Thursday     3
Friday       3
Saturday     3
Sunday       3
dtype: int64

crashes_by_day.plot(kind='bar')

如果不使用 Categorical,下一个可能的解决方案是通过映射进行集合排序:

crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
  DAY_OF_WEEK  count
0    Thursday      3
1   Wednesday      3
2      Friday      3
3     Tuesday      3
4      Monday      3
5    Saturday      3
6      Sunday      3

days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0    3
1    2
2    4
3    1
4    0
5    5
6    6
Name: DAY_OF_WEEK, dtype: int64

crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
print crashes_by_day
             count
DAY_OF_WEEK       
Monday           3
Tuesday          3
Wednesday        3
Thursday         3
Friday           3
Saturday         3
Sunday           3

crashes_by_day.plot(kind='bar')

graph


采用有序分类方法,看起来最优雅?我现在会阅读一些相关内容。非常感谢。 - jakc
1
“Categorical”解决方案更加优雅和更快。 - jezrael

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接