在Pandas中找到最长连续递增序列

Question

在Pandas中找到最长连续递增序列

3

我有一个数据框：

Date    Price
2021-01-01 29344.67
2021-01-02 32072.08
2021-01-03 33048.03
2021-01-04 32084.61
2021-01-05 34105.46
2021-01-06 36910.18
2021-01-07 39505.51
2021-01-08 40809.93
2021-01-09 40397.52
2021-01-10 38505.49

Date      object
Price    float64
dtype: object

我的目标是找到最长的连续增长期。应该返回：最长连续增长期为2021年01月04日至2021年01月08日，增加了$8725.32，但实际上我不知道从哪里开始。这是我在pandas中的第一步，我不知道应该使用哪些工具来获取这些信息。

有人可以帮助我/指引我正确的方向吗？

- nikisaku

2个回答

0

就像 Quang 所做的那样，将组分割，然后选择组数

s = df.Price.diff().lt(0).cumsum()
out = df.loc[s==s.value_counts().sort_values().index[-1]]
Out[514]: 
         Date     Price
3  2021-01-04  32084.61
4  2021-01-05  34105.46
5  2021-01-06  36910.18
6  2021-01-07  39505.51
7  2021-01-08  40809.93

- BENY

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Quang Hoang · Accepted Answer

使用cumsum在递减序列上检测你的递增序列：

df['is_increasing'] = df['Price'].diff().lt(0).cumsum()

您将获得：

         Date     Price  is_increasing
0  2021-01-01  29344.67             0
1  2021-01-02  32072.08             0
2  2021-01-03  33048.03             0
3  2021-01-04  32084.61             1
4  2021-01-05  34105.46             1
5  2021-01-06  36910.18             1
6  2021-01-07  39505.51             1
7  2021-01-08  40809.93             1
8  2021-01-09  40397.52             2
9  2021-01-10  38505.49             3

现在，您可以通过以下方式检测最长的序列

sizes=df.groupby('is_increasing')['Price'].transform('size')
df[sizes == sizes.max()]

而你将得到：

         Date     Price  is_increasing
3  2021-01-04  32084.61              1
4  2021-01-05  34105.46              1
5  2021-01-06  36910.18              1
6  2021-01-07  39505.51              1
7  2021-01-08  40809.93              1