如何在pandas中计算每周最大值的最常见时间?

5

在Python中使用Yahoo Finance包,我能够下载相关数据以显示OCHL。我的目标是找出股票平均最高的时间是哪个时段。

以下是下载数据的代码:

import yfinance as yf
import pandas as pd

df = yf.download(
        tickers = "APPL",
        period = "60d",
        interval = "5m",
        auto_adjust = True,
        group_by = 'ticker',
        prepost = True,
    )

maxTimes = df.groupby([df.index.month, df.index.day, df.index.day_name()])['High'].idxmax()

这使我得到类似于这样的东西:
Datetime  Datetime  Datetime 
6         2         Tuesday     2020-06-02 19:45:00-04:00
          3         Wednesday   2020-06-03 15:50:00-04:00
          4         Thursday    2020-06-04 10:30:00-04:00
          5         Friday      2020-06-05 11:30:00-04:00
...
8         3         Monday      2020-08-03 14:40:00-04:00
          4         Tuesday     2020-08-04 18:10:00-04:00
          5         Wednesday   2020-08-05 11:10:00-04:00
          6         Thursday    2020-08-06 16:20:00-04:00
          7         Friday      2020-08-07 15:50:00-04:00
Name: High, dtype: datetime64[ns, America/New_York]

在我创建的 maxTimes 对象中,我认为它应该给出每天发生的最高点的时间,然而我现在需要的是:

Monday    12:00
Tuesday   13:25
Wednesday 09:35
Thurs     16:10
Fri       12:05

有人能帮我确定如何让我的数据看起来像这样吗?

1个回答

2
这应该可以运行:
import yfinance as yf
import pandas as pd

df = yf.download(
        tickers = "AAPL",
        period = "60d",
        interval = "5m",
        auto_adjust = True,
        group_by = 'ticker',
        prepost = True,
    )

maxTimes = df.groupby([df.index.month, df.index.day, df.index.day_name()])['High'].idxmax()

# Drop date
maxTimes = maxTimes.apply(lambda x: x.time())

# Drop unused sub-indexes
maxTimes = maxTimes.droplevel(level=[0,1])

# To seconds
maxTimes = maxTimes.apply(lambda t: (t.hour * 60 + t.minute) * 60 + t.second)

# Get average
maxTimes =  maxTimes.groupby(maxTimes.index).mean()

# Back to time
maxTimes = pd.to_datetime(maxTimes, unit='s').apply(lambda x: x.time())

print (maxTimes)

'''
Output:

Datetime
Friday       11:59:32.727272
Monday              14:15:00
Thursday            13:21:40
Tuesday             10:35:00
Wednesday           11:53:45
Name: High, dtype: object

'''

您真是个英雄!我唯一的问题是原始项目中的时区是东部时间(UTC-4),您知道输出结果所在的时区吗? - Ash
与源数据相同,因此为UTC-4。 - M. Abreu

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接