dplyr中的do函数可以让您快速轻松地创建许多酷炫的模型,但我正在努力将这些模型用于良好的滚动预测。
# Data illustration
require(dplyr)
require(forecast)
df <- data.frame(
Date = seq.POSIXt(from = as.POSIXct("2015-01-01 00:00:00"),
to = as.POSIXct("2015-06-30 00:00:00"), by = "hour"))
df <- df %>% mutate(Hour = as.numeric(format(Date, "%H")) + 1,
Wind = runif(4320, min = 1, max = 5000),
Temp = runif(4320, min = - 20, max = 25),
Price = runif(4320, min = -15, max = 45)
)
我的因子变量是Hour
,我的外生变量是Wind
和temp
,我想预测的是Price
。基本上,我有24个模型,希望能够进行滚动预测。
现在,我的数据框包含180天。我想要回溯100天,并进行一天的滚动预测,然后能够将其与实际的Price
进行比较。
如果采用暴力方法,代码看起来会像这样:
# First I fit the data frame to be exactly the right length
# 100 days to start with (2015-03-21 or so), then 99, then 98.., etc.
n <- 100 * 24
# Make the price <- NA so I can replace it with a forecast
df$Price[(nrow(df) - n): (nrow(df) - n + 24)] <- NA
# Now I make df just 81 days long, the estimation period + the first forecast
df <- df[1 : (nrow(df) - n + 24), ]
# The actual do & fit, later termed fx(df)
result <- df %>% group_by(Hour) %>% do ({
historical <- .[!is.na(.$Price), ]
forecasted <- .[is.na(.$Price), c("Date", "Hour", "Wind", "Temp")]
fit <- Arima(historical$Price, xreg = historical[, 3:4], order = c(1, 1, 0))
data.frame(forecasted[],
Price = forecast.Arima(fit, xreg = forecasted[3:4])$mean )
})
result
现在我想将 n
修改为99 * 24。但是如果能用循环或应用来完成这个操作并保存每个新的预测,那就太好了。不过我实在想不出怎么做。
我尝试了以下循环,但还没有成功:
# 100 days ago, forecast that day, then the next, etc.
for (n in 1:100) {
nx <- n * 24 * 80 # Because I want to start after 80 days
df[nx:(nx + 23), 5] <- NA # Set prices to NA so I can forecast them
fx(df) # do the function
df.results[n] <- # Write the results into a vector / data frame to save them
# and now rinse and repeat for n + 1
}
真正令人惊叹的奖励积分,对于类似于的解决方案 :)