我正在开发一个模型,用于预测某一年龄段的生育率。目前我有一个类似这样的数据框,其中行是年龄,列是年份。每个单元格里的数值是该年份对应年龄的生育率:
> df1
iso3 sex age fert1953 fert1954 fert1955
14 AUS female 13 0.000 0.00000 0.00000
15 AUS female 14 0.000 0.00000 0.00000
16 AUS female 15 13.108 13.42733 13.74667
17 AUS female 16 26.216 26.85467 27.49333
18 AUS female 17 39.324 40.28200 41.24000
然而,我希望每一行都是一个群体。因为行和列代表的是各个年份,所以可以通过获取对角线来获得群体数据。我想要的结果如下:
> df2
iso3 sex ageIn1953 fert1953 fert1954 fert1955
14 AUS female 13 0.000 0.00000 13.74667
15 AUS female 14 0.000 13.42733 27.49333
16 AUS female 15 13.108 26.85467 41.24000
17 AUS female 16 26.216 40.28200 [data..]
18 AUS female 17 39.324 [data..] [data..]
这里是 df1
数据框:
df1 <- structure(list(iso3 = c("AUS", "AUS", "AUS", "AUS", "AUS"), sex = c("female",
"female", "female", "female", "female"), age = c(13, 14, 15,
16, 17), fert1953 = c(0, 0, 13.108, 26.216, 39.324), fert1954 = c(0,
0, 13.4273333333333, 26.8546666666667, 40.282), fert1955 = c(0,
0, 13.7466666666667, 27.4933333333333, 41.24)), .Names = c("iso3",
"sex", "age", "fert1953", "fert1954", "fert1955"), class = "data.frame", row.names = 14:18)
编辑:
这是我最终使用的解决方法。它基于David的答案,但我需要为每个iso3
级别执行此操作。
df.ls <- lapply(split(f3, f = f3$iso3), FUN = function(df1) {
n <- ncol(df1) - 4
temp <- mapply(function(x, y) lead(x, n = y), df1[, -seq_len(4)], seq_len(n))
return(cbind(df1[seq_len(4)], temp))
})
f4 <- do.call("rbind", df.ls)