非常感谢您的帮助,我正在尝试重新学习一些基础知识。
这里有一些示例代码,是从一个受伤工人的数据库中获取的,与我的问题有关。
Area <- c("Connecticut", "Maine", "Massachusetts", "New Hampshire", "Texas", "Arizona", "California", "Washington")
Region <- c("Northeast", "Northeast", "Northeast", "Northeast", "South", "South", "West", "West")
X2004 <- c(0,1,4,1,3,4,2,2)
X2005 <- c(1,0,6,2,0,1,0,2)
X2006 <- c(0,0,1,1,2,1,0,0)
df1 <- data.frame(Area, Region, X2004, X2005, X2006)
我希望能在 Base R 中展示 2004-2005 年两年平均值到单独的 2006 年的百分比变化。我已经通过 tidyverse 包解决了这个问题,但感觉就像使用了一个支架。以下是目前的代码:
df2 <- reshape(df1,
idvar=c("Area"),
v.names="count",
varying=c("X2004","X2005","X2006"),
direction="long",
times=2004:2006,
timevar="year")
df3 <- df2 %>% group_by(Region, year) %>%
summarise(total_count = sum(count))
df3$pre <- ifelse(df3$year<=2005, 1, 0)
df3 %>%
group_by(Region) %>%
summarise(mean_count_pre = mean(total_count[pre==1]),
mean_count_post = mean(total_count[pre==0]),
pct_change = 100*(mean_count_post - mean_count_pre) / mean_count_pre)
没有依赖于tidyverse或dplyr,有什么办法来解决这个问题吗?非常感谢您的帮助。我学习R的时候使用了tidyverse,现在想更好地理解基本原理。