我希望按照国家
计算状态
为open
的次数和状态
为closed
的次数,然后计算每个国家
的关闭率
。
数据:
customer <- c(1,2,3,4,5,6,7,8,9)
country <- c('BE', 'NL', 'NL','NL','BE','NL','BE','BE','NL')
closeday <- c('2017-08-23', '2017-08-05', '2017-08-22', '2017-08-26',
'2017-08-25', '2017-08-13', '2017-08-30', '2017-08-05', '2017-08-23')
closeday <- as.Date(closeday)
df <- data.frame(customer,country,closeday)
增加status
:
df$status <- ifelse(df$closeday < '2017-08-20', 'open', 'closed')
customer country closeday status
1 1 BE 2017-08-23 closed
2 2 NL 2017-08-05 open
3 3 NL 2017-08-22 closed
4 4 NL 2017-08-26 closed
5 5 BE 2017-08-25 closed
6 6 NL 2017-08-13 open
7 7 BE 2017-08-30 closed
8 8 BE 2017-08-05 open
9 9 NL 2017-08-23 closed
计算 closerate
closerate <- length(which(df$status == 'closed')) /
(length(which(df$status == 'closed')) + length(which(df$status == 'open')))
[1] 0.6666667
显然,这是总体的closerate
。挑战在于获得每个country
的closerate
。我尝试通过以下方式将closerate
计算添加到df
:
df$closerate <- length(which(df$status == 'closed')) /
(length(which(df$status == 'closed')) + length(which(df$status == 'open')))
但是因为我没有进行分组,所以它会给所有行一个closerate值为0.66。我认为不应该使用length函数,因为可以通过分组来进行计数。我阅读了一些关于使用dplyr按组计算逻辑输出的信息,但这并没有成功。
以下是期望的输出: