如何按变量创建新列?

4

我的数据包括不同人员(ID)在每个星期的每一天以及他们在医院或病房不同区域所花费的时间。我会获得这些时间,以分钟:秒或持续时间的形式呈现。以下是我的一个数据示例:

ShiftData <- data.frame(ID = c("Nelson", "Nelson", "Nelson", "Nelson", "Nelson", 
                      "Justin", "Justin", "Justin", "Justin", "Justin", 
                      "Nelson", "Nelson", "Nelson", "Nelson", "Nelson", 
                      "Justin", "Justin", "Justin", "Justin", "Justin"), 
               Day = c("Monday", "Monday", "Monday", "Monday", "Monday", 
                       "Monday", "Monday", "Monday", "Monday", "Monday",
                      "Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday", 
                      "Tuesday", "Tuesday", "Tuesday", "Tuesday", "Tuesday"), 
               Ward = c("Gen", "Anaesth", "Front Desk", "PreOp", "Front Desk", 
                       "PreOp", "Front Desk", "Anaesth", "Front Desk", "Gen",
                       "Gen", "Anaesth", "PreOp", "Front Desk", "Gen", 
                       "Front Desk", "PreOp", "PostOp", "Front Desk", "Anaesth"),
               Duration = c("5:35", "4:08", "4:30", "6:33", "4:17", 
                            "15:35", "4:28", "9:37", "18:33", "4:20",
                            "9:45", "8:28", "6:37", "2:34", "4:27", 
                            "19:35", "4:20", "9:47", "11:33", "4:26"))

我希望首先增加一个列,用于指示每个ID何时在轮班或值班。在Ward列中的"前台"表示人员更改其班次。一个人可能从"前台"开始,这取决于他们前一天工作的小时数(当前分析不需要进行此计算)。我预计的输出将是:

ShiftData$Shift <- c(1,1,0,2,0,
                     1,0,2,0,3,
                     1,1,1,0,2,
                     0,1,1,0,2)

我的问题与这个问题类似,但当有一个"前台"时,我希望是0,之后的任何活动都会逐个计数。

请问我该如何创建它?

我知道我可以使用以下方法为"前台"包含0:

ShiftData$Shift <- ifelse(ShiftData$Ward=='Front Desk', 0, NA)

但我不确定如何为列的其他部分包括一个连续计数?


第2行、12行、13行等中的数字是如何不进行任何增量而直接传递的? - Ronak Shah
2个回答

2
这个问题可以通过使用dplyr来解决:
ShiftData$Shift <- (ShiftData %>%
                    group_by(ID,Day) %>%
                    mutate(tmp = ifelse(Ward=="Front Desk",1,0), #tag to sum front desk shifts
                           tmp2 = cumsum(tmp),                   #cumsum shows shifts in a day
                           Ward1 = Ward[1],                      #this and the below count your first shift if you didn't start on desk duty
                           shift = ifelse(Ward1=="Front Desk",tmp2,tmp2+1))
                    )$shift
ShiftData$Shift[ShiftData$Ward=="Front Desk"] <- 0

2
请注意,您的问题与这个非常相似。
以下是解决它的一种方法:
library(dplyr)

ShiftData %>%
  group_by(ID, Day) %>% 
  mutate(Shift = cumsum(Ward != "Front Desk" & lag(Ward) %in% c("Front Desk", NA))) %>% 
  mutate(Shift = ifelse(Ward == "Front Desk", 0, Shift))

# Source: local data frame [20 x 5]
# Groups: ID, Day [4]
# 
#        ID     Day       Ward Duration Shift
#    <fctr>  <fctr>     <fctr>   <fctr> <dbl>
# 1  Nelson  Monday        Gen     5:35     1
# 2  Nelson  Monday    Anaesth     4:08     1
# 3  Nelson  Monday Front Desk     4:30     0
# 4  Nelson  Monday      PreOp     6:33     2
# 5  Nelson  Monday Front Desk     4:17     0
# 6  Justin  Monday      PreOp    15:35     1
# 7  Justin  Monday Front Desk     4:28     0
# 8  Justin  Monday    Anaesth     9:37     2
# 9  Justin  Monday Front Desk    18:33     0
# 10 Justin  Monday        Gen     4:20     3
# 11 Nelson Tuesday        Gen     9:45     1
# 12 Nelson Tuesday    Anaesth     8:28     1
# 13 Nelson Tuesday      PreOp     6:37     1
# 14 Nelson Tuesday Front Desk     2:34     0
# 15 Nelson Tuesday        Gen     4:27     2
# 16 Justin Tuesday Front Desk    19:35     0
# 17 Justin Tuesday      PreOp     4:20     1
# 18 Justin Tuesday     PostOp     9:47     1
# 19 Justin Tuesday Front Desk    11:33     0
# 20 Justin Tuesday    Anaesth     4:26     2

工作原理:在分组后,我们通过在非前台办公室之前添加前台办公室时每次加1来创建Shift列。然后我们将所有前台办公室的行上的Shift替换为0。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接