转置R矩阵包括NA值

4

我有类似以下数据,

trackingnumer = c(1,1,2,2,3) 
date = c("2017-08-01", "2017-08-10", "2017-08-02", "2017-08-05", "2017-08-12") 
scan = c("Pickup", "Delivered", "Pickup", "Delivered", "Delivered") 
df = data.frame(trackingnumer, date, scan) 

我想按照跟踪号转置这些数据

df2 <- df %>% 
group_by(trackingnumer) %>% 
mutate(n = row_number()) %>% 
{data.table::dcast(data = setDT(.), trackingnumer ~ n, value.var = c('date', 'scan'))}

我已经尝试过这个,但我没有得到理想的结果。我想将data_1设置为取货日期,将date_2设置为交付日期。如您所见,跟踪编号为3的货物没有取货记录,因此我希望将date_1设置为NA。

enter image description here

3个回答

3

使用基础R尝试,使用relevel来设置scan列的适当顺序:

reshape(
  cbind(df, time=as.numeric(relevel(df$scan, "Pickup"))),
  idvar="trackingnumer", direction="wide", sep="_"
)

#  trackingnumer     date_1 scan_1     date_2    scan_2
#1             1 2017-08-01 Pickup 2017-08-10 Delivered
#3             2 2017-08-02 Pickup 2017-08-05 Delivered
#5             3       <NA>   <NA> 2017-08-12 Delivered

2
问题在于你的mutate函数只是计算行数,而没有注意到其中的内容。case_when()函数允许你基于“scan”的值指定“n”列的特定值。
df2 <- df %>% 
  group_by(trackingnumer) %>% 
  mutate(n = case_when(scan == "Pickup" ~ 1,
                       scan == "Delivered" ~ 2)) %>% 
  {data.table::dcast(data = setDT(.), trackingnumer ~ n, value.var = c('date', 'scan'))}

1
或者使用 tidyr
library(tidyr)
df %>% group_by(trackingnumer,scan2 = scan) %>%
  nest(date,scan) %>%
  spread(scan2,data) %>%
  mutate_at(c("Delivered","Pickup"),~ifelse(map_lgl(.x,is_tibble),.x,lst(tibble(date=NA,scan=NA)))) %>%
  unnest %>%
  rename_at(c("date","scan"),paste0,2)

# # A tibble: 3 x 5
#     trackingnumer      date2     scan2      date1  scan1
#             <dbl>     <fctr>    <fctr>     <fctr> <fctr>
#   1             1 2017-08-10 Delivered 2017-08-01 Pickup
#   2             2 2017-08-05 Delivered 2017-08-02 Pickup
#   3             3 2017-08-12 Delivered       <NA>   <NA>

很好!但应该是它自己的答案。 - moodymudskipper

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接