非典型数据格式转换为宽格式(从长格式)

3

我的数据:

    # A tibble: 6 x 4
  X__1            X__6                                                     X__7     X__8        
  <chr>           <chr>                                                    <chr>    <chr>       
1 Emp #:          xxyy                                                    Departm~ Corporate S~
2 Reason of Resi~ I think below are areas of improvement within my team C~ NA       NA          
3 Emp #:          xyyy                                                    Departm~ Corporate S~
4 Reason of Resi~ better oppurtunity                                       NA       NA          

我希望将数据改为以下格式。
Emp #     Reason                                                 Department
10282     I think below are areas of improvement within my team  Corporate
10308     better oppurtunity                                     Corporate

复现数据

structure(list(X__1 = c("Emp #:", "Reason of Resignation:", "Emp #:", 
"Reason of Resignation:", "Emp #:", "Reason of Resignation:", 
"Emp #:", "Reason of Resignation:", "Emp #:", "Reason of Resignation:"
), X__6 = c("10282", "I think below are areas of improvement within my team CS / SME or my be cross the organization on my level (L1-L2). Lack of career growth specially in my department i.e. CS HOD/RSM/TLs/KAMs are on same position from last 5 years. Many people are here on same position from last 10-12 years. lack in focus on low level staff (L1 / L2) in terms of capacity building and career growth i.e. not a single training for my team on it. No rotation plans (for capacity building) for CS i.e. not a single team member rotated since I joined. Better opportunity in terms of career and financials outside ", 
"10308", "better oppurtunity", "11230", "Moving on another organization for career persuade", 
"13370", "Get a new job outside the company.", "14694", "Health Issues"
), X__7 = c("Department:", NA, "Department:", NA, "Department:", 
NA, "Department:", NA, "Department:", NA), X__8 = c("Corporate Solutions", 
NA, "Corporate Solutions", NA, "Region Central A", NA, "Region North", 
NA, "Finance Operations", NA)), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

稍微详细一些。

X__1中的Emp#将会放在第一列标题中,该标题将具有来自X__6的值,以此类推。

2个回答

1

我添加了一个名为rid的新列,它将一对行分组,然后过滤出所需的列并按其rid再次使用left_join()连接在一起。

library(dplyr)

df <- mutate(df, rid = lapply(1:(nrow(df)/2), function(x) rep(x, 2)) %>% unlist())

left_join(
  df %>%
    filter(X__1 == "Emp #:") %>%
    select(rid, X__6) %>%
    rename("Emp #" = "X__6"),
  df %>%
    filter(X__1 == "Reason of Resignation:") %>%
    select(rid, X__6) %>%
    rename("Reason" = "X__6"),
  by = "rid") %>%
  left_join(df %>%
              filter(X__7 == "Department:") %>%
              select(rid, X__8) %>%
              rename("Department" = "X__8"),
            by = "rid") %>%
  select(-rid)

#  `Emp #` Reason                                                    Department     
#   <chr>   <chr>                                                     <chr>          
# 1 10282   I think below are areas of improvement within my team CS~ Corporate Solu~
# 2 10308   better oppurtunity                                        Corporate Solu~
# 3 11230   Moving on another organization for career persuade        Region Central~
# 4 13370   Get a new job outside the company.                        Region North   
# 5 14694   Health Issues                                             Finance Operat~

0

鉴于您的格式严格按照您所展示的方式,另一个(有点过度拟合)的想法可能是:

d1 <- df[c(TRUE, FALSE),]
d2 <- df[c(FALSE, TRUE),]

setNames(data.frame(d1[2], d1[4], d2[2]), c(d1[1,1], d1[1,3], d2[1,1]))

这将会给出,

   Emp #:         Department:                                                       Reason of Resignation:
1  10282 Corporate Solutions I think below are areas of improvement within my team CS / SMEs outside JAZZ
2  10308 Corporate Solutions                                                           better oppurtunity
3  11230    Region Central A                           Moving on another organization for career persuade
4  13370        Region North                                           Get a new job outside the company.
5  14694  Finance Operations                                                                Health Issues

Sotos,谢谢你的回复。但是,如果我想要包含其他变量,我仍然更喜欢手动“文本”添加列名的任何方法。 - Rana Usman

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接