根据条件,使用另一列的行来创建一个新的列。

3

我有以下数据框:

> df
     group val               divisor denom
1  group A   5               group B    NA
2  group A  10               group B    NA
3  group A  12               group B    NA
4  group B   2               group D    NA
5  group B   5               group D    NA
6  group B   3               group D    NA
7  group C   1 need to be determined    NA
8  group C   3 need to be determined    NA
9  group C   5 need to be determined    NA
10 group D   2                 total    10
11 group D   3                 total    10
12 group D  11                 total    10

所需输出

     group val               divisor denom
1  group A   5               group B     2
2  group A  10               group B     5
3  group A  12               group B     3
4  group B   2               group D     2
5  group B   5               group D     3
6  group B   3               group D    11
7  group C   1 need to be determined    NA
8  group C   3 need to be determined    NA
9  group C   5 need to be determined    NA
10 group D   2                 total    10
11 group D   3                 total    10
12 group D  11                 total    10

尝试了以下操作:

df_org %>%
  dplyr::mutate(denom = ifelse(
    divisor %in% "total" , 10, denom
  )) %>%
  dplyr::mutate(denom = case_when(
    divisor %in% "group B" ~ val[group == "group B"] 
  ))

我遇到了一个错误,

Error in `dplyr::mutate()`:
! Problem while computing `denom = case_when(divisor %in% "group
  B" ~ val[group == "group B"])`.
Caused by error in `case_when()`:
! `divisor %in% "group B" ~ val[group == "group B"]` must be
  length 12 or one, not 3.

数据

> dput(df_org)
structure(list(group = c("group A", "group A", "group A", "group B", 
"group B", "group B", "group C", "group C", "group C", "group D", 
"group D", "group D"), val = c(5L, 10L, 12L, 2L, 5L, 3L, 1L, 
3L, 5L, 2L, 3L, 11L), divisor = c("group B", "group B", "group B", 
"group D", "group D", "group D", "need to be determined", "need to be determined", 
"need to be determined", "total", "total", "total"), denom = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, 10L, 10L, 10L)), class = "data.frame", row.names = c(NA, 
-12L))
1个回答

3
case_when要求所有参数的长度相同。在这里,replace可以被使用(假设divisorgroup中' group B '元素的数量是相同的)。
library(dplyr)
df_org %>%
  dplyr::ungroup() %>%
  dplyr::mutate(denom = ifelse(
    divisor %in% "total" , 10, denom
  )) %>%
    dplyr::mutate(denom = replace(denom, divisor %in% "group B", 
        val[group == "group B"]),
   denom = replace(denom, divisor %in% "group D",
     val[group == "group D"]))

-输出

       group val               divisor denom
1  group A   5               group B     2
2  group A  10               group B     5
3  group A  12               group B     3
4  group B   2               group D     2
5  group B   5               group D     3
6  group B   3               group D    11
7  group C   1 need to be determined    NA
8  group C   3 need to be determined    NA
9  group C   5 need to be determined    NA
10 group D   2                 total    10
11 group D   3                 total    10
12 group D  11                 total    10

另一种选择是嵌套和匹配标签

library(dplyr)
library(tidyr)
df_org %>%
    dplyr::ungroup() %>%
    select(-denom) %>%
     rename(denom = val) %>% 
     nest(data = c(denom)) %>% 
     mutate(ind = match(divisor, group), 
       data = coalesce(data[ind], data), ind = NULL) %>% 
   unnest(data) %>% 
   mutate(denom = case_when(divisor %in% c("need to be determined", 
     "total" ) ~ df_org$denom, TRUE ~ denom), val = df_org$val, 
    .before = 2)

-输出

# A tibble: 12 × 4
   group     val divisor               denom
   <chr>   <int> <chr>                 <int>
 1 group A     5 group B                   2
 2 group A    10 group B                   5
 3 group A    12 group B                   3
 4 group B     2 group D                   2
 5 group B     5 group D                   3
 6 group B     3 group D                  11
 7 group C     1 need to be determined    NA
 8 group C     3 need to be determined    NA
 9 group C     5 need to be determined    NA
10 group D     2 total                    10
11 group D     3 total                    10
12 group D    11 total                    10

组B需要获取组D的值。 - user5249203
@user5249203,但是你的代码只展示了B组的比较。 - akrun
@user5249203 我猜你需要使用自连接/匹配来处理多个情况。 - akrun
看起来我的原始数据有组信息,这可能会导致错误。我能否更新问题并添加组信息? - user5249203
1
@user5249203 在 mutate 步骤之前可以使用 ungroup 或者嵌套。 - akrun
显示剩余4条评论

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接