使用Dplyr的across + mutate + 条件选择列。

Question

使用Dplyr的across + mutate + 条件选择列。

5

我相信解决方案只需要一行代码，但我一直在摸索。请参考文章末尾的简短示例；我该如何告诉dplyr仅对没有NA值的列进行加倍？谢谢。

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union


df <- tibble(x=1:10, y=101:110,
             w=c(6,NA,4,NA, 5,0,NA,4,8,17 ),
             z=c(2,3,4,NA, 5,10,22,34,58,7 ),
             k=rep("A",10))


df
#> # A tibble: 10 x 5
#>        x     y     w     z k    
#>    <int> <int> <dbl> <dbl> <chr>
#>  1     1   101     6     2 A    
#>  2     2   102    NA     3 A    
#>  3     3   103     4     4 A    
#>  4     4   104    NA    NA A    
#>  5     5   105     5     5 A    
#>  6     6   106     0    10 A    
#>  7     7   107    NA    22 A    
#>  8     8   108     4    34 A    
#>  9     9   109     8    58 A    
#> 10    10   110    17     7 A


df %>% mutate(across(where(is.numeric), ~.x*2))
#> # A tibble: 10 x 5
#>        x     y     w     z k    
#>    <dbl> <dbl> <dbl> <dbl> <chr>
#>  1     2   202    12     4 A    
#>  2     4   204    NA     6 A    
#>  3     6   206     8     8 A    
#>  4     8   208    NA    NA A    
#>  5    10   210    10    10 A    
#>  6    12   212     0    20 A    
#>  7    14   214    NA    44 A    
#>  8    16   216     8    68 A    
#>  9    18   218    16   116 A    
#> 10    20   220    34    14 A


##now double the value of all the columns without NA. How to fix this...

df %>% mutate(across(where(sum(is.na(.x))==0), ~.x*2))
#> Error: Problem with `mutate()` input `..1`.
#> ✖ object '.x' not found
#> ℹ Input `..1` is `across(where(sum(is.na(.x)) == 0), ~.x * 2)`.

^{本文创建于2020年10月27日，使用了reprex包 (v0.3.0.9001)}

- larry77

2个回答

5

注意，目标是选择没有任何缺失值且是数值型的列。请记住，输入where必须是一个函数。在您的情况下，只需执行以下操作：

df %>% mutate(across(where(~is.numeric(.) & sum(is.na(.x))==0), ~.x*2))

好的，为您提供其他方式：

df %>% mutate(across(where(~!anyNA(.) & is.numeric(.)), ~.*2))
# A tibble: 10 x 5
       x     y     w     z k    
   <dbl> <dbl> <dbl> <dbl> <chr>
 1     2   202     6     2 A    
 2     4   204    NA     3 A    
 3     6   206     4     4 A    
 4     8   208    NA    NA A    
 5    10   210     5     5 A    
 6    12   212     0    10 A    
 7    14   214    NA    22 A    
 8    16   216     4    34 A    
 9    18   218     8    58 A    
10    20   220    17     7 A

如果你知道如何使用negate函数：

df %>% mutate(across(where(~Negate(anyNA)(.) & is.numeric(.)), ~.*2))

- onyambu

你和Ekoam的答案都是完美的，所以我根据个人喜好选择了一个。我忘记了，除了其他事情之外，我需要一个“~”来输入where函数。 - larry77

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ekoam · Accepted Answer

以下是您正在寻找的一行代码

df %>% mutate(across(where(~is.numeric(.) && all(!is.na(.))), ~.x*2))

输出

# A tibble: 10 x 5
       x     y     w     z k    
   <dbl> <dbl> <dbl> <dbl> <chr>
 1     2   202     6     2 A    
 2     4   204    NA     3 A    
 3     6   206     4     4 A    
 4     8   208    NA    NA A    
 5    10   210     5     5 A    
 6    12   212     0    10 A    
 7    14   214    NA    22 A    
 8    16   216     4    34 A    
 9    18   218     8    58 A    
10    20   220    17     7 A