在dplyr 1.0中从mutate_all转移到across()

Question

在dplyr 1.0中从mutate_all转移到across()

3

随着dplyr的新版本发布，我正在重构相当多的代码并删除现在已经被弃用或不再使用的函数。我有一个如下所示的函数：

processingAggregatedLoad <- function (df) {
  defined <- ls()
  passed <- names(as.list(match.call())[-1])
  
  if (any(!defined %in% passed)) {
    stop(paste("Missing values for the following arguments:", paste(setdiff(defined, passed), collapse=", ")))
  }
  
  df_isolated_load <- df %>% select(matches("snsr_val")) %>% mutate(global_demand = rowSums(.)) # we get isolated load
  df_isolated_load_qlty <- df %>% select(matches("qlty_good_ind")) # we get isolated quality
  df_isolated_load_qlty <- df_isolated_load_qlty %>% mutate_all(~ factor(.), colnames(df_isolated_load_qlty)) %>%
  mutate_each(funs(as.numeric(.)), colnames(df_isolated_load_qlty)) # we convert the qlty to factors and then to numeric
  df_isolated_load_qlty[df_isolated_load_qlty[]==1] <- 1  # 1 is bad
  df_isolated_load_qlty[df_isolated_load_qlty[]==2] <- 0 # 0 is good we mask to calculate the global index quality
  df_isolated_load_qlty <- df_isolated_load_qlty %>% mutate(global_quality = rowSums(.)) %>% select(global_quality)
  df <- bind_cols(df, df_isolated_load, df_isolated_load_qlty)
  return(df)
}

基本上，该函数的功能如下：

1. 该函数选择一个透视数据框的所有值并将它们聚合。

2. 该函数选择透视数据框的质量指标（字符）。

3. 我将质量的字符转换为因子，然后转换为数字以获取2个级别（1或2）。

4. 我根据级别用0或1替换每个单独列的数字值。

5. 我对单个质量进行行求和，如果所有值都是好的，则得到0，否则全局质量不佳。

问题在于我收到以下消息：

1: `funs()` is deprecated as of dplyr 0.8.0.
Please use a list of either functions or lambdas: 

  # Simple named list: 
  list(mean = mean, median = median)

  # Auto named with `tibble::lst()`: 
  tibble::lst(mean, median)

  # Using lambdas
  list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
2: `mutate_each_()` is deprecated as of dplyr 0.7.0.
Please use `across()` instead.

我进行了多次试验，例如：

 df_isolated_load_qlty %>% mutate(across(.fns = ~ as.factor(), .names = colnames(df_isolated_load_qlty)))
Error: Problem with `mutate()` input `..1`.
x All unnamed arguments must be length 1
ℹ Input `..1` is `across(.fns = ~as.factor(), .names = colnames(df_isolated_load_qlty))`.

但是我对新的dplyr语法还有点困惑。能否有人能够在正确的方向上指导我一下如何做到这一点？

- tfkLSTM

亲爱的Ronah，你的答案完美地解决了问题。非常感谢你的帮助，并对提供一个可复现的示例表示歉意。祝好，/E - tfkLSTM

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ronak Shah · Accepted Answer

mutate_each已被弃用，应使用mutate_all代替。
mutate_all现已被across代替。
across默认的.cols是everything()，这意味着如果没有明确指定，则默认行为类似于mutate_all（就像这里一样）。
您可以在同一个mutate调用中应用多个函数，因此这里可以同时应用factor和as.numeric。

考虑到所有这些，您可以将现有函数更改为：

library(dplyr)

processingAggregatedLoad <- function (df) {
      defined <- ls()
      passed <- names(as.list(match.call())[-1])

     if (any(!defined %in% passed)) {
            stop(paste("Missing values for the following arguments:", 
             paste(setdiff(defined, passed), collapse=", ")))
      }

     df_isolated_load <- df %>% 
                          select(matches("snsr_val")) %>% 
                          mutate(global_demand = rowSums(.))
    df_isolated_load_qlty <- df %>% select(matches("qlty_good_ind"))
    df_isolated_load_qlty <- df_isolated_load_qlty %>% 
                               mutate(across(.fns = ~as.numeric(factor(.))))
                          
    df_isolated_load_qlty[df_isolated_load_qlty ==1] <- 1  
    df_isolated_load_qlty[df_isolated_load_qlty==2] <- 0
    df_isolated_load_qlty <- df_isolated_load_qlty %>% 
                               mutate(global_quality = rowSums(.)) %>% 
                               select(global_quality)
    df <- bind_cols(df, df_isolated_load, df_isolated_load_qlty)
    return(df)
  }