在R中,从数据框某一列的字符串中提取%符号前面的数字值。

3
在下面的数据框中,我正在尝试构建一个名为“avcomp”的新列,该列从“code”列中提取空格和%符号之间的数字值(如果duty_nature为“M”或“C”)。 我尝试了下面的代码,但它只能获取前两个数字和百分号。 请帮忙看一下吗?
avcomp列应该如下所示:
>[1] "30", "0.50, ""

感谢您!
code <- c("Greater than 30% of something","Less than 0.50% of something","30%")
duty_nature<- c("M","C","A")
test <-data.frame(code,duty_nature)
  
test$avcomp <- ifelse(test$duty_nature == "M" | test$duty_nature == "C",str_sub(str_match(test$code,"\\s*(.*?)%\\s*"),-4,-1),"")
2个回答

4
正则表达式模式 [0-9]+[.]?[0-9]*(?=%) 匹配带有小数点和百分号(预查)的任何数字:
library(tidyverse)
code <- c("Greater than 30% of something", "Less than 0.50% of something", "30%")
duty_nature <- c("M", "C", "A")
test <- data.frame(code, duty_nature)

test %>%
  mutate(
    avcomp = ifelse(
      duty_nature %in% c("M", "C"),
      code %>% str_extract("[0-9]+[.]?[0-9]*(?=%)") %>% as.numeric(),
      NA
    )
  )
#>                            code duty_nature avcomp
#> 1 Greater than 30% of something           M   30.0
#> 2  Less than 0.50% of something           C    0.5
#> 3                           30%           A     NA

本示例创建于2022-03-21,使用了 reprex工具包 (v2.0.0)


0
使用%in%创建一个逻辑索引,并在索引为TRUE的地方更改新的列。
code <- c("Greater than 30% of something","Less than 0.50% of something","30%")
duty_nature<- c("M","C","A")
test <-data.frame(code,duty_nature)

test$avcomp <- ""
i <- test$duty_nature %in% c("M", "C")
test$avcomp[i] <- stringr::str_match(test$code, "(\\d+\\.*\\d*)%")[i, 2]
test
#>                            code duty_nature avcomp
#> 1 Greater than 30% of something           M     30
#> 2  Less than 0.50% of something           C   0.50
#> 3                           30%           A

reprex package (v2.0.1)于2022年3月21日创建


tidyverse 解决方案。

suppressPackageStartupMessages(library(tidyverse))

code <- c("Greater than 30% of something","Less than 0.50% of something","30%")
duty_nature<- c("M","C","A")
test <-data.frame(code,duty_nature)

test %>%
  mutate(
    avcomp = if_else(duty_nature %in% c("M", "C"), str_match(test$code, "(\\d+\\.*\\d*)%")[, 2], "")
  )
#>                            code duty_nature avcomp
#> 1 Greater than 30% of something           M     30
#> 2  Less than 0.50% of something           C   0.50
#> 3                           30%           A

reprex包 (v2.0.1)于2022-03-21创建


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接