在 R 中匹配不完全相同的字符字符串

3
考虑下面这个名为estimates_df的数据框。
....
                                Item            section
7596                5 Gal Samandoque      Cacti/Accents
7597       5 Gal Purple Prickly Pear      Cacti/Accents
7598              5 Gal Banana Yucca      Cacti/Accents
7599                5 Gal Yucca Vine              Vines
7600             5 Gal Red Three Awn            Grasses
7601 3/4" Screened To Match Existing      Decomposed Granite
...

我还有一个装有仙人掌/多肉植物名称的字符向量,名称为cactus_names

[1]"Prickly Pear"
[2]"Samandoque"
[3]"Banana Yucca"
...

我不想改变Item列中的全名,但我希望根据我的仙人掌/多肉植物向量中出现的名称更改section列。我之所以难以做到这一点,是因为向量中的名称与列中的名称并不完全匹配。例如,我尝试过像这样做:

estimates_df %>%
mutate(section = ifelse(cactus_names %in% Item, "Cacti/Succulents", section)

显然,这并不与任何名称匹配,因为它们并不完全匹配。我希望最终结果看起来像这样:
....
                                Item            section
7596                5 Gal Samandoque      Cacti/Succulents
7597       5 Gal Purple Prickly Pear      Cacti/Succulents
7598              5 Gal Banana Yucca      Cacti/Succulents
7599                5 Gal Yucca Vine              Vines
7600             5 Gal Red Three Awn            Grasses
7601 3/4" Screened To Match Existing      Decomposed Granite
...
1个回答

5

您是否正在寻找这样的东西!

library(dplyr)
library(stringr)

cactus_names <- c("Prickly Pear", "Yucca Vine", "Banana Yucca")

pattern <- paste(cactus_names, collapse = "|")

df %>% 
  mutate(section = ifelse(str_detect(Item, pattern), "Cacti/Succulents", section))


    id                           Item            section
1 7596               5 Gal Samandoque      Cacti/Accents
2 7597      5 Gal Purple Prickly Pear   Cacti/Succulents
3 7598             5 Gal Banana Yucca   Cacti/Succulents
4 7599               5 Gal Yucca Vine   Cacti/Succulents
5 7600            5 Gal Red Three Awn            Grasses
6 7601 3/4 Screened To Match Existing Decomposed Granite

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接