如何在字符串中匹配多个模式?

4

我该如何匹配多个模式并获取相应的值。

我有一个表格,像这样:

library(data.table)
set.seed(1)
table_1 <- data.table(names = c('bluecdsd','red321','yellowVsds523','423_black','ewrwblack'),
                      value = runif(5))

而模式表格如下:

table_2 <- data.table(category = c('black','blue','red','white'),
                      size = c('small','little','large','huge'))

What I want the result:

           names     value   size
1:      bluecdsd 0.5995658 little
2:        red321 0.4935413  large
3: yellowVsds523 0.1862176     NA
4:     423_black 0.8273733  small
5:     ewrwblack 0.6684667  small

我知道应该使用正则表达式,但不知道如何匹配多个模式,请帮忙。

2个回答

3
我们可以提取子字符串并进行匹配。
library(stringr)
table_1[, size := table_2$size[match(str_extract(names, 
            paste(table_2$category, collapse="|")), table_2$category)]]
table_1
#          names     value   size
#1:      bluecdsd 0.2655087 little
#2:        red321 0.3721239  large
#3: yellowVsds523 0.5728534     NA
#4:     423_black 0.9082078  small
#5:     ewrwblack 0.2016819  small

1
使用 grep 命令,从表格2的 category 中筛选出表格1的 names 并获取表格1的 names 的值并赋值给表格2。一旦两个表格都有了 names,我们可以基于 on = .(names) 使用 join 方法进行连接,并将表格2的 size 绑定到表格1中。
  library(data.table)      
  table_2 <- table_2[, .(names = grep( unique(category), table_1[, names], value =  TRUE  ), size = size ),
                     by = category ]
  table_2 <- table_2[!is.na(names), ]

  table_1[table_2, `:=` ( size = i.size), on = c('names')]
  table_1
  #            names     value   size
  # 1:      bluecdsd 0.2655087 little
  # 2:        red321 0.3721239  large
  # 3: yellowVsds523 0.5728534     NA
  # 4:     423_black 0.9082078  small
  # 5:     ewrwblack 0.2016819  small

数据:

set.seed(1)
table_1 <- data.table(names = c('bluecdsd','red321','yellowVsds523','423_black','ewrwblack'),
                        value = runif(5))

table_2 <- data.table(category = c('black','blue','red','white'),
                        size = c('small','little','large','huge'))

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接