更新 - 根据评论中作者提供的新样式进行基于模式匹配。 这里,我们使用str_extract
函数提取位于括号之后一个或多个数字(正则表达式lookaround
),或者任何不是括号的字符([^()]
)。
library(stringr)
str_extract_all(str1, "(?<=[(])\\d+|[^()]")
[[1]]
[1] "2" "10" "1" "12"
[[2]]
[1] "2" "0" "6" "9"
[[3]]
[1] "2" "15"
[[4]]
[1] "2" "1" "3" "1"
-在OP的额外模式上进行测试
str_extract_all(str2, "(?<=[(])\\d+|[^()]")
[[1]]
[1] "2" "10" "1" "12"
[[2]]
[1] "2" "0" "6" "9"
[[3]]
[1] "2" "15"
[[4]]
[1] "2" "1" "3" "1"
[[5]]
[1] "10" "0" "2" "0" "1"
-早期解决方案(基于所有大于9的数字都将被包裹在括号内的假设)
我们可以在基础R语言
中根据括号进行分割。
unlist(strsplit(str1[1], "\\(|\\)"))
[1] "2" "10" "1" "12"
假设存在这两种情况,那么一种选择是获取括号所在元素的索引并分别处理。
i1 <- grepl("\\(|\\)", str1)
lst1 <- vector('list', length(str1))
lst1[i1] <- strsplit(str1[i1], "\\(|\\)")
lst1[!i1] <- strsplit(str1[!i1], "")
unlist(lst1)
[1] "2" "10" "1" "12" "2" "0" "6" "9" "2" "15" "2" "1" "3" "1"
另一个选择是使用带有grepl的ifelse来创建单个分隔符,然后使用strsplit。
lst1 <- strsplit(trimws(ifelse(grepl("\\(|\\)", str1),
gsub("\\(|\\)", ",", str1), gsub("(?<=.)(?=.)", "\\1,\\2",
str1, perl = TRUE)), whitespace = ","), ",")
lst1
[[1]]
[1] "2" "10" "1" "12"
[[2]]
[1] "2" "0" "6" "9"
[[3]]
[1] "2" "15"
[[4]]
[1] "2" "1" "3" "1"
数据
str1 <- c("2(10)1(12)", "2069", "2(15)", "2131")
str2 <- c(str1, "(10)0201")
()
的情况。你能否请检查我的解决方案的第二部分? - akrun