将一个字符向量拆分为单个字符?(与paste或stringr :: str_c相反)

23

在R中一个非常基础的问题,但解决方案并不清楚。

如何将字符向量拆分为单个字符,即与paste(..., sep='')stringr::str_c()相反的操作?

有没有比下面这个更简洁的方法:

sapply(1:26, function(i) { substr("ABCDEFGHIJKLMNOPQRSTUVWXYZ",i,i) } )
"A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"

还有其他方法可以实现吗,比如使用 strsplit()stringr::* 或其他方式?


我的目的是为迭代器生成内容:it = iter(sapply(1:26, function(i) { substr("ABCDEFGHIJKLMNOPQRSTUVWXYZ",i,i) } )) ... nextElem(it) - smci
@Henrik 非常感谢,但这只是更通用示例的一个例子。 - smci
4个回答

32
是的,strsplit 可以实现。 strsplit 返回一个列表,因此您可以使用 unlist 将字符串强制转换为单个字符向量,或使用列表索引 [[1]] 访问第一个元素。
x <- paste(LETTERS, collapse = "")

unlist(strsplit(x, split = ""))
# [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
#[20] "T" "U" "V" "W" "X" "Y" "Z"

或者(注意不一定需要命名split参数)

strsplit(x, "")[[1]]
# [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
#[20] "T" "U" "V" "W" "X" "Y" "Z"

你也可以使用NULLcharacter(0)进行分割,得到相同的结果。


5

stringrstr_extract_all() 提供了一个不错的方法来执行此操作:

str_extract_all("ABCDEFGHIJKLMNOPQRSTUVWXYZ", boundary("character"))

[[1]]
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U"
[22] "V" "W" "X" "Y" "Z"

1

stringr 1.5.0 开始,您可以使用 str_split_1 来对单个字符串进行分割,它是 str_split 的一个版本:

library(stringr)
x <- paste(LETTERS, collapse = "")
str_split_1(x, "")
# [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
#[20] "T" "U" "V" "W" "X" "Y" "Z"

-1

为了清晰起见,此处逐步呈现;在实践中,将创建一个函数。

查找序列中任何字符重复的次数

the_string <- "BaaaaaaH"
# split string into characters
the_runs <- strsplit(the_string, "")[[1]]
# find runs
result <- rle(the_runs)
# find values that are repeated
result$values[which(result$lengths > 1)]
#> [1] "a"
# retest with more runs
the_string <- "BaabbccH"
# split string into characters
the_runs <- strsplit(the_string, "")[[1]]
# find runs
result <- rle(the_runs)
# find values that are repeated
result$values[which(result$lengths > 1)]
#> [1] "a" "b" "c"

1
不,我没有要求运行长度编码,我只是说“将字符向量拆分为其各个字符”。因此,"BaabbccH"应该给出'B'、'a'、'a'、'b'、'b'、'c'、'c'、'H'。 - smci
@smci 是的,不知道我被什么东西分心了。 - Richard Careaga

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接