正如Ben Bolker所提到的,以上方法可能会因为出现连续多个NA而失效。我尝试了一种不同的方法,似乎可以克服这个问题。
paste4 <- function(x, sep = ", ") {
x <- gsub("^\\s+|\\s+$", "", x)
ret <- paste(x[!is.na(x) & !(x %in% "")], collapse = sep)
is.na(ret) <- ret == ""
return(ret)
}
第二行代码去除了在连接文本和数字时引入的额外空格。
可以使用上述代码使用
apply
命令连接数据框中的多个列(或行),或者在需要时将数据重新打包为数据框。
EDIT
在经过几个小时的思考后,我认为以下代码结合了上述建议,允许指定collapse和na.rm选项。
paste5 <- function(..., sep = " ", collapse = NULL, na.rm = F) {
if (na.rm == F)
paste(..., sep = sep, collapse = collapse)
else
if (na.rm == T) {
paste.na <- function(x, sep) {
x <- gsub("^\\s+|\\s+$", "", x)
ret <- paste(na.omit(x), collapse = sep)
is.na(ret) <- ret == ""
return(ret)
}
df <- data.frame(..., stringsAsFactors = F)
ret <- apply(df, 1, FUN = function(x) paste.na(x, sep))
if (is.null(collapse))
ret
else {
paste.na(ret, sep = collapse)
}
}
}
如上所述,
na.omit(x)
可以被替换为
(x[!is.na(x) & !(x %in% "")
,如果需要的话也可以删除空字符串。请注意,使用na.rm = T和collapse一起返回一个没有任何“NA”的字符串,但是这可以通过将代码的最后一行替换为
paste(ret, collapse = collapse)
来改变。
nth <- paste0(1:12, c("st", "nd", "rd", rep("th", 9)))
mnth <- month.abb
nth[4:5] <- NA
mnth[5:6] <- NA
paste5(mnth, nth)
[1] "Jan 1st" "Feb 2nd" "Mar 3rd" "Apr NA" "NA NA" "NA 6th" "Jul 7th" "Aug 8th" "Sep 9th" "Oct 10th" "Nov 11th" "Dec 12th"
paste5(mnth, nth, sep = ": ", collapse = "; ", na.rm = T)
[1] "Jan: 1st; Feb: 2nd; Mar: 3rd; Apr; 6th; Jul: 7th; Aug: 8th; Sep: 9th; Oct: 10th; Nov: 11th; Dec: 12th"
paste3(c("a","b", "c", NA), c("A","B", NA, NA), c(1,2,NA,4), c(5,6,7,8))
[1] "a, A, 1, 5" "b, B, 2, 6" "c, , 7" "4, 8"
paste5(c("a","b", "c", NA), c("A","B", NA, NA), c(1,2,NA,4), c(5,6,7,8), sep = ", ", na.rm = T)
[1] "a, A, 1, 5" "b, B, 2, 6" "c, 7" "4, 8"
stringr::str_replace_na(c(NA, "abc", "def"), replacement="")
-- 2018 年的方法 - Ufospaste(1:4, stringr::str_replace_na(foo, replacement=""), sep=", ")
,你会得到"1, A" "2, B" "3, C" "4, "
。 - Dannid