R中的非字母数字字符

3

对于大写字母、小写字母和10个数字,我可以生成一个向量,其中包含所有字母或10位数字,如下所示:

A <- LETTERS[0:26]
B <- letters[0:26]
C <- seq(0,9)

我想知道是否有一个类似的功能用于非字母数字字符。
~!@#$%^&*_-+=`|\(){}[]:;"'<>,.?/

我尝试了

D <- c("~","!","@","#","$","%","^", "&","*","_","-","+","=","`","|","\","(",")","{","}","[","]",":",";",""","'","<",">",",",".","?","/")

谢谢


嗨,@RichardScriven,抱歉我不是很明白。 - useR
1
如果您想要所有ASCII字符,rawToChar(as.raw(1:127), multiple=T) 应该可以工作。不清楚您如何选择列表。有许多不可打印的字符。此外,它取决于您特定的编码方式。您可能在扩展页面和像UTF-8这样的编码中拥有更多的字符。 - MrFlick
1
你想实际做什么?如果你想将它们存储在向量中,那么其中有几个字符需要使用 \\\ 进行转义。 - A5C1D2H2I1M1N2O1R2T1
rawToChar(as.raw(c(32:47, 58:64,91,93:96,123:126)), multiple=T) 是我想要的。 - useR
4个回答

4

这是另一种选择。生成所有ASCII字符,然后使用正则表达式过滤掉非标点符号。

ascii <- rawToChar(as.raw(0:127), multiple=TRUE)
ascii[grepl('[[:punct:]]', ascii)]

# [1] "!"  "\"" "#"  "$"  "%"  "&"  "'"  "("  ")"  "*"  "+"  ","  "-"  "."  "/"  ":"  ";"  "<"  "="  ">"  "?"  "@" 
# [23] "["  "\\" "]"  "^"  "_"  "`"  "{"  "|"  "}"  "~" 

1
这有点冗长,可能有更好的网站(以及更好的获取相同结果的方法),但是。
library(XML); library(RCurl)
doc <- htmlParse(getURL("https://wci.llnl.gov/codes/basis/manual/node161.html"))
xp <- xpathSApply(doc, "//tr/td", xmlValue, trim = TRUE) 
xp[nzchar(xp) & nchar(xp) == 1]
#  [1] "!" "[" "%" "," "]" "&" "-" "|" "'" "." "=" "~" "("
# [14] "/" ")" "*" "=" "{" "?" "`" "}" "@" ":" ";" "^" " "

此外,使用另一篇答案中提到的网站可以得到更完整的结果。
> URL <- "http://datadebrief.blogspot.com/2011/03/ascii-code-table-in-r.html"
> r <- readLines(URL, warn = FALSE)[780:874]
> s <- sapply(strsplit(r, "\\s+"), "[", 1) 
> s[!s %in% c(letters, LETTERS, 0:9)]
#  [1] ""     "!"    "\""   "#"    "$"    "%"    "&"    "'"    "("   
# [10] ")"    "*"    "+"    ","    "-"    "."    "/"    ":"    ";"   
# [19] "<"    "="    ">"    "?"    "@"    "["    "\\\\" "]"    "^"   
# [28] "_"    "`"    "{"    "|"    "}"    "~" 

你可以像MrFlick建议的那样,使用rawToChar(as.raw(...))来实现。


1

1
这个答案仅供娱乐,列出您想要的字符并使用 strsplit 生成您的向量。
> D <- strsplit('!"#$%&\'()*+,-./\\:;<=>?@[]^_`{|}~', '(?=.)', perl=T)[[1]]
##  [1] "!"  "\"" "#"  "$"  "%"  "&"  "'"  "("  ")"  "*"  "+"  ","  "-"  "."  "/" 
## [16] "\\" ":"  ";"  "<"  "="  ">"  "?"  "@"  "["  "]"  "^"  "_"  "`"  "{"  "|" 
## [31] "}"  "~" 

或者过滤您想要的字符。
> D <- gsub('[^\\pP\\pS]', '', rawToChar(as.raw(1:127), multiple=T), perl=T)
> D[D != ""]
##  [1] "!"  "\"" "#"  "$"  "%"  "&"  "'"  "("  ")"  "*"  "+"  ","  "-"  "."  "/" 
## [16] ":"  ";"  "<"  "="  ">"  "?"  "@"  "["  "\\" "]"  "^"  "_"  "`"  "{"  "|" 
## [31] "}"  "~" 

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接