我的脚本包含以下代码行
lines <- readLines("~/data")
我想在脚本中保留文件数据的内容(原样)。在R语言中是否有“read_the_following_lines”函数?类似于bash shell中的“here document”功能?
我的脚本包含以下代码行
lines <- readLines("~/data")
多行字符串是你能够得到的最接近的东西。虽然不完全相同(因为你必须关心引号),但它对于你想要实现的目标非常有效(而且你可以使用超过read.table
):
here_lines <- 'line 1
line 2
line 3
'
readLines(textConnection(here_lines))
## [1] "line 1" "line 2" "line 3" ""
here_csv <- 'thing,val
one,1
two,2
'
read.table(text=here_csv, sep=",", header=TRUE, stringsAsFactors=FALSE)
## thing val
## 1 one 1
## 2 two 2
here_json <- '{
"a" : [ 1, 2, 3 ],
"b" : [ 4, 5, 6 ],
"c" : { "d" : { "e" : [7, 8, 9]}}
}
'
jsonlite::fromJSON(here_json)
## $a
## [1] 1 2 3
##
## $b
## [1] 4 5 6
##
## $c
## $c$d
## $c$d$e
## [1] 7 8 9
here_xml <- '<CATALOG>
<PLANT>
<COMMON>Bloodroot</COMMON>
<BOTANICAL>Sanguinaria canadensis</BOTANICAL>
<ZONE>4</ZONE>a
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$2.44</PRICE>
<AVAILABILITY>031599</AVAILABILITY>
</PLANT>
<PLANT>
<COMMON>Columbine</COMMON>
<BOTANICAL>Aquilegia canadensis</BOTANICAL>
<ZONE>3</ZONE>
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$9.37</PRICE>
<AVAILABILITY>030699</AVAILABILITY>
</PLANT>
</CATALOG>
'
str(xml <- XML::xmlParse(here_xml))
## Classes 'XMLInternalDocument', 'XMLAbstractDocument' <externalptr>
print(xml)
## <?xml version="1.0"?>
## <CATALOG>
## <PLANT><COMMON>Bloodroot</COMMON><BOTANICAL>Sanguinaria canadensis</BOTANICAL><ZONE>4</ZONE>a
## <LIGHT>Mostly Shady</LIGHT><PRICE>$2.44</PRICE><AVAILABILITY>031599</AVAILABILITY></PLANT>
## <PLANT>
## <COMMON>Columbine</COMMON>
## <BOTANICAL>Aquilegia canadensis</BOTANICAL>
## <ZONE>3</ZONE>
## <LIGHT>Mostly Shady</LIGHT>
## <PRICE>$9.37</PRICE>
## <AVAILABILITY>030699</AVAILABILITY>
## </PLANT>
## </CATALOG>
help(Quotes)
中可以看到:file_raw_string <-
r"(#!/bin/bash
echo $@
for word in $@;
do
echo "This is the word: '${word}'."
done
exit 0
)"
writeLines(file_raw_string, "print_words.sh")
system("bash print_words.sh Word/1 w@rd2 LongWord composite-word")
file_raw_string <- r"(
x <- lapply(mtcars[,1:4], mean)
cat(
paste(
"Mean for column", names(x), "is", format(x,digits = 2),
collapse = "\n"
)
)
cat("\n")
cat(r"{ - This is a raw string where \n, "", '', /, \ are allowed.}")
)"
writeLines(file_raw_string, "print_means.R")
source("print_means.R")
#> Mean for column mpg is 20
#> Mean for column cyl is 6.2
#> Mean for column disp is 231
#> Mean for column hp is 147
#> - This is a raw string where \n, "", '', /, \ are allowed.
本文档由reprex包(v2.0.0)于2021-08-01创建
R语言简介的第90页及以下内容提到,可以像下面这样编写R脚本(引用自该书,稍有修改):
chem <- scan()
2.90 3.10 3.40 3.40 3.70 3.70 2.80 2.50 2.40 2.40 2.70 2.20
5.28 3.37 3.03 3.03 28.95 3.77 3.40 2.20 3.50 3.60 3.70 3.70
print(chem)
heredoc.R
。如果你在终端中键入命令并以非交互方式执行该脚本。Rscript heredoc.R
Read 24 items
[1] 2.90 3.10 3.40 3.40 3.70 3.70 2.80 2.50 2.40 2.40 2.70 2.20
[13] 5.28 3.37 3.03 3.03 28.95 3.77 3.40 2.20 3.50 3.60 3.70 3.70
chem
中。函数scan(.)
默认从连接stdin()
读取。在交互模式下(调用未指定脚本的R
),stdin()
指的是来自控制台的用户输入,但当读入输入脚本时,以下行将被读取*)。数据后面的空行很重要,因为它标志着数据的结束。
这也适用于表格数据:
tab <- read.table(file=stdin(), header=T)
A B C
1 1 0
2 1 0
3 2 9
summary(tab)
readLines(.)
时,必须指定读取的行数;在这里使用空行的方法不起作用:txt <- readLines(con=stdin(), n=5)
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi ultricies diam
sed felis mattis, id commodo enim hendrerit. Suspendisse iaculis bibendum eros,
ut mattis eros interdum sit amet. Pellentesque condimentum eleifend blandit. Ut
commodo ligula quis varius faucibus. Aliquam accumsan tortor velit, et varius
sapien tristique ut. Sed accumsan, tellus non iaculis luctus, neque nunc
print(txt)
txt <- c()
repeat{
x <- readLines(con=stdin(), n=1)
if(x == "") break # you can use any EOF string you want here
txt = c(txt, x)
}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi ultricies diam
sed felis mattis, id commodo enim hendrerit. Suspendisse iaculis bibendum eros,
ut mattis eros interdum sit amet. Pellentesque condimentum eleifend blandit. Ut
commodo ligula quis varius faucibus. Aliquam accumsan tortor velit, et varius
sapien tristique ut. Sed accumsan, tellus non iaculis luctus, neque nunc
print(txt)
*) 如果您想在R脚本中从标准输入读取数据,例如因为您想创建可重复使用的脚本,可以使用任何输入数据调用(Rscript reusablescript.R < input.txt
或some-data-generating-command | Rscript reusablescript.R
),请不要使用stdin()
,而是使用file("stdin")
。
还有一些更近期的tidyverse语法怎么样?
SQL <- c("
SELECT * FROM patient
LEFT OUTER JOIN projectpatient ON patient.patient_id = projectpatient.patient_id
WHERE projectpatient.project_id = 16;
") %>% stringr::str_replace_all("[\r\n]"," ")
一种处理多行字符串但不必担心引号(只需使用反引号)的方法是:
as.character(quote(`
all of the crazy " ' ) characters, except
backtick and bare backslashes that aren't
printable, e.g. \n works but a \ and c with no space between them would fail`))
read.table
函数中的text
参数。 - Bhas