R中用于读取不完整数据的read.table函数

Question

R中用于读取不完整数据的read.table函数

4

我有一个大表格需要在R中读取，该文件是以.txt格式保存的。在R中，我使用read.table函数但是无法正确读入。出现以下错误信息：

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 28 did not have 23 elements

看起来（从第一行开始计数，不包括标题行，因为我指定了skip=），第28行的数据有缺失元素。我正在寻找一种方法通过过滤掉这一行来自动纠正此问题。目前，我甚至无法读取文件，因此无法在R中进行操作...非常感谢任何建议:)

- alittleboy

2个回答

2

当您的数据中有一个井号（#）时，也会出现该错误。

如果是这种情况，请将选项comment.char更改为comment.char = ""。

read.table("file.txt", comment.char = "")

- Enrique Ramos

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Jealie · Accepted Answer

这里是我的方法：使用选项fill=TRUE调用read.table，然后排除没有填充所有字段的行（使用count.fields调用）。

示例：

# 1. Data generation, and saving in 'tempfile'
cat("1 John", "2 Paul", "7 Pierre", '9', file = "tempfile", sep = "\n")

# 2. read the data:
data = read.table('tempfile',fill=T)

# 3. exclude incomplete data
c.fields = count.fields('tempfile')
data = data[ - (which(c.fields) != max(c.fields)),]

（编辑以自动获取行数）