如何在R中将第一行更改为标题?

27

我有以下表格:

     X.5       X.6       X.7       X.8          X.9 X.10         X.11  X.12   X.13
17   Zip CuCurrent PaCurrent PoCurrent      Contact  Ext          Fax email Status
18  74136         0         1         0 918-491-6998    0 918-491-6659            1
19  30329         1         0         0 404-321-5711                              1
20  74136         1         0         0 918-523-2516    0 918-523-2522            1
21  80203         0         1         0 303-864-1919    0                         1
22  80120         1         0         0 345-098-8890  456                         1

如何将第一行“zip,cucurrent,pacurrent……”作为列标题?

谢谢,

下面是dput(dat)

structure(list(X.5 = structure(c(26L, 14L, 6L, 14L, 17L, 16L), .Label = c("", 
"1104", "1234 I don't know Ave.", "139.98", "300 Morgan St.", 
"30329", "312.95", "4101 S. 4th Street, Traff", "500 Highway 89 North", 
"644.04", "656.73", "72160", "72336-7000", "74136", "75501", 
"80120", "80203", "877.87", "Address1", "BZip", "General Svcs Admin (WPY)", 
"InvFileName2", "LDC_Org_Cost", "N/A", "NULL", "Zip"), class = "factor"), 
    X.6 = structure(c(7L, 2L, 3L, 3L, 2L, 3L), .Label = c("", 
    "0", "1", "301 7th St. SW", "800-688-6160", "Address2", "CuCurrent", 
    "Emergency", "LDC_Cost_Adj", "Mtelemetry", "N/A", "NULL", 
    "Suite 1402"), class = "factor"), X.7 = structure(c(8L, 3L, 
    2L, 2L, 3L, 2L), .Label = c("", "0", "1", "Address3", "Cucustomer", 
    "LDC_Misc_Fee", "NULL", "PaCurrent", "Room 7512"), class = "factor"), 
    X.8 = structure(c(14L, 2L, 2L, 2L, 2L, 2L), .Label = c("", 
    "0", "100.98", "237.02", "242.33", "335.04", "50.6", "City", 
    "Durham", "LDC_FinalVolume", "Leavenwoth", "Pacustomer", 
    "Petersburg", "PoCurrent", "Prescott", "Washington"), class = "factor"), 
    X.9 = structure(c(18L, 16L, 10L, 17L, 7L, 9L), .Label = c("", 
    "0", "1", "139.98", "20024", "27701", "303-864-1919", "312.95", 
    "345-098-8890", "404-321-5711", "644.04", "656.73", "66048", 
    "86313", "877.87", "918-491-6998", "918-523-2516", "Contact", 
    "LDC_FinalCost", "PoCustomer", "Zip"), class = "factor"), 
    X.10 = structure(c(14L, 2L, 1L, 2L, 2L, 9L), .Label = c("", 
    "0", "2.620194604", "2.710064788", "2.717239052", "2.766403162", 
    "202-708-4995", "3.09912854", "456", "804-504-7200", "913-682-2000", 
    "919-956-5541", "928-717-7472", "Ext", "InvoicesNeeded", 
    "LDC_UnitPrice", "NULL", "Phone"), class = "factor"), X.11 = structure(c(7L, 
    4L, 1L, 5L, 1L, 1L), .Label = c("", " ", "1067", "918-491-6659", 
    "918-523-2522", "Ext", "Fax", "InvoiceMonths", "LDC_UnitPrice_Original", 
    "NULL", "x2951"), class = "factor"), X.12 = structure(c(13L, 
    1L, 1L, 1L, 1L, 1L), .Label = c("", "0", "100.98", "202-401-3722", 
    "237.02", "242.33", "335.04", "50.6", "716- 344-3303", "804-504-7227", 
    "913- 758-4230", "919- 956-7152", "email", "Fax", "GSA", 
    "Supp_Vol"), class = "factor"), X.13 = structure(c(10L, 2L, 
    2L, 2L, 2L, 2L), .Label = c("", "1", "15", "202-497-6164", 
    "3", "804-504-7200", "Emergency", "MajorTypeId", "NULL", 
    "Status", "Supp_Vol_Adj"), class = "factor")), .Names = c("X.5", 
"X.6", "X.7", "X.8", "X.9", "X.10", "X.11", "X.12", "X.13"), row.names = 17:22, class = "data.frame")

一个表的 dput() 会很有帮助。但是你可以使用 colnames(dat) <- as.character(dat[1,]) 来设置列名,并使用普通的 R 语法来“删除”第一行。 - hrbrmstr
@RichardScriven 很好的观点。回想起来,这似乎是在read.…函数中缺少了header=TRUE,但为什么行名称从17开始呢? - hrbrmstr
有点奇怪。我确保它在一个巨大的netflow记录数据表上做了我认为的事情。啊,但是其中没有一个是因子(刚刚检查过)。希望我们有一个dput()可以使用 :-) - hrbrmstr
@PerriMa,你可以在控制台中输入 dput(gas) 并将结果粘贴到这里吗?这样你就能得到好的答案了。 - RockScience
@RockScience - 我刚在我的原始帖子中完成了。谢谢! - PMa
显示剩余2条评论
7个回答

40

这可以用简单的方法来实现:

步骤1:将第一行复制到表头:

names(dat) <- dat[1,]

步骤2:删除第一行:

dat <- dat[-1,]

1
这显然是最简单和最好的答案。谢谢。 - Sam

28

如果您不想重新读取数据到R中(从评论中看起来似乎是这样),您可以执行以下操作。我不得不添加一些零以完全读取您的数据,所以请忽略它们。

dat
##       V2        V3        V4        V5           V6  V7           V8    V9    V10
## 17   Zip CuCurrent PaCurrent PoCurrent      Contact Ext          Fax email Status
## 18 74136         0         1         0 918-491-6998   0 918-491-6659     0      1
## 19 30329         1         0         0 404-321-5711   0            0     0      1
## 20 74136         1         0         0 918-523-2516   0 918-523-2522     0      1
## 21 80203         0         1         0 303-864-1919   0            0     0      1
## 22 80120         1         0         0 345-098-8890 456            0     0      1

首先将第一行作为列名。接下来删除第一行。最后将列转换为它们相应的类型。

names(dat) <- as.matrix(dat[1, ])
dat <- dat[-1, ]
dat[] <- lapply(dat, function(x) type.convert(as.character(x)))
dat
##     Zip CuCurrent PaCurrent PoCurrent      Contact Ext          Fax email Status
## 1 74136         0         1         0 918-491-6998   0 918-491-6659     0      1
## 2 30329         1         0         0 404-321-5711   0            0     0      1
## 3 74136         1         0         0 918-523-2516   0 918-523-2522     0      1
## 4 80203         0         1         0 303-864-1919   0            0     0      1
## 5 80120         1         0         0 345-098-8890 456            0     0      1

一个有用的工具是使用库janitor中的clean_names函数,这样可以避免使用lapply语句。因此,代码可以写作dat <- janitor::clean_names(aum) - roarkz

18

最清晰的方法是使用一个已经为此设计好的简单函数。你需要使用 janitor 包。

janitor::row_to_names(dat)

如果您希望将第n行用作列名,则函数的第二个参数是要使用的行号。默认值为1。


2
默认值对我来说似乎不是1。可能需要 row_number = 1 - dca

5
如果您从CSV文件中获取数据,请使用read.csv中的'header'参数。
dat=read.csv("gas.csv", header=TRUE)

如果您已经拥有数据,但不想或无法以干净的方式获取数据,则始终可以执行

dat=structure(list(X.5 = structure(c(26L, 14L, 6L, 14L, 17L, 16L), .Label = c("", "1104", "1234 I don't know Ave.", "139.98", "300 Morgan St.", "30329", "312.95", "4101 S. 4th Street, Traff", "500 Highway 89 North", "644.04", "656.73", "72160", "72336-7000", "74136", "75501", "80120", "80203", "877.87", "Address1", "BZip", "General Svcs Admin (WPY)", "InvFileName2", "LDC_Org_Cost", "N/A", "NULL", "Zip"), class = "factor"), X.6 = structure(c(7L, 2L, 3L, 3L, 2L, 3L), .Label = c("", "0", "1", "301 7th St. SW", "800-688-6160", "Address2", "CuCurrent", "Emergency", "LDC_Cost_Adj", "Mtelemetry", "N/A", "NULL", "Suite 1402"), class = "factor"), X.7 = structure(c(8L, 3L, 2L, 2L, 3L, 2L), .Label = c("", "0", "1", "Address3", "Cucustomer", "LDC_Misc_Fee", "NULL", "PaCurrent", "Room 7512"), class = "factor"), X.8 = structure(c(14L, 2L, 2L, 2L, 2L, 2L), .Label = c("", "0", "100.98", "237.02", "242.33", "335.04", "50.6", "City", "Durham", "LDC_FinalVolume", "Leavenwoth", "Pacustomer", "Petersburg", "PoCurrent", "Prescott", "Washington"), class = "factor"), X.9 = structure(c(18L, 16L, 10L, 17L, 7L, 9L), .Label = c("", "0", "1", "139.98", "20024", "27701", "303-864-1919", "312.95", "345-098-8890", "404-321-5711", "644.04", "656.73", "66048", "86313", "877.87", "918-491-6998", "918-523-2516", "Contact", "LDC_FinalCost", "PoCustomer", "Zip"), class = "factor"), X.10 = structure(c(14L, 2L, 1L, 2L, 2L, 9L), .Label = c("", "0", "2.620194604", "2.710064788", "2.717239052", "2.766403162", "202-708-4995", "3.09912854", "456", "804-504-7200", "913-682-2000", "919-956-5541", "928-717-7472", "Ext", "InvoicesNeeded", "LDC_UnitPrice", "NULL", "Phone"), class = "factor"), X.11 = structure(c(7L, 4L, 1L, 5L, 1L, 1L), .Label = c("", " ", "1067", "918-491-6659", "918-523-2522", "Ext", "Fax", "InvoiceMonths", "LDC_UnitPrice_Original", "NULL", "x2951"), class = "factor"), X.12 = structure(c(13L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "0", "100.98", "202-401-3722", "237.02", "242.33", "335.04", "50.6", "716- 344-3303", "804-504-7227", "913- 758-4230", "919- 956-7152", "email", "Fax", "GSA", "Supp_Vol"), class = "factor"), X.13 = structure(c(10L, 2L, 2L, 2L, 2L, 2L), .Label = c("", "1", "15", "202-497-6164", "3", "804-504-7200", "Emergency", "MajorTypeId", "NULL", "Status", "Supp_Vol_Adj"), class = "factor")), .Names = c("X.5", "X.6", "X.7", "X.8", "X.9", "X.10", "X.11", "X.12", "X.13"), row.names = 17:22, class = "data.frame")
dat2 = dat[2:6,]   
colnames(dat2) = dat[1,] 
dat2

你提供的数据框没有22行... 你的问题表述不清。既然你提供了dput,我现在已经改变了答案。 - RockScience

1
请在导入数据到R时使用header=TRUE

我在导入 csv 文件时使用了 header=T - PMa

1
如果您能从文件中重新读取数据到R中,您也可以在read.csv中添加“skip”参数以跳过前16行,并使用第17行作为标题:
dat=read.csv("contacts.csv", skip=16, nrows=5, header=TRUE)

0

Shalini Baranwal的回答是最好的,所以我会点赞。然而,未来的读者在运行该解决方案时可能会收到错误消息。我的错误是:

"Error in setnames(x, value) : Passed a vector of type 'list'. Needs to be type 'character'."

为了解决这个问题,我的修改方案是在第一步中添加一个as.character()包装器。完整的解决方案如下:

步骤1:将第一行复制到标题:

dat <- mtcars
names(dat) <- as.character(dat[1,])

第二步:删除第一行:

dat <- dat[-1,]

3
这应该作为对所引用回答的评论或建议编辑进行发布。尽管有用,但它并没有提供一个自包含的答案来回应原始问题。 - Jeremy Caney
我没有足够的声望来评论任何帖子。我的先前回答已经被编辑为自包含形式。谢谢。 - davidbaseball

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接