为什么使用reshape2中的melt函数返回值为NA?

4

为什么使用reshape2的melt函数返回的value值为NA

使用reshape包时它可以正常工作,但是使用reshape2时却不行:

以下是数据文件示例:

"","station_id","year","month","day","h1","h2","h3","h4","h5","h6","h7","h8","h9","h10","h11","h12","h13","h14","h15","h16","h17","h18","h19","h20","h21","h22","h23","h24"
"1",1,2004,1,1,46,46,45,41,39,35,33,33,36,47,53,54,55,55,55,55,52,46,40,40,39,38,40,41
"2",1,2004,1,2,43,44,46,46,47,47,47,47,47,47,47,49,52,56,54,56,57,53,50,47,46,45,45,45
"3",1,2004,1,3,45,46,46,44,43,46,46,47,51,55,56,59,65,68,69,68,68,65,64,63,62,63,63,62
"4",1,2004,1,4,63,62,62,62,60,60,60,62,60,64,64,66,71,70,71,72,71,68,67,67,65,64,65,64
"5",1,2004,1,5,64,63,65,64,64,64,64,64,65,66,66,67,68,68,66,66,66,66,63,54,52,49,47,47
"6",1,2004,1,6,47,46,45,43,41,41,39,39,40,43,45,44,45,46,46,46,45,39,39,39,38,36,32,32

假设文件被保存为/tmp/foo.csv,那么:
使用reshape:
$ R
...
Type 'q()' to quit R.

> library("reshape")
Loading required package: plyr

Attaching package: ‘reshape’

The following object(s) are masked from ‘package:plyr’:

    rename, round_any

> hlist <- NULL; for(z in 1:24) { hlist <- cbind(hlist, sprintf("h%d",z)) }
> 
> thh <- read.csv('/tmp/foo.csv')
> thm <- melt(thh,measure.vars=hlist,variable="hour")
> head(thm)
  station_id year month day hour value
1          1 2004     1   1   h1    46
2          1 2004     1   2   h1    43
3          1 2004     1   3   h1    45
4          1 2004     1   4   h1    63
5          1 2004     1   5   h1    64
6          1 2004     1   6   h1    47
> q()

使用reshape2:

$ R
...
Type 'q()' to quit R.

> library("reshape2")
> hlist <- NULL; for(z in 1:24) { hlist <- cbind(hlist, sprintf("h%d",z)) }
> 
> thh <- read.csv('/tmp/foo.csv')
> thm <- melt(thh,measure.vars=hlist,variable="hour")
> head(thm)
  station_id year month day hour value
1          1 2004     1   1   h1    NA
2          1 2004     1   2   h1    NA
3          1 2004     1   3   h1    NA
4          1 2004     1   4   h1    NA
5          1 2004     1   5   h1    NA
6          1 2004     1   6   h1    NA
> q()

您可以看到使用library("reshape")时,value列具有数字,但对于相同的数据,libary("reshape2")则为NA


2
请尝试使用(虚构的)数据进行说明,以便可以将其直接复制/粘贴到R会话中。 - Roland
回复@Roland的评论,可以看看我的更新答案,里面有他所说的例子。 - A5C1D2H2I1M1N2O1R2T1
1个回答

6

你所尝试做的事情有更好的方法。

以下所有方法都可以使用reshape2中的melt()函数:

# Not using hlist
melt(th, measure.vars=5:ncol(th), variable="hour")
melt(th, id.vars=1:4, variable="hour")

# Using your hlist
hlist <- NULL; for(z in 1:24) { hlist <- cbind(hlist, sprintf("h%d",z)) }
melt(th, measure.vars=as.vector(hlist), variable="hour")

# Using an alternative hlist
hlist <- paste0("h", 1:24)
melt(th, measure.vars=hlist, variable="hour")

看起来“reshape”中的melt()接受矩阵作为measure.vars的输入,但“reshape2”中的melt()不接受(我认为这更合理)。
更新:可重现问题示例
FYI,以下是您可以共享此问题的完整方式,方便其他Stack Overflow用户复制和粘贴:
# Use set.seed when you want to use random numbers 
#   but want others to have the same data as you.
set.seed(1) 

# Make up some data that mimics your actual dataset
# Does not have to be your exact dataset
th <- cbind(
  data.frame(station = rep(LETTERS[1:3], each = 3),
             year = 2004, month = rep(1:3, times = 3)), 
  setNames(data.frame(
    matrix(sample(100, 45, replace = TRUE), nrow = 9)),
           paste0("h", 1:5)))

hlist <- NULL; for(z in 1:5) { hlist <- cbind(hlist, sprintf("h%d",z)) }
# Cleanup any unnecessary stuff that your code leaves behind in the workspace
rm(z) 

现在,展示你的问题。你可以使用detach(package:package_name)代替退出并重新启动R。

library(reshape)
head(melt(th, measure.vars = hlist, variable = "hour"))
#   station year month hour value
# 1       A 2004     1   h1    27
# 2       A 2004     2   h1    38
# 3       A 2004     3   h1    58
# 4       B 2004     1   h1    91
# 5       B 2004     2   h1    21
# 6       B 2004     3   h1    90
detach(package:reshape)

library(reshape2)
head(melt(th, measure.vars = hlist, variable = "hour"))
#   station year month hour value
# 1       A 2004     1   h1  <NA>
# 2       A 2004     2   h1  <NA>
# 3       A 2004     3   h1  <NA>
# 4       B 2004     1   h1  <NA>
# 5       B 2004     2   h1  <NA>
# 6       B 2004     3   h1  <NA>
detach(package:reshape2)

希望这有所帮助!

啊,我按照你提供的生成数据方法一步步地完成了。非常有帮助,谢谢!我想给你额外的分数,但是每个答案只能给一个分数! - Hugh Perkins
我遇到了同样的问题,在添加 measure.vars = c(...) 后,现在它可以正常工作了。 - ah bon

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接