在R中创建多维NetCDF

8
我正在尝试使用R包ncdf创建一个多维NetCDF文件。我正在处理1500个点的气候日常观测数据,每个点的观测次数约为18250次。 问题是NetCDF文件的结构(create.ncdf)占用了4Gb,而每个点都会使文件大小增加超过3 Gb(put.var.ncdf)。
这是我正在使用的代码:
# Make a few dimensions we can use
dimX <- dim.def.ncdf( "Long", "degrees", Longvector )
dimY <- dim.def.ncdf( "LAT", "degrees", Latvector )
dimT <- dim.def.ncdf( "Time", "days", 1:18250, unlim=FALSE )

# Make varables of various dimensionality, for illustration purposes
mv <- -9999 # missing value to use
var1d <- var.def.ncdf( "var1d", "units", dimX, mv,prec="double" )
var2d <- var.def.ncdf( "var2d", "units", list(dimX,dimY), mv,prec="double" )
var3d <- var.def.ncdf( "var3d", "units", list(dimX,dimY,dimT), mv,prec="double" )

# Create the test file
nc <- create.ncdf( "writevals.nc", list(var1d,var2d,var3d) )
# !!Creates a nc file with + 4 Gb

# Adding the complete time series for one point (the first point in the list of the dataset)
put.var.ncdf( nc, var3d,dataset[[1]], start=c(Longvector[1],Latvector[1],1),         count=c(1,1,-1))

Longvector和Latvector是从矩阵中取出的向量,每个点的经度和纬度都有对应的值。数据集采用列表格式,每个点都有一组数值列表。

dataset[[1]]=c(0,0,0,9.7,0,7.5,3.6,2.9,0,0.5,....) 

我有所遗漏吗?还是应该尝试其他软件包?


Longvector和Latvector的长度是多少?你能提供它们吗?也许可以通过调用seq()或只需使用dput()转储代码来重新创建它们。 - mdsumner
请编辑问题以包含缺失的信息。 - mdsumner
建议将已接受的ncdf4解决方案移动,因为ncdf现在已经过时 - 大多数软件现在使用netcdf4约定。 - ClimateUnboxed
2个回答

8

您的无法重现的代码中存在一些错误,根据我的计算,文件大小为219Mb(1500 * 18250 * 8个字节)。

library(ncdf)

提供前两个维度的向量和数据集,以至少匹配一个切片。
Longvector = seq(-180, 180, length = 50)
Latvector = seq(-90, 90, length = 30)
dataset <- list(1:18250)

dimX <- dim.def.ncdf("Long", "degrees", Longvector)
dimY <- dim.def.ncdf("LAT", "degrees", Latvector)
dimT <- dim.def.ncdf("Time", "days", 1:18250, unlim = FALSE)

mv <- -9999 
var1d <- var.def.ncdf( "var1d", "units", dimX, mv,prec="double")
var2d <- var.def.ncdf( "var2d", "units", list(dimX,dimY), mv,prec="double")
var3d <- var.def.ncdf( "var3d", "units", list(dimX,dimY,dimT), mv,prec="double")

nc <- create.ncdf( "writevals.nc", list(var1d,var2d,var3d))

计数是维度的索引,而不是轴位置值,因此我们将start更正为1,并使用第三个维度的计数(长度)(而不是-1)。

put.var.ncdf(nc, var3d, dataset[[1]], start = c(1, 1, 1),  count = c(1, 1, length(dataset[[1]])))

close.ncdf(nc)

查询文件大小。

file.info("writevals.nc")$size/1e6
[1] 219.0866

5

这是对mdsumner的回答进行更新后的版本,可与R的NetCDF4软件包(ncdf4)配合使用。

# Open library
library(ncdf4)

# Get x and y vectors (dimensions)
Longvector = seq(-180, 180, length = 50)
Latvector = seq(-90, 90, length = 30)
# Define data
dataset = list(1:18250)

# Define the dimensions
dimX = ncdim_def("Long", "degrees", Longvector)
dimY = ncdim_def("Lat", "degrees", Latvector)
dimT = ncdim_def("Time", "days", 1:18250)

# Define missing value
mv = -9999

# Define the data
var1d = ncvar_def( "var1d", "units", dimX, mv, prec="double")
var2d = ncvar_def( "var2d", "units", list(dimX,dimY), mv, prec="double")
var3d = ncvar_def( "var3d", "units", list(dimX,dimY,dimT), mv, prec="double")

# Create the NetCDF file
# If you want a NetCDF4 file, explicitly add force_v4=T
nc = nc_create("writevals.nc", list(var1d, var2d, var3d))

# Write data to the NetCDF file
ncvar_put(nc, var3d, dataset[[1]], start=c(1, 1, 1),
    count=c(1, 1, length(dataset[[1]])))

# Close your new file to finish writing
nc_close(nc)

所以在:#定义数据集=列表(1:18250)中,我们传递网格的列表? - LuluPor
1
是的。有点像。它是时间步骤的索引。 - MikeRSpencer

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接