我正在尝试使用h5py逐个字段将多维结构化numpy数组写入hdf5文件,但是我收到了有关广播不同形状的数组的错误消息。就像示例中所演示的那样,我确实需要将创建数据集和将数据写入数据集的步骤分开。
以下是我最小的可行示例:
以下是我最小的可行示例:
writehdf5.py
#!/bin/env python
import h5py
from numpy.random import randn
print 'Creating Test Data'
mass = randn(10)
altitude = randn(10)
position = randn(10, 3)
velocity = randn(10, 3)
print 'Write 1 dimensional arrays'
hdf5 = h5py.File('test1.hdf', 'w')
dataset = hdf5.create_dataset('test dataset', (10,),
dtype=[('mass', '<f8'),
('altitude', '<f8')])
dataset['mass'] = mass
dataset['altitude'] = altitude
hdf5.close()
print 'Write 2 dimensional arrays'
hdf5 = h5py.File('test2.hdf', 'w')
dataset = hdf5.create_dataset('test dataset', (10,),
dtype=[('position', '<f8', 3),
('velocity', '<f8', 3)])
print dataset['position'].shape
print position.shape
dataset['position'] = position # <-- Error Occurs Here
dataset['velocity'] = velocity
hdf5.close()
运行后,该命令会输出以下内容。
>> python writehdf5.py
Creating Test Data
Write 1 dimensional arrays
Write 2 dimensional arrays
(10, 3)
(10, 3)
Traceback (most recent call last):
File "mwe.py", line 27, in <module>
dataset['position'] = position # <-- Error Occurs Here
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/builddir/build/BUILD/h5py-2.5.0/h5py/_objects.c:2450)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/builddir/build/BUILD/h5py-2.5.0/h5py/_objects.c:2407)
File "/usr/lib64/python2.7/site-packages/h5py/_hl/dataset.py", line 514, in __setitem__
val = numpy.asarray(val, dtype=dtype, order='C')
File "/usr/lib64/python2.7/site-packages/numpy/core/numeric.py", line 462, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: could not broadcast input array from shape (10,3) into shape (10,3,3)
>> h5dump test1.hdf
HDF5 "test1.hdf" {
GROUP "/" {
DATASET "test dataset" {
DATATYPE H5T_COMPOUND {
H5T_IEEE_F64LE "mass";
H5T_IEEE_F64LE "altitude";
}
DATASPACE SIMPLE { ( 10 ) / ( 10 ) }
DATA {
(0): {
0.584402,
1.50107
},
(1): {
-0.284148,
-0.521783
},
(2): {
-0.461751,
0.53352
},
(3): {
2.06525,
-0.0364377
},
(4): {
-0.835377,
1.35912
},
(5): {
-1.31011,
1.21051
},
(6): {
0.103971,
-0.669617
},
(7): {
0.244425,
-0.654791
},
(8): {
0.468478,
2.60204
},
(9): {
0.837614,
1.21362
}
}
}
}
}
>> h5dump test2.hdf
HDF5 "test2.hdf" {
GROUP "/" {
DATASET "test dataset" {
DATATYPE H5T_COMPOUND {
H5T_ARRAY { [3] H5T_IEEE_F64LE } "position";
H5T_ARRAY { [3] H5T_IEEE_F64LE } "velocity";
}
DATASPACE SIMPLE { ( 10 ) / ( 10 ) }
DATA {
(0): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(1): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(2): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(3): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(4): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(5): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(6): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(7): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(8): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
},
(9): {
[ 0, 0, 0 ],
[ 0, 0, 0 ]
}
}
}
}
}
我可以看到我的数据集被正确初始化了,但是当我尝试填写一个字段时,我不理解出现的错误。数据集和要写入的数据的形状显然是相同的。
非常感谢任何关于我可能遗漏的东西(很可能是简单的事情)的帮助!
h5py
版本为 2.6.0。看起来你使用的是 2.5 版本。但我不知道是否存在影响结构化数组的 bug。 - hpaulj