有人能解释一下,我需要什么样的循环来将一个4x11x14的numpy数组写入文件吗?
这个数组由四个11 x 14的数组组成,因此我应该使用良好的换行格式对其进行格式化,以使其他人更容易读取该文件。
编辑:所以我尝试了numpy.savetxt函数。奇怪的是它出现了以下错误:
TypeError: float argument required, not numpy.ndarray
我猜这是因为该函数不支持多维数组?有什么解决方法吗,因为我希望它们在同一个文件中。
TypeError: float argument required, not numpy.ndarray
我猜这是因为该函数不支持多维数组?有什么解决方法吗,因为我希望它们在同一个文件中。
numpy.save
。pickle也可以正常工作,但对于大型数组效率较低(而您的数组不是很大,因此两种方法都可以)。numpy.savetxt
。
编辑:看起来savetxt
对于维数大于2的数组并不是很好......但只是为了将所有内容详细说明:numpy.savetxt
会在ndarray中有超过2个维度时出错......这可能是设计上的问题,因为在文本文件中没有明确定义附加维度的方法。import numpy as np
x = np.arange(20).reshape((4,5))
np.savetxt('test.txt', x)
当使用一个3D数组时,同样的操作将会失败(并且会返回一个不太详细的错误:TypeError: float argument required, not numpy.ndarray
):
import numpy as np
x = np.arange(200).reshape((4,5,10))
np.savetxt('test.txt', x)
一种解决方法是将三维(或更高维)数组分成二维切片。例如:
x = np.arange(200).reshape((4,5,10))
with open('test.txt', 'w') as outfile:
for slice_2d in x:
np.savetxt(outfile, slice_2d)
numpy.loadtxt
读取。因此,我们可以更加冗长,并使用注释行区分切片。默认情况下,numpy.loadtxt
将忽略任何以#
开头的行(或由comments
kwarg指定的任何字符)。 (这看起来比实际情况要冗长一些...)import numpy as np
# Generate some test data
data = np.arange(200).reshape((4,5,10))
# Write the array to disk
with open('test.txt', 'w') as outfile:
# I'm writing a header here just for the sake of readability
# Any line starting with "#" will be ignored by numpy.loadtxt
outfile.write('# Array shape: {0}\n'.format(data.shape))
# Iterating through a ndimensional array produces slices along
# the last axis. This is equivalent to data[i,:,:] in this case
for data_slice in data:
# The formatting string indicates that I'm writing out
# the values in left-justified columns 7 characters in width
# with 2 decimal places.
np.savetxt(outfile, data_slice, fmt='%-7.2f')
# Writing out a break to indicate different slices...
outfile.write('# New slice\n')
# Array shape: (4, 5, 10)
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00
10.00 11.00 12.00 13.00 14.00 15.00 16.00 17.00 18.00 19.00
20.00 21.00 22.00 23.00 24.00 25.00 26.00 27.00 28.00 29.00
30.00 31.00 32.00 33.00 34.00 35.00 36.00 37.00 38.00 39.00
40.00 41.00 42.00 43.00 44.00 45.00 46.00 47.00 48.00 49.00
# New slice
50.00 51.00 52.00 53.00 54.00 55.00 56.00 57.00 58.00 59.00
60.00 61.00 62.00 63.00 64.00 65.00 66.00 67.00 68.00 69.00
70.00 71.00 72.00 73.00 74.00 75.00 76.00 77.00 78.00 79.00
80.00 81.00 82.00 83.00 84.00 85.00 86.00 87.00 88.00 89.00
90.00 91.00 92.00 93.00 94.00 95.00 96.00 97.00 98.00 99.00
# New slice
100.00 101.00 102.00 103.00 104.00 105.00 106.00 107.00 108.00 109.00
110.00 111.00 112.00 113.00 114.00 115.00 116.00 117.00 118.00 119.00
120.00 121.00 122.00 123.00 124.00 125.00 126.00 127.00 128.00 129.00
130.00 131.00 132.00 133.00 134.00 135.00 136.00 137.00 138.00 139.00
140.00 141.00 142.00 143.00 144.00 145.00 146.00 147.00 148.00 149.00
# New slice
150.00 151.00 152.00 153.00 154.00 155.00 156.00 157.00 158.00 159.00
160.00 161.00 162.00 163.00 164.00 165.00 166.00 167.00 168.00 169.00
170.00 171.00 172.00 173.00 174.00 175.00 176.00 177.00 178.00 179.00
180.00 181.00 182.00 183.00 184.00 185.00 186.00 187.00 188.00 189.00
190.00 191.00 192.00 193.00 194.00 195.00 196.00 197.00 198.00 199.00
# New slice
只要我们知道原始数组的形状,将其读回来非常容易。我们可以使用numpy.loadtxt('test.txt').reshape((4,5,10))
。例如(你可以在一行中完成这个操作,我只是详细解释):
# Read the array from disk
new_data = np.loadtxt('test.txt')
# Note that this returned a 2D array!
print new_data.shape
# However, going back to 3D is easy if we know the
# original shape of the array
new_data = new_data.reshape((4,5,10))
# Just to check that they're the same...
assert np.all(new_data == data)
我不确定这是否符合您的要求,因为我认为您对使文件可读性更强感兴趣,但如果这不是首要问题,只需使用pickle
。
保存它:
import pickle
my_data = {'a': [1, 2.0, 3, 4+6j],
'b': ('string', u'Unicode string'),
'c': None}
output = open('data.pkl', 'wb')
pickle.dump(my_data, output)
output.close()
读取它的方法:
import pprint, pickle
pkl_file = open('data.pkl', 'rb')
data1 = pickle.load(pkl_file)
pprint.pprint(data1)
pkl_file.close()
pprint
来打印字典。 - zyyimport numpy as np
import scipy.io
# Some test data
x = np.arange(200).reshape((4,5,10))
# Specify the filename of the .mat file
matfile = 'test_mat.mat'
# Write the array to the mat file. For this to work, the array must be the value
# corresponding to a key name of your choice in a dictionary
scipy.io.savemat(matfile, mdict={'out': x}, oned_as='row')
# For the above line, I specified the kwarg oned_as since python (2.7 with
# numpy 1.6.1) throws a FutureWarning. Here, this isn't really necessary
# since oned_as is a kwarg for dealing with 1-D arrays.
# Now load in the data from the .mat that was just saved
matdata = scipy.io.loadmat(matfile)
# And just to check if the data is the same:
assert np.all(x == matdata['out'])
如果您忘记了 .mat
文件中数组的名称,您可以始终执行以下操作:
print matdata.keys()
当然,你可以使用更多的键存储许多数组。
因此,是的 - 它不适合人类阅读,但只需要两行即可编写和读取数据,我认为这是一个公平的权衡。
请查看 scipy.io.savemat 和 scipy.io.loadmat 的文档,以及这个教程页面:scipy.io 文件IO教程
ndarray.tofile()
也可以使用。
例如,如果您的数组名为 a
:
a.tofile('yourfile.txt',sep=" ",format="%s")
不确定如何获得换行格式。
编辑(感谢Kevin J. Black在此评论中的贡献):
自版本1.5.0以来,
np.tofile()
带有可选参数newline='\n'
以允许多行输出。 https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.savetxt.html
tofile
没有 newline='\n'
。 - Nico Schlömer文件I/O常常是代码中的瓶颈。这就是为什么知道ASCII I/O总是比二进制I/O慢得多很重要。我用perfplot比较了一些建议的解决方案:
绘制图表的代码:
import json
import pickle
import numpy as np
import perfplot
import scipy.io
def numpy_save(data):
np.save("test.dat", data)
def numpy_savetxt(data):
np.savetxt("test.txt", data)
def numpy_savetxt_fmt(data):
np.savetxt("test.txt", data, fmt="%-7.2f")
def pickle_dump(data):
with open("data.pkl", "wb") as f:
pickle.dump(data, f)
def scipy_savemat(data):
scipy.io.savemat("test.dat", mdict={"out": data})
def numpy_tofile(data):
data.tofile("test.txt", sep=" ", format="%s")
def json_dump(data):
with open("test.json", "w") as f:
json.dump(data.tolist(), f)
perfplot.save(
"out.png",
setup=np.random.rand,
n_range=[2 ** k for k in range(20)],
kernels=[
numpy_save,
numpy_savetxt,
numpy_savetxt_fmt,
pickle_dump,
scipy_savemat,
numpy_tofile,
json_dump,
],
equality_check=None,
)
您也可以将NumPy多维数组数据存储在.npy
文件类型中(这是一个二进制文件)。
save()
函数将数据存储到文件中:import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) #shape (3x3)
np.save('filename.npy', a)
load()
函数卸载:b = np.load('filename.npy')
存在一些特殊的库可以实现这个功能(还有Python的包装器)
netCDF4 Python 接口: http://www.unidata.ucar.edu/software/netcdf/software.html#Python
希望这能帮到你
import numpy as np
import ndsave
shape = (5,2,4)
X = np.arange(np.product(shape)).reshape( shape )
time = [10.0, 10.1, 10.2, 10.3, 10.4]
height = [110, 115]
ndsave.savetxt('test.txt', X, fmt='%i', axisdata=[time, height],
axisnames=['time (s)', 'height (cm)', 'sample'])
X, axisdata, axisnames = ndsave.loadtxt('test.txt', dtype=np.uint32)
test.txt:
# (5, 2, 3)
# axis 0 (time (s))
# [10. ,10.1,10.2,10.3,10.4]
# axis 1 (height (cm))
# [110,115]
# axis 2 (sample)
# [0,1,2]
0 1 2
3 4 5
6 7 8
9 10 11
12 13 14
15 16 17
18 19 20
21 22 23
24 25 26
27 28 29
# from util.npa2csv import Visualarr; Visualarr(x)
import numpy as np
import torch
def Visualarr(arr, out = 'array_out.txt'):
dim = arr.ndim
if isinstance(arr, np.ndarray):
# (#Images, #Chennels, #Row, #Column)
if dim == 4:
arr = arr.transpose(3,2,0,1)
if dim == 3:
arr = arr.transpose(2,0,1)
if isinstance(arr, torch.Tensor):
arr = arr.numpy()
with open(out, 'w') as outfile:
outfile.write('# Array shape: {0}\n'.format(arr.shape))
if dim == 1 or dim == 2:
np.savetxt(outfile, arr, fmt='%-7.3f')
elif dim == 3:
for i, arr2d in enumerate(arr):
outfile.write('# {0}-th channel\n'.format(i))
np.savetxt(outfile, arr2d, fmt='%-7.3f')
elif dim == 4:
for j, arr3d in enumerate(arr):
outfile.write('\n# {0}-th Image\n'.format(j))
for i, arr2d in enumerate(arr3d):
outfile.write('# {0}-th channel\n'.format(i))
np.savetxt(outfile, arr2d, fmt='%-7.3f')
else:
print("Out of dimension!")
def test_va():
arr = np.random.rand(4,2)
tens = torch.rand(2,5,6,3)
Visualarr(arr)
test_va()
numpy.loadtxt
(http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html)。 - Dominic Rodger