在Python中，与Matlab中的“fread”等效的函数是什么？

Question

在Python中，与Matlab中的“fread”等效的函数是什么？

16

我对Matlab几乎没有了解，需要将一些解析例程翻译成Python。它们用于处理大文件，这些文件本身被分成“块”，我在文件顶部的校验和方面就遇到了困难。

这里在Matlab中究竟发生了什么？

status = fseek(fid, 0, 'cof');
fposition = ftell(fid);
disp(' ');
disp(['** Block ',num2str(iBlock),' File Position = ',int2str(fposition)]);

% ----------------- Block Start ------------------ %
[A, count] = fread(fid, 3, 'uint32');
if(count == 3)
    magic_l = A(1);
    magic_h = A(2);
    block_length = A(3);
else
    if(fposition == file_length)
        disp(['** End of file OK']);
    else
        disp(['** Cannot read block start magic !  Note File Length = ',num2str(file_length)]);
    end
    ok = 0;
    break;
end

fid代表当前正在查看的文件。 iBlock是文件内所在“块”的计数器。

magic_l和magic_h与后面的校验和有关，以下是相关代码（直接从上面的代码中跟随）：

disp(sprintf('  Magic_L = %08X, Magic_H = %08X, Length = %i', magic_l, magic_h, block_length));
correct_magic_l = hex2dec('4D445254');
correct_magic_h = hex2dec('43494741');

if(magic_l ~= correct_magic_l | magic_h ~= correct_magic_h)
    disp(['** Bad block start magic !']);
    ok = 0;
    return;
end

remaining_length = block_length - 3*4 - 3*4;   % We read Block Header, and we expect a footer
disp(sprintf('  Remaining Block bytes = %i', remaining_length));

关于 %08X 和 hex2dec 的事情是什么？
还有，为什么要指定 3*4 而不是 12？

但实际上，我想知道如何在Python中复制 [A, count] = fread(fid, 3, 'uint32'); 的功能，因为 io.readline() 只能读取文件的前3个字符。如果我错过了什么重点，请原谅。问题在于，在文件上使用 io.readline(3) 似乎返回了一些不应该返回的内容，并且我不明白为什么 block_length 可以适合一个字节，即使它可能非常长。

感谢您阅读我的废话。希望您能理解我要知道的内容！（任何见解都将不胜感激。）

- Duncan Tait

你可能想要考虑将问题拆分，并将第二部分移动到另一个问题中，题目有点误导。 - Torsten Marek

4个回答

8

根据 fread 函数的文档，它是用于读取二进制数据的函数。第二个参数指定输出向量的大小，第三个参数指定读取的项的大小/类型。

为了在 Python 中重新创建此功能，您可以使用 array 模块。

f = open(...)
import array
a = array.array("L")  # L is the typecode for uint32
a.fromfile(f, 3)

这将从文件f中读取三个uint32值，并在之后在a中可用。根据fromfile的文档：

从文件对象f中读取n个项（作为机器值），并将它们附加到数组的末尾。如果少于n个项目可用，则引发EOFError，但仍将可用的项目插入数组中。f必须是真正的内置文件对象；带有read()方法的其他内容不起作用。

数组实现序列协议，因此支持与列表相同的操作，但您还可以使用.tolist()方法从数组创建普通列表。

- Torsten Marek

不知何故，我在使用a = array.array('i'), a.fromfile(fid, count)和numpy.fromfile(fid, numpy.int16)时得到了不同的结果...33947761，-157220022与113，518。当我将TEXBAT cleanStatic.bin文件用作fid（http://radionavlab.ae.utexas.edu/datastore/texbat/）时...有什么改变可以得到相同的结果吗？ - Cloud Cho

3

实际上，我想知道如何复制 [A, count] = fread(fid, 3, 'uint32'); 在Matlab中，fread()有一个签名是fread(fileID, sizeA, precision)。它读入文件的前sizeA个元素（不是字节），每个元素的大小足够满足precision。在这种情况下，由于您正在读取uint32，因此每个元素的大小为32位或4字节。

因此，尝试使用io.readline(12)从文件中获取前3个4字节元素。

- John Feminella

0

第一部分已经由Torsten的回答涵盖了... 无论如何，你需要使用array或numarray来处理这些数据。

至于%08X和hex2dec的内容，%08X只是用于打印unit32数字的格式（8位十六进制，与Python完全相同），而hex2dec('4D445254')是matlab中表示0x4D445254的方式。

最后，在matlab中，~=是按位比较；在Python中使用==。

- Andrew McGregor

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Matthew Rankin · Accepted Answer

读取一维数组的Python代码

在用Python替换Matlab时，我想要将二进制数据读入到一个numpy.array中，因此我使用了numpy.fromfile来将数据读入到一个一维数组中：

import numpy as np

with open(inputfilename, 'rb') as fid:
    data_array = np.fromfile(fid, np.int16)

使用numpy.fromfile相比其他Python解决方案的一些优点包括：

Not having to manually determine the number of items to be read. You can specify them using the count= argument, but it defaults to -1 which indicates reading the entire file.
Being able to specify either an open file object (as I did above with fid) or you can specify a filename. I prefer using an open file object, but if you wanted to use a filename, you could replace the two lines above with:
```
data_array = numpy.fromfile(inputfilename, numpy.int16)
```

Matlab二维数组的代码

Matlab的fread函数可以将数据读入一个形如[m, n]的矩阵中，而不仅仅是读入一个列向量。例如，要将数据读入一个有2行的矩阵中，请使用以下命令：

fid = fopen(inputfilename, 'r');
data_array = fread(fid, [2, inf], 'int16');
fclose(fid);

二维数组的等效Python代码

您可以使用Numpy的shape和transpose在Python中处理这种情况。

import numpy as np

with open(inputfilename, 'rb') as fid:
    data_array = np.fromfile(fid, np.int16).reshape((-1, 2)).T

-1表示告诉numpy.reshape根据其他维度来推断该维度的数组长度，类似于Matlab的inf无穷大表示。
.T将数组转置，使其成为一个具有2个维度的数组，第一个维度（轴）长度为2。