用Fortran读取由Python代码创建的二进制文件

Question

用Fortran读取由Python代码创建的二进制文件

7

我有一个二进制文件，是由Python代码创建的。该代码脚本化了一堆任务以预处理一组数据文件。我现在希望在Fortran中读取这个二进制文件。二进制文件的内容是点的坐标，格式很简单：点数、x0、y0、z0、x1、y1、z1等。这些二进制文件是使用numpy中的“tofile”函数创建的。到目前为止，在Fortran中我有以下代码：

integer:: intValue
double precision:: dblValue
integer:: counter
integer:: check
open(unit=10, file='file.bin', form='unformatted', status='old', access='stream')

counter = 1

do 

  if ( counter == 1 ) then
    read(unit=10, iostat=check) intValue
    if ( check < 0 ) then
      print*,"End Of File"
      stop
    else if ( check > 0 ) then
      print*, "Error Detected"
      stop
    else if ( check == 0 ) then
      counter = counter + 1
      print*, intValue
    end if
  else if ( counter > 1 ) then
    read(unit=10, iostat=check) dblValue
    if ( check < 0 ) then
      print*,"End Of File"
      stop
    else if ( check > 0 ) then
      print*, "Error Detected"
      stop
    else if ( check == 0 ) then
      counter = counter + 1
      print*,dblValue
    end if
  end if

end do

close(unit=10)

很不幸，这并不起作用，我得到了一些垃圾数值（例如6.4731191026611484E+212，2.2844499004808491E-279等）。有人能否指点一下如何正确地做到这一点？此外，在Python和Fortran之间可互换地编写和读取二进制文件的好方法是什么——因为这似乎是我应用程序的要求之一。

谢谢

- computanjohn

1

我添加了一个详细而友好的答案，告诉你要使用access=stream :) 我刚意识到你已经在这样做了，所以我暂时删除了我的答案。所以问题是：你确定你的Python int和Fortran integer的字节大小是相同的吗？你应该检查两者。如果有一字节的差异，数据的错位将导致读取后的垃圾值。你使用的Fortran编译器是什么？你是如何声明你的integer变量的？你的Python int具体类型是什么？ - Andras Deak -- Слава Україні

1

关于可互换性的问题：我更愿意将元数据（如点数）放入单独的ASCII头文件中，这样可以轻松读取，并且只将相同类型的数据放入单个二进制文件中，这也允许相对容易地进行字节序转换。 - haraldkl

展示你的声明。 - agentp

1

为了实现可移植性和互换性，我会使用HDF5来读写数据。它支持Python、Fortran和许多其他语言。 - casey

1

Kind numbers通常是不可移植的，但是我知道的所有编译器都不支持kind=32。请使用kind=real32和kind=real64（这些常量来自模块iso_fortran_env https://gcc.gnu.org/onlinedocs/gfortran/ISO_005fFORTRAN_005fENV.html）。它们应该等同于NumPy的float32和float64。 - Vladimir F Героям слава

显示剩余8条评论

2个回答

1

这是一个使用numpy生成数据并以Fortran二进制方式处理的简单示例。

我计算了在[0,2π)上的360个sin值。

#!/usr/bin/env python3
import numpy as np

with open('sin.dat', 'wb') as outfile:
    np.sin(np.arange(0., 2*np.pi, np.pi/180.,
                     dtype=np.float32)).tofile(outfile)

使用 tofile 导出二进制文件 'sin.dat'，该文件大小为 1440 字节 (360 * sizeof(float32))，使用以下 Fortran95 程序（gfortran -O3 -Wall -pedantic）读取该文件，程序输出 x 在 [0,2π) 范围内的值 1. - (val ** 2 + cos(x) ** 2)。

program numpy_import
    integer,         parameter                    :: REAL_KIND = 4
    integer,         parameter                    :: UNIT = 10
    integer,         parameter                    :: SAMPLE_LENGTH = 360
    real(REAL_KIND), parameter                    :: PI = acos(-1.)
    real(REAL_KIND), parameter                    :: DPHI = PI/180.

    real(REAL_KIND), dimension(0:SAMPLE_LENGTH-1) :: arr
    real(REAL_KIND)                               :: r
    integer                                       :: i


    open(UNIT, file="sin.dat", form='unformatted',&
                access='direct', recl=4)

    do i = 0,ubound(arr, 1)
        read(UNIT, rec=i+1, err=100) arr(i)  
    end do

    do i = 0,ubound(arr, 1)
        r = 1. - (arr(i)**2. + cos(real(i*DPHI, REAL_KIND))**2) 
        write(*, '(F6.4, " ")', advance='no')&
            real(int(r*1E6+1)/1E6, REAL_KIND)
    end do

100 close(UNIT)    
    write(*,*)
end program numpy_import

因此，如果val == sin(x)，则数值结果必须在float32类型中很好地近似为零。

而事实上：

输出：

360 x 0.0000

- decltype_auto

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- computanjohn · Accepted Answer

感谢这个伟大的社区，从我得到的所有建议中，再加上一点点尝试，我认为我找到了一个稳定的解决方案来解决这个问题，并且我想与大家分享这个答案。我将在此提供一个最小示例，其中我想从Python将可变大小的数组写入二进制文件，并使用Fortran读取它。我假设行数numRows和列数numCols也与完整数组datatArray一起写入。以下Python脚本writeBin.py编写该文件：

import numpy as np
# Read in the numRows and numCols value 
# Read in the array values
numRowArr = np.array([numRows], dtype=np.float32)
numColArr = np.array([numCols], dtype=np.float32)
fileObj   = open('pybin.bin', 'wb')
numRowArr.tofile(fileObj)
numColArr.tofile(fileObj)
for i in range(numRows):
    lineArr = dataArray[i,:]
    lineArr.tofile(fileObj)
fileObj.close()

接下来，从文件中读取数组的Fortran代码可以编写如下：

program readBin

    use iso_fortran_env

    implicit none

    integer:: nR, nC, i

    real(kind=real32):: numRowVal, numColVal
    real(kind=real32), dimension(:), allocatable:: rowData
    real(kind=real32), dimension(:,:), allocatable:: fullData

    open(unit=10,file='pybin.bin',form='unformatted',status='old',access='stream')

    read(unit=10) numRowVal
    nR = int(numRowVal)

    read(unit=10) numColVal
    nC = int(numColVal)

    allocate(rowData(nC))
    allocate(fullData(nR,nC))

    do i = 1, nR

        read(unit=10) rowData
        fullData(i,:) = rowData(:)

    end do

    close(unit=10)

end program readBin

我从这个讨论串中得到的主要观点是尽可能匹配读和写，并精确指定要读取的数据类型、它们的写入方式等。正如您所注意到的，这只是一个虚构的例子，因此可能有一些不完美的地方。然而，我现在已经用它编写了一个有限元程序，网格数据就是我使用这种二进制读/写的地方 - 它运行得非常好。

附注：如果您发现有些错别字，请告诉我，我会立即进行编辑。

非常感谢。