从ctypes数组中获取数据到numpy

Question

从ctypes数组中获取数据到numpy

pythonnumpyctypes

43

我正在使用一个Python（通过ctypes）封装的C库来运行一系列计算。在运行的不同阶段，我想将数据传递到Python中，特别是numpy数组。

我使用的封装对于数组数据有两种不同的返回类型（这对我来说非常重要）：

ctypes Array: When I do type(x) (where x is the ctypes array, I get a <class 'module_name.wrapper_class_name.c_double_Array_12000'> in return. I know that this data is a copy of the internal data from the documentation and I am able to get it into a numpy array easily:
```
>>> np.ctypeslib.as_array(x)
```

这将返回一个数据的一维numpy数组。

ctype pointer to data: In this case from the library's documentation, I understand that I am getting a pointer to the data stored and used directly to the library. Whey I do type(y) (where y is the pointer) I get <class 'module_name.wrapper_class_name.LP_c_double'>. With this case I am still able to index through the data like y[0][2], but I was only able to get it into numpy via a super awkward:
```
>>> np.frombuffer(np.core.multiarray.int_asbuffer(
    ctypes.addressof(y.contents), array_length*np.dtype(float).itemsize))
```

我在一封旧的numpy邮件列表Travis Oliphant的主题线程中发现了这个问题，但在numpy文档中找不到。如果我尝试以上述方法而不是这种方法，我会得到以下结果：

>>> np.ctypeslib.as_array(y)
...
...  BUNCH OF STACK INFORMATION
...
AttributeError: 'LP_c_double' object has no attribute '__array_interface__'

这种使用 np.frombuffer 的方法是否是最好或唯一的方法？我可以接受其他建议，但我仍然希望使用 numpy，因为我有很多其他后处理代码依赖于 numpy 功能，我想在这个数据上使用。

- dtlussier

你对 C 库有控制权吗？你能改变库的 API 吗？ - Sven Marnach

是的 - 我有源代码。但我不确定该怎么做，因为指针方法允许Python直接对数据进行操作，在某些情况下可能是一个优势。但在我的情况下，将所有内容作为ctype数组输出确实是一个优势。您有什么建议吗？ - dtlussier

2

我建议使用在Python中分配并传递给库的（NumPy）数组。这样，您可以对相同的内存进行操作，但不必担心进行任何尴尬的转换。您已经有了一个NumPy数组，并且通过将其作为参数类型使用numpy.ctypeslib.ndpointer来支持函数的ctypes包装器。（如果不清楚，请随时问...） - Sven Marnach

6个回答

14

另一种可能性（可能需要比第一个答案编写时可用的库版本更新 -- 我使用了 ctypes 1.1.0 和 numpy 1.5.0b2 测试了类似的东西）是从指针转换为数组。

np.ctypeslib.as_array(
    (ctypes.c_double * array_length).from_address(ctypes.addressof(y.contents)))

看起来这仍具有共享所有权的语义，因此您可能需要确保最终释放底层缓冲区。

- seeker

2

或者没有numpy的特殊支持：您可以将y指针转换为数组类型的指针：ap = ctypes.cast(y, ctypes.POINTER(ArrayType))，其中ArrayType = ctypes.c_double * array_length，然后从中创建numpy数组：a = np.frombuffer(ap.contents)。请参见如何将C数组指针转换为Python数组。 - jfs

我正在尝试这个，但是ap对象没有成员"contents"。 - Totte Karlsson

@TotteKarlsson：链接中的代码正常运行（我已测试过）。这很可能是您代码中的错误（也可能是各种 Python 版本之间的不同，但可能性较小）。如果您尚未解决问题，请创建一个最小但完整的代码示例，指定您的操作系统、Python 版本，并将其作为新的SO问题发布。 - jfs

12

np.ctypeslib.as_array在这里就足够了。从一个数组开始：

 c_arr = (c_float * 8)()
 np.ctypeslib.as_array(c_arr)

从指针开始

 c_arr = (c_float * 8)()
 ptr = ctypes.pointer(c_arr[0])
 np.ctypeslib.as_array(ptr, shape=(8,))

- Eric

这对我非常有效，谢谢。我将一个带有数组指针成员的python ctypes.Structure传递给一个处理数据的C函数；然后在python中读取成员数组的内容。Python 3.8，Numpy 1.19，ctypes 1.1.0 - coulan88

11

对我来说，这两种方法在Python 3中都不起作用。针对将ctypes指针转换为Python 2和3中的numpy ndarray的通用解决方案，我发现以下方法可行（通过获取只读缓冲区）：

def make_nd_array(c_pointer, shape, dtype=np.float64, order='C', own_data=True):
    arr_size = np.prod(shape[:]) * np.dtype(dtype).itemsize 
    if sys.version_info.major >= 3:
        buf_from_mem = ctypes.pythonapi.PyMemoryView_FromMemory
        buf_from_mem.restype = ctypes.py_object
        buf_from_mem.argtypes = (ctypes.c_void_p, ctypes.c_int, ctypes.c_int)
        buffer = buf_from_mem(c_pointer, arr_size, 0x100)
    else:
        buf_from_mem = ctypes.pythonapi.PyBuffer_FromMemory
        buf_from_mem.restype = ctypes.py_object
        buffer = buf_from_mem(c_pointer, arr_size)
    arr = np.ndarray(tuple(shape[:]), dtype, buffer, order=order)
    if own_data and not arr.flags.owndata:
        return arr.copy()
    else:
        return arr

- wordy

这太棒了！数组出现了旋转 - 奇怪 :) - jtlz2

5

使用`np.ndarrays`作为`ctypes`参数

更好的方法是使用ndpointer，如numpy文档中所提到的。

这种方法比使用例如POINTER(c_double)更加灵活，因为可以指定多个限制条件，在调用ctypes函数时进行验证。这些条件包括数据类型、维数、形状和标志。如果给定的数组不满足指定的限制条件，则会引发TypeError。

最小重现示例

从Python调用memcpy。最终需要调整标准C库文件libc.so.6的文件名。

import ctypes
import numpy as np

n_bytes_f64 = 8
nrows = 2
ncols = 5

clib = ctypes.cdll.LoadLibrary("libc.so.6")

clib.memcpy.argtypes = [
    np.ctypeslib.ndpointer(dtype=np.float64, ndim=2, flags='C_CONTIGUOUS'),
    np.ctypeslib.ndpointer(dtype=np.float64, ndim=1, flags='C_CONTIGUOUS'),
    ctypes.c_size_t]
clib.memcpy.restype = ctypes.c_void_p

arr_from = np.arange(nrows * ncols).astype(np.float64)
arr_to = np.empty(shape=(nrows, ncols), dtype=np.float64)

print('arr_from:', arr_from)
print('arr_to:', arr_to)

print('\ncalling clib.memcpy ...\n')
clib.memcpy(arr_to, arr_from, nrows * ncols * n_bytes_f64)

print('arr_from:', arr_from)
print('arr_to:', arr_to)

输出

arr_from: [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
arr_to: [[0.0e+000 4.9e-324 9.9e-324 1.5e-323 2.0e-323]
 [2.5e-323 3.0e-323 3.5e-323 4.0e-323 4.4e-323]]

calling clib.memcpy ...

arr_from: [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
arr_to: [[0. 1. 2. 3. 4.]
 [5. 6. 7. 8. 9.]]

如果您将ndpointer的ndim=1/2参数修改为与arr_from / arr_to的维度不一致，代码将失败并出现ArgumentError。

由于此问题的标题相当通用，...

从`ctypes.c_void_p`结果构造`np.ndarray`

最小化、可重复的示例

在以下示例中，通过malloc分配了一些内存，并通过memset填充为0。然后构造了一个numpy数组，以访问该内存。当然，会出现所有权问题，因为Python不会释放在c中分配的内存。为避免内存泄漏，必须通过ctypes再次free已分配的内存。可以使用np.ndarray的copy方法获取所有权。

import ctypes
import numpy as np

n_bytes_int = 4
size = 7

clib = ctypes.cdll.LoadLibrary("libc.so.6")

clib.malloc.argtypes = [ctypes.c_size_t]
clib.malloc.restype = ctypes.c_void_p

clib.memset.argtypes = [
    ctypes.c_void_p,
    ctypes.c_int,
    ctypes.c_size_t]
clib.memset.restype = np.ctypeslib.ndpointer(
    dtype=np.int32, ndim=1, flags='C_CONTIGUOUS')

clib.free.argtypes = [ctypes.c_void_p]
clib.free.restype = ctypes.c_void_p


pntr = clib.malloc(size * n_bytes_int)
ndpntr = clib.memset(pntr, 0, size * n_bytes_int)
print(type(ndpntr))
ctypes_pntr = ctypes.cast(ndpntr, ctypes.POINTER(ctypes.c_int))
print(type(ctypes_pntr))
print()
arr_noowner = np.ctypeslib.as_array(ctypes_pntr, shape=(size,))
arr_owner = np.ctypeslib.as_array(ctypes_pntr, shape=(size,)).copy()
# arr_owner = arr_noowner.copy()


print('arr_noowner (at {:}): {:}'.format(arr_noowner.ctypes.data, arr_noowner))
print('arr_owner (at {:}): {:}'.format(arr_owner.ctypes.data, arr_owner))

print('\nfree allocated memory again ...\n')
_ = clib.free(pntr)

print('arr_noowner (at {:}): {:}'.format(arr_noowner.ctypes.data, arr_noowner))
print('arr_owner (at {:}): {:}'.format(arr_owner.ctypes.data, arr_owner))

print('\njust for fun: free some python-memory ...\n')
_ = clib.free(arr_owner.ctypes.data_as(ctypes.c_void_p))

print('arr_noowner (at {:}): {:}'.format(arr_noowner.ctypes.data, arr_noowner))
print('arr_owner (at {:}): {:}'.format(arr_owner.ctypes.data, arr_owner))

输出

<class 'numpy.ctypeslib.ndpointer_<i4_1d_C_CONTIGUOUS'>
<class '__main__.LP_c_int'>

arr_noowner (at 104719884831376): [0 0 0 0 0 0 0]
arr_owner (at 104719884827744): [0 0 0 0 0 0 0]

free allocated memory again ...

arr_noowner (at 104719884831376): [ -7687536     24381 -28516336     24381         0         0         0]
arr_owner (at 104719884827744): [0 0 0 0 0 0 0]

just for fun: free some python-memory ...

arr_noowner (at 104719884831376): [ -7687536     24381 -28516336     24381         0         0         0]
arr_owner (at 104719884827744): [ -7779696     24381 -28516336     24381         0         0         0]

- Markus Dutschke

0

如果您已经熟悉在Python中创建数组，下面这个使用2D数组的示例可以在Python3中运行：

import numpy as np
import ctypes

OutType = (ctypes.c_float * 4) * 6
out = OutType()
YourCfunction = ctypes.CDLL('./yourlib.so').voidreturningfunctionwithweirdname
YourCfunction.argtypes = [ctypes.POINTER(ctypes.c_float)]*3, ctypes.POINTER(ctypes.c_float)]*5, OutType]
YourCfunction(input1, input2, out)
out = np.array(out) # convert it to numpy

print(out)

numpy和ctypes的版本分别为1.11.1和1.1.0

- Ilya Prokin

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sven Marnach · Accepted Answer

从ctypes指针对象创建NumPy数组是一种问题多多的操作。不清楚指针所指向的内存实际归谁所有，什么时候会被释放？它的有效期有多长？尽可能避免使用这种构造。在Python代码中创建数组并将其传递给C函数比使用由不知道Python的C函数分配的内存更容易、更安全。后者会在一定程度上抵消使用高级语言管理内存的优势。

如果您真的确定有人负责内存，可以创建一个公开Python“缓冲区协议”的对象，然后使用此缓冲区对象创建NumPy数组。你在帖子中给出了创建缓冲区对象的一种方式，通过未记录在案的int_asbuffer() 函数：

buffer = numpy.core.multiarray.int_asbuffer(
    ctypes.addressof(y.contents), 8*array_length)

（请注意，我将np.dtype(float).itemsize替换为8。对于任何平台，它始终为8。）创建缓冲区对象的另一种方法是通过ctypes调用Python C API中的PyBuffer_FromMemory()函数：

buffer_from_memory = ctypes.pythonapi.PyBuffer_FromMemory
buffer_from_memory.restype = ctypes.py_object
buffer = buffer_from_memory(y, 8*array_length)

对于这两种方法，您可以通过以下方式从buffer创建NumPy数组：

a = numpy.frombuffer(buffer, float)

我实际上不理解为什么你使用 .astype() 而不是在 frombuffer 中使用第二个参数；另外，我想知道为什么你使用 np.int，而之前你说数组包含的是 double。

恐怕这不会比这更容易了，但也不是那么糟糕，你可以将所有丑陋的细节隐藏在一个包装函数中，就不用再担心了。

从ctypes数组中获取数据到numpy

使用np.ndarrays作为ctypes参数

从ctypes.c_void_p结果构造np.ndarray

使用`np.ndarrays`作为`ctypes`参数

从`ctypes.c_void_p`结果构造`np.ndarray`