使用numpy.interp进行线性插值。

Question

使用numpy.interp进行线性插值。

pythonnumpyinterpolationlinear-interpolation

11

我有一个由浮点数构成的一维数组A，大部分数据都是完整的，但是有一些数据缺失了。缺失的数据被替换为nan（非数字）。我需要通过线性插值从周围的完整数据中填补这些缺失数据。比如说：

F7(np.array([10.,20.,nan,40.,50.,nan,30.]))

应该返回

np.array([10.,20.,30.,40.,50.,40.,30.]).

使用Python做这件事情的最佳方式是什么？

非常感谢任何帮助。

谢谢。

- user1789657

3

你是真的希望使用线性插值吗？还是实际上你指的是平均值？--我还假设第一个和最后一个值保证不为NaN（非数字）吗？ - mgilson

这只是一个例子的平均数。线性插值确实应该在线性方程中找到缺失的值。而且，第一个和最后一个值不是NaN。 - user1789657

3个回答

9

我会选择使用 pandas。采用一种最简化的方法，只需要一行代码即可完成：

from pandas import *
a=np.array([10.,20.,nan,40.,50.,nan,30.])
Series(a).interpolate()   

Out[219]:
0    10
1    20
2    30
3    40
4    50
5    40
6    30

或者如果您想将其保留为数组：

Series(a).interpolate().values

Out[221]:
array([ 10.,  20.,  30.,  40.,  50.,  40.,  30.])

- root

@larsmans -- 我刚想建议 .values，它也会返回一个数组 :) - root

看到了，删掉了我的评论。Pandas 仍然在“需要学习的库”列表上 :) - Fred Foo

0

为了不在每次要插值数据时创建新的Series对象或新的Series项，请使用RedBlackPy。请参见下面的代码示例：

import redblackpy as rb

# we do not include missing data
index = [0,1,3,4,6]
data = [10,20,40,50,30]
# create Series object
series = rb.Series(index=index, values=data, dtype='float32',
                   interpolate='linear')

# Now you have access at any key using linear interpolation
# Interpolation does not creates new items in Series
print(series[2]) # prints 30
print(series[5]) # prints 40
# print Series and see that keys 2 and 5 do not exist in series
print(series)

最后的输出如下：

Series object Untitled
0: 10.0
1: 20.0
3: 40.0
4: 50.0
6: 30.0

- Кирилл Солодских

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Fred Foo · Accepted Answer

You could use scipy.interpolate.interp1d:

>>> from scipy.interpolate import interp1d
>>> import numpy as np
>>> x = np.array([10., 20., np.nan, 40., 50., np.nan, 30.])
>>> not_nan = np.logical_not(np.isnan(x))
>>> indices = np.arange(len(x))
>>> interp = interp1d(indices[not_nan], x[not_nan])
>>> interp(indices)
array([ 10.,  20.,  30.,  40.,  50.,  40.,  30.])

编辑：我花了一些时间弄清楚如何使用np.interp，但那也能胜任这个工作：

>>> np.interp(indices, indices[not_nan], x[not_nan])
array([ 10.,  20.,  30.,  40.,  50.,  40.,  30.])