在Python中尝试线性插值

Question

在Python中尝试线性插值

4

我有三个数组：a、b、c，长度均为15。

a=[950, 850, 750, 675, 600, 525, 460, 400, 350, 300, 250, 225, 200, 175, 150] 

b = [16, 12, 9, -35, -40, -40, -40, -45, -50, -55, -60, -65, -70, -75, -80]

c=[32.0, 22.2, 12.399999999999999, 2.599999999999998, -7.200000000000003, -17.0, -26.800000000000004, -36.60000000000001, -46.400000000000006, -56.2, -66.0, -75.80000000000001, -85.60000000000001, -95.4, -105.20000000000002]

我正在尝试找到数组中b=c的索引处的a的值。

问题在于没有确切的地方满足b=c，因此我需要在线性插值数组中的值之间来查找b=c时a的值。这样讲清楚了吗？

我想使用scipy.interpolate来进行插值计算。

我很难理解如何解决这个问题。对此有任何想法都可以提出来！

- HM14

你能添加一些数组 a、b 和 c 的示例吗？ - Christian Ternus

b = [16, 12, 9, -35, -40, -40, -40, -45, -50, -55, -60, -65, -70, -75, -80]c = [32.0, 22.2, 12.399999999999999, 2.599999999999998, -7.200000000000003, -17.0, -26.800000000000004, -36.60000000000001, -46.400000000000006, -56.2, -66.0, -75.80000000000001, -85.60000000000001, -95.4, -105.20000000000002]a = [950, 850, 750, 675, 600, 525, 460, 400, 350, 300, 250, 225, 200, 175, 150] - HM14

请定义“线性插值”（是什么意思？在所有点上进行线性回归；分段线性？这对数据来说是一个有效的假设吗？...）。另外：不要将此示例添加为注释，而是编辑您的问题并使其格式化！ - sascha

线性插值：https://en.wikipedia.org/wiki/Linear_interpolation。特别是请参阅“数据集的线性插值”：https://en.wikipedia.org/wiki/Linear_interpolation#Interpolation_of_a_data_set - Warren Weckesser

3个回答

1

这不一定是解决您问题的方法，因为您的数据似乎不是线性的，但它可能会给您一些思路。如果您假设您的线a、b和c是线性的，则以下思路适用：

对线a、b和c执行线性回归，以获取它们各自的斜率（m_a、m_b、m_c）和y截距（b_a、b_b、b_c）。然后解方程“y_b = y_c”求出x，并找到y = m_a * x + b_a以获得您的结果。

由于线性回归近似解决了y = m * x + b的方程，因此可以通过手动解决方程y_b = y_c得到：x = （b_b-b_c）/（m_c-m_b）。

使用Python，您将得到：

>> m_a, b_a, r_a, p_a, err_a = stats.linregress(range(15), a)
>> m_b, b_b, r_b, p_b, err_b = stats.linregress(range(15), b)
>> m_c, b_c, r_c, p_c, err_c = stats.linregress(range(15), c)
>> x = (b_b-b_c) / (m_c-m_b)
>> m_a * x + b_a
379.55151515151516

由于您的数据不是线性的，您可能需要逐个检查向量并搜索重叠的y区间。然后，您可以应用上述方法，但仅使用两个区间的端点来构造线性回归的b和c输入。在这种情况下，您应该会得到一个精确的结果，因为最小二乘法将只用两个点完美地插值（尽管在这种简单情况下有更有效的方法来解决交点问题）。祝好运。

- davhoo

不要让Warren看到你的解决方案 ;-). 我喜欢它。虽然有争议，如果线性回归是正确的方法（你质疑了），但OP没有给出太多信息。我们的方法是相同的，得到了相同的结果，但你的方法更优雅（不需要通用优化器）！ - sascha

0

另一个简单的解决方案使用：

每个向量使用一个线性回归器（使用scikit-learn完成，因为我的scipy-docs无法访问；很容易切换到基于numpy/scipy的线性回归）
使用scipy.optimize.minimize进行通用最小化

代码

a=[950, 850, 750, 675, 600, 525, 460, 400, 350, 300, 250, 225, 200, 175, 150]
b = [16, 12, 9, -35, -40, -40, -40, -45, -50, -55, -60, -65, -70, -75, -80]
c=[32.0, 22.2, 12.399999999999999, 2.599999999999998, -7.200000000000003, -17.0, -26.800000000000004, -36.60000000000001, -46.400000000000006, -56.2, -66.0, -75.80000000000001, -85.60000000000001, -95.4, -105.20000000000002]

from sklearn.linear_model import LinearRegression
from scipy.optimize import minimize
import numpy as np

reg_a = LinearRegression().fit(np.arange(len(a)).reshape(-1,1), a)
reg_b = LinearRegression().fit(np.arange(len(b)).reshape(-1,1), b)
reg_c = LinearRegression().fit(np.arange(len(c)).reshape(-1,1), c)

funA = lambda x: reg_a.predict(x.reshape(-1,1))
funB = lambda x: reg_b.predict(x.reshape(-1,1))
funC = lambda x: reg_c.predict(x.reshape(-1,1))

opt_crossing = lambda x: (funB(x) - funC(x))**2
x0 = 1
res = minimize(opt_crossing, x0, method='SLSQP', tol=1e-6)
print(res)
print('Solution: ', funA(res.x))

import matplotlib.pyplot as plt

x = np.linspace(0, 15, 100)
a_ = reg_a.predict(x.reshape(-1,1))
b_ = reg_b.predict(x.reshape(-1,1))
c_ = reg_c.predict(x.reshape(-1,1))

plt.plot(x, a_, color='blue')
plt.plot(x, b_, color='green')
plt.plot(x, c_, color='cyan')
plt.scatter(np.arange(15), a, color='blue')
plt.scatter(np.arange(15), b, color='green')
plt.scatter(np.arange(15), c, color='cyan')

plt.axvline(res.x, color='red', linestyle='solid')
plt.axhline(funA(res.x), color='red', linestyle='solid')

plt.show()

输出

fun: array([  7.17320622e-15])
jac: array([ -3.99479864e-07,   0.00000000e+00])
message: 'Optimization terminated successfully.'
nfev: 8
nit: 2
njev: 2
status: 0
success: True
  x: array([ 8.37754008])
Solution:  [ 379.55151658]

绘图

- sascha

酷，但是379？绘制数据，看看你是否认为这个答案令人满意。 :) - Warren Weckesser

@WarrenWeckesser，这一切都与模型有关。我问他什么是线性插值，但没有得到答案。因此，我的方法是使用全局线性回归。这可能是有效的，也可能无效。这是一个模型决策！所以我必须承认：我觉得这很令人满意，现在我们的模型之间的差异也很容易看出来（现在你也添加了一个图）。 - sascha

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Warren Weckesser · Accepted Answer

这是我在另一个答案中提到的一个函数的简单版本：

from __future__ import division

import numpy as np


def find_roots(t, y):
    """
    Given the input signal `y` with samples at times `t`,
    find the times where `y` is 0.

    `t` and `y` must be 1-D numpy arrays.

    Linear interpolation is used to estimate the time `t` between
    samples at which sign changes in `y` occur.
    """
    # Find where y crosses 0.
    transition_indices = np.where(np.sign(y[1:]) != np.sign(y[:-1]))[0]

    # Linearly interpolate the time values where the transition occurs.
    t0 = t[transition_indices]
    t1 = t[transition_indices + 1]
    y0 = y[transition_indices]
    y1 = y[transition_indices + 1]
    slope = (y1 - y0) / (t1 - t0)
    transition_times = t0 - y0/slope

    return transition_times

该函数可与 t = a 和 y = b - c 一起使用。例如，以下是您的数据，输入为 numpy 数组：

In [354]: a = np.array([950, 850, 750, 675, 600, 525, 460, 400, 350, 300, 250, 225, 200, 175, 150])

In [355]: b = np.array([16, 12, 9, -35, -40, -40, -40, -45, -50, -55, -60, -65, -70, -75, -80])

In [356]: c = np.array([32.0, 22.2, 12.399999999999999, 2.599999999999998, -7.200000000000003, -17.0, -26.800000000000004, -3
     ...: 6.60000000000001, -46.400000000000006, -56.2, -66.0, -75.80000000000001, -85.60000000000001, -95.4, -105.2000000000
     ...: 0002])

"b=c" 的地方就是 "b-c=0" 的地方，因此我们将 b-c 作为 y 传递：

In [357]: find_roots(a, b - c)
Out[357]: array([ 312.5])

因此，a的线性插值值为312.5。

使用以下matplotlib命令：

In [391]: plot(a, b, label="b")
Out[391]: [<matplotlib.lines.Line2D at 0x11eac8780>]

In [392]: plot(a, c, label="c")
Out[392]: [<matplotlib.lines.Line2D at 0x11f23aef0>]

In [393]: roots = find_roots(a, b - c)

In [394]: [axvline(root, color='k', alpha=0.2) for root in roots]
Out[394]: [<matplotlib.lines.Line2D at 0x11f258208>]

In [395]: grid()

In [396]: legend(loc="best")
Out[396]: <matplotlib.legend.Legend at 0x11f260ba8>

In [397]: xlabel("a")
Out[397]: <matplotlib.text.Text at 0x11e71c470>

我理解了这个图表。