我有许多样本(y_i, (a_i, b_i, c_i))
,其中y
被假定为关于a,b,c
的多项式,达到一定的次数。例如,对于给定的数据集和二次项,我可能会产生以下模型
y = a^2 + 2ab - 3cb + c^2 +.5ac
可以使用最小二乘法来实现这一点,并且是numpy中polyfit例程的轻微扩展。在Python生态系统中是否有标准实现?
我有许多样本(y_i, (a_i, b_i, c_i))
,其中y
被假定为关于a,b,c
的多项式,达到一定的次数。例如,对于给定的数据集和二次项,我可能会产生以下模型
y = a^2 + 2ab - 3cb + c^2 +.5ac
可以使用最小二乘法来实现这一点,并且是numpy中polyfit例程的轻微扩展。在Python生态系统中是否有标准实现?
import numpy as np
from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model
#X is the independent variable (bivariate in this case)
X = np.array([[0.44, 0.68], [0.99, 0.23]])
#vector is the dependent data
vector = np.array([109.85, 155.72])
#predict is an independent variable for which we'd like to predict the value
predict= np.array([[0.49, 0.18]])
#generate a model of polynomial features
poly = PolynomialFeatures(degree=2)
#transform the x data for proper fitting (for single variable type it returns,[1,x,x**2])
X_ = poly.fit_transform(X)
#transform the prediction to fit the model type
predict_ = poly.fit_transform(predict)
#here we can remove polynomial orders we don't want
#for instance I'm removing the `x` component
X_ = np.delete(X_,(1),axis=1)
predict_ = np.delete(predict_,(1),axis=1)
#generate the regression object
clf = linear_model.LinearRegression()
#preform the actual regression
clf.fit(X_, vector)
print("X_ = ",X_)
print("predict_ = ",predict_)
print("Prediction = ",clf.predict(predict_))
>>> X_ = [[ 0.44 0.68 0.1936 0.2992 0.4624]
>>> [ 0.99 0.23 0.9801 0.2277 0.0529]]
>>> predict_ = [[ 0.49 0.18 0.2401 0.0882 0.0324]]
>>> Prediction = [ 126.84247142]
delete
函数的实现呢?谢谢! - Shivam GaurPolynomialFeatures
是什么?它的作用是什么?我可以看到代码吗? - Charlie Parkerfit_transform
既返回多项式特征矩阵(范德蒙矩阵),又返回预测值? :/ - Charlie Parkerc_pinv = np.dot(np.linalg.pinv( Kern_train ),Y_train)
相比如何? - Charlie Parkerpredict
variable should be a 2d array [[0.49, 0.18]]
- Lucasklearn有一个很好的例子,使用他们的Pipeline,在这里。这是他们例子的核心:
polynomial_features = PolynomialFeatures(degree=degrees[i],
include_bias=False)
linear_regression = LinearRegression()
pipeline = Pipeline([("polynomial_features", polynomial_features),
("linear_regression", linear_regression)])
pipeline.fit(X[:, np.newaxis], y)
您无需自己转换数据,只需将其传递到管道中即可。
def model(p, v, x, w):
a,b,c,d,e,f,g,h,i,j,k = p #coefficients to the polynomials
return a*v**2 + b*x**2 + c*w**2 + d*v*x + e*v*w + f*x*w + g*v + h*x + i*y + k
def residuals(p, data): # Function needed by fit routine
v, x, w, z = data # The values for v, x, w and the measured hypersurface z
a,b,c,d,e,f,g,h,i,j,k = p #coefficients to the polynomials
return (z-model(p,v,x,w)) # Returns an array of residuals.
#This should (z-model(p,v,x,w))/err if
# there are error bars on the measured z values
#initial guess at parameters. Avoid using 0.0 as initial guess
par0 = [1.0, 1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]
#create a fitting object. data should be in the form
#that the functions above are looking for, i.e. a Nx4
#list of lists/tuples like (v,x,w,z)
fitobj = kmpfit.Fitter(residuals=residuals, data=data)
# call the fitter
fitobj.fit(params0=par0)
这些事情的成功与拟合的起始值密切相关,因此如果可能的话,请仔细选择。由于有太多的自由参数,因此可能很难得出解决方案。