我正在尝试将SGDRegressor拟合到我的数据上,然后检查准确性。拟合工作正常,但是预测结果与原始目标数据不是相同的数据类型(?), 我得到了以下错误:
ValueError: Can't handle mix of multiclass and continuous
当调用 print "Accuracy:", ms.accuracy_score(y_test,predictions)
时。
数据长这样(只有20万多行):
Product_id/Date/product_group1/Price/Net price/Purchase price/Hour/Quantity/product_group2
0 107 12/31/2012 10 300 236 220 10 1 108
代码如下:
from sklearn.preprocessing import StandardScaler
import numpy as np
from sklearn.linear_model import SGDRegressor
import numpy as np
from sklearn import metrics as ms
msk = np.random.rand(len(beers)) < 0.8
train = beers[msk]
test = beers[~msk]
X = train [['Price', 'Net price', 'Purchase price','Hour','Product_id','product_group2']]
y = train[['Quantity']]
y = y.as_matrix().ravel()
X_test = test [['Price', 'Net price', 'Purchase price','Hour','Product_id','product_group2']]
y_test = test[['Quantity']]
y_test = y_test.as_matrix().ravel()
clf = SGDRegressor(n_iter=2000)
clf.fit(X, y)
predictions = clf.predict(X_test)
print "Accuracy:", ms.accuracy_score(y_test,predictions)
我应该有什么不同的做法吗?谢谢!
y_preds = y_preds > 0.5
来转换为离散值。在这里,你可以设置自己的阈值。 - Shark Deng