在RandomForestRegressor中遇到了"continuous is not supported"错误。

39

我只是尝试做一个简单的RandomForestRegressor示例。但在测试准确性时,我遇到了这个错误

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

in accuracy_score(y_true, y_pred, normalize, sample_weight) 177 178 # Compute accuracy for each possible representation --> 179 y_type, y_true, y_pred = _check_targets(y_true, y_pred) 180 if y_type.startswith('multilabel'): 181 differing_labels = count_nonzero(y_true - y_pred, axis=1)

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc

in _check_targets(y_true, y_pred) 90 if (y_type not in ["binary", "multiclass", "multilabel-indicator", 91 "multilabel-sequences"]): ---> 92 raise ValueError("{0} is not supported".format(y_type)) 93 94 if y_type in ["binary", "multiclass"]:

ValueError: continuous is not supported

这是数据样本,我无法展示真实数据。

target, func_1, func_2, func_2, ... func_200
float, float, float, float, ... float

这是我的代码。

import pandas as pd
import numpy as np
from sklearn.preprocessing import Imputer
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor, ExtraTreesRegressor, GradientBoostingRegressor
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import tree

train = pd.read_csv('data.txt', sep='\t')

labels = train.target
train.drop('target', axis=1, inplace=True)
cat = ['cat']
train_cat = pd.get_dummies(train[cat])

train.drop(train[cat], axis=1, inplace=True)
train = np.hstack((train, train_cat))

imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
imp.fit(train)
train = imp.transform(train)

x_train, x_test, y_train, y_test = train_test_split(train, labels.values, test_size = 0.2)

clf = RandomForestRegressor(n_estimators=10)

clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)
accuracy_score(y_test, y_pred) # This is where I get the error.
3个回答

88

这是因为accuracy_score只适用于分类任务。对于回归任务,您应该使用其他指标,例如:

clf.score(X_test, y_test)

X_test是样本,y_test是相应的真值。它将计算内部预测。


有人知道如何比较预测值和测试值,例如分类回归吗? - Priyansh
1
@Priyansh 在回归分析中,您可以使用R平方(决定系数)比较预测值和测试值。 - ThReSholD

5

由于您正在执行回归任务,因此应使用度量标准R-squared(确定系数),而不是准确性得分(准确性得分用于分类问题)。

可以通过调用RandomForestRegressor提供的score函数来计算R-squared,例如:

rfr.score(X_test,Y_test)

0

尝试一下:

tree_clf.score(x_train, y_train)

你也不能使用混淆矩阵。


首先,在训练数据集上评估训练是不好的实践。这么做只会评估过度学习。其次,一旦这个问题被解决,这个答案就只是一个现有的被点赞的答案的副本。 - chrslg

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接