分类指标无法处理连续多输出和多标签指示器目标的混合。

Question

分类指标无法处理连续多输出和多标签指示器目标的混合。

21

我已经创建了一个ANN，它具有数值输入和一个单独的分类输出，该输出使用一种独热编码方式编码为19个类别之一。我将输出层设置为具有19个单元。我不知道如何在这种情况下执行混淆矩阵或者如何进行分类器预测()，而不是一个单一的二进制输出。我一直得到一个错误说分类指标无法处理连续-多输出和多标签指示器目标的混合。不确定如何继续。

#Importing Datasets
dataset=pd.read_csv('Data.csv')
x = dataset.iloc[:,1:36].values # lower bound independent variable to upper bound in a matrix (in this case only 1 column 'NC')
y = dataset.iloc[:,36:].values # dependent variable vector
print(x.shape)
print(y.shape)

#One Hot Encoding fuel rail column
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_y= LabelEncoder()
y[:,0]=labelencoder_y.fit_transform(y[:,0])
onehotencoder= OneHotEncoder(categorical_features=[0])
y = onehotencoder.fit_transform(y).toarray()
print(y[:,0:])

print(x.shape)
print (y.shape)


#splitting data into Training and Test Data
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size=0.1,random_state=0)

#Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
#x_train = sc.fit_transform(x_train)
#x_test=sc.transform(x_test)
y_train = sc.fit_transform(y_train)
y_test=sc.transform(y_test)

# PART2 - Making ANN, deep neural network

#Importing the Keras libraries and packages
import keras
from keras.models import Sequential
from keras.layers import Dense


#Initialising ANN
classifier = Sequential()
#Adding the input layer and first hidden layer
classifier.add(Dense(activation= 'relu', input_dim =35, units=2, kernel_initializer="uniform"))#rectifier activation function, include all input with one hot encoding
#Adding second hidden layer
classifier.add(Dense(activation= 'relu', units=2, kernel_initializer="uniform")) #rectifier activation function
#Adding the Output Layer
classifier.add(Dense(activation='softmax', units=19, kernel_initializer="uniform")) 
#Compiling ANN - stochastic gradient descent
classifier.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])#stochastic gradient descent

#Fit ANN to training set

#PART 3 - Making predictions and evaluating the model
#Fitting classifier to the training set
classifier.fit(x_train, y_train, batch_size=10, epochs=100)#original batch is 10 and epoch is 100

#Predicting the Test set rules
y_pred = classifier.predict(x_test)
y_pred = (y_pred > 0.5) #greater than 0.50 on scale 0 to 1
print(y_pred)

#Making confusion matrix that checks accuracy of the model
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

- user8512104

1

Keras需要对多标签分类进行独热编码的y。 - Jakub Bartczuk

谢谢。据我所知，我已经对这个变量进行了一次独热编码，也许这就是问题所在。我应该如何构建我的输出层？ - user8512104

你在最后一层使用了softmax吗？ - Jakub Bartczuk

是的，它不喜欢输出19个潜在选项而不是简单的1或0。我已经将y_pred=classifier.predict(x_test)，然后y_pred=(y_pred>0.5)。 - user8512104

你在model.compile中使用什么损失函数？顺便说一句，没有看到代码很难发表任何意见。 - Jakub Bartczuk

显示剩余2条评论

2个回答

12

总结一下：使用这段代码，您应该能够获得矩阵。

y_pred=model.predict(X_test) 
y_pred=np.argmax(y_pred, axis=1)
y_test=np.argmax(y_test, axis=1)
cm = confusion_matrix(y_test, y_pred)
print(cm)

- ericheindl

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Jakub Bartczuk · Accepted Answer

40

y_pred = (y_pred > 0.5)

输出一个布尔矩阵。问题在于它与之前的形状相同，但在评估准确性时需要一个标签向量。

为此，请使用np.argmax(y_pred, axis=1)来输出正确的标签。

- Jakub Bartczuk

谢谢，我在哪里写这行代码？现在我要删除“：”吗？y_pred = (y_pred > 0.5) - user8512104

是的，你可以这样做。 - Jakub Bartczuk

2

那我是这样写的吗：y_pred=classifier.predict(x_test)，然后下一行 y_pred=np.argmax(y_pred, axis=1)，最后做混淆矩阵？ - user8512104

1

我还得改变 y_test，现在好了！！谢谢你 - user8512104

对于超过2个类别的情况，条件y_pred > 0.5并不总是会导致样本被预测为1。因此，sklearn认为您将使用多标签分类，但它不能立即与多输出混合。 - Anatoly Alekseev