我想知道如何保存OnevsRest分类器模型以备后续预测。
由于需要保存向量化器,因此我在保存时遇到了问题。我已经在这篇文章中学习过。
以下是我创建的模型:
由于需要保存向量化器,因此我在保存时遇到了问题。我已经在这篇文章中学习过。
以下是我创建的模型:
from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer = TfidfVectorizer(strip_accents='unicode', analyzer='word', ngram_range=(1,3), norm='l2')
vectorizer.fit(train_text)
vectorizer.fit(test_text)
x_train = vectorizer.transform(train_text)
y_train = train.drop(labels = ['id','comment_text'], axis=1)
x_test = vectorizer.transform(test_text)
y_test = test.drop(labels = ['id','comment_text'], axis=1)
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
from sklearn.multiclass import OneVsRestClassifier
%%time
# Using pipeline for applying logistic regression and one vs rest classifier
LogReg_pipeline = Pipeline([
('clf', OneVsRestClassifier(LogisticRegression(solver='sag'), n_jobs=-1)),
])
for category in categories:
printmd('**Processing {} comments...**'.format(category))
# Training logistic regression model on train data
LogReg_pipeline.fit(x_train, train[category])
# calculating test accuracy
prediction = LogReg_pipeline.predict(x_test)
print('Test accuracy is {}'.format(accuracy_score(test[category], prediction)))
print("\n")
我非常感谢您的帮助。
真诚地致谢,