对于从文本中提取特征,如何检查矢量化器(例如TfIdfVectorizer或CountVectorizer)是否已经适合了训练数据?
特别是,我想让代码自动找出矢量化器是否已经适合。
from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer = TfidfVectorizer()
def vectorize_data(texts):
# if vectorizer has not been already fit
vectorizer.fit_transform(texts)
# else
vectorizer.transform(texts)