我觉得下面的代码缺少了些什么。
from sklearn.model_selection import train_test_split
from imblearn.over_sampling import SMOTE
# Split into training and test sets
# Testing Count Vectorizer
X = df[['Spam']]
y = df['Value']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=40)
X_resample, y_resampled = SMOTE().fit_resample(X_train, y_train)
sm = pd.concat([X_resampled, y_resampled], axis=1)
由于我遇到了以下错误
ValueError: 无法将字符串转换为浮点数: ---> 19 X_resampled, y_resampled = SMOTE().fit_resample(X_train, y_train)
数据示例为
Spam Value
Your microsoft account was compromised 1
Manchester United lost against PSG 0
I like cooking 0
我考虑将训练集和测试集都进行转换以解决导致错误的问题,但我不知道如何同时应用于两者。我在谷歌上尝试了一些示例,但它们并没有解决这个问题。