从随机森林中将特征重要性导出为CSV文件

Question

从随机森林中将特征重要性导出为CSV文件

pythoncsvpandasscikit-learnrandom-forest

3

您好，我想创建一个包含两列的.csv文件：随机森林模型的特征重要性和该特征的名称。并确保数字值和变量名之间的匹配是正确的。

这里有一个例子，但我无法正确导出为.csv文件。

test_features = test[["area","product", etc.]].values

# Create the target 
target = test["churn"].values

pred_forest = my_forest.predict(test_features)

# Print the score of the fitted random forest
print(my_forest.score(test_features, target))


importance = my_forest.feature_importances_


pd.DataFrame({"IMP": importance, "features":test_features }).to_csv('forest_0407.csv',index=False)

- progster

这个怎么会失败？我觉得这有点可疑，因为你试图将特征重要性与特征df本身进行匹配，这是不正确的，因为特征重要性是列。 - EdChum

我感到困惑，因为当我打印“importance”时，我只能看到一个数组，但我不确定它匹配哪个特征，因此我想检查名称和值。错误信息如下：异常：数据必须是一维的。 - progster

1

@shivsn 懒惰打字者的版本是list(df)，以获取列作为列表 - EdChum

@EdChum 很好，我不知道那个，谢谢你。 - shivsn

我认为你想要的是类似于 feat_imp = pd.Series(importance, index=df.columns) 的东西。 - EdChum

显示剩余2条评论

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Abhishek Sharma · Accepted Answer

请使用以下内容

x = list(zip(my_forest.feature_importances_,list of features you are using))
x = pandas.DataFrame(x,columns=["Importance","Feature_Name"])