如何使用Julia训练支持向量机(SVM)?

5

有人在 Julia (1.4.1) 中训练过支持向量机(SVM)吗?

我尝试了 LIBSVM 接口,但是 GitHub 页面上的示例出现了错误:

# Load Fisher's classic iris data
iris = dataset("datasets", "iris")
# LIBSVM handles multi-class data automatically using a one-against-one strategy
labels = convert(Vector, iris[:Species])
# First dimension of input data is features; second is instances
instances = convert(Array, iris[:, 1:4])'
# Train SVM on half of the data using default parameters. See documentation
# of svmtrain for options
model = svmtrain(instances[:, 1:2:end], labels[1:2:end]);```

ERROR: MethodError: no method matching LIBSVM.SupportVectors(::Int32, ::Array{Int32,1}, ::CategoricalArray{String,1,UInt8,String,CategoricalValue{String,UInt8},Union{}}, ::Array{Float64,2}, ::Array{Int32,1}, ::Array{LIBSVM.SVMNode,1})
Closest candidates are:
LIBSVM.SupportVectors(::Int32, ::Array{Int32,1}, ::Array{T,1}, ::AbstractArray{U,2}, ::Array{Int32,1}, ::Array{LIBSVM.SVMNode,1}) where {T, U} at /home/benny/.julia/packages/LIBSVM/5Z99T/src/LIBSVM.jl:18
LIBSVM.SupportVectors(::LIBSVM.SVMModel, ::Any, ::Any) at /home/benny/.julia/packages/LIBSVM/5Z99T/src/LIBSVM.jl:27 
2个回答

6

看起来 LIBSVM.jl 的文档已经过时了,该软件包没有得到适当的更新,因此值得提出问题(或者至少提交一个拉请求以更新 README)。

您看到的错误与软件包本身无关,而是由于在当前版本的 DataFrames.jlRDatasets.jl 中,labels 列不再是 Vector (如 LIBSVM.jl 开发时所示),而是 CategoricalArray。您可以通过将CategoricalArray转换为普通的 Vector {String} 来避免此问题。完整的例子如下:

using RDatasets, LIBSVM
using StatsBase, Printf # `mean` and `printf` are no longer in Base, and should be used explicitly

# Load Fisher's classic iris data
iris = dataset("datasets", "iris")

# LIBSVM handles multi-class data automatically using a one-against-one strategy
labels = string.(convert(Vector, iris[:Species]))

# First dimension of input data is features; second is instances
instances = convert(Array, iris[:, 1:4])'

# Train SVM on half of the data using default parameters. See documentation
# of svmtrain for options
model = svmtrain(instances[:, 1:2:end], labels[1:2:end]);

# Test model on the other half of the data.
(predicted_labels, decision_values) = svmpredict(model, instances[:, 2:2:end]);

# Compute accuracy
@printf "Accuracy: %.2f%%\n" mean((predicted_labels .== labels[2:2:end]))*100

另外,您可以使用MLJ.jlScikitLearn.jl自行正确包装LIBSVM.jl。


3

Oskin的回答适用于旧版本。

在当前版本中,应进行修改为,

using RDatasets, LIBSVM
using StatsBase, Printf # `mean` and `printf` are no longer in Base, and should be used explicitly

# Load Fisher's classic iris data
iris = dataset("datasets", "iris")

# LIBSVM handles multi-class data automatically using a one-against-one strategy
labels = string.(convert(Vector, iris[:,:Species]))

# First dimension of input data is features; second is instances
instances = Matrix(iris[:, 1:4])'

# Train SVM on half of the data using default parameters. See documentation
# of svmtrain for options
model = svmtrain(instances[:, 1:2:end], labels[1:2:end]);

# Test model on the other half of the data.
(predicted_labels, decision_values) = svmpredict(model, instances[:, 2:2:end]);

# Compute accuracy
@printf "Accuracy: %.2f%%\n" mean((predicted_labels .== labels[2:2:end]))*100


1
这是一个很大的改进,应该被接受为答案。 - Andrej Oskin

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接