你已经拥有三个最重要的工具。
hoggify
创建了一个 HOG 描述符列表 - 每个图像一个。请注意,计算描述符的预期输入是灰度图像,并且描述符作为具有1列的2D数组返回,这意味着HOG描述符中的每个元素都有自己的行。但是,您正在使用
np.squeeze
来删除单例列,并将其替换为1D numpy数组,因此我们在这里很好。然后,您将使用
list_to_matrix
将列表转换为
numpy
数组。完成此操作后,您可以使用
svmClassify
最终训练数据。这假定您已经将
labels
存储在1D
numpy
数组中。在训练 SVM 后,您将使用
SVC.predict
方法,在给定输入 HOG 特征的情况下,它将分类图像是否属于椅子。
因此,您需要执行以下步骤:
Use hoggify
to create your list of HOG descriptors, one per image. It looks like the input x
is a prefix to whatever you called your chair images as, while z
denotes the total number of images you want to load in. Remember that range
is exclusive of the ending value, so you may want to add a + 1
after int(z)
(i.e. int(z) + 1
) to ensure that you include the end. I'm not sure if this is the case, but I wanted to throw it out there.
x = '...'
z = 100
lst = hoggify(x, z)
Convert the list of HOG descriptors into an actual matrix:
data = list_to_matrix(lst)
Train your SVM classifier. Assuming you already have your labels stored in labels
where a value 0
denotes not a chair and 1
denotes a chair and it is a 1D numpy
array:
labels = ...
clf = svmClassify(data, labels)
Use your SVM classifer to perform predictions. Assuming you have a test image you want to test with your classifier, you will need to do the same processing steps like you did with your training data. I'm assuming that's what hoggify
does where you can specify a different x
to denote different sets to use. Specify a new variable xtest
to specify this different directory or prefix, as well as the number of images you need, then use hoggify
combined with list_to_matrix
to get your features:
xtest = '...'
ztest = 50
lst_test = hoggify(xtest, ztest)
test_data = list_to_matrix(lst_test)
pred = clf.predict(test_data)
pred
will contain an array of predicted labels, one for each test image that you have. If you want, you can see how well your SVM did with the training data, so since you have this already at your disposal, just use data
again from step #2:
pred_training = clf.predict(data)
pred_training
will contain an array of predicted labels, one for each training image.
如果您最终想要将此与网络摄像头一起使用,那么流程将是使用
VideoCapture
对象并指定连接到计算机的设备的ID。通常只有一个网络摄像头连接到计算机上,因此请使用ID 0。完成此操作后,流程将是使用循环,获取帧,将其转换为灰度图像,因为HOG描述符需要灰度图像,计算描述符,然后对图像进行分类。
假设您已经训练好了模型,并且之前创建了一个HOG描述符对象,类似以下内容就可以工作:
cap = cv2.VideoCapture(0)
dim = 128
while True:
ret, frame = cap.read()
cv2.imshow('Webcam', frame)
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
gray = cv2.resize(gray, (dim, dim), interpolation = cv2.INTER_AREA)
features = hog.compute(gray)
features = features.T
pred = clf.predict(features)
print("The label of the image is: " + str(pred))
if cv2.waitKey(25) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
上述过程读取图像,将其显示在屏幕上,将图像转换为灰度以便计算其HOG描述符,确保数据是单行兼容的SVM,然后预测其标签。我们将其打印到屏幕上,并等待25毫秒,然后读取下一帧,以避免CPU负载过重。此外,您可以通过在键盘上按下
q键随时退出程序。否则,此程序将无限循环。完成后,我们释放相机资源以使其可用于其他进程。