将HOG+SVM训练应用于网络摄像头进行物体检测

Question

将HOG+SVM训练应用于网络摄像头进行物体检测

pythonopencvmachine-learningscikit-learncomputer-vision

3

我已经通过从正负数据集中提取HOG特征来训练我的SVM分类器。

from sklearn.svm import SVC
import cv2
import numpy as np

hog = cv2.HOGDescriptor()


def hoggify(x,z):

    data=[]

    for i in range(1,int(z)):
        image = cv2.imread("/Users/munirmalik/cvprojek/cod/"+x+"/"+"file"+str(i)+".jpg", 0)
        dim = 128
        img = cv2.resize(image, (dim,dim), interpolation = cv2.INTER_AREA)
        img = hog.compute(img)
        img = np.squeeze(img)
        data.append(img)

    return data

def svmClassify(features,labels):
    clf=SVC(C=10000,kernel="linear",gamma=0.000001)
    clf.fit(features,labels)

    return clf

def list_to_matrix(lst):
    return np.stack(lst)

我希望运用所学知识，让程序能够识别我的自定义物体（椅子）。

我已经为每个集合添加了标签，接下来需要做什么？

- MM3

你正在使用scikit-learn的支持向量分类模块吗？这段代码不能单独运行。主要是因为你没有展示你所包含的模块。 - rayryeng

@rayryeng 抱歉，我已经在编辑中包含了它们。我需要使用opencv中的SVM函数吗？ - MM3

不，你不必使用OpenCV。我之所以问这个问题是因为它可以让我写一个答案:P - rayryeng

@rayryeng 哈哈，那么我该怎么继续呢？ - MM3

我已经在写答案了，请给我几分钟。 - rayryeng

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- rayryeng · Accepted Answer

你已经拥有三个最重要的工具。 hoggify 创建了一个 HOG 描述符列表 - 每个图像一个。请注意，计算描述符的预期输入是灰度图像，并且描述符作为具有1列的2D数组返回，这意味着HOG描述符中的每个元素都有自己的行。但是，您正在使用np.squeeze来删除单例列，并将其替换为1D numpy数组，因此我们在这里很好。然后，您将使用list_to_matrix将列表转换为numpy数组。完成此操作后，您可以使用svmClassify最终训练数据。这假定您已经将labels存储在1Dnumpy数组中。在训练 SVM 后，您将使用 SVC.predict 方法，在给定输入 HOG 特征的情况下，它将分类图像是否属于椅子。

因此，您需要执行以下步骤：

Use hoggify to create your list of HOG descriptors, one per image. It looks like the input x is a prefix to whatever you called your chair images as, while z denotes the total number of images you want to load in. Remember that range is exclusive of the ending value, so you may want to add a + 1 after int(z) (i.e. int(z) + 1) to ensure that you include the end. I'm not sure if this is the case, but I wanted to throw it out there.
```
x = '...' # Whatever prefix you called your chairs
z = 100 # Load in 100 images for example
lst = hoggify(x, z)
```
Convert the list of HOG descriptors into an actual matrix:
```
data = list_to_matrix(lst)
```
Train your SVM classifier. Assuming you already have your labels stored in labels where a value 0 denotes not a chair and 1 denotes a chair and it is a 1D numpy array:
```
labels = ... # Define labels here as a numpy array
clf = svmClassify(data, labels)
```
Use your SVM classifer to perform predictions. Assuming you have a test image you want to test with your classifier, you will need to do the same processing steps like you did with your training data. I'm assuming that's what hoggify does where you can specify a different x to denote different sets to use. Specify a new variable xtest to specify this different directory or prefix, as well as the number of images you need, then use hoggify combined with list_to_matrix to get your features:
```
xtest = '...' # Define new test prefix here
ztest = 50 # 50 test images
lst_test = hoggify(xtest, ztest)
test_data = list_to_matrix(lst_test)
pred = clf.predict(test_data)
```
pred will contain an array of predicted labels, one for each test image that you have. If you want, you can see how well your SVM did with the training data, so since you have this already at your disposal, just use data again from step #2:
```
pred_training = clf.predict(data)
```
pred_training will contain an array of predicted labels, one for each training image.

如果您最终想要将此与网络摄像头一起使用，那么流程将是使用VideoCapture对象并指定连接到计算机的设备的ID。通常只有一个网络摄像头连接到计算机上，因此请使用ID 0。完成此操作后，流程将是使用循环，获取帧，将其转换为灰度图像，因为HOG描述符需要灰度图像，计算描述符，然后对图像进行分类。

假设您已经训练好了模型，并且之前创建了一个HOG描述符对象，类似以下内容就可以工作：

cap = cv2.VideoCapture(0)
dim = 128 # For HOG

while True:
    # Capture the frame
    ret, frame = cap.read()

    # Show the image on the screen
    cv2.imshow('Webcam', frame)

    # Convert the image to grayscale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Convert the image into a HOG descriptor
    gray = cv2.resize(gray, (dim, dim), interpolation = cv2.INTER_AREA)
    features = hog.compute(gray)
    features = features.T # Transpose so that the feature is in a single row

    # Predict the label
    pred = clf.predict(features)

    # Show the label on the screen
    print("The label of the image is: " + str(pred))

    # Pause for 25 ms and keep going until you push q on the keyboard
    if cv2.waitKey(25) == ord('q'):
        break

cap.release() # Release the camera resource
cv2.destroyAllWindows() # Close the image window

上述过程读取图像，将其显示在屏幕上，将图像转换为灰度以便计算其HOG描述符，确保数据是单行兼容的SVM，然后预测其标签。我们将其打印到屏幕上，并等待25毫秒，然后读取下一帧，以避免CPU负载过重。此外，您可以通过在键盘上按下q键随时退出程序。否则，此程序将无限循环。完成后，我们释放相机资源以使其可用于其他进程。