OpenCV中的HOG特征数据布局是什么？

Question

OpenCV中的HOG特征数据布局是什么？

c++opencvcomputer-visionhistogramfeature-extraction

9

我正在使用OpenCV的CPU版方向梯度直方图（HOG）进行编程。我正在使用32x32像素的图像，4x4个单元格，4x4个块，在块之间没有重叠，并且有15个方向的箱子。 OpenCV的HOGDescriptor给了我一个长度为960的1D特征向量。这是有道理的，因为（32*32像素）*（15个方向）/（4*4个单元格）= 960。

然而，我不确定这960个数字在内存中是如何排列的。我的猜测是这样的：

vector<float> descriptorsValues =
[15 bins for cell 0, 0] 
[15 bins for cell 0, 1]
...
[15 bins for cell 0, 7]
....
[15 bins for cell 7, 0] 
[15 bins for cell 7, 1]
...
[15 bins for cell 7, 7]

当然，这是一个被压缩成1D的2D问题，实际上看起来像这样：

[cell 0, 0] [cell 0, 1] ... [cell 7, 0] ... [cell 7, 7]

所以，我的数据布局想法正确吗？还是说还有其他问题？

这是我针对此问题的示例代码：

using namespace cv;

//32x32 image, 4x4 blocks, 4x4 cells, 4x4 blockStride
vector<float> hogExample(cv::Mat img)
{
    img = img.rowRange(0, 32).colRange(0,32); //trim image to 32x32
    bool gamma_corr = true;
    cv::Size win_size(img.rows, img.cols); //using just one window
    int c = 4;
    cv::Size block_size(c,c);
    cv::Size block_stride(c,c); //no overlapping blocks
    cv::Size cell_size(c,c);
    int nOri = 15; //number of orientation bins

    cv::HOGDescriptor d(win_size, block_size, block_stride, cell_size, nOri, 1, -1,
                              cv::HOGDescriptor::L2Hys, 0.2, gamma_corr, cv::HOGDescriptor::DEFAULT_NLEVELS);

    vector<float> descriptorsValues;
    vector<cv::Point> locations;
    d.compute(img, descriptorsValues, cv::Size(0,0), cv::Size(0,0), locations);

    printf("descriptorsValues.size() = %d \n", descriptorsValues.size()); //prints 960
    return descriptorsValues;
}

相关资源：这篇StackOverflow文章和这个教程帮助我入门OpenCV HOGDescriptor。

- solvingPuzzles

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- herohuyongtao · Accepted Answer

我相信你已经有了正确的想法。

在Histograms of Oriented Gradients for Human Detection （第2页）的原始论文中，它说：

[...] 检测窗口被用具有重叠网格的块状方式平铺，其中提取方向梯度直方图特征向量。[...]

[...] 用密集（实际上是重叠的）HOG描述符平铺检测窗口，并使用组合特征向量[...]

它所说的都只是将它们平铺在一起。虽然没有详细介绍如何确切地将它们平铺在一起。我猜在这里不会发生任何花哨的事情（否则他们会谈论它），即仅正常地将它们连接在一起（从左到右，从上到下）。

毕竟，这是布局数据的合理且最简单的方法。

编辑：如果您查看人们如何访问和可视化数据，您会更加自信。

for (int blockx=0; blockx<blocks_in_x_dir; blockx++)
{
    for (int blocky=0; blocky<blocks_in_y_dir; blocky++)            
    {
        for (int cellNr=0; cellNr<4; cellNr++)
        {
            for (int bin=0; bin<gradientBinSize; bin++)
            {
                float gradientStrength = descriptorValues[ descriptorDataIdx ];
                descriptorDataIdx++;

                // ... ...

            } // for (all bins)
        } // for (all cells)
    } // for (all block x pos)
} // for (all block y pos)