哪个openCv函数可以用于计算BEV（鸟瞰图）透视变换，给定一个点的坐标和相机的外参/内参？

Question

哪个openCv函数可以用于计算BEV（鸟瞰图）透视变换，给定一个点的坐标和相机的外参/内参？

pythonopencvcomputer-visioncamera-calibrationhomography

4

我通过 cv2.calibrateCamera() 函数得到了相机的 3x3 的内参矩阵和 4x3 的外参矩阵。

现在我想要使用这些参数来计算给定坐标在相机帧中的 BEV (Bird Eye View) 变换。

哪个 openCv 函数可以用于计算具有给定点坐标以及相机的 3x3 内参和/或外参矩阵的 BEV 透视变换？

我在以下帖子中发现了相关内容：https://deepnote.com/article/social-distancing-detector/，基于https://www.pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/

他们使用 cv2.getPerspectiveTransform() 函数获取一个 3X3 的矩阵，但我不知道该矩阵是否代表相机的 内参、外参 或其他什么参数。然后，他们使用该矩阵对点列表进行以下变换：

#Assuming list_downoids is the list of points to be transformed and matrix is the one obtained above
list_points_to_detect = np.float32(list_downoids).reshape(-1, 1, 2)
transformed_points = cv2.perspectiveTransform(list_points_to_detect, matrix)

我真的需要知道是否可以使用cv2.perspectiveTransform函数来计算转换，或者是否有另一种更好的方法可以使用extrinsics、intrinsics或两者结合使用，而无需重新使用帧，因为我已经将检测/选择的坐标保存在一个数组中。

- Maf

3个回答

2

经过深入调查，我找到了一个好的解决方案：

投影矩阵是外参和内参相机矩阵之间的乘积。

https://medium.com/analytics-vidhya/using-homography-for-pose-estimation-in-opencv-a7215f260fdd
由于外参是4x3矩阵，而内参是3x3矩阵，但我们需要投影矩阵是一个3x3矩阵，因此我们需要在执行乘法之前将外参转换为3x3矩阵。

当我们没有相机参数时，cv2.getPerspectiveTransform()会给我们投影矩阵：

https://towardsdatascience.com/a-hands-on-application-of-homography-ipm-18d9e47c152f

cv2.warpPerspective()函数可以对图像进行透视变换。

针对上述问题，我们不需要使用这两个函数，因为我们已经有了外参、内参和图像中点的坐标。

考虑到上述情况，我编写了一个函数，通过给定内参和外参将点列表list_x_y转换为BEV：

    def compute_point_perspective_transformation(intrinsics, extrinsics, point_x_y):
    """Auxiliary function to project a specific point to BEV
        
        Parameters
        ----------
        intrinsics (array)     : The camera intrinsics matrix
        extrinsics (array)     : The camera extrinsics matrix
        point_x_y (tuple[x,y]) : The coordinates of the point to be projected to BEV
        
        Returns
        ----------
        tuple[x,y] : the projection of the point
    """
        # Using the camera calibration for Bird Eye View
        intrinsics_matrix = np.array(intrinsics, dtype='float32')
        #In the intrinsics we have parameters such as focal length and the principal point

        extrinsics_matrix = np.array(extrinsics, dtype='float32')
        #The extrinsic matrix stores the position of the camera in global space
        #The 1st 3 columns represents the rotation matrix and the last is a translation vector
        extrinsics = extrinsics[:, [0, 1, 3]]

        #We removed the 3rd column of the extrinsics because it represents the z coordinate (0)
        projection_matrix = np.matmul(intrinsics_matrix, extrinsics_matrix)

        # Compute the new coordinates of our points - cv2.perspectiveTransform expects shape 3
        list_points_to_detect = np.array([[point_x_y]], dtype=np.float32)
        transformed_points = cv2.perspectiveTransform(list_points_to_detect, projection_matrix)
        return transformed_points[0][0][0], transformed_points[0][0][1]

- Maf

你能解释一下为什么这是你问题的答案吗？（在“考虑上述呈现”的部分之后）我看到你返回了透视变换矩阵的部分，但是这如何回答“计算给定点坐标的BEV透视变换”这个问题呢？ - KansaiRobot

1

如果您有相机模型，一个现成但不完整的解决方案是使用来自相机变换库的 getTopViewOfImage 函数。

该函数的详细信息请参见此处

- TripleS

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Joseph Budin · Accepted Answer

答案是：如果您没有有关图像像素的距离信息，那么计算场景的鸟瞰图是不可能的。

想一想：想象一下您拍摄了一个竖屏幕的照片：此时鸟瞰图将会是一条线。现在假设这个屏幕正在播放一个风景的画面，并且这个屏幕的照片和实际风景的照片无法区分。尽管鸟瞰图会是一条彩色的线，但它仍然只是一条线。

现在，想象你拍摄了完全相同的图像，但这次不是屏幕的照片而是风景的照片。那么鸟瞰图就不是一条线，更接近于我们通常所想象的鸟瞰图。

最后，让我说明一下OpenCV无法知道您的图片描述的是一个平面还是其他物体（即使给出了相机参数），因此它无法计算出场景的鸟瞰图。函数cv2.perspectiveTransform需要您传递一个单应性矩阵（您可以使用cv2.findHomography()获得该矩阵，但您还需要一些关于图像距离的信息）。

很抱歉给出了这样的否定答案，但仅凭相机的内部和外部校准矩阵是无法解决您的问题的。