ARKit条码追踪和视觉框架

5
我一直试图在ARSession期间检测到的QR码上绘制边界框。结果如下: boundingbox 1 boundingbox 2 条形码正在被跟踪,但边界框的几何结构是错误的。
如何获取正确的边界框坐标?
源代码如下:
 public func session(_ session: ARSession, didUpdate frame: ARFrame) {

     // Only run one Vision request at a time
     if self.processing {
         return
     }

    self.processing = true

    let request = VNDetectBarcodesRequest { (request, error) in

        if let results = request.results, let result = results.first as? VNBarcodeObservation {

            DispatchQueue.main.async {

                let path = CGMutablePath()

                for result in results {
                    guard let barcode = result as? VNBarcodeObservation else { continue }
                    let topLeft = self.convert(point: barcode.topLeft)
                    path.move(to: topLeft)
                    let topRight = self.convert(point: barcode.topRight)
                    path.addLine(to: topRight)
                    let bottomRight = self.convert(point: barcode.bottomRight)
                    path.addLine(to: bottomRight)
                    let bottomLeft = self.convert(point: barcode.bottomLeft)
                    path.addLine(to: bottomLeft)
                    path.addLine(to: topLeft)
                }                   
                self.drawLayer.path = path
                self.processing = false
            }
        } else {
            self.processing = false
        }
    }

    DispatchQueue.global(qos: .userInitiated).async {
        do {
            request.symbologies = [.QR]
            let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .right, options: [:])                
            try imageRequestHandler.perform([request])
        } catch {               
        }
    }
}

 private func convert(point: CGPoint) -> CGPoint {
     return CGPoint(x: point.x * view.bounds.size.width,
                   y: (1 - point.y) * view.bounds.size.height)
 }

如果您在VNRectangleObservation上进行Command-click,该类的文档会告诉您应该使用CIPerspectiveTransform来将其矫正。虽然我不确定这是否能解决问题,但可能与当前帧和找到代码的帧之间的延迟有关。 - Maxim Volgin
对我来说唯一可行的方法是拍摄快照: let snapshot = self.sceneView.snapshot().rotate(radians: -.pi/2) 但这种方式并不好,因为我必须拍摄已经跟踪过程中拍摄的帧的快照,并且快照分辨率很低。我想正常的方法一定存在。 - Дмитрий Акимов
这和“orientation: .right”有关吗?也许应该改成“.up”? - Maxim Volgin
我尝试了不同的方向,但这不是方向的问题。当我从ARFrame和快照中获取图像帧时,两个图像具有不同的尺寸和内容,就好像两个图像是从不同的视角拍摄的。 - Дмитрий Акимов
我会尝试将一些 ARFrames 保存为 CGImages 到文件系统中,并在保存的图像上运行 VNDetectBarcodesRequest,以找出发生了什么。 - Maxim Volgin
1个回答

0

我刚刚将我的应用程序中的条形码识别从AVFoundation迁移到Vision,以下是对于勾勒逻辑对我有效的内容:

extension CVPixelBuffer {
    var size: CGSize {
        get {
            let width = CGFloat(CVPixelBufferGetWidth(self))
            let height = CGFloat(CVPixelBufferGetHeight(self))
            return CGSize(width: width, height: height)
        }
    }
}
extension VNRectangleObservation {    
    func outline(in cvPixelBuffer: CVPixelBuffer, with color: UIColor) -> CALayer {
        let outline = CAShapeLayer()
        outline.path = self.path(in: cvPixelBuffer).cgPath
        outline.fillColor = UIColor.clear.cgColor
        outline.strokeColor =  color.cgColor
        return outline
    }
    
    func path(in cvPixelBuffer: CVPixelBuffer) -> UIBezierPath {
        let size = cvPixelBuffer.size
        let transform = CGAffineTransform.identity
            .scaledBy(x: 1, y: -1)
            .translatedBy(x: 0, y: -size.height)
            .scaledBy(x: size.width, y: size.height)
        
        let convertedTopLeft = self.topLeft.applying(transform)
        let convertedTopRight = self.topRight.applying(transform)
        let convertedBottomLeft = self.bottomLeft.applying(transform)
        let convertedBottomRight = self.bottomRight.applying(transform)
        
        let path = UIBezierPath()
        path.move(to: convertedTopLeft)
        path.addLine(to: convertedTopRight)
        path.addLine(to: convertedBottomRight)
        path.addLine(to: convertedBottomLeft)
        path.close()
        
        path.lineWidth = 2.0
        return path
    }
}

接下来,我会应用一次额外的缩放变换,以适应轮廓所显示的视图的大小。

我正在使用 https://github.com/maxvol/RxVision 库,这使得传递处理后的图像(在我的情况下是 CVPixelBuffer)非常简单。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接