如何在IOS 4中直接旋转CVImageBuffer图像而不转换为UIImage?

14

我正在使用OpenCV 2.2在iPhone上检测人脸。 我正在使用IOS 4的AVCaptureSession来获取相机流,如下面的代码所示。

我的挑战是视频帧以CVBufferRef对象的形式进入(指向CVImageBuffer),它们以横向480像素,纵向300像素的方式定向。 如果您将手机纵向拿着,则很好,但当手机处于竖直位置时,我希望将这些帧顺时针旋转90度,以便OpenCV可以正确找到人脸。

可以将CVBufferRef转换为CGImage,然后转换为UIImage并旋转,就像这个人所做的那样:Rotate CGImage taken from video frame

但是这会浪费很多CPU。 如果可能的话,我正在寻找一种更快速地旋转进入的图像的方法,最好使用GPU进行此处理。

有任何想法吗?

Ian

代码示例:

 -(void) startCameraCapture {
  // Start up the face detector

  faceDetector = [[FaceDetector alloc] initWithCascade:@"haarcascade_frontalface_alt2" withFileExtension:@"xml"];

  // Create the AVCapture Session
  session = [[AVCaptureSession alloc] init];

  // create a preview layer to show the output from the camera
  AVCaptureVideoPreviewLayer *previewLayer = [AVCaptureVideoPreviewLayer layerWithSession:session];
  previewLayer.frame = previewView.frame;
  previewLayer.videoGravity = AVLayerVideoGravityResizeAspectFill;

  [previewView.layer addSublayer:previewLayer];

  // Get the default camera device
  AVCaptureDevice* camera = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];

  // Create a AVCaptureInput with the camera device
  NSError *error=nil;
  AVCaptureInput* cameraInput = [[AVCaptureDeviceInput alloc] initWithDevice:camera error:&error];
  if (cameraInput == nil) {
   NSLog(@"Error to create camera capture:%@",error);
  }

  // Set the output
  AVCaptureVideoDataOutput* videoOutput = [[AVCaptureVideoDataOutput alloc] init];
  videoOutput.alwaysDiscardsLateVideoFrames = YES;

  // create a queue besides the main thread queue to run the capture on
  dispatch_queue_t captureQueue = dispatch_queue_create("catpureQueue", NULL);

  // setup our delegate
  [videoOutput setSampleBufferDelegate:self queue:captureQueue];

  // release the queue.  I still don't entirely understand why we're releasing it here,
  // but the code examples I've found indicate this is the right thing.  Hmm...
  dispatch_release(captureQueue);

  // configure the pixel format
  videoOutput.videoSettings = [NSDictionary dictionaryWithObjectsAndKeys:
          [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA], 
          (id)kCVPixelBufferPixelFormatTypeKey,
          nil];

  // and the size of the frames we want
  // try AVCaptureSessionPresetLow if this is too slow...
  [session setSessionPreset:AVCaptureSessionPresetMedium];

  // If you wish to cap the frame rate to a known value, such as 10 fps, set 
  // minFrameDuration.
  videoOutput.minFrameDuration = CMTimeMake(1, 10);

  // Add the input and output
  [session addInput:cameraInput];
  [session addOutput:videoOutput];

  // Start the session
  [session startRunning];  
 }

 - (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
  // only run if we're not already processing an image
  if (!faceDetector.imageNeedsProcessing) {

   // Get CVImage from sample buffer
   CVImageBufferRef cvImage = CMSampleBufferGetImageBuffer(sampleBuffer);

   // Send the CVImage to the FaceDetector for later processing
   [faceDetector setImageFromCVPixelBufferRef:cvImage];

   // Trigger the image processing on the main thread
   [self performSelectorOnMainThread:@selector(processImage) withObject:nil waitUntilDone:NO];
  }
 }
4个回答

17

vImage是一种相当快速的方法。只需要ios5即可。调用中说ARGB,但它也适用于从缓冲区获取的BGRA。

这还有一个优点,你可以剪切缓冲区的一部分并旋转它。在这里看我的答案

- (unsigned char*) rotateBuffer: (CMSampleBufferRef) sampleBuffer
{
 CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
 CVPixelBufferLockBaseAddress(imageBuffer,0);

 size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
 size_t width = CVPixelBufferGetWidth(imageBuffer);
 size_t height = CVPixelBufferGetHeight(imageBuffer);
 size_t currSize = bytesPerRow*height*sizeof(unsigned char); 
 size_t bytesPerRowOut = 4*height*sizeof(unsigned char); 

 void *srcBuff = CVPixelBufferGetBaseAddress(imageBuffer); 
 unsigned char *outBuff = (unsigned char*)malloc(currSize);  

 vImage_Buffer ibuff = { srcBuff, height, width, bytesPerRow};
 vImage_Buffer ubuff = { outBuff, width, height, bytesPerRowOut};

 uint8_t rotConst = 1;   // 0, 1, 2, 3 is equal to 0, 90, 180, 270 degrees rotation

 vImage_Error err= vImageRotate90_ARGB8888 (&ibuff, &ubuff, NULL, rotConst, NULL,0);
 if (err != kvImageNoError) NSLog(@"%ld", err);

 return outBuff;
}

3
我之前使用类似的方法在将视频写入文件前,操作了每个样本缓存帧。需要注意的是:vImageRotate... 函数原型已更改,我的调用看起来像 vImageRotate90_ARGB8888(&inbuff, &outbuff, rotationConstant, bgColor, 0);(其中 uint8_t bgColor[4] = {0, 0, 0, 0};)。您还需要手动创建一个 CVPixelBufferRef,以便将结果图像数据传递给 AVAssetWriterInputPixelBufferAdaptor。只要不要忘记创建一个 CVPixelBufferReleaseBytesCallback 来释放在此函数中分配的数据缓冲区。 - Mr. T

3
也许更容易的方法是直接设置视频方向:
connection.videoOrientation = AVCaptureVideoOrientationPortrait

这样你就不需要进行那种旋转花招了。

7
该方法并不会在物理上旋转图像缓冲区。 - ozz
1
连接(connection)是什么类型的对象? - Parth Patel

3
我知道这是一个相当旧的问题,但我最近一直在解决类似的问题,也许有人会发现我的解决方案有用。
我需要从iPhone相机提供的YCbCr格式的图像缓冲区中提取原始图像数据(从[AVCaptureVideoDataOutput.availableVideoCVPixelFormatTypes firstObject]获取),丢弃头部、元信息等信息,以便将其传递给进一步处理。
此外,我只需要提取捕获视频帧中心的小区域,因此需要进行一些裁剪。
我的条件只允许在横向方向上捕获视频,但是当设备处于横向左侧方向时,图像被倒置传送,因此我需要在两个轴上翻转它。 如果图像被翻转,我的想法是从源图像缓冲区中以相反的顺序复制数据,并在读取数据的每行reverse bytes中翻转图像。这个想法确实有效,而且由于我必须从源缓冲区复制数据,所以似乎无论是从开始还是从结尾读取都没有太大的性能损失(当然,更大的图像=更长的处理时间,但我处理的数字非常小)。
我想知道其他人对这个解决方案有什么看法,当然还有一些改进代码的提示:
/// Lock pixel buffer
CVPixelBufferLockBaseAddress(imageBuffer, 0);

/// Address where image buffer starts
uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);

/// Read image parameters
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);

/// See whether image is flipped upside down
BOOL isFlipped = (_previewLayer.connection.videoOrientation == AVCaptureVideoOrientationLandscapeLeft);

/// Calculate cropping frame. Crop to scanAreaSize (defined as CGSize constant elsewhere) from the center of an image
CGRect cropFrame = CGRectZero;
cropFrame.size = scanAreaSize;
cropFrame.origin.x = (width / 2.0f) - (scanAreaSize.width / 2.0f);
cropFrame.origin.y = (height / 2.0f) - (scanAreaSize.height / 2.0f);

/// Update proportions to cropped size
width = (size_t)cropFrame.size.width;
height = (size_t)cropFrame.size.height;

/// Allocate memory for output image data. W*H for Y component, W*H/2 for CbCr component
size_t bytes = width * height + (width * height / 2);

uint8_t *outputDataBaseAddress = (uint8_t *)malloc(bytes);

if(outputDataBaseAddress == NULL) {

    /// Memory allocation failed, unlock buffer and give up
    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);

    return NULL;
}

/// Get parameters of YCbCr pixel format
CVPlanarPixelBufferInfo_YCbCrBiPlanar *bufferInfo = (CVPlanarPixelBufferInfo_YCbCrBiPlanar *)baseAddress;

NSUInteger bytesPerRowY = EndianU32_BtoN(bufferInfo->componentInfoY.rowBytes);
NSUInteger offsetY = EndianU32_BtoN(bufferInfo->componentInfoY.offset);

NSUInteger bytesPerRowCbCr = EndianU32_BtoN(bufferInfo->componentInfoCbCr.rowBytes);
NSUInteger offsetCbCr = EndianU32_BtoN(bufferInfo->componentInfoCbCr.offset);

/// Copy image data only, skipping headers and metadata. Create single buffer which will contain Y component data
/// followed by CbCr component data.

/// Process Y component
/// Pointer to the source buffer
uint8_t *src;

/// Pointer to the destination buffer
uint8_t *destAddress;

/// Calculate crop rect offset. Crop offset is number of rows (y * bytesPerRow) + x offset.
/// If image is flipped, then read buffer from the end to flip image vertically. End address is height-1!
int flipOffset = (isFlipped) ? (int)((height - 1) * bytesPerRowY) : 0;

int cropOffset = (int)((cropFrame.origin.y * bytesPerRowY) + flipOffset + cropFrame.origin.x);

/// Set source pointer to Y component buffer start address plus crop rect offset
src = baseAddress + offsetY + cropOffset;

for(int y = 0; y < height; y++) {

    /// Copy one row of pixel data from source into the output buffer.
    destAddress = (outputDataBaseAddress + y * width);

    memcpy(destAddress, src, width);

    if(isFlipped) {

        /// Reverse bytes in row to flip image horizontally
        [self reverseBytes:destAddress bytesSize:(int)width];

        /// Move one row up
        src -= bytesPerRowY;
    }
    else {

        /// Move to the next row
        src += bytesPerRowY;
    }
}

/// Calculate crop offset for CbCr component
flipOffset = (isFlipped) ? (int)(((height - 1) / 2) * bytesPerRowCbCr) : 0;
cropOffset = (int)((cropFrame.origin.y * bytesPerRowCbCr) + flipOffset + cropFrame.origin.x);

/// Set source pointer to the CbCr component offset + crop offset
src = (baseAddress + offsetCbCr + cropOffset);

for(int y = 0; y < (height / 2); y++) {

    /// Copy one row of pixel data from source into the output buffer.
    destAddress = (outputDataBaseAddress + (width * height) + y * width);

    memcpy(destAddress, src, width);

    if(isFlipped) {

        /// Reverse bytes in row to flip image horizontally
        [self reverseBytes:destAddress bytesSize:(int)width];

        /// Move one row up
        src -= bytesPerRowCbCr;
    }
    else {

        src += bytesPerRowCbCr;
    }
}

/// Unlock pixel buffer
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);

/// Continue with image data in outputDataBaseAddress;

3
如果您以90度为单位旋转,则可以在内存中执行操作。以下是示例代码,只需将数据简单地复制到新的像素缓冲区即可。进行暴力旋转应该很容易。
- (CVPixelBufferRef) rotateBuffer: (CMSampleBufferRef) sampleBuffer
{
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CVPixelBufferLockBaseAddress(imageBuffer,0);

    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);

    void *src_buff = CVPixelBufferGetBaseAddress(imageBuffer);

    NSDictionary *options = [NSDictionary dictionaryWithObjectsAndKeys:
                             [NSNumber numberWithBool:YES], kCVPixelBufferCGImageCompatibilityKey,
                             [NSNumber numberWithBool:YES], kCVPixelBufferCGBitmapContextCompatibilityKey,
                             nil];

    CVPixelBufferRef pxbuffer = NULL;
    //CVReturn status = CVPixelBufferPoolCreatePixelBuffer (NULL, _pixelWriter.pixelBufferPool, &pxbuffer);
    CVReturn status = CVPixelBufferCreate(kCFAllocatorDefault, width,
                                          height, kCVPixelFormatType_32BGRA, (CFDictionaryRef) options, 
                                          &pxbuffer);

    NSParameterAssert(status == kCVReturnSuccess && pxbuffer != NULL);

    CVPixelBufferLockBaseAddress(pxbuffer, 0);
    void *dest_buff = CVPixelBufferGetBaseAddress(pxbuffer);
    NSParameterAssert(dest_buff != NULL);

    int *src = (int*) src_buff ;
    int *dest= (int*) dest_buff ;
    size_t count = (bytesPerRow * height) / 4 ;
    while (count--) {
        *dest++ = *src++;
    }

    //Test straight copy.
    //memcpy(pxdata, baseAddress, width * height * 4) ;
    CVPixelBufferUnlockBaseAddress(pxbuffer, 0);
    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
    return pxbuffer;
}

如果您要将其写回到AVAssetWriterInput中,可以使用AVAssetWriterInputPixelBufferAdaptor。
以上内容未经优化。您可能需要寻找更有效的复制算法。一个好的起点是原地矩阵转置。您还需要使用像素缓冲池而不是每次创建一个新的。
编辑。您可以使用GPU来完成此操作。这听起来像是在传输大量数据。在CVPixelBufferRef中,有一个名为kCVPixelBufferOpenGLCompatibilityKey的关键字。我假设您可以从CVImageBufferRef(只是像素缓冲区引用)创建一个OpenGL兼容图像,并通过着色器将其推送。再次,我认为这是过度处理。您可以看看BLAS或LAPACK是否具有“非就地”转置方法。如果有,则可以确保它们已经高度优化。
90 CW其中new_width = width… 这将使您获得一张纵向定向的图像。
for (int i = 1; i <= new_height; i++) {
    for (int j = new_width - 1; j > -1; j--) {
        *dest++ = *(src + (j * width) + i) ;
    }
}

1
Steve,感谢您的回复。目前我正在使用OpenCV中的转置和翻转方法,这是我尝试过的所有图像旋转方法中最快的。我发现,尽管我可以将其推送到OpenGL中,但除非我可以在OpenGL中完成所有图像处理(包括人脸检测),否则我就无法获得巨大的性能提升。现在,我将坚持使用各种组合的transpose()和flip()来以90度为增量进行旋转。我认为您在限制条件下给出了最佳答案,因此我会考虑这个问题已经得到了解答。 - Ian Charnas
使用OpenCV旋转图像的想法很好...我fork了niw的项目并添加了实时人脸跟踪...我可能会稍后整理它,但至少它为那些寻找完整解决方案的人提供了一个起点-https://github.com/gitaaron/iphone_opencv_test - surtyaar
我使用了直接数据复制来旋转图像,并且在低分辨率(AVCaptureSessionPresetLow)下运行良好,但是当我尝试在AVCaptureSessionPresetMedium下运行时,图像变得混乱了。我可能错过了一些愚蠢的东西...有人知道问题出在哪里吗? - Ilya K.
说实话,这种方法真的很慢。通过OpenGL着色器将图像旋转会更有效率。您可以修改模型/视图矩阵来进行旋转(如果需要还可以进行缩放)。您也可以通过CoreImage来完成此操作。请查看Brad Larsons在GitHub上的GPUImage项目。 - Steve McFarlin
是的,我知道它很慢,但在这种情况下,我只是想出于好奇心了解为什么使用更高分辨率(480x360与192x144相比)的图像相同的算法不起作用。你能帮我吗? - Ilya K.
我不明白为什么它不能工作。上面的代码似乎是分辨率无关的。请记住,宽度和高度被交换了。为什么它在低分辨率下可以工作,而在高分辨率下不能工作,这超出了我的理解。您是否将旋转调度或运行在AVCaptureSession所在的线程之外? - Steve McFarlin

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接