AVAudioConverter使用AVAudioConverterInputBlock处理后,音频出现卡顿现象。

10

我正在尝试将音频缓冲区转换为不同的格式,并且在这里使用AVAudioConverter。当您拥有相同的采样率并且不需要使用AVAudioConverterInputBlock时,AVAudioConverter可以完成任务。

但是如果我处理的是相同的采样率,则会出现奇怪的音频数据卡顿。我感觉自己没有很好地处理输入块。输出会重复两到三次。以下是完整方法:

func sendAudio(audioFile: URL, completionHandler: @escaping (Bool, Bool, Data?)->Void) {

    createSession(){ sessionUrl, observeURL, session in
        let file = try! AVAudioFile(forReading: audioFile)
        let formatOfAudio = file.processingFormat
        self.engine = AVAudioEngine()
        guard let input = self.engine.inputNode else {
            print("no input")
            return
        }
        //The audio in format in this case is: <AVAudioFormat 0x61800009d010:  2 ch,  44100 Hz, Float32, non-inter>
        let formatIn = formatOfAudio
        let formatOut = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: 16000, channels: 1, interleaved: true)
        let mixer = AVAudioMixerNode()
        self.engine.attach(mixer)
        mixer.volume = 0.0
        self.engine.attach(self.audioPlayerNode)
        self.engine.connect(self.audioPlayerNode, to: mixer, format: formatIn)
        self.engine.connect(input, to: mixer, format: input.outputFormat(forBus: 0))
        self.engine.connect(mixer, to: self.engine.mainMixerNode, format: formatIn)
        let audioConverter = AVAudioConverter(from: formatIn, to: formatOut)
        mixer.installTap(onBus: 0, bufferSize: 32000, format: formatIn, block: {
            (buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
                let convertedBuffer = AVAudioPCMBuffer(pcmFormat: formatOut, frameCapacity: buffer.frameCapacity)
                let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
                    outStatus.pointee = AVAudioConverterInputStatus.haveData
                    return buffer
                }
                var error: NSError? = nil
                let status = audioConverter.convert(to: convertedBuffer, error: &error, withInputFrom: inputBlock)
                let myData = convertedBuffer.toData()
                completionHandler(true, false, myData)
        })
        self.audioPlayerNode.scheduleFile(file, at: nil){
            self.delayWithSeconds(3.0){
            self.engine.stop()
            mixer.removeTap(onBus: 0)
            completionHandler(true, true, nil)
            }
        }
        do {
            try self.engine.start()
        } catch {
            print(error)
        }
        self.audioPlayerNode.play()
    }
}

有何想法?我从一个苹果示例幻灯片中获取了这段代码:

// Create an input block that’s called when converter needs input
let inputBlock : AVAudioConverterInputBlock = {inNumPackets, outStatus in 
    if (<no_data_available>) {   
        outStatus.memory = AVAudioConverterInputStatus.NoDataNow; 
        return nil;  
    } else if (<end_of_stream>) {   
        outStatus.memory = AVAudioConverterInputStatus.EndOfStream; 
        return nil;  
    } else {
        ..outStatus.memory = AVAudioConverterInputStatus.HaveData;   
        return inBuffer; // fill and return input buffer 
    }  
}
2个回答

12

对于任何找到这篇文章的人,实际的根本原因是错误使用了AVAudioConverterInputBlock。目标缓冲区容量并不重要,只要足够大即可,但是块将会被反复调用,直到目标缓冲区填满为止。

如果您的源缓冲区包含ABC,它将用ABCABCABC...填充目标缓冲区。然后,如果您将其传输到实时播放中,块将随机截断以适应播放时间,导致奇怪的噼啪声。

实际的解决方案是在缓冲区提交到转换器后正确设置AVAudioConverterInputStatus.noDataNow。请注意,返回.endOfStream将永久锁定转换器对象。

var gotData = false
self.converter.convert(to: convertedBuffer, error: nil, withInputFrom: { (_, outStatus) -> AVAudioBuffer? in
    if gotData {
        outStatus.pointee = .noDataNow
        return nil
    }
    gotData = true
    outStatus.pointee = .haveData
    return inputBuffer
})            

2
我试过了,它运行良好。总是很高兴得到正确的答案,而不是一个hack。 - MScottWaller

8

我相信我已经搞清楚了。转换后的缓冲帧容量必须被转换的采样率比例除以。因此,完整的答案是这样的:

func sendAudio(audioFile: URL, completionHandler: @escaping (Bool, Bool, Data?)->Void) {

    createSession(){ sessionUrl, observeURL, session in
        let file = try! AVAudioFile(forReading: audioFile)
        let formatOfAudio = file.processingFormat
        self.engine = AVAudioEngine()
        guard let input = self.engine.inputNode else {
            print("no input")
            return
        }
        //The audio in format in this case is: <AVAudioFormat 0x61800009d010:  2 ch,  44100 Hz, Float32, non-inter>
        let formatIn = formatOfAudio
        let formatOut = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: 16000, channels: 1, interleaved: true)
        let mixer = AVAudioMixerNode()
        self.engine.attach(mixer)
        mixer.volume = 0.0
        self.engine.attach(self.audioPlayerNode)
        self.engine.connect(self.audioPlayerNode, to: mixer, format: formatIn)
        self.engine.connect(input, to: mixer, format: input.outputFormat(forBus: 0))
        self.engine.connect(mixer, to: self.engine.mainMixerNode, format: formatIn)
        let audioConverter = AVAudioConverter(from: formatIn, to: formatOut)
        //Here is where I adjusted for the sample rate. It's hard coded here, but you would want to adjust so that you're dividing the input sample rate by your chosen sample rate.
        let sampleRateConversionRatio: Float = 44100.0/16000.0

        mixer.installTap(onBus: 0, bufferSize: 32000, format: formatIn, block: {
        (buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
                //And this is where you set the appropriate capacity!
                let capacity = UInt32(Float(buffer.frameCapacity)/ratio)
                let convertedBuffer = AVAudioPCMBuffer(pcmFormat: formatOut, frameCapacity: capacity)
                let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
                    outStatus.pointee = AVAudioConverterInputStatus.haveData
                    return buffer
                }
                var error: NSError? = nil
                let status = audioConverter.convert(to: convertedBuffer, error: &error, withInputFrom: inputBlock)
                let myData = convertedBuffer.toData()
                completionHandler(true, false, myData)
        })
        self.audioPlayerNode.scheduleFile(file, at: nil){
            self.delayWithSeconds(3.0){
            self.engine.stop()
            mixer.removeTap(onBus: 0)
            completionHandler(true, true, nil)
            }
        }
        do {
            try self.engine.start()
        } catch {
            print(error)
        }
        self.audioPlayerNode.play()
    }
}

1
我在使用这个转换器时遇到了类似的问题,文档中缺乏这方面的详细说明。另外,对于其他人来说,可能也有用的是,尽管您在tap上设置了bufferSize: 32000,但要计算容量,您需要使用实际的缓冲区frameCapacity,即buffer.frameCapacity。您不能假设它会实际使用您请求的缓冲区大小-它可以忽略您的请求。我尝试使用更大的大小(48000),但实际缓冲区只有19200,所以您必须像您所做的那样去处理! - Jesse Pangburn

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接