从AVAudioPCMBuffer中提取声压级

Question

从AVAudioPCMBuffer中提取声压级

iosaudiovolumeavaudioengineavaudiopcmbuffer

8

我对信号处理几乎一无所知，目前我正尝试在Swift中实现一个函数，当声压级增加时（例如人类尖叫时）触发事件。

我通过回调来连接AVAudioEngine的输入节点，如下所示：

let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat){
 (buffer : AVAudioPCMBuffer?, when : AVAudioTime) in 
    let arraySize = Int(buffer.frameLength)
    let samples = Array(UnsafeBufferPointer(start: buffer.floatChannelData![0], count:arraySize))

   //do something with samples
    let volume = 20 * log10(floatArray.reduce(0){ $0 + $1} / Float(arraySize))
    if(!volume.isNaN){
       print("this is the current volume: \(volume)")
    }
}

将其转换为浮点数组后，我尝试通过计算平均值来粗略估计声压级。

但即使iPad只是静静地放在一个安静的房间里，这也会导致值波动很大：

this is the current volume: -123.971
this is the current volume: -119.698
this is the current volume: -147.053
this is the current volume: -119.749
this is the current volume: -118.815
this is the current volume: -123.26
this is the current volume: -118.953
this is the current volume: -117.273
this is the current volume: -116.869
this is the current volume: -110.633
this is the current volume: -130.988
this is the current volume: -119.475
this is the current volume: -116.422
this is the current volume: -158.268
this is the current volume: -118.933

如果我在麦克风附近拍手，这个值的确会显著增加。

所以我可以在准备阶段先计算这些音量的平均值，然后比较事件触发阶段差异是否显著增加：

 if(!volume.isNaN){
    if(isInThePreparingPhase){
        print("this is the current volume: \(volume)")
        volumeSum += volume
        volumeCount += 1
     }else if(isInTheEventTriggeringPhase){
         if(volume > meanVolume){
             //triggers an event
         }
      }
 }

平均音量是在准备阶段过渡到触发事件阶段期间计算的：meanVolume = volumeSum / Float(volumeCount)

....

然而，如果我在麦克风之外播放大声音乐，似乎没有明显的增加。而且在极少数情况下，即使环境中的音量没有明显增加（对人耳可听到），音量仍然大于平均音量。

那么，从AVAudioPCMBuffer中提取声压级的正确方法是什么？

维基百科给出了以下公式：

其中p是均方根声压，p0是参考声压。

但我不知道AVAudioPCMBuffer.floatChannelData中的浮点数值代表什么。苹果页面只说：

缓冲区的音频样本作为浮点值。

我该如何处理它们？

- Archy Will He 何魏奇

嗨，Arch，我想你已经想出了这个问题的答案？你有任何可以提供的代码吗？ - Logan

floatArray 是什么？在这里... let volume = 20 * log10(floatArray.reduce(0){ $0 + $1} / Float(arraySize)) .... - MikeMaus

2个回答

6

我认为第一步是获取声音的包络线。您可以使用简单的平均值来计算包络线，但需要添加一个整流步骤（通常意味着使用abs()或square()使所有样本为正数）。

通常情况下，使用简单的iir滤波器代替平均值，攻击和衰减采用不同的常数，这里有一个实验室。请注意，这些常数取决于采样频率，您可以使用此公式计算常数：

1 - exp(-timePerSample*2/smoothingTime)

步骤2

当您拥有信封时，可以使用额外的过滤器对其进行平滑处理，然后比较两个信封以找到一个比基本水平更响亮的声音，这里有一个更详细的完整实验室。

请注意，检测音频“事件”可能非常棘手且难以预测，请确保您有大量的调试辅助工具！

- teadrinker

感谢实验室的演示！非常有帮助 :D - Archy Will He 何魏奇

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- davebcn87 · Accepted Answer

感谢@teadrinker的回应，我终于找到了解决这个问题的方法。我分享我的Swift代码，它输出AVAudioPCMBuffer输入的音量：

private func getVolume(from buffer: AVAudioPCMBuffer, bufferSize: Int) -> Float {
    guard let channelData = buffer.floatChannelData?[0] else {
        return 0
    }

    let channelDataArray = Array(UnsafeBufferPointer(start:channelData, count: bufferSize))

    var outEnvelope = [Float]()
    var envelopeState:Float = 0
    let envConstantAtk:Float = 0.16
    let envConstantDec:Float = 0.003

    for sample in channelDataArray {
        let rectified = abs(sample)

        if envelopeState < rectified {
            envelopeState += envConstantAtk * (rectified - envelopeState)
        } else {
            envelopeState += envConstantDec * (rectified - envelopeState)
        }
        outEnvelope.append(envelopeState)
    }

    // 0.007 is the low pass filter to prevent
    // getting the noise entering from the microphone
    if let maxVolume = outEnvelope.max(),
        maxVolume > Float(0.015) {
        return maxVolume
    } else {
        return 0.0
    }
}