Swift FFT - 复数分离问题

14
我正在尝试使用加速框架在音频文件上执行FFT以查找频率。我从此问题中(可能是错误的)适应了代码:Spectrogram from AVAudioPCMBuffer using Accelerate framework in Swift。尽管'spectrum'的幅度可能为'0'、'inf'或'nan',而复数分割的'real'和'imag'成分打印出类似的结果;这表明这是问题的原因:'magnitude = sqrt(pow(real,2)+pow(imag,2))'。请纠正我,但我认为代码的其余部分没问题。为什么我会收到这些结果,我该如何解决它(分割组件应该是什么),我做错了什么?请记住,我对FFT和采样非常陌生,不知道如何为音频文件设置这个,所以任何帮助都将不胜感激。谢谢。以下是我正在使用的代码:
    // get audio file
    let fileURL:NSURL = NSBundle.mainBundle().URLForResource("foo", withExtension: "mp3")!
    let audioFile = try!  AVAudioFile(forReading: fileURL)
    let fileFormat = audioFile.processingFormat
    let frameCount = UInt32(audioFile.length)

    let buffer = AVAudioPCMBuffer(PCMFormat: fileFormat, frameCapacity: frameCount)
    let audioEngine = AVAudioEngine()
    let playerNode = AVAudioPlayerNode()
    audioMixerNode = audioEngine.mainMixerNode

    let bufferSize = Int(frameCount)
    let channels: NSArray = [Int(buffer.format.channelCount)]
    let channelCount = channels.count
    let floats1 = [Int(buffer.frameLength)]
    for var i=0; i<channelCount; ++i {
        channelSamples.append([])
        let firstSample = buffer.format.interleaved ? i : i*bufferSize
        for var j=firstSample; j<bufferSize; j+=buffer.stride*2 {
            channelSamples[i].append(DSPComplex(real: buffer.floatChannelData.memory[j], imag: buffer.floatChannelData.memory[j+buffer.stride]))
        }
    }

    // connect node
    audioEngine.attachNode(playerNode)
    audioEngine.connect(playerNode, to: audioMixerNode, format: playerNode.outputFormatForBus(0))

    // Set up the transform
    let log2n = UInt(round(log2(Double(bufferSize))))
    let fftSetup = vDSP_create_fftsetup(log2n, Int32(kFFTRadix2))

    // Create the complex split value to hold the output of the transform
    // why doesn't this work?
    var realp = [Float](count: bufferSize/2, repeatedValue: 0)
    var imagp = [Float](count: bufferSize/2, repeatedValue: 0)
    var output = DSPSplitComplex(realp: &realp, imagp: &imagp)

    vDSP_ctoz(UnsafePointer(channelSamples), 2, &output, 1, UInt(bufferSize / 2))

    // Do the fast Fourier forward transform
    vDSP_fft_zrip(fftSetup, &output, 1, log2n, Int32(FFT_FORWARD))

    // Convert the complex output to magnitude
    var fft = [Float](count:Int(bufferSize / 2), repeatedValue:0.0)
    let bufferOver2: vDSP_Length = vDSP_Length(bufferSize / 2)
    vDSP_zvmags(&output, 1, &fft, 1, bufferOver2)

    var spectrum = [Float]()
    for var i=0; i<bufferSize/2; ++i {
        let imag = output.imagp[i]
        let real = output.realp[i]
        let magnitude = sqrt(pow(real,2)+pow(imag,2))
        spectrum.append(magnitude) }

    // Release the setup
    vDSP_destroy_fftsetup(fftSetup)
1个回答

8
您的代码存在以下几个问题:
  1. 您没有读入音频文件样本
  2. channelSamples 打包不正确
  3. vDSP_fft_zrip已读取超出数组末尾。它需要2^log2n个样本
  4. vDSP_fft_zrip的输出是打包的,而您的计算期望解包

Swift 4版本现已实际修复第3点。

let fileURL = Bundle.main.url(forResource: "foo", withExtension: "mp3")!
let audioFile = try!  AVAudioFile(forReading: fileURL as URL)
let frameCount = UInt32(audioFile.length)

let log2n = UInt(round(log2(Double(frameCount))))
let bufferSizePOT = Int(1 << log2n)

let buffer = AVAudioPCMBuffer(pcmFormat: audioFile.processingFormat, frameCapacity: AVAudioFrameCount(bufferSizePOT))!
try! audioFile.read(into: buffer, frameCount:frameCount)

// Not sure if AVAudioPCMBuffer zero initialises extra frames, so when in doubt...
let leftFrames = buffer.floatChannelData![0]
for i in Int(frameCount)..<Int(bufferSizePOT) {
    leftFrames[i] = 0
}

// Set up the transform
let fftSetup = vDSP_create_fftsetup(log2n, Int32(kFFTRadix2))!

// create packed real input
var realp = [Float](repeating: 0, count: bufferSizePOT/2)
var imagp = [Float](repeating: 0, count: bufferSizePOT/2)
var output = DSPSplitComplex(realp: &realp, imagp: &imagp)

leftFrames.withMemoryRebound(to: DSPComplex.self, capacity: bufferSizePOT / 2) {
    vDSP_ctoz($0, 2, &output, 1, UInt(bufferSizePOT / 2))
}

// Do the fast Fourier forward transform, packed input to packed output
vDSP_fft_zrip(fftSetup, &output, 1, log2n, Int32(FFT_FORWARD))

// you can calculate magnitude squared here, with care
// as the first result is wrong! read up on packed formats
var fft = [Float](repeating:0.0, count:Int(bufferSizePOT / 2))
vDSP_zvmags(&output, 1, &fft, 1, vDSP_Length(bufferSizePOT / 2))

// Release the setup
vDSP_destroy_fftsetup(fftSetup)

你可以帮忙让这个在Swift 4.0以上版本正常工作吗?我已经完成了转换,除了vDSP_ctoz之外其他都是自动的,但输出结果与预期不符。 - Sam Bing
所有的值都是nan,还在调试中。 - Sam Bing
哎呀,这段代码超出了AVAudioPCMBuffer内存的末尾索引。也许我要用修好的Swift 4版本替换这个答案。 - Rhythmic Fistman
太棒了,错过了。非常感谢! - Sam Bing

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接