iOS：如何将音频文件读入浮点缓冲区

Question

iOS：如何将音频文件读入浮点缓冲区

iosfilecore-audio

10

我有一个非常短的音频文件，比如说10分之一秒，格式可能是.PCM。

我想使用RemoteIO循环播放该文件，以产生连续的音乐音调。那么，我该如何将其读入一个浮点数数组中？

编辑：虽然我可以挖掘文件格式，将文件提取到NSData中并手动处理它，但我猜想有一个更明智的通用方法...(例如能够处理不同的格式)。

- P i

为什么文件的NSData不足以满足需求？ - August Lilleaas

2

我猜每种音频文件格式都会有一些头部信息。否则它怎么知道采样率/数据格式等呢？ - P i

4个回答

6

这是我用来将音频数据（音频文件）转换为浮点表示并保存到数组中的代码。

-(void) PrintFloatDataFromAudioFile {

NSString *  name = @"Filename";  //YOUR FILE NAME
NSString * source = [[NSBundle mainBundle] pathForResource:name ofType:@"m4a"]; // SPECIFY YOUR FILE FORMAT

const char *cString = [source cStringUsingEncoding:NSASCIIStringEncoding];

CFStringRef str = CFStringCreateWithCString(
                                            NULL,
                                            cString,
                                            kCFStringEncodingMacRoman
                                            );
CFURLRef inputFileURL = CFURLCreateWithFileSystemPath(
                                                      kCFAllocatorDefault,
                                                      str,
                                                      kCFURLPOSIXPathStyle,
                                                      false
                                                      );

ExtAudioFileRef fileRef;
ExtAudioFileOpenURL(inputFileURL, &fileRef);


  AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 44100;   // GIVE YOUR SAMPLING RATE 
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat;
audioFormat.mBitsPerChannel = sizeof(Float32) * 8;
audioFormat.mChannelsPerFrame = 1; // Mono
audioFormat.mBytesPerFrame = audioFormat.mChannelsPerFrame * sizeof(Float32);  // == sizeof(Float32)
audioFormat.mFramesPerPacket = 1;
audioFormat.mBytesPerPacket = audioFormat.mFramesPerPacket * audioFormat.mBytesPerFrame; // = sizeof(Float32)

// 3) Apply audio format to the Extended Audio File
ExtAudioFileSetProperty(
                        fileRef,
                        kExtAudioFileProperty_ClientDataFormat,
                        sizeof (AudioStreamBasicDescription), //= audioFormat
                        &audioFormat);

int numSamples = 1024; //How many samples to read in at a time
UInt32 sizePerPacket = audioFormat.mBytesPerPacket; // = sizeof(Float32) = 32bytes
UInt32 packetsPerBuffer = numSamples;
UInt32 outputBufferSize = packetsPerBuffer * sizePerPacket;

// So the lvalue of outputBuffer is the memory location where we have reserved space
UInt8 *outputBuffer = (UInt8 *)malloc(sizeof(UInt8 *) * outputBufferSize);



AudioBufferList convertedData ;//= malloc(sizeof(convertedData));

convertedData.mNumberBuffers = 1;    // Set this to 1 for mono
convertedData.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame;  //also = 1
convertedData.mBuffers[0].mDataByteSize = outputBufferSize;
convertedData.mBuffers[0].mData = outputBuffer; //

UInt32 frameCount = numSamples;
float *samplesAsCArray;
int j =0;
    double floatDataArray[882000]   ; // SPECIFY YOUR DATA LIMIT MINE WAS 882000 , SHOULD BE EQUAL TO OR MORE THAN DATA LIMIT

while (frameCount > 0) {
    ExtAudioFileRead(
                     fileRef,
                     &frameCount,
                     &convertedData
                     );
    if (frameCount > 0)  {
        AudioBuffer audioBuffer = convertedData.mBuffers[0];
        samplesAsCArray = (float *)audioBuffer.mData; // CAST YOUR mData INTO FLOAT

       for (int i =0; i<1024 /*numSamples */; i++) { //YOU CAN PUT numSamples INTEAD OF 1024

            floatDataArray[j] = (double)samplesAsCArray[i] ; //PUT YOUR DATA INTO FLOAT ARRAY
              printf("\n%f",floatDataArray[j]);  //PRINT YOUR ARRAY'S DATA IN FLOAT FORM RANGING -1 TO +1
            j++;


        }
    }
}}

- Ankush

1

感谢提供这段代码。在iOS上，我不得不使用malloc来初始化floatDataArray。除此之外，一切都很好。 - VaporwareWolf

5

我不熟悉RemoteIO，但我熟悉WAV格式，并想分享一些格式信息。如果需要，您应该可以轻松地解析出持续时间、比特率等信息...

首先，这里有一个非常好的网站，详细介绍了WAVE PCM声音文件格式。该网站还很好地说明了“fmt”子块内部不同字节地址所指的内容。

WAVE文件格式

WAVE由“RIFF”块和后续子块组成
每个块至少为8个字节
前4个字节是块ID
接下来的4个字节是块大小（块大小给出了除用于块ID和块大小的8个字节之外的块剩余部分的大小）
每个WAVE都具有以下块/子块
- “RIFF”（第一个且唯一的块。其余所有均为“子块”）
- “fmt”（通常是“RIFF”之后的第一个子块，但可以位于“RIFF”和“data”之间的任何位置。此块包含关于WAV的信息，例如通道数、采样率和字节率）
- “data”（必须是最后一个子块，包含所有声音数据）

常见的WAVE音频格式：

PCM
IEEE_Float
PCM_EXTENSIBLE（其子格式为PCM或IEEE_FLOAT）

WAVE持续时间和大小

可以通过以下方式计算WAVE文件的持续时间：

seconds = DataChunkSize / ByteRate

在哪里

ByteRate = SampleRate * NumChannels * BitsPerSample/8

而DataChunkSize不包括“数据”子块的ID和大小所保留的8个字节。

了解这一点后，如果您知道WAV的持续时间和ByteRate，则可以计算DataChunkSize。

DataChunkSize = seconds * ByteRate

当从像mp3或wma这样的格式转换时，计算wav数据的大小可能会很有用。请注意，典型的wav头是44个字节，后跟DataChunkSize（如果使用Normalizer工具转换了wav，则始终如此-至少在本文撰写之时）。

- Sam

2

Swift 5更新

这是一个简单的函数，可以将音频文件转换为浮点数数组。支持单声道和立体声音频。如果需要获取立体声音频的第二个通道，请取消注释样例2。


import AVFoundation

//..

do {
    guard let url = Bundle.main.url(forResource: "audio_example", withExtension: "wav") else { return }
    let file = try AVAudioFile(forReading: url)
    if let format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: file.fileFormat.sampleRate, channels: file.fileFormat.channelCount, interleaved: false), let buf = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(file.length)) {

        try file.read(into: buf)
        guard let floatChannelData = buf.floatChannelData else { return }
        let frameLength = Int(buf.frameLength)
        
        let samples = Array(UnsafeBufferPointer(start:floatChannelData[0], count:frameLength))
//        let samples2 = Array(UnsafeBufferPointer(start:floatChannelData[1], count:frameLength))
        
        print("samples")
        print(samples.count)
        print(samples.prefix(10))
//        print(samples2.prefix(10))
    }
} catch {
    print("Audio Error: \(error)")
}

- Ibrahim

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- sbooth · Accepted Answer

您可以使用ExtAudioFile以多种客户端格式从任何受支持的数据格式中读取数据。以下是一个示例，演示如何将文件读取为16位整数：

CFURLRef url = /* ... */;
ExtAudioFileRef eaf;
OSStatus err = ExtAudioFileOpenURL((CFURLRef)url, &eaf);
if(noErr != err)
  /* handle error */

AudioStreamBasicDescription format;
format.mSampleRate = 44100;
format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags = kAudioFormatFormatFlagIsPacked;
format.mBitsPerChannel = 16;
format.mChannelsPerFrame = 2;
format.mBytesPerFrame = format.mChannelsPerFrame * 2;
format.mFramesPerPacket = 1;
format.mBytesPerPacket = format.mFramesPerPacket * format.mBytesPerFrame;

err = ExtAudioFileSetProperty(eaf, kExtAudioFileProperty_ClientDataFormat, sizeof(format), &format);

/* Read the file contents using ExtAudioFileRead */

如果你需要Float32数据，你将会像这样设置format：

format.mFormatID = kAudioFormatLinearPCM;
format.mFormatFlags = kAudioFormatFlagsNativeFloatPacked;
format.mBitsPerChannel = 32;