从iPhone上的线性PCM中提取振幅数据

Question

从iPhone上的线性PCM中提取振幅数据

iphoneioscore-audio

10

我在尝试从存储在audio.caf中的iPhone的线性PCM中提取幅度数据时遇到了困难。

我的问题是：

线性PCM将幅度样本存储为16位值。这是正确的吗？
AudioFileReadPacketData()返回的数据包中如何存储幅度？当录制单声道线性PCM时，每个样本（在一个帧中，在一个数据包中）不就是一个SInt16数组吗？字节顺序是大端还是小端？
线性PCM幅度中的每个步骤在物理上意味着什么？
当在iPhone上录制线性PCM时，中心点是0（SInt16）还是32768（UInt16）？在物理波形/气压中，最大最小值代表什么？

还有一个额外的问题：iPhone麦克风无法测量哪些声音/气压波形？

我的代码如下：

// get the audio file proxy object for the audio
AudioFileID fileID;
AudioFileOpenURL((CFURLRef)audioURL, kAudioFileReadPermission, kAudioFileCAFType, &fileID);

// get the number of packets of audio data contained in the file
UInt64 totalPacketCount = [self packetCountForAudioFile:fileID];

// get the size of each packet for this audio file
UInt32 maxPacketSizeInBytes = [self packetSizeForAudioFile:fileID];

// setup to extract the audio data
Boolean inUseCache = false;
UInt32 numberOfPacketsToRead = 4410; // 0.1 seconds of data
UInt32 ioNumPackets = numberOfPacketsToRead;
UInt32 ioNumBytes = maxPacketSizeInBytes * ioNumPackets;
char *outBuffer = malloc(ioNumBytes);
memset(outBuffer, 0, ioNumBytes);

SInt16 signedMinAmplitude = -32768;
SInt16 signedCenterpoint = 0;
SInt16 signedMaxAmplitude = 32767;

SInt16 minAmplitude = signedMaxAmplitude;
SInt16 maxAmplitude = signedMinAmplitude;

// process each and every packet
for (UInt64 packetIndex = 0; packetIndex < totalPacketCount; packetIndex = packetIndex + ioNumPackets)
{
   // reset the number of packets to get
   ioNumPackets = numberOfPacketsToRead;

   AudioFileReadPacketData(fileID, inUseCache, &ioNumBytes, NULL, packetIndex, &ioNumPackets, outBuffer);

   for (UInt32 batchPacketIndex = 0; batchPacketIndex < ioNumPackets; batchPacketIndex++)
   {
      SInt16 packetData = outBuffer[batchPacketIndex * maxPacketSizeInBytes];
      SInt16 absoluteValue = abs(packetData);

      if (absoluteValue < minAmplitude) { minAmplitude = absoluteValue; }
      if (absoluteValue > maxAmplitude) { maxAmplitude = absoluteValue; }
   }
}

NSLog(@"minAmplitude: %hi", minAmplitude);
NSLog(@"maxAmplitude: %hi", maxAmplitude);

使用这段代码，我几乎总是得到0和128的最小值和最大值！这对我来说毫无意义。

我正在使用AVAudioRecorder记录音频，如下所示：

// specify mono, 44.1 kHz, Linear PCM with Max Quality as recording format
NSDictionary *recordSettings = [[NSDictionary alloc] initWithObjectsAndKeys:
   [NSNumber numberWithFloat: 44100.0], AVSampleRateKey,
   [NSNumber numberWithInt: kAudioFormatLinearPCM], AVFormatIDKey,
   [NSNumber numberWithInt: 1], AVNumberOfChannelsKey,
   [NSNumber numberWithInt: AVAudioQualityMax], AVEncoderAudioQualityKey,
   nil];

// store the sound file in the app doc folder as calibration.caf
NSString *documentsDir = [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject];
NSURL *audioFileURL = [NSURL fileURLWithPath:[documentsDir stringByAppendingPathComponent: @"audio.caf"]];

// create the audio recorder
NSError *createAudioRecorderError = nil;
AVAudioRecorder *newAudioRecorder = [[AVAudioRecorder alloc] initWithURL:audioFileURL settings:recordSettings error:&createAudioRecorderError];
[recordSettings release];

if (newAudioRecorder)
{
   // record the audio
   self.recorder = newAudioRecorder;
   [newAudioRecorder release];

   self.recorder.delegate = self;
   [self.recorder prepareToRecord];
   [self.recorder record];
}
else
{
   NSLog(@"%@", [createAudioRecorderError localizedDescription]);
}

感谢您能提供任何帮助。这是我使用核心音频的第一个项目，所以请随意批评我的方法！

附注：我试图搜索核心音频列表档案，但请求一直出错：(http://search.lists.apple.com/?q=linear+pcm+amplitude&cmd=Search%21&ul=coreaudio-api)

附注2：我已经查看了:

http://en.wikipedia.org/wiki/Sound_pressure

http://en.wikipedia.org/wiki/Linear_PCM

http://wiki.multimedia.cx/index.php?title=PCM

在音频文件的给定时间获取振幅？

http://music.columbia.edu/pipermail/music-dsp/2002-April/048341.html

我已经阅读了整个Core Audio概述和大部分音频会话编程指南，但我的问题仍然存在。

- David Weiss

2个回答

2

如果您请求16位样本作为录制格式，那么您将获得16位样本。但是在许多核心音频记录/播放API和可能的caf文件格式中存在其他格式。在单声道中，您只需获得有符号16位整数的数组。您可以在某些Core Audio录制API中特别要求大端或小端。除非您想校准特定设备型号的麦克风或外部麦克风（并确保关闭音频处理/自动增益控制），否则您可能需要考虑音频电平为任意缩放。此外，响应还会随着麦克风方向性和音频频率而变化。 16位音频样本的中心点通常为0（范围约为-32k至32k）。没有偏差。

- hotpaw2

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- justin · Accepted Answer

1) OS X/iPhone的文件读取例程允许您确定样本格式，通常为SInt8、SInt16、SInt32、Float32、Float64或LPCM的连续24位有符号整数之一。

2) 对于int格式，MIN_FOR_TYPE表示负相位中的最大振幅，MAX_FOR_TYPE表示正相位中的最大振幅。0表示无声。浮点格式在[-1...1]之间调制，其中0是浮点。在读取、写入、录制或使用特定格式时，字节顺序很重要 - 文件可能需要特定的格式，并且通常希望以本机字节顺序操作数据。Apple音频文件库中的某些例程允许您传递一个标志来表示源字节顺序，而不是手动进行转换。CAF则更加复杂 - 它类似于一个包装一种或多种音频文件的元包装器，并支持许多类型。

3) LPCM的幅度表示只是一种暴力线性幅度表示（无需转换/解码即可播放，幅度步骤相等）。

4) 参见＃2。这些值与气压无关，而与0 dBFS有关；例如，如果将流直接输出到DAC，则int max（或-1/1如果是浮点）表示单个样本剪切的级别。

额外的提示：与每个ADC和组件链一样，它在输入方面有处理的限制。此外，采样速率定义了可以捕获的最高频率（最高频率为采样速率的一半）。ADC可能使用固定或可选择的位深度，但是选择其他位深度时，最大输入电压通常不会改变。

您在代码级别上犯的一个错误是：您将“outBuffer”操作为字符，而不是SInt16。