我正在尝试将Android语音识别服务听到的音频数据保存到文件中。
实际上,我按照这里所述实现了RecognitionListener
:
在Android上进行语音转文本
按照这里所示的方法将数据保存到缓冲区: 捕获发送到Google语音识别服务器的音频
并像这里一样将缓冲区写入Wav文件。 Android将原始字节记录到用于HTTP流媒体的WAVE文件中
我的问题是如何获取适当的音频设置以保存在WAV文件头中。 实际上,当我播放WAV文件时只能听到奇怪的噪音,使用这些参数:
short nChannels=2;// audio channels
int sRate=44100; // Sample rate
short bSamples = 16;// byteSample
或者用这个什么也不做:
short nChannels=1;// audio channels
int sRate=8000; // Sample rate
short bSamples = 16;// byteSample
令人困惑的是,从logcat查看语音识别任务的参数时,我首先找到 将PLAYBACK采样率设置为44100 HZ:
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Set PLAYBACK PCM format to S16_LE (Signed 16 bit Little Endian)
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Using 2 channels for PLAYBACK.
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Set PLAYBACK sample rate to 44100 HZ
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Buffer size: 2048
12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Latency: 46439
然后在播放要发送到谷歌服务器的文件时,aInfo.SampleRate = 8000
12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::InitWavParser
12-20 14:41:36.152: DEBUG/(2364): File open Succes
12-20 14:41:36.152: DEBUG/(2364): File SEEK End Succes
...
12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::ReadData
12-20 14:41:36.152: DEBUG/(2364): Data Read buff = RIFF?
12-20 14:41:36.152: DEBUG/(2364): Data Read = RIFF?
12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::ReadData
12-20 14:41:36.152: DEBUG/(2364): Data Read buff = fmt
...
12-20 14:41:36.152: DEBUG/(2364): PVWAVPARSER_OK
12-20 14:41:36.156: DEBUG/(2364): aInfo.AudioFormat = 1
12-20 14:41:36.156: DEBUG/(2364): aInfo.NumChannels = 1
12-20 14:41:36.156: DEBUG/(2364): aInfo.SampleRate = 8000
12-20 14:41:36.156: DEBUG/(2364): aInfo.ByteRate = 16000
12-20 14:41:36.156: DEBUG/(2364): aInfo.BlockAlign = 2
12-20 14:41:36.156: DEBUG/(2364): aInfo.BitsPerSample = 16
12-20 14:41:36.156: DEBUG/(2364): aInfo.BytesPerSample = 2
12-20 14:41:36.156: DEBUG/(2364): aInfo.NumSamples = 2258
那么,我该如何找到正确的参数以便将音频缓冲区保存为高质量的wav音频文件?