如何实时处理麦克风输入？

Question

如何实时处理麦克风输入？

winapiaudiosignal-processingmicrophonedirectsound

3

我正在开始创建一个概念验证，这个想法需要对如何开始有一些指导。

我需要采样麦克风输入，并实时处理信号（类似于Auto-Tune，但是在实时工作），而不是“录制”一段时间。

我正在做的是“麦克风输入到MIDI转换器”，因此它需要相当快的响应速度。

我在网上调查了一下，显然直接使用 DirectSound 或 WaveIn* API 函数是正确的方法。现在，根据我的阅读，WaveIn API 可以让我填充一定大小的缓冲区，这对于录制和后期处理来说很好，但是我想知道...如何进行实时处理？

我使用10毫秒的缓冲区并自己保持50毫秒或100毫秒的循环数组，然后获取每10毫秒触发分析的函数？（只有最新的10毫秒是新的）

我是否遗漏了什么？

另外，DirectSound 如何处理？它是否比常规 Win32 API 提供更强大的功能？

- Daniel Magliola

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Stu Mackellar · Accepted Answer

Both DirectSound and the Wave API will provide you with audio data buffers that can be processed. The size of these buffers can be adjusted, but it's best to keep latency under 10 milliseconds for real-time processing. This means processing the data within 10 milliseconds of it arriving at the buffer, minus the time it takes to arrive at the audio hardware and get to the buffer, which depends on the driver. For this reason, it's recommended to process no more than 5 milliseconds of data at a time.

The main difference between the two is that with DirectSound, you allocate a circular buffer that is filled by the DirectSound audio driver, while the Wave API takes a queue of pre-allocated WAVEHDR buffers that are filled, returned to the app, and then recycled. Both APIs have various notification methods, such as window messages or events. However, for low-latency processing, it's best to maintain a dedicated streaming thread and wait for new data to arrive.

For various reasons, DirectSound is recommended over the Wave API for new development - achieving lower latency will certainly be easier.

无论您选择哪种捕获方法，一旦获得数据，只需将其传递到处理算法中并等待下一个缓冲区准备好即可。只要您能够比数据到达的速度更快地处理数据，那么您就可以进行（伪）实时分析。

还有其他可能更合适的替代API。请查看ASIO，内核流（仅适用于XP-我不会麻烦），以及在Vista中推出的Core Audio APIs。