安卓系统中的声音识别

14
我希望我的Android应用程序能识别声音。例如,我想知道来自麦克风的声音是拍手、敲门还是其他什么声音。
我需要使用数学,还是可以只使用一些库?
如果有任何用于声音分析的库,请告诉我。谢谢。

请查看此帖子:https://dev59.com/uXE95IYBdhLWcg3wkeqq - coder
是的,我已经了解了AudioRecord类。该类的Read()方法返回原始数据,需要使用数学进行分析。但我想知道是否有一些第三方API可以在不使用数学的情况下分析声音? - Elephant
4个回答

11

Musicg库对于哨声检测非常有用。但对于拍手声,我不建议使用它,因为它会对每个大声的声音(甚至是语音)做出反应。

对于拍手和其他打击乐声音的检测,我推荐使用TarsosDSP。它具有简单的API和丰富的功能(例如音高检测等)。对于拍手检测,您可以使用类似以下内容的代码(如果您使用TarsosDSPAndroid-v3):

MicrophoneAudioDispatcher mDispatcher = new MicrophoneAudioDispatcher((int) SAMPLE_RATE, BUFFER_SIZE, BUFFER_OVERLAP);
double threshold = 8;
double sensitivity = 20;
mPercussionDetector = new PercussionOnsetDetector(22050, 1024, 
        new OnsetHandler() {

            @Override
            public void handleOnset(double time, double salience) {
                Log.d(TAG, "Clap detected!");
            }
        }, sensitivity, threshold);
mDispatcher.addAudioProcessor(mPercussionDetector);
new Thread(mDispatcher).start();

你可以通过调整灵敏度(0-100)和阈值(0-20)来调整探测器。

祝你好运!


我无法使用这个程序检测拍手声,它只能检测口哨声...你能帮我吗?我想要检测拍手声、口哨声以及手指响声。 - Arpit Patel
@ArpitPatel,你成功地在musicg API中检测到了哨声吗?我遇到了错误,请支持我。http://stackoverflow.com/questions/37925382/detectionapi-supports-mono-wav-only - Sagar Nayak
我能否获取以下需求的库 https://android.stackexchange.com/questions/237271/sound-detection-library-for-android-without-using-mic - KIRAN K J

4

2
你不需要数学,也不需要AudioRecord。只需每1000毫秒检查MediaRecorder.getMaxAmplitude()即可。
这些代码this codethis code可能会有帮助。
下面是一些你需要的代码。
public class Clapper
{
    private static final String TAG = "Clapper";

    private static final long DEFAULT_CLIP_TIME = 1000;
    private long clipTime = DEFAULT_CLIP_TIME;
    private AmplitudeClipListener clipListener;

    private boolean continueRecording;

    /**
     * how much louder is required to hear a clap 10000, 18000, 25000 are good
     * values
     */
    private int amplitudeThreshold;

    /**
     * requires a little of noise by the user to trigger, background noise may
     * trigger it
     */
    public static final int AMPLITUDE_DIFF_LOW = 10000;
    public static final int AMPLITUDE_DIFF_MED = 18000;
    /**
     * requires a lot of noise by the user to trigger. background noise isn't
     * likely to be this loud
     */
    public static final int AMPLITUDE_DIFF_HIGH = 25000;

    private static final int DEFAULT_AMPLITUDE_DIFF = AMPLITUDE_DIFF_MED;

    private MediaRecorder recorder;

    private String tmpAudioFile;

    public Clapper() throws IOException
    {
        this(DEFAULT_CLIP_TIME, "/tmp.3gp", DEFAULT_AMPLITUDE_DIFF, null, null);
    }

    public Clapper(long snipTime, String tmpAudioFile,
            int amplitudeDifference, Context context, AmplitudeClipListener clipListener)
            throws IOException
    {
        this.clipTime = snipTime;
        this.clipListener = clipListener;
        this.amplitudeThreshold = amplitudeDifference;
        this.tmpAudioFile = tmpAudioFile;
    }

    public boolean recordClap()
    {
        Log.d(TAG, "record clap");
        boolean clapDetected = false;

        try
        {
            recorder = AudioUtil.prepareRecorder(tmpAudioFile);
        }
        catch (IOException io)
        {
            Log.d(TAG, "failed to prepare recorder ", io);
            throw new RecordingFailedException("failed to create recorder", io);
        }

        recorder.start();
        int startAmplitude = recorder.getMaxAmplitude();
        Log.d(TAG, "starting amplitude: " + startAmplitude);

        do
        {
            Log.d(TAG, "waiting while recording...");
            waitSome();
            int finishAmplitude = recorder.getMaxAmplitude();
            if (clipListener != null)
            {
                clipListener.heard(finishAmplitude);
            }

            int ampDifference = finishAmplitude - startAmplitude;
            if (ampDifference >= amplitudeThreshold)
            {
                Log.d(TAG, "heard a clap!");
                clapDetected = true;
            }
            Log.d(TAG, "finishing amplitude: " + finishAmplitude + " diff: "
                    + ampDifference);
        } while (continueRecording || !clapDetected);

        Log.d(TAG, "stopped recording");
        done();

        return clapDetected;
    }

    private void waitSome()
    {
        try
        {
            // wait a while
            Thread.sleep(clipTime);
        } catch (InterruptedException e)
        {
            Log.d(TAG, "interrupted");
        }
    }

    /**
     * need to call this when completely done with recording
     */
    public void done()
    {
        Log.d(TAG, "stop recording");
        if (recorder != null)
        {
            if (isRecording())
            {
                stopRecording();
            }
            //now stop the media player
            recorder.stop();
            recorder.release();
        }
    }

    public boolean isRecording()
    {
        return continueRecording;
    }

    public void stopRecording()
    {
        continueRecording = false;
    }
}

4
你的示例代码将对任何大声的声音做出反应(不仅仅是鼓掌)。它无法识别声音的性质。我说得对吗? - Elephant
1
这段代码只能根据阈值识别响亮和不响亮的噪音。虽然非常简单,但对许多应用程序非常有用。 - gregm

1

我知道这篇文章已经有一年了,但我偶然发现它。我非常确定通用的、开放领域的声音识别问题尚未解决。因此,在Android上找不到任何可以完成你想要的功能的库,因为这样的代码还不存在。如果你选择某个受限制的领域,你可以训练一个分类器来识别你感兴趣的声音类型,但这需要大量的数学计算和每种潜在声音的大量实例。如果你想要的库存在,那将会非常酷,但据我所知,这项技术还没有达到那个水平。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接