Android: 使用MediaCodec对音频和视频进行编码

7
我正在尝试使用MediaCodec和MediaMuxer对来自相机的视频和麦克风的音频进行编码。在录制过程中,我使用OpenGL将文本叠加在图像上。
我以这些类为例:

我写了一个主类来执行编码。它会生成两个线程来记录音频和视频。但是它无法正常工作(生成的文件无效),但如果我注释掉其中一个线程(音频或视频),它就可以正常工作。另外,我需要将TRACK_COUNT设置为1。以下是主类的代码:

import android.graphics.SurfaceTexture;
import android.media.AudioFormat;
import android.media.AudioRecord;
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.media.MediaMuxer;
import android.media.MediaRecorder;

import com.google.common.base.Throwables;

import java.io.IOException;
import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;

/**
 * Class for recording a reply including a text message.
 */
public class ReplyRecorder {
    // Encoding state
    private boolean encoding;
    long startWhen;

    // Muxer
    private static final int TRACK_COUNT = 2;
    private Muxer mMuxer;

    // Video
    private static final String VIDEO_MIME_TYPE = "video/avc"; // H.264 Advanced Video Coding
    private static final int FRAME_RATE = 15;                  // 30fps
    private static final int IFRAME_INTERVAL = 10;             // 5 seconds between I-frames
    private static final int BIT_RATE = 2000000;

    private Encoder mVideoEncoder;
    private CodecInputSurface mInputSurface;

    private SurfaceTextureManager mStManager;

    // Audio
    private static final String AUDIO_MIME_TYPE = "audio/mp4a-latm";
    private static final int SAMPLE_RATE = 44100;
    private static final int SAMPLES_PER_FRAME = 1024;
    private static final int CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO;
    private static final int AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT;

    private Encoder mAudioEncoder;
    private AudioRecord audioRecord;

    public void start(final CameraManager cameraManager, final String messageText, final String filePath) {
        checkNotNull(cameraManager);
        checkNotNull(messageText);
        checkNotNull(filePath);

        try {
            // Create a MediaMuxer.  We can't add the video track and start() the muxer here,
            // because our MediaFormat doesn't have the Magic Goodies.  These can only be
            // obtained from the encoder after it has started processing data.
            mMuxer = new Muxer(new MediaMuxer(filePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4), TRACK_COUNT);
            startWhen = System.nanoTime();
            encoding = true;
            new Thread(new Runnable() {
                @Override
                public void run() {
                    initVideoComponents(cameraManager, messageText);
                    encodeVideo(cameraManager);
                }
            }).start();
            new Thread(new Runnable() {
                @Override
                public void run() {
                    initAudioComponents();
                    encodeAudio();
                }
            }).start();
        } catch (IOException e) {
            release();
            throw Throwables.propagate(e);
        }
    }

    private void initVideoComponents(CameraManager cameraManager,
                                     String messageText) {
        try {
            MediaFormat format = MediaFormat.createVideoFormat(VIDEO_MIME_TYPE, cameraManager.getEncWidth(), cameraManager.getEncHeight());

            // Set some properties.  Failing to specify some of these can cause the MediaCodec
            // configure() call to throw an unhelpful exception.
            format.setInteger(MediaFormat.KEY_COLOR_FORMAT,
                    MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
            format.setInteger(MediaFormat.KEY_BIT_RATE, BIT_RATE);
            format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE);
            format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, IFRAME_INTERVAL);

            // Create a MediaCodec encoder, and configure it with our format.  Get a Surface
            // we can use for input and wrap it with a class that handles the EGL work.
            //
            // If you want to have two EGL contexts -- one for display, one for recording --
            // you will likely want to defer instantiation of CodecInputSurface until after the
            // "display" EGL context is created, then modify the eglCreateContext call to
            // take eglGetCurrentContext() as the share_context argument.
            mVideoEncoder = new Encoder(VIDEO_MIME_TYPE, format, mMuxer);
            mInputSurface = new CodecInputSurface(mVideoEncoder.getEncoder().createInputSurface());
            mVideoEncoder.getEncoder().start();

            mInputSurface.makeCurrent();
            mStManager = new SurfaceTextureManager(messageText, cameraManager.getEncWidth(), cameraManager.getEncHeight());
        } catch (RuntimeException e) {
            releaseVideo();
            throw e;
        }
    }

    private void encodeVideo(CameraManager cameraManager) {
        try {

            SurfaceTexture st = mStManager.getSurfaceTexture();
            cameraManager.record(st);

            while (encoding) {
                // Feed any pending encoder output into the muxer.
                mVideoEncoder.drain(false);

                // Acquire a new frame of input, and render it to the Surface.  If we had a
                // GLSurfaceView we could switch EGL contexts and call drawImage() a second
                // time to render it on screen.  The texture can be shared between contexts by
                // passing the GLSurfaceView's EGLContext as eglCreateContext()'s share_context
                // argument.
                mStManager.awaitNewImage();
                mStManager.drawImage();

                // Set the presentation time stamp from the SurfaceTexture's time stamp.  This
                // will be used by MediaMuxer to set the PTS in the video.
                mInputSurface.setPresentationTime(st.getTimestamp() - startWhen);

                // Submit it to the encoder.  The eglSwapBuffers call will block if the input
                // is full, which would be bad if it stayed full until we dequeued an output
                // buffer (which we can't do, since we're stuck here).  So long as we fully drain
                // the encoder before supplying additional input, the system guarantees that we
                // can supply another frame without blocking.
                mInputSurface.swapBuffers();
            }

            // send end-of-stream to encoder, and drain remaining output
            mVideoEncoder.drain(true);
        } finally {
            releaseVideo();
        }
    }

    private void initAudioComponents() {
        try {
            int min_buffer_size = AudioRecord.getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT);
            int buffer_size = SAMPLES_PER_FRAME * 10;
            if (buffer_size < min_buffer_size)
                buffer_size = ((min_buffer_size / SAMPLES_PER_FRAME) + 1) * SAMPLES_PER_FRAME * 2;

            audioRecord = new AudioRecord(
                    MediaRecorder.AudioSource.MIC,       // source
                    SAMPLE_RATE,                         // sample rate, hz
                    CHANNEL_CONFIG,                      // channels
                    AUDIO_FORMAT,                        // audio format
                    buffer_size);                        // buffer size (bytes)

            /////////////////

            MediaFormat format = new MediaFormat();
            format.setString(MediaFormat.KEY_MIME, AUDIO_MIME_TYPE);
            format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
            format.setInteger(MediaFormat.KEY_SAMPLE_RATE, 44100);
            format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
            format.setInteger(MediaFormat.KEY_BIT_RATE, 128000);
            format.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 16384);

            mAudioEncoder = new Encoder(AUDIO_MIME_TYPE, format, mMuxer);
            mAudioEncoder.getEncoder().start();
        } catch (RuntimeException e) {
            releaseAudio();
            throw e;
        }
    }

    private void encodeAudio() {
        try {
            audioRecord.startRecording();
            while (encoding) {
                mAudioEncoder.drain(false);
                sendAudioToEncoder(false);
            }
            //TODO: Sending "false" because calling signalEndOfInputStream fails on this encoder
            mAudioEncoder.drain(false);
        } finally {
            releaseAudio();
        }
    }

    public void sendAudioToEncoder(boolean endOfStream) {
        // send current frame data to encoder
        ByteBuffer[] inputBuffers = mAudioEncoder.getEncoder().getInputBuffers();
        int inputBufferIndex = mAudioEncoder.getEncoder().dequeueInputBuffer(-1);
        if (inputBufferIndex >= 0) {
            ByteBuffer inputBuffer = inputBuffers[inputBufferIndex];
            inputBuffer.clear();
            long presentationTimeNs = System.nanoTime();
            int inputLength = audioRecord.read(inputBuffer, SAMPLES_PER_FRAME);
            presentationTimeNs -= (inputLength / SAMPLE_RATE) / 1000000000;

            long presentationTimeUs = (presentationTimeNs - startWhen) / 1000;
            if (endOfStream) {
                mAudioEncoder.getEncoder().queueInputBuffer(inputBufferIndex, 0, inputLength, presentationTimeUs, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
            } else {
                mAudioEncoder.getEncoder().queueInputBuffer(inputBufferIndex, 0, inputLength, presentationTimeUs, 0);
            }
        }
    }

    public void stop() {
        encoding = false;
    }

    /**
     * Releases encoder resources.
     */
    public void release() {
        releaseVideo();
        releaseAudio();
    }

    private void releaseVideo() {
        if (mVideoEncoder != null) {
            mVideoEncoder.release();
            mVideoEncoder = null;
        }
        if (mInputSurface != null) {
            mInputSurface.release();
            mInputSurface = null;
        }
        if (mStManager != null) {
            mStManager.release();
            mStManager = null;
        }
        releaseMuxer();
    }

    private void releaseAudio() {
        if (audioRecord != null) {
            audioRecord.stop();
            audioRecord = null;
        }
        if (mAudioEncoder != null) {
            mAudioEncoder.release();
            mAudioEncoder = null;
        }
        releaseMuxer();
    }

    private void releaseMuxer() {
        if (mMuxer != null && mVideoEncoder == null && mAudioEncoder == null) {
            mMuxer.release();
            mMuxer = null;
        }
    }

    public boolean isRecording() {
        return mMuxer != null;
    }
}

包装复用器并在开始之前等待轨道完成的类如下所示(我添加了一些同步以进行测试):
import android.media.MediaCodec;
import android.media.MediaFormat;
import android.media.MediaMuxer;

import com.google.common.base.Throwables;

import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;
import static com.google.common.base.Preconditions.checkState;

/**
 * Class responsible for muxing. Wraps a MediaMuxer.
 */
public class Muxer {
    private final MediaMuxer muxer;
    private final int totalTracks;
    private int trackCounter;

    public Muxer(MediaMuxer muxer, int totalTracks) {
        this.muxer = checkNotNull(muxer);
        this.totalTracks = totalTracks;
    }

    synchronized public int addTrack(MediaFormat format) {
        checkState(!isStarted(), "Muxer already started");
        int trackIndex = muxer.addTrack(format);
        trackCounter++;
        if (isStarted()) {
            muxer.start();
            notifyAll();
        } else {
            while (!isStarted()) {
                try {
                    wait();
                } catch (InterruptedException e) {
                    Throwables.propagate(e);
                }
            }
        }
        return trackIndex;
    }

    synchronized public void writeSampleData(int trackIndex, ByteBuffer byteBuf,
                                MediaCodec.BufferInfo bufferInfo) {
        checkState(isStarted(), "Muxer not started");
        muxer.writeSampleData(trackIndex, byteBuf, bufferInfo);
    }

    public void release() {
        if (muxer != null) {
            try {
                muxer.stop();
            } catch (Exception e) {
            }
            muxer.release();
        }
    }

    private boolean isStarted() {
        return trackCounter == totalTracks;
    }
}

负责向MediaCodec编码器写入的类如下:

import android.media.MediaCodec;
import android.media.MediaFormat;

import com.google.common.base.Throwables;

import java.io.IOException;
import java.nio.ByteBuffer;

import static com.google.common.base.Preconditions.checkNotNull;
import static com.google.common.base.Preconditions.checkState;

/**
 * Class responsible for encoding.
 */
public class Encoder {
    private final MediaCodec encoder;
    private final Muxer muxer;
    private final MediaCodec.BufferInfo bufferInfo;
    private int trackIndex;


    public Encoder(String mimeType, MediaFormat format, Muxer muxer) {
        checkNotNull(mimeType);
        checkNotNull(format);
        checkNotNull(muxer);

        try {
            encoder = MediaCodec.createEncoderByType(mimeType);
            encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);

            this.muxer = muxer;
            bufferInfo = new MediaCodec.BufferInfo();
        } catch (IOException e) {
            throw Throwables.propagate(e);
        }
    }

    public MediaCodec getEncoder() {
        return encoder;
    }

    /**
     * Extracts all pending data from the encoder and forwards it to the muxer.
     * <p/>
     * If endOfStream is not set, this returns when there is no more data to drain.  If it
     * is set, we send EOS to the encoder, and then iterate until we see EOS on the output.
     * Calling this with endOfStream set should be done once, right before stopping the muxer.
     * <p/>
     * We're just using the muxer to get a .mp4 file (instead of a raw H.264 stream).
     */
    public void drain(boolean endOfStream) {
        final int TIMEOUT_USEC = 10000;

        if (endOfStream) {
            encoder.signalEndOfInputStream();
        }

        ByteBuffer[] encoderOutputBuffers = encoder.getOutputBuffers();
        while (true) {
            int encoderStatus = encoder.dequeueOutputBuffer(bufferInfo, TIMEOUT_USEC);
            if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
                // no output available yet
                if (!endOfStream) {
                    break;      // out of while
                }
            } else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                // not expected for an encoder
                encoderOutputBuffers = encoder.getOutputBuffers();
            } else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                // now that we have the Magic Goodies, start the muxer
                trackIndex = muxer.addTrack(encoder.getOutputFormat());
            } else if (encoderStatus < 0) {
                // let's ignore it
            } else {
                ByteBuffer encodedData = encoderOutputBuffers[encoderStatus];
                checkState(encodedData != null, "encoderOutputBuffer %s was null", encoderStatus);

                if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG) != 0) {
                    // The codec config data was pulled out and fed to the muxer when we got
                    // the INFO_OUTPUT_FORMAT_CHANGED status.  Ignore it.
                    bufferInfo.size = 0;
                }

                if (bufferInfo.size != 0) {
                    // adjust the ByteBuffer values to match BufferInfo (not needed?)
                    encodedData.position(bufferInfo.offset);
                    encodedData.limit(bufferInfo.offset + bufferInfo.size);

                    muxer.writeSampleData(trackIndex, encodedData, bufferInfo);
                }

                encoder.releaseOutputBuffer(encoderStatus, false);

                if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                    break;      // out of while
                }
            }
        }
    }

    public void release() {
        if (encoder != null) {
            try {
                encoder.stop();
            } catch (Exception e) {
            }
            encoder.release();
        }
    }
}

有任何想法为什么在同时运行时会失败吗?

事实上,有时它是有效的。我会得到像这样的(糟糕的)视频:https://www.youtube.com/watch?v=NQWc2Zo2MXo - user992029
你曾经找出过导致视频卡顿的原因吗? - Devsil
不,事实上我放弃了。我认为这可能是处理能力问题,因为我曾通过批处理模式做过类似的事情并获得了良好的结果。 - user992029
可能是这样,虽然我认为这不应该成为问题,因为GPU正在为视频帧做大部分工作,而CPU正在为麦克风录制做工作。它仍然可能会偶尔错过或“跳”一帧,但整体上不应该成为问题。我认为另一个原因不是处理能力或缺乏处理能力,因为无论分辨率如何,视频都存在相同的问题。事实上,我已经接近找到问题所在,我相信问题出在音轨的呈现时间上。当我完善解决方案时,我会在这里发布我的解决方案。 - Devsil
2个回答

5

好的,我终于实现了原帖作者的最终目标。问题就像我预料的那样,与音频轨道生成的时间戳不完全匹配视频轨道给出的时间戳有关。

我的解决方案是将我们的VideoEncoder用于存储在其BufferInfo中的表面时间戳传递到AudioEncoder中。我没有像原帖作者一样基于线程运行时间计算时间戳,而是直接从表面获取时间戳,并将其用作AudioEncoderBufferInfo时间戳。您必须确保您的音频记录器的缓冲区限制足够大,以处理这一点,因为我们将不会按采样率收到音频帧,而是按视频帧速率。这很容易确定。

明确一下,音频和视频编码仍然在单独的线程上进行,但每当我调用mVideoEncoder.onFrameAvailable向视频编码器线程发送带有表面时间戳的消息时,我也会对AudioEncoder线程执行相同的操作,使用表面纹理的时间戳。这样做可以得到一个完全功能的MP4视频,包括音频和视频轨道,而没有最初发生的卡顿问题。希望这可以帮助目前遇到类似问题或曾经遇到过的人。


谢谢,我会尝试的! - user992029
@LautaroBrasseur 这是正确的答案吗?它对你有用吗? - Inoy
不确定,因为我无法测试它。项目中取消了即时编码的要求。 - user992029
@Inoy,你需要代码的哪个部分?我正在使用一堆大类来完成这个任务,所以在此发布所有内容可能不是我们想要的。如果你能给我一个具体的需求,我可以尝试用一些代码来回答它。如果你创建了一个新问题,请将其链接给我。谢谢。 - Devsil

0

问题可能是您在其他线程已经开始写入样本数据(..muxer.writeSampleData(trackIndex, encodedData, bufferInfo))时调用了muxer.addTrack(encoder.getOutputFormat())。这会导致MediaMuxer中的IllegalStateException,但您没有捕获它,只是在finally部分调用releaseAudio()

  • 您应该尝试同步线程。等待两个线程都调用muxer.addTrack(encoder.getOutputFormat()),然后允许线程通过muxer.writeSampleData(trackIndex, encodedData, bufferInfo)写入样本。

  • 或者在同一个线程中运行音频编码和视频编码。


第一个选项是您需要的,如果您正在从麦克风/相机录制。如果您使用第二个选项,您将会遇到卡顿问题。 - JCutting8

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接