从内存中解码音频 - C++

4
我有两个功能:
  • 一个网络套接字函数,用于获取mp3数据并将其写入文件中。
  • 一个解码mp3文件的函数。
但我更愿意让当前写入磁盘的数据在内存中被解码函数解码。
我的解码函数看起来像这样,并且它是通过初始化完成的。
    avformat_open_input(AVCodecContext, filename, NULL, NULL) 

如何在不使用文件名的情况下,仅使用内存缓冲区读取AVCodecContext中的内容?


如果我是你,我宁愿使用libmp3lame库,这个库更加轻量级并且更加简单易用于简单的MP3解码。请参考https://dev59.com/YWs05IYBdhLWcg3wR_22。 - SirDarius
谢谢,但我希望以后还能有其他格式,比如AAC。 - user2492388
2个回答

3

我想发布一些代码来说明如何实现这一点,我试图进行注释,但时间紧迫,不过应该都是相对简单的东西。返回值基于将相关消息插入到1337语言转换为十进制值的hex版本中进行插值,并且我尽可能保持了轻松的口吻:)

#include <iostream>

extern "C"
{
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
};

std::string tooManyChannels =   "The audio stream (and its frames) has/have too many channels to properly fit in\n to frame->data. Therefore, to access the audio data, you need to use\nframe->extended_data to access the audio data."
                                "It is a  planar store, so\neach channel is in a different element.\n"
                                " E.G.: frame->extended_data[0] has the data for channel 1\n"
                                "       frame->extended_data[1] has the data for channel 2\n"
                                "And so on.\n";

std::string nonPlanar = "Either the audio data is not planar, or there is not enough room in\n"
                        "frame->data to store all the channel data.  Either use\n"
                        "frame->data\n or \nframe->extended_data to access the audio data\n"
                        "both should just point to the same data in this instance.\n";

std::string information1 =  "If the frame is planar, each channel is in a separate element:\n"
                            "frame->data[0]/frame->extended_data[0] contains data for channel 1\n"
                            "frame->data[1]/frame->extended_data[1] contains data for channel 2\n";

std::string information2 =  "If the frame is in packed format( and therefore not planar),\n"
                            "then all the data is contained within:\n"
                            "frame->data[0]/frame->extended_data[0] \n"
                            "Similar to the manner in which some image formats have RGB(A) pixel data packed together,\n"
                            "rather than containing separate R G B (and A) data.\n";

void printAudioFrameInfo(const AVCodecContext* codecContext, const AVFrame* frame)
{
    /*
     This url: http://ffmpeg.org/doxygen/trunk/samplefmt_8h.html#af9a51ca15301871723577c730b5865c5
     contains information on the type you will need to utilise to access the audio data.
    */
    // format the tabs etc. in this string to suit your font, they line up for mine but may not for yours:)
    std::cout << "Audio frame info:\n"
              << "\tSample count:\t\t" << frame->nb_samples << '\n'
              << "\tChannel count:\t\t" << codecContext->channels << '\n'
              << "\tFormat:\t\t\t" << av_get_sample_fmt_name(codecContext->sample_fmt) << '\n'
              << "\tBytes per sample:\t" << av_get_bytes_per_sample(codecContext->sample_fmt) << '\n'
              << "\tPlanar storage format?:\t" << av_sample_fmt_is_planar(codecContext->sample_fmt) << '\n';


    std::cout << "frame->linesize[0] tells you the size (in bytes) of each plane\n";

    if (codecContext->channels > AV_NUM_DATA_POINTERS && av_sample_fmt_is_planar(codecContext->sample_fmt))
    {
        std::cout << tooManyChannels;
    }
    else
    {
        stc::cout << nonPlanar;
    }
    std::cout << information1 << information2;
}

int main()
{
    // You can change the filename for any other filename/supported format
    std::string filename = "../my file.ogg";
    // Initialize FFmpeg
    av_register_all();

    AVFrame* frame = avcodec_alloc_frame();
    if (!frame)
    {
        std::cout << "Error allocating the frame.  Let's try again shall we?\n";
        return 666;  // fail at start: 66 = number of the beast
    }

    // you can change the file name to whatever yo need:)
    AVFormatContext* formatContext = NULL;
    if (avformat_open_input(&formatContext, filename, NULL, NULL) != 0)
    {
        av_free(frame);
        std::cout << "Error opening file " << filename<< "\n";
        return 800; // cant open file.  800 = Boo!
    }

    if (avformat_find_stream_info(formatContext, NULL) < 0)
    {
        av_free(frame);
        avformat_close_input(&formatContext);
        std::cout << "Error finding the stream information.\nCheck your paths/connections and the details you supplied!\n";
        return 57005; // stream info error.  0xDEAD in hex is 57005 in decimal
    }

    // Find the audio stream
    AVCodec* cdc = nullptr;
    int streamIndex = av_find_best_stream(formatContext, AVMEDIA_TYPE_AUDIO, -1, -1, &cdc, 0);
    if (streamIndex < 0)
    {
        av_free(frame);
        avformat_close_input(&formatContext);
        std::cout << "Could not find any audio stream in the file.  Come on! I need data!\n";
        return 165; // no(0) (a)udio s(5)tream:  0A5 in hex = 165 in decimal
    }

    AVStream* audioStream = formatContext->streams[streamIndex];
    AVCodecContext* codecContext = audioStream->codec;
    codecContext->codec = cdc;

    if (avcodec_open2(codecContext, codecContext->codec, NULL) != 0)
    {
        av_free(frame);
        avformat_close_input(&formatContext);
        std::cout << "Couldn't open the context with the decoder.  I can decode but I need to have something to decode.\nAs I couldn't find anything I have surmised the decoded output is 0!\n (Well can't have you thinking I am doing nothing can we?\n";
        return 1057; // cant find/open context 1057 = lost
    }

    std::cout << "This stream has " << codecContext->channels << " channels with a sample rate of " << codecContext->sample_rate << "Hz\n";
    std::cout << "The data presented in format: " << av_get_sample_fmt_name(codecContext->sample_fmt) << std::endl;

    AVPacket readingPacket;
    av_init_packet(&readingPacket);

    // Read the packets in a loop
    while (av_read_frame(formatContext, &readingPacket) == 0)
    {
        if (readingPacket.stream_index == audioStream->index)
        {
            AVPacket decodingPacket = readingPacket;

            // Audio packets can have multiple audio frames in a single packet
            while (decodingPacket.size > 0)
            {
                // Try to decode the packet into a frame(s)
                // Some frames rely on multiple packets, so we have to make sure the frame is finished
                // before utilising it
                int gotFrame = 0;
                int result = avcodec_decode_audio4(codecContext, frame, &gotFrame, &decodingPacket);

                if (result >= 0 && gotFrame)
                {
                    decodingPacket.size -= result;
                    decodingPacket.data += result;

                    // et voila! a decoded audio frame!
                    printAudioFrameInfo(codecContext, frame);
                }
                else
                {
                    decodingPacket.size = 0;
                    decodingPacket.data = nullptr;
                }
            }
        }

        // You MUST call av_free_packet() after each call to av_read_frame()
        // or you will leak so much memory on a large file you will need a memory-plumber!
        av_free_packet(&readingPacket);
    }

    // Some codecs will cause frames to be buffered in the decoding process. 
    // If the CODEC_CAP_DELAY flag is set, there can be buffered frames that need to be flushed
    // therefore flush them now....
    if (codecContext->codec->capabilities & CODEC_CAP_DELAY)
    {
        av_init_packet(&readingPacket);
        // Decode all the remaining frames in the buffer
        int gotFrame = 0;
        while (avcodec_decode_audio4(codecContext, frame, &gotFrame, &readingPacket) >= 0 && gotFrame)
        {
            // Again: a fully decoded audio frame!
            printAudioFrameInfo(codecContext, frame);
        }
    }

    // Clean up! (unless you have a quantum memory machine with infinite RAM....)
    av_free(frame);
    avcodec_close(codecContext);
    avformat_close_input(&formatContext);

    return 0;  // success!!!!!!!!
}

希望这有所帮助。如果你需要更多信息,我会尽力帮助:)还有一些非常好的教程信息可以在dranger.com找到,可能会对你有用。

这是音频解码,我的解码部分看起来类似于这样。但另一个函数从网络获取mp3流并将它们写入文件:f.write(buf,bytes); 我想将这个“buf”放入我的解码函数中,以便来自网络的数据将直接写入磁盘作为原始音频文件。 - user2492388
啊,我明白了:) 我会研究一下并回复你:) - GMasucci
谢谢,我的解决方案应该是使用网络功能将数据写入文件,同时解码函数读取它,但这不可能:/ - user2492388
为了实现您所描述的功能,您可以设置两个文件句柄:一个只读,一个读写,这样一个可以将数据转储到文件中,而另一个可以进行读取。这样,您就可以使用一个文件句柄进行读取,另一个进行写入,并且可以在不同的位置上进行操作,而不会出现任何问题。 - GMasucci
好的,现在我找到了这个例子:http://www.codeproject.com/Tips/489450/Creating-Custom-FFmpeg-IO-Context 但是我有我的char buf和int bufsize,我该如何将这两种类型放入并获得我的新AVFormatContext呢? - user2492388

1

我该如何设置pb字段?没有文档,我也找不到一个真正好的AVIOContext描述。 - user2492388

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接