使用ffmpeg(libavcodec)解码H264视频时遇到RTP问题

9

我将AvCodecContext的profile_idc、level_idc、extradata和extradata_size与SDP中的profile-level-id和sprop-parameter-set进行了匹配。

我分别对编码切片、SPS、PPS和NAL_IDR_SLICE数据包进行了解码:

初始化:

uint8_t start_sequence[]= {0, 0, 1}; int size= recv(id_de_la_socket,(char*) rtpReceive,65535,0);

编码切片:

char *z = new char[size-16+sizeof(start_sequence)];
    memcpy(z,&start_sequence,sizeof(start_sequence));
    memcpy(z+sizeof(start_sequence),rtpReceive+16,size-16);
    ConsumedBytes = avcodec_decode_video(codecContext,pFrame,&GotPicture,(uint8_t*)z,size-16+sizeof(start_sequence));
    delete z;

结果:ConsumedBytes >0且GotPicture >0(通常如此)

SPS和PPS:

相同的代码。 结果:ConsumedBytes >0且GotPicture =0

我认为这是正常的

当我发现一个新的SPS/PPS对时,我会使用这个数据包的有效负载和它们的大小来更新extradata和extrada_size。

NAL_IDR_SLICE:

Nal单元类型为28 =>idr Frame被分割了,因此我尝试了两种方法来解码。

1)我将第一个片段(没有RTP头)的前缀设置为0x000001,并将其发送到avcodec_decode_video。然后我将其他片段发送到此函数。

2)我将第一个片段(没有RTP头)的前缀设置为0x000001,并将其余的片段连接到它上面。我将此缓冲区发送到解码器。

在这两种情况下,我没有错误(ConsumedBytes >0),但我没有检测到帧(GotPicture = 0)...

问题出在哪里?


FFmpeg 邮件列表有任何评论吗? - neuro
为什么是0x000001?这是h264而不是MPEG4。 - Cipi
再读一遍我的回答,我解释了一些可能会让你担心的事情。 - Cipi
3个回答

26

在 RTP 中,所有的 H264 I 帧(IDR)通常都是分片的。当你接收 RTP 数据时,你需要先跳过头部(通常为前 12 个字节),然后才能到达 NAL 单元(第一个有效负载字节)。如果 NAL 是 28(1C),那么它意味着接下来的有效载荷表示一个 H264 IDR(I-Frame)分片,并且你需要收集所有这些分片以重构 H264 IDR(I-Frame)。

分片是由于有限的 MTU 和更大的 IDR 导致的。一个分片可能看起来像这样:

具有 START BIT = 1 的分片:

First byte:  [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS] 
Second byte: [ START BIT | END BIT | RESERVED BIT | 5 NAL UNIT BITS] 
Other bytes: [... IDR FRAGMENT DATA...]

其他片段:

First byte:  [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS]  
Other bytes: [... IDR FRAGMENT DATA...]

要重建IDR,您必须收集此信息:

int fragment_type = Data[0] & 0x1F;
int nal_type = Data[1] & 0x1F;
int start_bit = Data[1] & 0x80;
int end_bit = Data[1] & 0x40;
如果fragment_type == 28,那么接下来的有效负载是IDR的一个片段。然后检查start_bit是否设置,如果设置了,则该片段是序列中的第一个片段。您可以使用它来通过从第一个有效负载字节中获取前3位(3个NAL单元位)并将其与第二个有效负载字节的最后5位(5个NAL单元位)组合以重构IDR的NAL字节,因此您将获得一个字节,如下所示:[3 NAL UNIT BITS | 5 NAL UNIT BITS]。然后将该NAL字节首先写入具有所有其他跟随字节的清晰缓冲区中。请记住跳过序列中的第一个字节,因为它不是IDR的一部分,而仅标识片段。
如果start_bitend_bit均为0,则只需将有效负载(跳过用于标识片段的第一个有效负载字节)写入缓冲区即可。
如果start_bit为0且end_bit为1,则意味着这是最后一个片段,您只需将其有效负载(跳过标识片段的第一个字节)写入缓冲区,现在您已经重构了IDR。
如果您需要一些代码,请在评论中提问,我会发表它,但我认为这很清楚如何实现...=)
关于解码:
今天我想到了为什么您在解码IDR时会出错(我假设您已经成功重构了它)。您是如何构建AVC解码器配置记录的?您使用的库是否自动化了这一过程?如果没有,并且您没有听说过它,请继续阅读...
AVCDCR指定允许解码器快速解析其解码H264(AVC)视频流所需的所有数据。数据如下:
  • ProfileIDC
  • ProfileIOP
  • LevelIDC
  • SPS(序列参数集)
  • PPS(图像参数集)
  • 所有这些数据都在SDP中的profile-level-idsprop-parameter-sets字段下的RTSP会话中发送。
    解码PROFILE-LEVEL-ID:
    Prifile级别ID字符串分为3个子字符串,每个子字符串长度为2个字符:
    [PROFILE IDC][PROFILE IOP][LEVEL IDC]
    每个子字符串表示一个字节,用base16表示!因此,如果Profile IDC为28,则它实际上是十进制40。稍后,您将使用十进制值构造AVC解码器配置记录。
    解码SPROP-PARAMETER-SETS:
    sprops通常是2个字符串(可能更多),用逗号分隔,并base64编码!您可以解码它们中的两个,但没有必要。您在此处的工作仅是将它们从base64字符串转换为字节数组以供稍后使用。现在,您有2个字节数组,第一个数组是SPS,第二个数组是PPS。
    构建AVCDCR
    现在,您拥有构建AVCDCR所需的所有内容,可以开始制作新的干净缓冲区,然后按照这里解释的顺序将这些内容写入其中:
    1 - 值为1的字节,表示版本
    2 - Profile IDC字节
    3 - Prifile IOP字节
    4 - Level IDC字节
    5 - 具有值0xFF的字节(搜索AVC解码器配置记录以查看其含义)
    6 - 具有值0xE1的字节
    7 - 具有SPS阵列长度值的短整型
    8 - SPS字

    这个库本身可以自己构建,但我自己构建了它。使用ffmpeg时,这些参数存储在一个结构体中(AvCodecContext)。我将尝试使用您的方法构建ACDR。谢谢。 - bben
    好的,那么你并没有像应该一样重构IDR...再检查一下整个过程。希望我能帮到你... =) - Cipi
    很好:ACDR被解码器识别并设置参数。解码器没有解码剩余部分,但我认为这是由于FFmpeg的另一个参数所致。非常感谢您的帮助:我已经取得了重大进展。 - bben
    1
    这是一个非常好的答案,不幸的是你写错了FU-A的第二个字节。它应该是[ START | END | RESERVED | TYPE ],也就是说,END和RESERVED应该交换位置。请参阅RFC3984(http://www.ietf.org/rfc/rfc3984.txt)。 - Alexander Olsson
    是的,我明白了,感谢您的评论!我已经很好地处理了start_bitend_bit的位掩码... :P - Cipi
    @Cipi,我看到你说:“如果你需要一些代码,只需在评论中提出请求,我会发布它,但我认为这很清楚该怎么做...”请问您能否发布重建IDR的代码?非常感谢。 - Frank

    1
    我有一个针对C#的实现,链接为https://net7mma.codeplex.com/ ,但是在任何地方都可以使用相同的过程。
    以下是相关代码。
    /// <summary>
        /// Implements Packetization and Depacketization of packets defined in <see href="https://tools.ietf.org/html/rfc6184">RFC6184</see>.
        /// </summary>
        public class RFC6184Frame : Rtp.RtpFrame
        {
            /// <summary>
            /// Emulation Prevention
            /// </summary>
            static byte[] NalStart = { 0x00, 0x00, 0x01 };
    
            public RFC6184Frame(byte payloadType) : base(payloadType) { }
    
            public RFC6184Frame(Rtp.RtpFrame existing) : base(existing) { }
    
            public RFC6184Frame(RFC6184Frame f) : this((Rtp.RtpFrame)f) { Buffer = f.Buffer; }
    
            public System.IO.MemoryStream Buffer { get; set; }
    
            /// <summary>
            /// Creates any <see cref="Rtp.RtpPacket"/>'s required for the given nal
            /// </summary>
            /// <param name="nal">The nal</param>
            /// <param name="mtu">The mtu</param>
            public virtual void Packetize(byte[] nal, int mtu = 1500)
            {
                if (nal == null) return;
    
                int nalLength = nal.Length;
    
                int offset = 0;
    
                if (nalLength >= mtu)
                {
                    //Make a Fragment Indicator with start bit
                    byte[] FUI = new byte[] { (byte)(1 << 7), 0x00 };
    
                    bool marker = false;
    
                    while (offset < nalLength)
                    {
                        //Set the end bit if no more data remains
                        if (offset + mtu > nalLength)
                        {
                            FUI[0] |= (byte)(1 << 6);
                            marker = true;
                        }
                        else if (offset > 0) //For packets other than the start
                        {
                            //No Start, No End
                            FUI[0] = 0;
                        }
    
                        //Add the packet
                        Add(new Rtp.RtpPacket(2, false, false, marker, PayloadTypeByte, 0, SynchronizationSourceIdentifier, HighestSequenceNumber + 1, 0, FUI.Concat(nal.Skip(offset).Take(mtu)).ToArray()));
    
                        //Move the offset
                        offset += mtu;
                    }
                } //Should check for first byte to be 1 - 23?
                else Add(new Rtp.RtpPacket(2, false, false, true, PayloadTypeByte, 0, SynchronizationSourceIdentifier, HighestSequenceNumber + 1, 0, nal));
            }
    
            /// <summary>
            /// Creates <see cref="Buffer"/> with a H.264 RBSP from the contained packets
            /// </summary>
            public virtual void Depacketize() { bool sps, pps, sei, slice, idr; Depacketize(out sps, out pps, out sei, out slice, out idr); }
    
            /// <summary>
            /// Parses all contained packets and writes any contained Nal Units in the RBSP to <see cref="Buffer"/>.
            /// </summary>
            /// <param name="containsSps">Indicates if a Sequence Parameter Set was found</param>
            /// <param name="containsPps">Indicates if a Picture Parameter Set was found</param>
            /// <param name="containsSei">Indicates if Supplementatal Encoder Information was found</param>
            /// <param name="containsSlice">Indicates if a Slice was found</param>
            /// <param name="isIdr">Indicates if a IDR Slice was found</param>
            public virtual void Depacketize(out bool containsSps, out bool containsPps, out bool containsSei, out bool containsSlice, out bool isIdr)
            {
                containsSps = containsPps = containsSei = containsSlice = isIdr = false;
    
                DisposeBuffer();
    
                this.Buffer = new MemoryStream();
    
                //Get all packets in the frame
                foreach (Rtp.RtpPacket packet in m_Packets.Values.Distinct()) 
                    ProcessPacket(packet, out containsSps, out containsPps, out containsSei, out containsSlice, out isIdr);
    
                //Order by DON?
                this.Buffer.Position = 0;
            }
    
            /// <summary>
            /// Depacketizes a single packet.
            /// </summary>
            /// <param name="packet"></param>
            /// <param name="containsSps"></param>
            /// <param name="containsPps"></param>
            /// <param name="containsSei"></param>
            /// <param name="containsSlice"></param>
            /// <param name="isIdr"></param>
            internal protected virtual void ProcessPacket(Rtp.RtpPacket packet, out bool containsSps, out bool containsPps, out bool containsSei, out bool containsSlice, out bool isIdr)
            {
                containsSps = containsPps = containsSei = containsSlice = isIdr = false;
    
                //Starting at offset 0
                int offset = 0;
    
                //Obtain the data of the packet (without source list or padding)
                byte[] packetData = packet.Coefficients.ToArray();
    
                //Cache the length
                int count = packetData.Length;
    
                //Must have at least 2 bytes
                if (count <= 2) return;
    
                //Determine if the forbidden bit is set and the type of nal from the first byte
                byte firstByte = packetData[offset];
    
                //bool forbiddenZeroBit = ((firstByte & 0x80) >> 7) != 0;
    
                byte nalUnitType = (byte)(firstByte & Common.Binary.FiveBitMaxValue);
    
                //o  The F bit MUST be cleared if all F bits of the aggregated NAL units are zero; otherwise, it MUST be set.
                //if (forbiddenZeroBit && nalUnitType <= 23 && nalUnitType > 29) throw new InvalidOperationException("Forbidden Zero Bit is Set.");
    
                //Determine what to do
                switch (nalUnitType)
                {
                    //Reserved - Ignore
                    case 0:
                    case 30:
                    case 31:
                        {
                            return;
                        }
                    case 24: //STAP - A
                    case 25: //STAP - B
                    case 26: //MTAP - 16
                    case 27: //MTAP - 24
                        {
                            //Move to Nal Data
                            ++offset;
    
                            //Todo Determine if need to Order by DON first.
                            //EAT DON for ALL BUT STAP - A
                            if (nalUnitType != 24) offset += 2;
    
                            //Consume the rest of the data from the packet
                            while (offset < count)
                            {
                                //Determine the nal unit size which does not include the nal header
                                int tmp_nal_size = Common.Binary.Read16(packetData, offset, BitConverter.IsLittleEndian);
                                offset += 2;
    
                                //If the nal had data then write it
                                if (tmp_nal_size > 0)
                                {
                                    //For DOND and TSOFFSET
                                    switch (nalUnitType)
                                    {
                                        case 25:// MTAP - 16
                                            {
                                                //SKIP DOND and TSOFFSET
                                                offset += 3;
                                                goto default;
                                            }
                                        case 26:// MTAP - 24
                                            {
                                                //SKIP DOND and TSOFFSET
                                                offset += 4;
                                                goto default;
                                            }
                                        default:
                                            {
                                                //Read the nal header but don't move the offset
                                                byte nalHeader = (byte)(packetData[offset] & Common.Binary.FiveBitMaxValue);
    
                                                if (nalHeader > 5)
                                                {
                                                    if (nalHeader == 6)
                                                    {
                                                        Buffer.WriteByte(0);
                                                        containsSei = true;
                                                    }
                                                    else if (nalHeader == 7)
                                                    {
                                                        Buffer.WriteByte(0);
                                                        containsPps = true;
                                                    }
                                                    else if (nalHeader == 8)
                                                    {
                                                        Buffer.WriteByte(0);
                                                        containsSps = true;
                                                    }
                                                }
    
                                                if (nalHeader == 1) containsSlice = true;
    
                                                if (nalHeader == 5) isIdr = true;
    
                                                //Done reading
                                                break;
                                            }
                                    }
    
                                    //Write the start code
                                    Buffer.Write(NalStart, 0, 3);
    
                                    //Write the nal header and data
                                    Buffer.Write(packetData, offset, tmp_nal_size);
    
                                    //Move the offset past the nal
                                    offset += tmp_nal_size;
                                }
                            }
    
                            return;
                        }
                    case 28: //FU - A
                    case 29: //FU - B
                        {
                            /*
                             Informative note: When an FU-A occurs in interleaved mode, it
                             always follows an FU-B, which sets its DON.
                             * Informative note: If a transmitter wants to encapsulate a single
                              NAL unit per packet and transmit packets out of their decoding
                              order, STAP-B packet type can be used.
                             */
                            //Need 2 bytes
                            if (count > 2)
                            {
                                //Read the Header
                                byte FUHeader = packetData[++offset];
    
                                bool Start = ((FUHeader & 0x80) >> 7) > 0;
    
                                //bool End = ((FUHeader & 0x40) >> 6) > 0;
    
                                //bool Receiver = (FUHeader & 0x20) != 0;
    
                                //if (Receiver) throw new InvalidOperationException("Receiver Bit Set");
    
                                //Move to data
                                ++offset;
    
                                //Todo Determine if need to Order by DON first.
                                //DON Present in FU - B
                                if (nalUnitType == 29) offset += 2;
    
                                //Determine the fragment size
                                int fragment_size = count - offset;
    
                                //If the size was valid
                                if (fragment_size > 0)
                                {
                                    //If the start bit was set
                                    if (Start)
                                    {
                                        //Reconstruct the nal header
                                        //Use the first 3 bits of the first byte and last 5 bites of the FU Header
                                        byte nalHeader = (byte)((firstByte & 0xE0) | (FUHeader & Common.Binary.FiveBitMaxValue));
    
                                        //Could have been SPS / PPS / SEI
                                        if (nalHeader > 5)
                                        {
                                            if (nalHeader == 6)
                                            {
                                                Buffer.WriteByte(0);
                                                containsSei = true;
                                            }
                                            else if (nalHeader == 7)
                                            {
                                                Buffer.WriteByte(0);
                                                containsPps = true;
                                            }
                                            else if (nalHeader == 8)
                                            {
                                                Buffer.WriteByte(0);
                                                containsSps = true;
                                            }
                                        }
    
                                        if (nalHeader == 1) containsSlice = true;
    
                                        if (nalHeader == 5) isIdr = true;
    
                                        //Write the start code
                                        Buffer.Write(NalStart, 0, 3);
    
                                        //Write the re-construced header
                                        Buffer.WriteByte(nalHeader);
                                    }
    
                                    //Write the data of the fragment.
                                    Buffer.Write(packetData, offset, fragment_size);
                                }
                            }
                            return;
                        }
                    default:
                        {
                            // 6 SEI, 7 and 8 are SPS and PPS
                            if (nalUnitType > 5)
                            {
                                if (nalUnitType == 6)
                                {
                                    Buffer.WriteByte(0);
                                    containsSei = true;
                                }
                                else if (nalUnitType == 7)
                                {
                                    Buffer.WriteByte(0);
                                    containsPps = true;
                                }
                                else if (nalUnitType == 8)
                                {
                                    Buffer.WriteByte(0);
                                    containsSps = true;
                                }
                            }
    
                            if (nalUnitType == 1) containsSlice = true;
    
                            if (nalUnitType == 5) isIdr = true;
    
                            //Write the start code
                            Buffer.Write(NalStart, 0, 3);
    
                            //Write the nal heaer and data data
                            Buffer.Write(packetData, offset, count - offset);
    
                            return;
                        }
                }
            }
    
            internal void DisposeBuffer()
            {
                if (Buffer != null)
                {
                    Buffer.Dispose();
                    Buffer = null;
                }
            }
    
            public override void Dispose()
            {
                if (Disposed) return;
                base.Dispose();
                DisposeBuffer();
            }
    
            //To go to an Image...
            //Look for a SliceHeader in the Buffer
            //Decode Macroblocks in Slice
            //Convert Yuv to Rgb
        }
    

    还有实现其他RFC的方式,可以帮助在MediaElement或其他软件中播放媒体,或者只是将其保存到磁盘。

    正在进行向容器格式的编写。


    1

    我不了解你的实现的其余部分,但似乎你收到的“片段”可能是NAL单元。因此,在将比特流重构后发送给FFmpeg之前,每个单元都需要附加NALU起始码(00 00 0100 00 00 01)。

    无论如何,你可能会发现H264 RTP封装的RFC有用:

    http://www.rfc-editor.org/rfc/rfc3984.txt

    希望这能帮到你!


    我没能力在你的问题或回答下面发表评论,但是你是否在每个“片段”之前添加了NALU起始码? - Scott
    1
    您不需要这样做... 片段是一个 IDR 的一部分。NALU 只在第一个片段中传输,而不是每个片段。要解码它,您完全不需要添加起始代码,因为 NAL 单元定义了其后的 H264 负载(较低的5位完成此操作)。 - Cipi

    网页内容由stack overflow 提供, 点击上面的
    可以查看英文原文,
    原文链接