在ffplay中出现绿屏:使用Live555将桌面(DirectX表面)作为H264视频流传输到RTP流时。

10

我正在尝试使用Live555和Windows媒体基础架构的硬件编码器,在Windows10上以H264视频的形式通过RTP流来流式传输桌面(NV12格式的DirectX表面),并期望它能被ffplay(ffmpeg 4.2)渲染。但是只能得到如下所示的绿色屏幕。

enter image description here

enter image description here

enter image description here

enter image description here

我参考了MFWebCamToRTP媒体基础库示例使用硬件MFT编码DirectX表面来实现Live555的FramedSource,并将输入源更改为DirectX表面而不是WebCam。

以下是我实现Live555的doGetNextFrame回调函数以从DirectX表面提供输入样本的摘录:

virtual void doGetNextFrame()
{
    if (!_isInitialised)
    {
        if (!initialise()) {
            printf("Video device initialisation failed, stopping.");
            return;
        }
        else {
            _isInitialised = true;
        }
    }

    //if (!isCurrentlyAwaitingData()) return;

    DWORD processOutputStatus = 0;
    HRESULT mftProcessOutput = S_OK;
    MFT_OUTPUT_STREAM_INFO StreamInfo;
    IMFMediaBuffer *pBuffer = NULL;
    IMFSample *mftOutSample = NULL;
    DWORD mftOutFlags;
    bool frameSent = false;
    bool bTimeout = false;

    // Create sample
    CComPtr<IMFSample> videoSample = NULL;

    // Create buffer
    CComPtr<IMFMediaBuffer> inputBuffer;
    // Get next event
    CComPtr<IMFMediaEvent> event;
    HRESULT hr = eventGen->GetEvent(0, &event);
    CHECK_HR(hr, "Failed to get next event");

    MediaEventType eventType;
    hr = event->GetType(&eventType);
    CHECK_HR(hr, "Failed to get event type");


    switch (eventType)
    {
    case METransformNeedInput:
        {
            hr = MFCreateDXGISurfaceBuffer(__uuidof(ID3D11Texture2D), surface, 0, FALSE, &inputBuffer);
            CHECK_HR(hr, "Failed to create IMFMediaBuffer");

            hr = MFCreateSample(&videoSample);
            CHECK_HR(hr, "Failed to create IMFSample");
            hr = videoSample->AddBuffer(inputBuffer);
            CHECK_HR(hr, "Failed to add buffer to IMFSample");

            if (videoSample)
            {
                _frameCount++;

                CHECK_HR(videoSample->SetSampleTime(mTimeStamp), "Error setting the video sample time.\n");
                CHECK_HR(videoSample->SetSampleDuration(VIDEO_FRAME_DURATION), "Error getting video sample duration.\n");

                // Pass the video sample to the H.264 transform.

                hr = _pTransform->ProcessInput(inputStreamID, videoSample, 0);
                CHECK_HR(hr, "The resampler H264 ProcessInput call failed.\n");

                mTimeStamp += VIDEO_FRAME_DURATION;
            }
        }

        break;

    case METransformHaveOutput:

        {
            CHECK_HR(_pTransform->GetOutputStatus(&mftOutFlags), "H264 MFT GetOutputStatus failed.\n");

            if (mftOutFlags == MFT_OUTPUT_STATUS_SAMPLE_READY)
            {
                MFT_OUTPUT_DATA_BUFFER _outputDataBuffer;
                memset(&_outputDataBuffer, 0, sizeof _outputDataBuffer);
                _outputDataBuffer.dwStreamID = outputStreamID;
                _outputDataBuffer.dwStatus = 0;
                _outputDataBuffer.pEvents = NULL;
                _outputDataBuffer.pSample = nullptr;

                mftProcessOutput = _pTransform->ProcessOutput(0, 1, &_outputDataBuffer, &processOutputStatus);

                if (mftProcessOutput != MF_E_TRANSFORM_NEED_MORE_INPUT)
                {
                    if (_outputDataBuffer.pSample) {

                        //CHECK_HR(_outputDataBuffer.pSample->SetSampleTime(mTimeStamp), "Error setting MFT sample time.\n");
                        //CHECK_HR(_outputDataBuffer.pSample->SetSampleDuration(VIDEO_FRAME_DURATION), "Error setting MFT sample duration.\n");

                        IMFMediaBuffer *buf = NULL;
                        DWORD bufLength;
                        CHECK_HR(_outputDataBuffer.pSample->ConvertToContiguousBuffer(&buf), "ConvertToContiguousBuffer failed.\n");
                        CHECK_HR(buf->GetCurrentLength(&bufLength), "Get buffer length failed.\n");
                        BYTE * rawBuffer = NULL;

                        fFrameSize = bufLength;
                        fDurationInMicroseconds = 0;
                        gettimeofday(&fPresentationTime, NULL);

                        buf->Lock(&rawBuffer, NULL, NULL);
                        memmove(fTo, rawBuffer, fFrameSize);

                        FramedSource::afterGetting(this);

                        buf->Unlock();
                        SafeRelease(&buf);

                        frameSent = true;
                        _lastSendAt = GetTickCount();

                        _outputDataBuffer.pSample->Release();
                    }

                    if (_outputDataBuffer.pEvents)
                        _outputDataBuffer.pEvents->Release();
                }

                //SafeRelease(&pBuffer);
                //SafeRelease(&mftOutSample);

                break;
            }
        }

        break;
    }

    if (!frameSent)
    {
        envir().taskScheduler().triggerEvent(eventTriggerId, this);
    }

    return;

done:

    printf("MediaFoundationH264LiveSource doGetNextFrame failed.\n");
    envir().taskScheduler().triggerEvent(eventTriggerId, this);
}

初始化方法:

bool initialise()
{
    HRESULT hr;
    D3D11_TEXTURE2D_DESC desc = { 0 };

    HDESK CurrentDesktop = nullptr;
    CurrentDesktop = OpenInputDesktop(0, FALSE, GENERIC_ALL);
    if (!CurrentDesktop)
    {
        // We do not have access to the desktop so request a retry
        return false;
    }

    // Attach desktop to this thread
    bool DesktopAttached = SetThreadDesktop(CurrentDesktop) != 0;
    CloseDesktop(CurrentDesktop);
    CurrentDesktop = nullptr;
    if (!DesktopAttached)
    {
        printf("SetThreadDesktop failed\n");
    }

    UINT32 activateCount = 0;

    // h264 output
    MFT_REGISTER_TYPE_INFO info = { MFMediaType_Video, MFVideoFormat_H264 };

    UINT32 flags =
        MFT_ENUM_FLAG_HARDWARE |
        MFT_ENUM_FLAG_SORTANDFILTER;

    // ------------------------------------------------------------------------
    // Initialize D3D11
    // ------------------------------------------------------------------------

    // Driver types supported
    D3D_DRIVER_TYPE DriverTypes[] =
    {
        D3D_DRIVER_TYPE_HARDWARE,
        D3D_DRIVER_TYPE_WARP,
        D3D_DRIVER_TYPE_REFERENCE,
    };
    UINT NumDriverTypes = ARRAYSIZE(DriverTypes);

    // Feature levels supported
    D3D_FEATURE_LEVEL FeatureLevels[] =
    {
        D3D_FEATURE_LEVEL_11_0,
        D3D_FEATURE_LEVEL_10_1,
        D3D_FEATURE_LEVEL_10_0,
        D3D_FEATURE_LEVEL_9_1
    };
    UINT NumFeatureLevels = ARRAYSIZE(FeatureLevels);

    D3D_FEATURE_LEVEL FeatureLevel;

    // Create device
    for (UINT DriverTypeIndex = 0; DriverTypeIndex < NumDriverTypes; ++DriverTypeIndex)
    {
        hr = D3D11CreateDevice(nullptr, DriverTypes[DriverTypeIndex], nullptr,
            D3D11_CREATE_DEVICE_VIDEO_SUPPORT,
            FeatureLevels, NumFeatureLevels, D3D11_SDK_VERSION, &device, &FeatureLevel, &context);
        if (SUCCEEDED(hr))
        {
            // Device creation success, no need to loop anymore
            break;
        }
    }

    CHECK_HR(hr, "Failed to create device");

    // Create device manager
    UINT resetToken;
    hr = MFCreateDXGIDeviceManager(&resetToken, &deviceManager);
    CHECK_HR(hr, "Failed to create DXGIDeviceManager");

    hr = deviceManager->ResetDevice(device, resetToken);
    CHECK_HR(hr, "Failed to assign D3D device to device manager");


    // ------------------------------------------------------------------------
    // Create surface
    // ------------------------------------------------------------------------
    desc.Format = DXGI_FORMAT_NV12;
    desc.Width = surfaceWidth;
    desc.Height = surfaceHeight;
    desc.MipLevels = 1;
    desc.ArraySize = 1;
    desc.SampleDesc.Count = 1;

    hr = device->CreateTexture2D(&desc, NULL, &surface);
    CHECK_HR(hr, "Could not create surface");

    hr = MFTEnumEx(
        MFT_CATEGORY_VIDEO_ENCODER,
        flags,
        NULL,
        &info,
        &activateRaw,
        &activateCount
    );
    CHECK_HR(hr, "Failed to enumerate MFTs");

    CHECK(activateCount, "No MFTs found");

    // Choose the first available encoder
    activate = activateRaw[0];

    for (UINT32 i = 0; i < activateCount; i++)
        activateRaw[i]->Release();

    // Activate
    hr = activate->ActivateObject(IID_PPV_ARGS(&_pTransform));
    CHECK_HR(hr, "Failed to activate MFT");

    // Get attributes
    hr = _pTransform->GetAttributes(&attributes);
    CHECK_HR(hr, "Failed to get MFT attributes");

    // Unlock the transform for async use and get event generator
    hr = attributes->SetUINT32(MF_TRANSFORM_ASYNC_UNLOCK, TRUE);
    CHECK_HR(hr, "Failed to unlock MFT");

    eventGen = _pTransform;
    CHECK(eventGen, "Failed to QI for event generator");

    // Get stream IDs (expect 1 input and 1 output stream)
    hr = _pTransform->GetStreamIDs(1, &inputStreamID, 1, &outputStreamID);
    if (hr == E_NOTIMPL)
    {
        inputStreamID = 0;
        outputStreamID = 0;
        hr = S_OK;
    }
    CHECK_HR(hr, "Failed to get stream IDs");

     // ------------------------------------------------------------------------
    // Configure hardware encoder MFT
   // ------------------------------------------------------------------------
    CHECK_HR(_pTransform->ProcessMessage(MFT_MESSAGE_SET_D3D_MANAGER, reinterpret_cast<ULONG_PTR>(deviceManager.p)), "Failed to set device manager.\n");

    // Set low latency hint
    hr = attributes->SetUINT32(MF_LOW_LATENCY, TRUE);
    CHECK_HR(hr, "Failed to set MF_LOW_LATENCY");

    hr = MFCreateMediaType(&outputType);
    CHECK_HR(hr, "Failed to create media type");

    hr = outputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
    CHECK_HR(hr, "Failed to set MF_MT_MAJOR_TYPE on H264 output media type");

    hr = outputType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264);
    CHECK_HR(hr, "Failed to set MF_MT_SUBTYPE on H264 output media type");

    hr = outputType->SetUINT32(MF_MT_AVG_BITRATE, TARGET_AVERAGE_BIT_RATE);
    CHECK_HR(hr, "Failed to set average bit rate on H264 output media type");

    hr = MFSetAttributeSize(outputType, MF_MT_FRAME_SIZE, desc.Width, desc.Height);
    CHECK_HR(hr, "Failed to set frame size on H264 MFT out type");

    hr = MFSetAttributeRatio(outputType, MF_MT_FRAME_RATE, TARGET_FRAME_RATE, 1);
    CHECK_HR(hr, "Failed to set frame rate on H264 MFT out type");

    hr = outputType->SetUINT32(MF_MT_INTERLACE_MODE, 2);
    CHECK_HR(hr, "Failed to set MF_MT_INTERLACE_MODE on H.264 encoder MFT");

    hr = outputType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE);
    CHECK_HR(hr, "Failed to set MF_MT_ALL_SAMPLES_INDEPENDENT on H.264 encoder MFT");

    hr = _pTransform->SetOutputType(outputStreamID, outputType, 0);
    CHECK_HR(hr, "Failed to set output media type on H.264 encoder MFT");

    hr = MFCreateMediaType(&inputType);
    CHECK_HR(hr, "Failed to create media type");

    for (DWORD i = 0;; i++)
    {
        inputType = nullptr;
        hr = _pTransform->GetInputAvailableType(inputStreamID, i, &inputType);
        CHECK_HR(hr, "Failed to get input type");

        hr = inputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
        CHECK_HR(hr, "Failed to set MF_MT_MAJOR_TYPE on H264 MFT input type");

        hr = inputType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_NV12);
        CHECK_HR(hr, "Failed to set MF_MT_SUBTYPE on H264 MFT input type");

        hr = MFSetAttributeSize(inputType, MF_MT_FRAME_SIZE, desc.Width, desc.Height);
        CHECK_HR(hr, "Failed to set MF_MT_FRAME_SIZE on H264 MFT input type");

        hr = MFSetAttributeRatio(inputType, MF_MT_FRAME_RATE, TARGET_FRAME_RATE, 1);
        CHECK_HR(hr, "Failed to set MF_MT_FRAME_RATE on H264 MFT input type");

        hr = _pTransform->SetInputType(inputStreamID, inputType, 0);
        CHECK_HR(hr, "Failed to set input type");

        break;
    }

    CheckHardwareSupport();

    CHECK_HR(_pTransform->ProcessMessage(MFT_MESSAGE_COMMAND_FLUSH, NULL), "Failed to process FLUSH command on H.264 MFT.\n");
    CHECK_HR(_pTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_BEGIN_STREAMING, NULL), "Failed to process BEGIN_STREAMING command on H.264 MFT.\n");
    CHECK_HR(_pTransform->ProcessMessage(MFT_MESSAGE_NOTIFY_START_OF_STREAM, NULL), "Failed to process START_OF_STREAM command on H.264 MFT.\n");

    return true;

done:

    printf("MediaFoundationH264LiveSource initialisation failed.\n");
    return false;
}


    HRESULT CheckHardwareSupport()
    {
        IMFAttributes *attributes;
        HRESULT hr = _pTransform->GetAttributes(&attributes);
        UINT32 dxva = 0;

        if (SUCCEEDED(hr))
        {
            hr = attributes->GetUINT32(MF_SA_D3D11_AWARE, &dxva);
        }

        if (SUCCEEDED(hr))
        {
            hr = attributes->SetUINT32(CODECAPI_AVDecVideoAcceleration_H264, TRUE);
        }

#if defined(CODECAPI_AVLowLatencyMode) // Win8 only

        hr = _pTransform->QueryInterface(IID_PPV_ARGS(&mpCodecAPI));

        if (SUCCEEDED(hr))
        {
            VARIANT var = { 0 };

            // FIXME: encoder only
            var.vt = VT_UI4;
            var.ulVal = 0;

            hr = mpCodecAPI->SetValue(&CODECAPI_AVEncMPVDefaultBPictureCount, &var);

            var.vt = VT_BOOL;
            var.boolVal = VARIANT_TRUE;
            hr = mpCodecAPI->SetValue(&CODECAPI_AVEncCommonLowLatency, &var);
            hr = mpCodecAPI->SetValue(&CODECAPI_AVEncCommonRealTime, &var);

            hr = attributes->SetUINT32(CODECAPI_AVLowLatencyMode, TRUE);

            if (SUCCEEDED(hr))
            {
                var.vt = VT_UI4;
                var.ulVal = eAVEncCommonRateControlMode_Quality;
                hr = mpCodecAPI->SetValue(&CODECAPI_AVEncCommonRateControlMode, &var);

                // This property controls the quality level when the encoder is not using a constrained bit rate. The AVEncCommonRateControlMode property determines whether the bit rate is constrained.
                VARIANT quality;
                InitVariantFromUInt32(50, &quality);
                hr = mpCodecAPI->SetValue(&CODECAPI_AVEncCommonQuality, &quality);
            }
        }
#endif

        return hr;
    }

ffplay命令:

ffplay -protocol_whitelist file,udp,rtp -i test.sdp -x 800 -y 600 -profile:v baseline

SDP:

v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
t=0 0
c=IN IP4 127.0.0.1
m=video 1234 RTP/AVP 96
a=rtpmap:96 H264/90000
a=fmtp:96 packetization-mode=1

我不知道我错过了什么,我已经尝试了将近一周的时间来修复它,但没有任何进展,并且几乎尝试了所有可能的方法。此外,有关将DirectX表面编码为视频的在线资源非常有限。
如果能得到任何帮助将不胜感激。

1
我认为你错误地期望在METransformNeedInput之后再次调用doGetNextFrame。也许你应该在其中循环,直到得到有效的ProcessOutput调用。 - VuVirt
hr = event->GetType(&eventType);switch(eventType) {....}如果没有发送帧,则使用envir().taskScheduler().triggerEvent(eventTriggerId, this);触发事件。以上两个块很好地处理了调用ProcessInput,直到我们从编码器获得输出。我已经验证过了。@VuVirt - iamrameshkumar
1
您可能需要检查是否从ProcessOutput接收到MF_E_TRANSFORM_STREAM_CHANGE,并相应地处理格式更改。 - VuVirt
这是否意味着编码器和渲染器正常工作,但问题可能出在图像源(DirectX表面)上? - iamrameshkumar
是的,可以尝试解码并显示它。 - VuVirt
显示剩余3条评论
3个回答

6

这比看起来要难。

如果您想像现在这样使用编码器,通过直接调用IMFTransform接口,您需要将RGB帧转换为NV12格式。如果您想要良好的性能,最好在GPU上进行转换。可以使用像素着色器完成此操作,将一个全尺寸带亮度的帧渲染到DXGI_FORMAT_R8_UNORM渲染目标中,将一个半尺寸带颜色的帧渲染到DXGI_FORMAT_R8G8_UNORM目标中,并编写两个像素着色器以生成NV12值。两个渲染目标都可以渲染到同一NV12纹理的2个平面上,但仅适用于Windows 8及以上版本。

另一种方法是使用 sink writer。它可以同时托管多个MFT,因此您可以在VRAM中提供RGB纹理,然后sink writer将首先使用一个MFT(很可能是由GPU驱动程序实现的专有硬件MFT,就像编码器一样)将其转换为NV12,然后传递给编码器MFT。相对而言,将其编码为mp4文件比较容易,使用 MFCreateSinkWriterFromURL API来创建writer。但是,从sink writer中获取原始样本要困难得多,您必须实现自定义媒体sink、自定义流sink来处理视频流,并调用 MFCreateSinkWriterFromMediaSink来创建writer。
还有更多。
无论采用哪种编码方法,都不能重复使用帧纹理。每次从DD获得的帧,您都应该创建一个新的纹理并将其传递给MF。
视频编码器需要恒定的帧率。DD 不能提供这一点,它只在屏幕上有变化时才给出一个帧。如果你有游戏显示器,帧率可以达到144 FPS;如果唯一的变化是闪烁的光标,则可能只有2 FPS。理想情况下,您应该以视频媒体类型中指定的恒定帧率提交帧到 MF。
如果您想要流式传输到网络,往往需要提供参数集。除非您使用的是英特尔硬件h265编码器,该编码器已经损坏且没有来自英特尔的评论,否则通过在IMFMediaTypeHandler接口上调用SetCurrentMediaType,MF会在媒体类型的MF_MT_MPEG_SEQUENCE_HEADER属性中提供该数据。您可以实现该接口以获得通知。只有在开始编码后才能获得这些数据。如果您使用的是sink writer,则对于IMFTransform方法更容易,您应该从ProcessOutput方法获得MF_E_TRANSFORM_STREAM_CHANGE代码,然后调用GetOutputAvailableType以获取带有该魔术 blob 的更新的媒体类型。

1
@Ram 桌面复制始终以 DXGI_FORMAT_B8G8R8A8_UNORM 格式提供 RGB 帧。H264 和 h265 编码器 MFT 仅支持 NV12 和其他一些同样奇怪的格式。必须有人进行转换。您正在使用桌面复制;您已经无法支持 Windows 7。请使用 sink writer。我非常确定这些 nVidia / Intel 硬件 MFT 将 RGB 转换为 NV12 的效率比像素着色器 ALU 更高,它们可能完全是硬件实现的。 - Soonts
你是对的。颜色转换必须明确地完成。https://github.com/GPUOpen-LibrariesAndSDKs/AMF/issues/92。我正在朝着那个方向前进。 - iamrameshkumar
1
@Ram 它应该可以工作,我以前试过。当 DD 拒绝给你一个新的帧因为没有更新时,你可以通过再次提交相同的纹理给编码器来节省大量的 VRAM。只有在 DD 有新的框架时才创建新的纹理。但是检测何时提交帧以及等待多长时间的代码并不是简单的。我使用 QueryPerformanceCounter 来测量时间,并且在最近几帧中使用某种滚动平均值来找出是否应该进行捕获或睡眠。顺便说一下,正确的睡眠方式是 IDXGIOutput::WaitForVBlank 方法。 - Soonts
我已经成功地找到了问题所在。它不是由于帧率的变化,而是由于编码器缓冲直到GOP填满(我猜测,它会缓冲多达30帧)。我也尝试过设置CODECAPI_AVLowLatencyMode,但没有任何改变。以恒定速率提供先前的样本可以解决问题,但我认为即使屏幕内容没有变化,它也会消耗少量数据(可能是由于关键帧)。 - iamrameshkumar
https://learn.microsoft.com/zh-cn/windows/win32/medfound/codecapi-avlowlatencymode https://learn.microsoft.com/zh-cn/windows/win32/medfound/mf-low-latency按照定义,这对我没有起作用(“一个输入样本应该产生一个输出样本”) - iamrameshkumar
显示剩余4条评论

1

我会尽力为您提供解决问题所需的一切。

首先,您需要进行DXGI_FORMAT_B8G8R8A8_UNORM和MFVideoFormat_NV12之间的格式转换:

Format conversion

格式转换信息

我认为使用着色器进行格式转换更好,因为所有纹理都将留在GPU中(对性能更好)。

这是你需要做的第一步。你会有其他方法来改进你的程序。


1
2x4图像在NV12格式下只占用12字节而不是24字节:由于该图只有1x2像素,比灰度图像小一倍,因此只需要4个字节来存储彩色信息,在这里你有8个亮度值,所以U和V分别需要2个字节。 - Soonts
是的,你说得对,我忽略了将 NV12 格式的下采样设置为 4:2:0。我会尝试制作一个更适合的图表。 - mofo77

1

由于ffplay抱怨流参数,我认为它无法捕获SPS/PPS。您没有在硬编码的SDP中设置它们-请参见RFC-3984并查找sprop-parameter-sets。来自RFC的示例:

m = video 49170 RTP / AVP 98
  a = rtpmap:98 H264 / 90000
  a = fmtp:98 profile-level-id = 42A01E; sprop-parameter-sets = Z0IACpZTBYmI,aMljiA ==

我强烈认为 ffplay 期望在SDP中收到这些内容。我不记得如何从媒体基础编码器中获取SPS/PPS,但是它们可能在样本负载中,您需要通过查找正确的NAL单元来提取它们,或者搜索如何从编码器中提取额外数据 - 我得到的第一个 hit 看起来很有希望。


这是一个有道理的观点。我也怀疑SPS/PPS。不过我还没有验证它。感谢您指引我到MSDN线程,给了我一些希望。 - iamrameshkumar
@Ram,有很大的可能性样本负载中包含SPS/PPS,所以我建议先检查一下。 - Rudolfs Bundulis
是的,我明白。当我尝试通过Mpeg4MediaSink将样本写入文件时,我已经掌握了直接从媒体基础编码器中检索和解析SPS/PPS的一些知识。我会朝着这个方向继续前进。 - iamrameshkumar

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接