如何使用DirectX / Direct3D 12中的fence同步CPU和GPU？

Question

如何使用DirectX / Direct3D 12中的fence同步CPU和GPU？

5

我开始学习Direct3D 12，对CPU-GPU同步理解有困难。据我所知，fence（ID3D12Fence）不过是一个用作计数器的UINT64（unsigned long long）值。但它的方法使我感到困惑。以下是D3D12示例源代码的一部分。（https://github.com/d3dcoder/d3d12book）

void D3DApp::FlushCommandQueue()
{
    // Advance the fence value to mark commands up to this fence point.
    mCurrentFence++;

    // Add an instruction to the command queue to set a new fence point.  Because we 
    // are on the GPU timeline, the new fence point won't be set until the GPU finishes
    // processing all the commands prior to this Signal().
    ThrowIfFailed(mCommandQueue->Signal(mFence.Get(), mCurrentFence));

    // Wait until the GPU has completed commands up to this fence point.
    if(mFence->GetCompletedValue() < mCurrentFence)
    {
        HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);

        // Fire event when GPU hits current fence.  
        ThrowIfFailed(mFence->SetEventOnCompletion(mCurrentFence, eventHandle));

        // Wait until the GPU hits current fence event is fired.
        WaitForSingleObject(eventHandle, INFINITE);
        CloseHandle(eventHandle);
    }
}

据我理解，这部分的目的是要“Flush”命令队列，即使CPU等待GPU直到它达到给定的“Fence值”，以便CPU和GPU具有相同的Fence值。

问：如果Signal()是一个函数，让GPU更新给定ID3D12Fence内的Fence值，为什么需要mCurrentFence值？

根据Microsoft文档，它说“将一个Fence更新为指定的值。”是什么指定的值？我需要的是“获取最后已完成的命令列表值”，而不是设置或指定。这个指定的值是什么？

在我看来，它似乎必须像这样：

// Suppose mCurrentFence is 1 after submitting 1 command list (Index 0), and the thread reached to here for the FIRST time
ThrowIfFailed(mCommandQueue->Signal(mFence.Get()));
// At this point Fence value inside mFence is updated
if (m_Fence->GetCompletedValue() < mCurrentFence)
{
...
}

如果m_Fence->GetCompletedValue()等于0，

如果(0 < 1)

GPU尚未执行命令列表（索引0），则CPU必须等待GPU跟进。然后才有意义调用SetEventOnCompletion，WaitForSingleObject等方法。

如果(1 < 1)

GPU已完成命令列表（索引0），因此CPU不需要等待。

在执行命令列表的某个位置增加mCurrentFence。

mCommandQueue->ExecuteCommandLists(_countof(cmdsLists), cmdsLists);
mCurrentFence++;

- YoonSeok OH

2个回答

3

作为对Felix回答的补充：

跟踪围栏值（例如mCurrentFence）有助于在命令队列中等待更具体的点。

例如，假设我们正在使用以下设置：

ComPtr<ID3D12CommandQueue> queue;
ComPtr<ID3D12Fence> queueFence;
UINT64 fenceVal = 0;

UINT64 incrementFence()
{
    fenceVal++;
    queue->Signal(queueFence.Get(), fenceVal); // CHECK HRESULT
    return fenceVal;
}

void waitFor(UINT64 fenceVal, DWORD timeout = INFINITE)
{
    if (queueFence->GetCompletedValue() < fenceVal)
    {
        queueFence->SetEventOnCompletion(fenceVal, fenceEv); // CHECK HRESULT
        WaitForSingleObject(fenceEv, timeout);
    }
}

接下来我们可以按如下方式进行操作（伪代码）：

SUBMIT COMMANDS 1
cmds1Complete = incrementFence();
    .
    . <- CPU STUFF
    .
SUBMIT COMMANDS 2
cmds2Complete = incrementFence();
    .
    . <- CPU STUFF
    .
waitFor(cmds1Complete)
    .
    . <- CPU STUFF (that needs COMMANDS 1 to be complete,
      but COMMANDS 2 is NOT required to be completed [but also could be])
    .
waitFor(cmds2Complete)
    .
    . <- EVERYTHING COMPLETE
    .

既然我们跟踪了 fenceVal，那么我们也可以有一个flush函数，它只等待被跟踪的 fenceVal（而不是从 incrementFence 返回的值），这就是 FlushCommandQueue 中所做的，因为它内联了信号，它总是最新的值（这就是为什么 Felix 说它只保存了一个 API 调用）：

void flushCmdQueue()
{
    waitFor(incrementFence());
}

这个示例比起最初的问题要稍微复杂一些，但是我认为当询问跟踪 mCurrentFence时它很重要。

- nedb

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Felix Brüll · Accepted Answer

mCommandQueue->Signal(mFence.Get(), 1)在命令队列上排队的所有先前命令被执行后，将围栏的值设置为mCurrentFence。在这种情况下，“指定值”是mCurrentFence。

开始时，围栏和mCurrentFence的值都设置为0。接下来，将mCurrentFence设置为1。然后我们执行mCommandQueue->Signal(mFence.Get(), 1)，它会在该队列上执行完毕后将围栏设置为1。最后，调用mFence->SetEventOnCompletion(1, eventHandle)，然后等待直到围栏被设置为1。

下一次迭代将1替换为2，以此类推。

请注意，mCommandQueue->Signal 是一项非阻塞操作，不会立即设置围栏的值，只有在所有其他 GPU 命令被执行后才会进行设置。在本例中，可以假设m_Fence->GetCompletedValue() < mCurrentFence始终为真。

为什么需要mCurrentFence值？

我想它并不一定是必需的，但通过这种方式跟踪围栏值可以避免额外的 API 调用。在这种情况下，您也可以执行：

// retrieve last value of the fence and increment by one (Additional API call)
auto nextFence = mFence->GetCompletedValue() + 1;
ThrowIfFailed(mCommandQueue->Signal(mFence.Get(), nextFence));

// Wait until the GPU has completed commands up to this fence point.
if(mFence->GetCompletedValue() < nextFence)
{
    HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);  
    ThrowIfFailed(mFence->SetEventOnCompletion(nextFence, eventHandle));
    WaitForSingleObject(eventHandle, INFINITE);
    CloseHandle(eventHandle);
}