Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan AS rebuild-on-replay: On queue submission completion events #3411

Merged
merged 2 commits into from
Sep 6, 2024

Conversation

cmannett85-arm
Copy link
Collaborator

@cmannett85-arm cmannett85-arm commented Aug 17, 2024

If an AS build command is called on an already built AS (i.e. an update build) then we need to safely destroy the previously copied input buffers and their memory. However at the point this happens, we do not own the command buffer so we don't know if the original data has actually been copied yet.

Using the mechanism in this patch we associate function objects with the command buffer that will be executed when the queue the command buffer is submitted to, has finished executing on the GPU.

This works by carying the function objects in command buffer and once it is submitted, we poll on a VkEvent we inserted. Once signalled, we call the function objects in the order they were inserted.

Copy link
Owner

@baldurk baldurk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather have a different solution for this which I think should be simpler/more direct and more tolerant of common application behaviours.

  • We can't expect each submission to complete immediately (or within 5ms) so instead we should check after each submission to see which things we're waiting on have been completed, and add new things into that list. Any which have completed should be processed, any that haven't can continue waiting.
  • Since we're not waiting immediately, we can't use the user's fence as we can't reference it after the submit call returns. In theory the ideal solution would be to add our own semaphore to the submission's signalled semaphore, but I don't see any way in base vulkan to check a semaphore's status from the CPU and although it might be safe to rely on timeline semaphores I'd rather not depend on another extension. Instead I think it's best to set a VkEvent in the command buffer and poll on that.
  • Instead of divorcing the registration of 'callback after X command buffer completes' with its processing via a map lookup, store the callbacks to be called directly in the command buffer record and during queue submit we can move them out into the array of events we're waiting for. We can store a {event, callback} pair.
  • Rather than adding a new class for this which adds another indirection and I find makes things harder to navigate/organise, keep this within WrappedVulkan or at most in VulkanAccelerationStructureManager. Having it in a separate file is fine if you want but I think if the above changes are made there's not much to go there.

If you look in the D3D12 implementation at WrappedID3D12GraphicsCommandList::AddSubmissionASBuildCallback through to WrappedID3D12CommandQueue::ExecuteCommandListsInternal you can see where this kind of callback is recorded in the command buffer, then taken by the queue and passed to the RT-specific manager for polling.

As a minor note - there's no need to store a VkDevice. We only handle a single device and that's unlikely to ever change in future, if it does there will need to be a pass done throughout the codebase to handle it.

@cmannett85-arm
Copy link
Collaborator Author

Ah balls, this is wrong. I'm calling vkCmdSetEvent on the command buffer after vkEndCommandBuffer has been called which bizarrely works on Mali but (quite fairly) causes the VVL to have a meltdown.

Copy link
Owner

@baldurk baldurk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this could be simplified a little further overall - the lifetime of the event and needing to track it vs the command buffer in two places makes this a little awkward.

In practice we could just delete the event as soon as we see it's set, but the spec disallows that until we can prove that the command buffer has finished executing and hooking onto the reset/free is a good plan for that. I even wondered if we could pool the events such that we don't have to worry about their lifetime but the spec also bans submitting a use of an event while another use of it is in flight - even though this would again be ordered in practice by the CPU check :(.

I think putting the command buffer more explicitly in control of the event's lifetime would work nicely - since you have one event per command buffer it seems to me to make more sense to tie them together. Then you only have to handle being able to check the event potentially after the command buffer has been reset.

I'd suggest adding a simple refcount for the event, have both the command buffer and the overall global array of callbacks holding a ref to it. Once both have release their ref it deletes itself. You could leave the native VkEvent to be entirely tied to the lifetime of the command buffer and simply have a helper refcounted struct which knows that a deleted event means implicitly that it's been set.

renderdoc/driver/vulkan/vk_core.cpp Outdated Show resolved Hide resolved
renderdoc/driver/vulkan/wrappers/vk_cmd_funcs.cpp Outdated Show resolved Hide resolved
@cmannett85-arm
Copy link
Collaborator Author

Apologies for the delay, I've just got back from being abroad.

I'd suggest adding a simple refcount for the event, have both the command buffer and the overall global array of callbacks holding a ref to it.

Any objections to just using an equivalent to std::shared_ptr<VkEvent>(event, [d](VkEvent *event){ ObjDisp(d)->DestroyEvent(Unwrap(d), *event, NULL); delete event; } for this? I can't see std::shared_ptr being used elsewhere in the codebase but it saves me reinventing the refcounting wheel.

@baldurk
Copy link
Owner

baldurk commented Sep 2, 2024

I'm glad you were able to get away, that's nothing to apologise for :).

Any objections to just using an equivalent to std::shared_ptr

No please don't use a std:: class for this, you can do the same as ResourceRecord for its refcounting:

  int32_t RefCount = 1;
  void AddRef() { Atomic::Inc32(&RefCount); }
  void Release()
  {
    int32_t ref = Atomic::Dec32(&RefCount);
    RDCASSERT(ref >= 0);
    if(ref <= 0)
      delete this;
  }

@cmannett85-arm cmannett85-arm force-pushed the on_q_submission branch 2 times, most recently from a01bc38 to 8ea828f Compare September 2, 2024 21:45
@cmannett85-arm
Copy link
Collaborator Author

I'm not convinced this is much of an improvement... But it's a starting point for iteration!

renderdoc/driver/vulkan/wrappers/vk_cmd_funcs.cpp Outdated Show resolved Hide resolved
renderdoc/driver/vulkan/vk_resources.h Outdated Show resolved Hide resolved
renderdoc/driver/vulkan/vk_resources.h Outdated Show resolved Hide resolved
renderdoc/driver/vulkan/vk_core.cpp Show resolved Hide resolved
renderdoc/driver/vulkan/vk_resources.h Outdated Show resolved Hide resolved
renderdoc/driver/vulkan/vk_core.cpp Outdated Show resolved Hide resolved
Copy link
Owner

@baldurk baldurk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks about ready - I've left a couple of minor notes only.

renderdoc/driver/vulkan/wrappers/vk_queue_funcs.cpp Outdated Show resolved Hide resolved
renderdoc/driver/vulkan/vk_core.cpp Outdated Show resolved Hide resolved
renderdoc/driver/vulkan/vk_core.cpp Show resolved Hide resolved
If an AS build command is called on an already built AS (i.e. an update build) then we need to safely destroy the previously copied input buffers and their memory.  However at the point this happens, we do not own the command buffer so we don't know if the original data has actually been copied yet.

Using the mechanism in this patch we associate function objects with the command buffer that will be executed when the queue the command buffer is submitted to, has finished executing on the GPU.

This works by carying the function objects in command buffer and once it is submitted, we poll on a VkEvent we inserted.  Once signalled, we call the function objects in the order they were inserted.
@baldurk baldurk merged commit f632719 into baldurk:v1.x Sep 6, 2024
16 checks passed
@cmannett85-arm cmannett85-arm deleted the on_q_submission branch September 6, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants