The blit shader path for buffer to image copies is pretty bad, since it needs to produce a tiled image from the linear buffer prior to emitting the blit copy.
This patch adds a new preferential path where we implement the copy using the CPU, similar to what the GL driver does for texture uploads. This makes vkQuake2 at least 4x faster when dynamic lights are enabled (which triggers dynamic texture updates).
We also tested a GPU path where we use a shader that takes the linear buffer as a UBO and copies directly from it. This also shows a clear performance gain, but still worse than the CPU implementation.
1f8343b8752 v3dv: add a CPU path for buffer to image copies
src/broadcom/vulkan/v3dv_meta_copy.c | 68 ++++++++++++++++++++++++++++++++++++
src/broadcom/vulkan/v3dv_private.h | 27 ++++++++++----
src/broadcom/vulkan/v3dv_queue.c | 49 ++++++++++++++++++++++++++
3 files changed, 138 insertions(+), 6 deletions(-)