aco: implement 16-bit literals

Graphics / Mesa 3D Graphics Library / Mesa - Rhys Perry [gmail.com] - 15 October 2020 11:33 UTC

We can copy any value into a 16-bit subregister with a 3 dword
v_pack_b32_f16 on GFX10 or a v_and_b32+v_or_b32 on GFX9.

Because the generated code can depend on the register assignment and to improve constant propagation, Builder::copy creates a p_create_vector in the case of sub-dword literals.

1a652244e4b aco: implement 16-bit literals
src/amd/compiler/aco_builder_h.py | 2 +-
src/amd/compiler/aco_lower_to_hw_instr.cpp | 41 ++++++++++
src/amd/compiler/aco_validate.cpp | 1 -
src/amd/compiler/tests/test_to_hw_instr.cpp | 113 ++++++++++++++++++++++++++++
4 files changed, 155 insertions(+), 2 deletions(-)

Upstream: cgit.freedesktop.org


  • Share