nir: add partial loop unrolling support

Graphics / Mesa 3D Graphics Library / Mesa - Timothy Arceri [itsqueeze.com] - 12 March 2019 00:52 EDT

This adds partial loop unrolling support and makes use of a guessed trip count based on array access.

The code is written so that we could use partial unrolling more generally, but for now it's only use when we have guessed the trip count.

We use partial unrolling for this guessed trip count because its possible any out of bounds array access doesn't otherwise affect the shader e.g the stores/loads to/from the array are unused. So we insert a copy of the loop in the innermost continue branch of the unrolled loop. Later on its possible for nir_opt_dead_cf() to then remove the loop in some cases.

A Renderdoc capture from the Rise of the Tomb Raider benchmark, reports the following change in an affected compute shader:

GPU duration: 350 -> 325 microseconds

shader-db results radeonsi VEGA (NIR backend):

SGPRS: 1008 -> 816 (-19.05 %)
VGPRS: 684 -> 432 (-36.84 %) Spilled SGPRs: 539 -> 0 (-100.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 39708 -> 45812 (15.37 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 105 -> 144 (37.14 %) Wait states: 0 -> 0 (0.00 %)

shader-db results i965 SKL:

total instructions in shared programs: 13098265 -> 13103359 (0.04%) instructions in affected programs: 5126 -> 10220 (99.38%) helped: 0 HURT: 21

total cycles in shared programs: 332039949 -> 331985622 (-0.02%) cycles in affected programs: 289252 -> 234925 (-18.78%) helped: 12 HURT: 9

vkpipeline-db results VEGA:

Totals from affected shaders: SGPRS: 184 -> 184 (0.00 %)
VGPRS: 448 -> 448 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 26076 -> 24428 (-6.32 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 5 -> 5 (0.00 %) Wait states: 0 -> 0 (0.00 %)

e8a8937a04f nir: add partial loop unrolling support
src/compiler/nir/nir_opt_loop_unroll.c | 207 +++++++++++++++++++++++++++++++--
1 file changed, 199 insertions(+), 8 deletions(-)

Upstream: cgit.freedesktop.org


  • Share