i965: Switch to scalar TCS by default

Graphics / Mesa 3D Graphics Library / Mesa - Kenneth Graunke [whitecape.org] - 5 May 2016 16:24 UTC

Normally, we expect SIMD8 shaders to be more instructions than SIMD4x2 shaders, as it takes four instructions to operate on a vec4, rather than a single instruction. However, the benefit is that it can process 8 objects per shader thread instead of 2.

Surprisingly, the shader-db statistics show an improvement in both instruction and cycle counts:

Synmark: -31.25% instructions, -29.27% cycles, 0 hurt. Tessmark: -36.92% instructions, -37.81% cycles, 0 hurt. Unigine Heaven: -3.42% instructions, -17.95% cycles, 0 hurt. Shadow of Mordor: +13.24% instructions (26 with fewer instructions, 45 with more),
-5.23% cycles (44 with fewer cycles, 27 with more cycles).

Presumably, this is because the SIMD8 URB messages are a much more natural fit than the SIMD4x2 URB messages - there's a ton less header setup.

I benchmarked Shadow of Mordor and Unigine Heaven on my Skylake GT3e, and the performance seems to be the same or increase ever so slightly (< 1 FPS difference). So I believe it's strictly superior.

There's also a lot more optimization potential we can do in scalar mode.

This will also help us finish fp64 support, as scalar support is going to land much sooner than vec4-mode support.

b593737 i965: Switch to scalar TCS by default.
src/mesa/drivers/dri/i965/brw_compiler.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Upstream: cgit.freedesktop.org


  • Share