This effectively does the opposite of nir_lower_alus_to_scalar, trying to combine per-component ALU operations with the same sources but different swizzles into one larger ALU operation. It uses a similar model as CSE, where we do a depth-first approach and keep around a hash set of instructions to be combined, but there are a few major differences:
1. For now, we only support entirely per-component ALU operations. 2. Since it's not always guaranteed that we'll be able to combine equivalent instructions, we keep a stack of equivalent instructions around, trying to combine new instructions with instructions on the stack.
The pass isn't comprehensive by far; it can't handle operations where some of the sources are per-component and others aren't, and it can't handle phi nodes. But it should handle the more common cases, and it should be reasonably efficient.
[Alyssa: Rebase on latest master, updating with respect to typeless moves]
47e7c6961a3 nir: add a vectorization pass
src/compiler/Makefile.sources | 1 +
src/compiler/nir/meson.build | 1 +
src/compiler/nir/nir.h | 2 +
src/compiler/nir/nir_opt_vectorize.c | 453 +++++++++++++++++++++++++++++++++++
4 files changed, 457 insertions(+)