lavu/x86/lls: add fma3 optimizations for update_lls

Multimedia / FFmpeg - Ganesh Ajjanagadde [gmail.com] - 15 January 2016 15:46 UTC

This improves accuracy (very slightly) and speed for processors having fma3.

Sample benchmark (fate flac-16-lpc-cholesky, Haswell): old: 5993610 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips 5951528 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips

new: 5252410 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips 5232869 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips

Tested with FATE and --disable-fma3, also examined contents of lavu/lls-test.

5989add lavu/x86/lls: add fma3 optimizations for update_lls
libavutil/x86/lls.asm | 59 ++++++++++++++++++++++++++++++++++++++++++++--
libavutil/x86/lls_init.c | 4 ++++
2 files changed, 61 insertions(+), 2 deletions(-)

  • Share