aarch64: Optimized implementation of strnlen

System Internals / glibc - Xuelei Zhang [huawei.com] - 19 December 2019 19:31 UTC

Optimize the strlen implementation by using vector operations and loop unrooling in main loop. Compared to aarch64/strnlen.S, it reduces latency of cases in bench-strnlen by 11%~24% when the length of src is greater than 64 bytes, with gains throughout the benchmark.

Checked on aarch64-linux-gnu.

2911cb68ed aarch64: Optimized implementation of strnlen
sysdeps/aarch64/strnlen.S | 52 ++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 51 insertions(+), 1 deletion(-)

Upstream: sourceware.org


  • Share