aarch64: Optimized memchr specific to AmpereComputing emag

System Internals / glibc - Feng Xue [os.amperecomputing.com] - 1 February 2019 13:14 EST

This version uses general register based memory instruction to load data, because vector register based is slightly slower in emag.

Character-matching is performed on 16-byte (both size and alignment) memory block in parallel each iteration.

- sysdeps/aarch64/memchr.S (__memchr): Rename to MEMCHR. [!MEMCHR](MEMCHR): Set to __memchr.
- sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add memchr_generic and memchr_nosimd.
- sysdeps/aarch64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add memchr ifuncs.
- sysdeps/aarch64/multiarch/memchr.c: New file.
- sysdeps/aarch64/multiarch/memchr_generic.S: Likewise.
- sysdeps/aarch64/multiarch/memchr_nosimd.S: Likewise.

83d1cc42d8 aarch64: Optimized memchr specific to AmpereComputing emag
ChangeLog | 12 ++
sysdeps/aarch64/memchr.S | 10 +-
sysdeps/aarch64/multiarch/Makefile | 1 +
sysdeps/aarch64/multiarch/ifunc-impl-list.c | 3 +
sysdeps/aarch64/multiarch/memchr.c | 41 +++++
sysdeps/aarch64/multiarch/memchr_generic.S | 33 ++++
sysdeps/aarch64/multiarch/memchr_nosimd.S | 223 ++++++++++++++++++++++++++++
7 files changed, 320 insertions(+), 3 deletions(-)

Upstream: sourceware.org


  • Share