X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove

System Internals / glibc - H.J. Lu [gmail.com] - 8 June 2016 15:58 UTC

Since the new SSE2/AVX2 memcpy/memmove are faster than the previous ones, we can remove the previous SSE2/AVX2 memcpy/memmove and replace them with the new ones.

No change in IFUNC selection if SSE2 and AVX2 memcpy/memmove weren't used before. If SSE2 or AVX2 memcpy/memmove were used, the new SSE2 or AVX2 memcpy/memmove optimized with Enhanced REP MOVSB will be used for processors with ERMS. The new AVX512 memcpy/memmove will be used for processors with AVX512 which prefer vzeroupper.

Since the new SSE2 memcpy/memmove are faster than the previous default memcpy/memmove used in libc.a and ld.so, we also remove the previous default memcpy/memmove and make them the default memcpy/memmove, except that non-temporal store isn't used in ld.so.

Together, it reduces the size of libc.so by about 6 KB and the size of ld.so by about 2 KB.

[BZ #19776]
- sysdeps/x86_64/memcpy.S: Make it dummy.
- sysdeps/x86_64/mempcpy.S: Likewise.
- sysdeps/x86_64/memmove.S: New file.
- sysdeps/x86_64/memmove_chk.S: Likewise.
- sysdeps/x86_64/multiarch/memmove.S: Likewise.
- sysdeps/x86_64/multiarch/memmove_chk.S: Likewise.
- sysdeps/x86_64/memmove.c: Removed.
- sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: Likewise.
- sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Likewise.
- sysdeps/x86_64/multiarch/memmove-avx-unaligned.S: Likewise.
- sysdeps/x86_64/multiarch/memmove-sse2-unaligned-erms.S: Likewise.
- sysdeps/x86_64/multiarch/memmove.c: Likewise.
- sysdeps/x86_64/multiarch/memmove_chk.c: Likewise.
- sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Remove memcpy-sse2-unaligned, memmove-avx-unaligned, memcpy-avx-unaligned and memmove-sse2-unaligned-erms.
- sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Replace __memmove_chk_avx512_unaligned_2 with __memmove_chk_avx512_unaligned. Remove __memmove_chk_avx_unaligned_2. Replace __memmove_chk_sse2_unaligned_2 with __memmove_chk_sse2_unaligned. Remove __memmove_chk_sse2 and __memmove_avx_unaligned_2. Replace __memmove_avx512_unaligned_2 with __memmove_avx512_unaligned. Replace __memmove_sse2_unaligned_2 with __memmove_sse2_unaligned. Remove __memmove_sse2. Replace __memcpy_chk_avx512_unaligned_2 with __memcpy_chk_avx512_unaligned. Remove __memcpy_chk_avx_unaligned_2. Replace __memcpy_chk_sse2_unaligned_2 with __memcpy_chk_sse2_unaligned. Remove __memcpy_chk_sse2. Remove __memcpy_avx_unaligned_2. Replace __memcpy_avx512_unaligned_2 with __memcpy_avx512_unaligned. Remove __memcpy_sse2_unaligned_2 and __memcpy_sse2. Replace __mempcpy_chk_avx512_unaligned_2 with __mempcpy_chk_avx512_unaligned. Remove __mempcpy_chk_avx_unaligned_2. Replace __mempcpy_chk_sse2_unaligned_2 with __mempcpy_chk_sse2_unaligned. Remove __mempcpy_chk_sse2. Replace __mempcpy_avx512_unaligned_2 with __mempcpy_avx512_unaligned. Remove __mempcpy_avx_unaligned_2. Replace __mempcpy_sse2_unaligned_2 with __mempcpy_sse2_unaligned. Remove __mempcpy_sse2.
- sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Support __memcpy_avx512_unaligned_erms and __memcpy_avx512_unaligned. Use __memcpy_avx_unaligned_erms and __memcpy_sse2_unaligned_erms if processor has ERMS. Default to __memcpy_sse2_unaligned. (ENTRY): Removed. (END): Likewise. (ENTRY_CHK): Likewise. (libc_hidden_builtin_def): Likewise. Don't include ../memcpy.S.
- sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Support __memcpy_chk_avx512_unaligned_erms and __memcpy_chk_avx512_unaligned. Use __memcpy_chk_avx_unaligned_erms and __memcpy_chk_sse2_unaligned_erms if if processor has ERMS. Default to __memcpy_chk_sse2_unaligned.
- sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S Change function suffix from unaligned_2 to unaligned.
- sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Support __mempcpy_avx512_unaligned_erms and __mempcpy_avx512_unaligned. Use __mempcpy_avx_unaligned_erms and __mempcpy_sse2_unaligned_erms if processor has ERMS. Default to __mempcpy_sse2_unaligned. (ENTRY): Removed. (END): Likewise. (ENTRY_CHK): Likewise. (libc_hidden_builtin_def): Likewise. Don't include ../mempcpy.S. (mempcpy): New. Add a weak alias.
- sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Support __mempcpy_chk_avx512_unaligned_erms and __mempcpy_chk_avx512_unaligned. Use __mempcpy_chk_avx_unaligned_erms and __mempcpy_chk_sse2_unaligned_erms if if processor has ERMS. Default to __mempcpy_chk_sse2_unaligned.

c867597 X86-64: Remove previous default/SSE2/AVX2 memcpy/memmove
ChangeLog | 80 +++
sysdeps/x86_64/memcpy.S | 585 +-------------------
sysdeps/x86_64/memmove.S | 71 +++
sysdeps/x86_64/memmove.c | 26 -
sysdeps/x86_64/memmove_chk.S | 33 ++
sysdeps/x86_64/mempcpy.S | 9 +-
sysdeps/x86_64/multiarch/Makefile | 6 +-
sysdeps/x86_64/multiarch/ifunc-impl-list.c | 63 +--
sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S | 391 -------------
sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S | 175 ------
sysdeps/x86_64/multiarch/memcpy.S | 62 +--
sysdeps/x86_64/multiarch/memcpy_chk.S | 40 +-
sysdeps/x86_64/multiarch/memmove-avx-unaligned.S | 22 -
.../x86_64/multiarch/memmove-sse2-unaligned-erms.S | 13 -
.../x86_64/multiarch/memmove-vec-unaligned-erms.S | 24 +-
sysdeps/x86_64/multiarch/memmove.S | 98 ++++
sysdeps/x86_64/multiarch/memmove.c | 73 ---
sysdeps/x86_64/multiarch/memmove_chk.S | 71 +++
sysdeps/x86_64/multiarch/memmove_chk.c | 46 --
sysdeps/x86_64/multiarch/mempcpy.S | 74 +--
sysdeps/x86_64/multiarch/mempcpy_chk.S | 38 +-
21 files changed, 492 insertions(+), 1508 deletions(-)

Upstream: sourceware.org


  • Share