Add option to force indirect calls for x86

Programming / Compilers / GCC - ak [138bc75d-0d04-0410-961f-82ee72b054a4] - 9 November 2017 05:42 EST

This patch adds a -mforce-indirect-call option to force all calls or tail calls on x86_64 between functions to indirect. This is similar to the large code model, but doesn't affect jumps inside functions, so has much less run time overhead.

This is useful with Intel Processor Trace (PT). PT has precise timing for indirect calls/jumps, but not for direct ones. So if we can force them to indirect it allows to time every function relatively accurately (minus the overhead of the indirect branch)

Without this short functions often don't see a timing update and cannot be measured.

The timing requires at least Skylake or Goldmont based CPUs.

I made it an option. Originally I tried to make it a new code model, but since it can be combined with other code models (medium, pic, kernel etc.) this turned out to be too many combinations.

For example with gcc. This first column is a ns time stamp for the functions.

$ perf record -e intel_pt/noretcomp=1,cyc=1,cyc_thresh=1/u ./cc1 -O3 hello.c $ perf script --itrace=cr -F callindent,time,sym,addr --ns | sed -n 180000,182000p | less

1184596.432756920: build_int_cst => 79c9de c_common_nodes_and_builtins 1184596.432756921: tree_cons => ee2080 tree_cons 1184596.432756938: ggc_internal_alloc => 80f3e0 ggc_internal_alloc 1184596.432756951: memset@plt => 598af0 memset@plt 1184596.432756967: __memset_avx2_unaligned_erms => 80f605 ggc_internal_alloc 1184596.432756969: ggc_internal_alloc => ee20a2 tree_cons 1184596.432756973: tree_cons => 79c9f4 c_common_nodes_and_builtins 1184596.432756974: build_int_cst => ef9a40 build_int_cst 1184596.432756996: wide_int_to_tree => ef93a0 wide_int_to_tree 1184596.432757000: wi::force_to_size => f48f70 wi::force_to_size 1184596.432757005: canonize => ef94de wide_int_to_tree 1184596.432757021: get_int_cst_ext_nunits => ee1960 get_int_cst_ext_nunits 1184596.432757026: get_int_cst_ext_nunits => ef94fe wide_int_to_tree 1184596.432757042: tree_int_cst_elt_check => 83e310 tree_int_cst_elt_check 1184596.432757044: tree_int_cst_elt_check => ef9761 wide_int_to_tree 1184596.432757046: wide_int_to_tree => ef9a9b build_int_cst

gcc/: 2017-11-08 Andi Kleen

- config/i386/i386.opt: Add -mforce-indirect-call.
- config/i386/predicates.md: Check for flag_force_indirect_call.
- doc/invoke.texi: Document -mforce-indirect-call

gcc/testsuite/: 2017-11-08 Andi Kleen

- gcc.target/i386/force-indirect-call-1.c: New test.
- gcc.target/i386/force-indirect-call-2.c: New test.
- gcc.target/i386/force-indirect-call-3.c: New test.

37db795769b Add option to force indirect calls for x86
gcc/ChangeLog | 6 ++++++
gcc/config/i386/i386.opt | 4 ++++
gcc/config/i386/predicates.md | 3 ++-
gcc/doc/invoke.texi | 8 +++++++-
gcc/testsuite/ChangeLog | 6 ++++++
.../gcc.target/i386/force-indirect-call-1.c | 23 ++++++++++++++++++++++
.../gcc.target/i386/force-indirect-call-2.c | 5 +++++
.../gcc.target/i386/force-indirect-call-3.c | 5 +++++
8 files changed, 58 insertions(+), 2 deletions(-)

Upstream: gcc.gnu.org


  • Share