Explicitly copy internal::wordmask to the stack array to work around a compiler

optimization with LLVM that converts const stack arrays to global arrays.  This
is a temporary change and should be reverted when https://reviews.llvm.org/D30759
is fixed.

With PIE, accessing stack arrays is more efficient than global arrays and
wordmask was moved to the stack due to that.  However, the LLVM compiler
automatically converts stack arrays, detected as constant, to global arrays
and this transformation hurts PIE performance with LLVM.

We are working to fix this in the LLVM compiler, via
https://reviews.llvm.org/D30759, to not do this conversion in PIE mode.  Until
this patch is finished, please consider this source change as a temporary
work around to keep this array on the stack.  This source change is important
to allow some projects to flip the default compiler from GCC to LLVM for
optimized builds.

This change works for the following reason.  The LLVM compiler does not convert
non-const stack arrays to global arrays and explicitly copying the elements is
enough to make the compiler assume that this is a non-const array.

With GCC, this change does not affect code-gen in any significant way.  The
array initialization code is slightly different as it copies the constants
directly to the stack.

With LLVM, this keeps the array on the stack.

No change in performance with GCC (within noise range). With LLVM, ~0.7%
improvement in optimized mode (no FDO) and ~1.75% improvement in FDO
mode.
This commit is contained in:
tmsriram 2017-06-15 14:24:18 -07:00 committed by Victor Costan
parent 82deffcde7
commit f24f9d2d97
1 changed files with 10 additions and 1 deletions

View File

@ -662,7 +662,16 @@ class SnappyDecompressor {
// For position-independent executables, accessing global arrays can be
// slow. Move wordmask array onto the stack to mitigate this.
uint32 wordmask[sizeof(internal::wordmask)/sizeof(uint32)];
memcpy(wordmask, internal::wordmask, sizeof(wordmask));
// Do not use memcpy to copy internal::wordmask to
// wordmask. LLVM converts stack arrays to global arrays if it detects
// const stack arrays and this hurts the performance of position
// independent code. This change is temporary and can be reverted when
// https://reviews.llvm.org/D30759 is approved.
wordmask[0] = internal::wordmask[0];
wordmask[1] = internal::wordmask[1];
wordmask[2] = internal::wordmask[2];
wordmask[3] = internal::wordmask[3];
wordmask[4] = internal::wordmask[4];
// We could have put this refill fragment only at the beginning of the loop.
// However, duplicating it at the end of each branch gives the compiler more