mirror of https://github.com/google/snappy.git
Explicitly copy internal::wordmask to the stack array to work around a compiler
optimization with LLVM that converts const stack arrays to global arrays. This is a temporary change and should be reverted when https://reviews.llvm.org/D30759 is fixed. With PIE, accessing stack arrays is more efficient than global arrays and wordmask was moved to the stack due to that. However, the LLVM compiler automatically converts stack arrays, detected as constant, to global arrays and this transformation hurts PIE performance with LLVM. We are working to fix this in the LLVM compiler, via https://reviews.llvm.org/D30759, to not do this conversion in PIE mode. Until this patch is finished, please consider this source change as a temporary work around to keep this array on the stack. This source change is important to allow some projects to flip the default compiler from GCC to LLVM for optimized builds. This change works for the following reason. The LLVM compiler does not convert non-const stack arrays to global arrays and explicitly copying the elements is enough to make the compiler assume that this is a non-const array. With GCC, this change does not affect code-gen in any significant way. The array initialization code is slightly different as it copies the constants directly to the stack. With LLVM, this keeps the array on the stack. No change in performance with GCC (within noise range). With LLVM, ~0.7% improvement in optimized mode (no FDO) and ~1.75% improvement in FDO mode.
This commit is contained in:
parent
82deffcde7
commit
f24f9d2d97
11
snappy.cc
11
snappy.cc
|
@ -662,7 +662,16 @@ class SnappyDecompressor {
|
|||
// For position-independent executables, accessing global arrays can be
|
||||
// slow. Move wordmask array onto the stack to mitigate this.
|
||||
uint32 wordmask[sizeof(internal::wordmask)/sizeof(uint32)];
|
||||
memcpy(wordmask, internal::wordmask, sizeof(wordmask));
|
||||
// Do not use memcpy to copy internal::wordmask to
|
||||
// wordmask. LLVM converts stack arrays to global arrays if it detects
|
||||
// const stack arrays and this hurts the performance of position
|
||||
// independent code. This change is temporary and can be reverted when
|
||||
// https://reviews.llvm.org/D30759 is approved.
|
||||
wordmask[0] = internal::wordmask[0];
|
||||
wordmask[1] = internal::wordmask[1];
|
||||
wordmask[2] = internal::wordmask[2];
|
||||
wordmask[3] = internal::wordmask[3];
|
||||
wordmask[4] = internal::wordmask[4];
|
||||
|
||||
// We could have put this refill fragment only at the beginning of the loop.
|
||||
// However, duplicating it at the end of each branch gives the compiler more
|
||||
|
|
Loading…
Reference in New Issue