Change a few ORs to additions where they don't matter. This helps the compiler

use the LEA instruction more efficiently, since e.g. a + (b << 2) can be encoded
as one instruction. Even more importantly, it can constant-fold the
COPY_* enums together with the shifted negative constants, which also saves
some instructions. (We don't need it for LITERAL, since it happens to be 0.)

I am unsure why the compiler couldn't do this itself, but the theory is that
it cannot prove that len-1 and len-4 cannot underflow/wrap, and thus can't
do the optimization safely.

The gains are small but measurable; 0.5-1.0% over the BM_Z* benchmarks
(measured on Westmere, Sandy Bridge and Istanbul).

R=sanjay


git-svn-id: https://snappy.googlecode.com/svn/trunk@69 03e5f5b5-db94-4691-08a0-1a8bf15f6143
This commit is contained in:
snappy.mirrorbot@gmail.com 2013-01-04 11:54:20 +00:00
parent 55209f9b92
commit 698af469b4
1 changed files with 2 additions and 2 deletions

View File

@ -202,10 +202,10 @@ static inline char* EmitCopyLessThan64(char* op, size_t offset, int len) {
if ((len < 12) && (offset < 2048)) {
size_t len_minus_4 = len - 4;
assert(len_minus_4 < 8); // Must fit in 3 bits
*op++ = COPY_1_BYTE_OFFSET | ((len_minus_4) << 2) | ((offset >> 8) << 5);
*op++ = COPY_1_BYTE_OFFSET + ((len_minus_4) << 2) + ((offset >> 8) << 5);
*op++ = offset & 0xff;
} else {
*op++ = COPY_2_BYTE_OFFSET | ((len-1) << 2);
*op++ = COPY_2_BYTE_OFFSET + ((len-1) << 2);
LittleEndian::Store16(op, offset);
op += 2;
}