Change a few ORs to additions where they don't matter. This helps the compiler

use the LEA instruction more efficiently, since e.g. a + (b << 2) can be encoded as one instruction. Even more importantly, it can constant-fold the COPY_* enums together with the shifted negative constants, which also saves some instructions. (We don't need it for LITERAL, since it happens to be 0.) I am unsure why the compiler couldn't do this itself, but the theory is that it cannot prove that len-1 and len-4 cannot underflow/wrap, and thus can't do the optimization safely. The gains are small but measurable; 0.5-1.0% over the BM_Z* benchmarks (measured on Westmere, Sandy Bridge and Istanbul). R=sanjay git-svn-id: https://snappy.googlecode.com/svn/trunk@69 03e5f5b5-db94-4691-08a0-1a8bf15f6143
2013-01-04 11:54:20 +00:00 · 2013-01-04 11:54:20 +00:00 · 698af469b4
parent 55209f9b92
commit 698af469b4
1 changed files with 2 additions and 2 deletions
--- a/snappy.cc
+++ b/snappy.cc
@ -202,10 +202,10 @@ static inline char* EmitCopyLessThan64(char* op, size_t offset, int len) {
  if ((len < 12) && (offset < 2048)) {
    size_t len_minus_4 = len - 4;
    assert(len_minus_4 < 8);            // Must fit in 3 bits
-    *op++ = COPY_1_BYTE_OFFSET | ((len_minus_4) << 2) | ((offset >> 8) << 5);
+    *op++ = COPY_1_BYTE_OFFSET + ((len_minus_4) << 2) + ((offset >> 8) << 5);
    *op++ = offset & 0xff;
  } else {
-    *op++ = COPY_2_BYTE_OFFSET | ((len-1) << 2);
+    *op++ = COPY_2_BYTE_OFFSET + ((len-1) << 2);
    LittleEndian::Store16(op, offset);
    op += 2;
  }