The #if predicate evaluates to false if the macro is undefined, or
defined to 0. #ifdef (and its synonym #if defined) evaluates to false
only if the macro is undefined.
The new setup allows differentiating between setting a macro to 0 (to
express that the capability definitely does not exist / should not be
used) and leaving a macro undefined (to express not knowing whether a
capability exists / not caring if a capability is used).
PiperOrigin-RevId: 391094241
This lets us remove main() from snappy_bench.cc and snappy_unittest.cc,
which simplifies integrating these tests and benchmarks with other
suites.
PiperOrigin-RevId: 347857427
Snappy includes a testing framework, which implements a subset of the
Google Test API, and can be used when Google Test is not available.
Snappy also includes a micro-benchmark framework, which implements an
old version of the Google Benchmark API.
This CL replaces the custom test and micro-benchmark frameworks with
google/googletest and google/benchmark. The code is vendored in
third_party/ via git submodules. The setup is similar to google/crc32c
and google/leveldb.
This CL also updates the benchmarking code to the modern Google
Benchmark API.
Benchmark results are expected to be more precise, as the old framework
ran each benchmark with a fixed number of iterations, whereas Google
Benchmark keeps iterating until the noise is low.
PiperOrigin-RevId: 347456142
2) Replace offset extraction with a lookup mask. This is less uops and is needed because we need to special case type 3 to always return 0 as to properly trigger the fallback.
3) Unroll the loop twice, this removes some loop-condition checks AND it improves the generated assembly. The loop variables tend to end up in a different register requiring mov's having two consecutive copies allows the elision of the mov's.
PiperOrigin-RevId: 346663328
When SSSE3 is available:
- Use PSHUFB (_mm_shuffle_epi8) to handle pattern size 1 to 15 (previously it handled size 1 to 7).
- This enables us to do 16 byte copies instead of 8 bytes copies because we know that the pattern size >= 16.
- Use shuffle-reshuffle strategy to generate the next pattern after loading the initial pattern. This enables us to write 4 conditionals (similar to when pattern size >= 16) which would allow FDO to layout the code with respect to actual probabilities of each length.
- The PSHUFB masks are now generated programmatically at compile-time.
When SSSE3 is unavailable:
- No change.
In both cases:
- assert(op < op_limit) in IncrementalCopy so that we can check 'op_limit <= buf_limit - 15' instead of 'op_limit <= buf_limit - 16'. All existing call sites of IncrementalCopy guarantee this.
'bin' case is notably >20% faster because it has many repeated character patterns (i.e. pattern_size = 1).
PiperOrigin-RevId: 346454471
I also made the compression happen only once per benchmark. This way we get a cleaner measurement of #branch-misses using "perf stat". Compression suffers naturally from a large number of branch misses which was polluting the measurements.
This showed that with the new decompression the branch misses is actually much lower then initially reported, only .2% and very stable, ie. doesn't really fluctuate with how you execute the benchmarks.
PiperOrigin-RevId: 342628576
* Replace post-increment with pre-increment in for loops.
* Replace unsigned int counters with precise types, like uint8_t.
* Switch to C++11 iterating loops when possible.
PiperOrigin-RevId: 309724233
This CL makes the following substitutions.
* assert.h -> cassert
* math.h -> cmath
* stdarg.h -> cstdarg
* stdio.h -> cstdio
* stdlib.h -> cstdlib
* string.h -> cstring
stddef.h and stdint.h are not migrated to C++ headers.
PiperOrigin-RevId: 309074805
The following changes are done via find/replace.
* int8 -> int8_t
* int16 -> int16_t
* int32 -> int32_t
* int64 -> int64_t
The aliases were removed from snappy-stubs-public.h.
PiperOrigin-RevId: 306141557
This CL replaces memcpy() with std::memcpy()
and memmove() with std::memmove(), and #includes
<cstring> in files that use either function.
PiperOrigin-RevId: 306067788
1) It shaves of a few cycles from the data dependency chain. By using "shrd" instead of a load.
2) The important loop is finding small copies (4-12) which are either "copy 1", or "copy 2" depending if the offset fits <2048. It turns out that this is a branch that is mispredicted often. Due to the long dependency chain the CPU is running with IPC~1 anyway so we can freely add instructions to instead emit copies branchfree. This reduces the branch misspredicts from 15% to 11% (for BM_ZFlat/6 txt1) and from 5.6% to 4% (for BM_ZFlat/10 or pb).
PiperOrigin-RevId: 303328967
Copybara transforms code slightly different than MOE. One
example is the TODO username stripping where Copybara
produces different results than MOE did. This change
moves the Copybara versions of comments to the public
repository.
Note: These changes didn't originate in cl/247950252.
PiperOrigin-RevId: 247950252
A previous CL removed use of Google-specific random number generating
functionality, such as ACMRandom, and used the C++11 standard library
instead. The CL used std::uniform_distribution<uint8_t> to generate
random bytes, which seems to be unsupported by the standard [1, 2].
For better or for worse, our toolchain does not complain. However,
Visual Studio errors out with "invalid template argument for
uniform_int_distribution: N4659 29.6.1.1 [rand.req.genl]/1e requires one
of short, int, long, long long, unsigned short, unsigned int, unsigned
long, or unsigned long long".
This CL replaces std::uniform_distribution<uint8_t> with
std::uniform_distribution<int>(0, 255) and appropriate static_cast<>s.
[1] http://eel.is/c++draft/rand.req.genl#1.6
[2] be83c0b472/source/numerics.tex (L1807-L1817)
An earlier CL introduced absl::Uniform, which is not yet open sourced,
and therefore unavailable in the open source build.
This CL removes absl::Uniform and ACMRandom in favor of equivalent C++11
standard random generators. Abseil promises to be faster than the
standard library, but we can afford a speed hit in tests in return for
an easier open sourcing story.
would incorrectly accept some invalid varints that the other path would not,
causing potential CHECK-failures if the unit test were run with
--write_uncompressed and a corrupted input file.
Found by the afl fuzzer.
automatically together with the other tests, and also removes the stray
function ComputeTable() (which was never referenced by anything else
in the open-source version, causing compiler warnings for some)
out of the core library.
Fixes public issue 96.
A=sesse
R=sanjay
This code was not compiling under Visual Studio 2013 with warnings being treated
as errors. Specifically:
1. Changed int -> size_t to eliminate signed/unsigned mismatch warning.
2. Added some missing return values to functions.
3. Inserting character instead of integer literals into strings to avoid type
conversions.
A=cmumford
R=jeff