snappy

Commit Graph

Author	SHA1	Message	Date
Marcin Kowalczyk	984b191f0f	Fix the remaining occurrence of non-const `std::string::data()`. PiperOrigin-RevId: 479818960	2022-10-08 21:59:12 +02:00
Matt Callanan	974fcc49e8	Fix compilation errors under C++11. `std::string::data()` is const-only until C++17. PiperOrigin-RevId: 479708109	2022-10-08 08:41:35 +02:00
Matt Callanan	9758c9dfd7	Add `snappy::CompressFromIOVec`. This reads from an `iovec` array rather than from a `char` array as in `snappy::Compress`. PiperOrigin-RevId: 476930623	2022-09-29 09:32:28 -07:00
Victor Costan	cbb83a1d64	Migrate feature detection macro checks from #ifdef to #if. The #if predicate evaluates to false if the macro is undefined, or defined to 0. #ifdef (and its synonym #if defined) evaluates to false only if the macro is undefined. The new setup allows differentiating between setting a macro to 0 (to express that the capability definitely does not exist / should not be used) and leaving a macro undefined (to express not knowing whether a capability exists / not caring if a capability is used). PiperOrigin-RevId: 391094241	2021-08-16 18:26:33 +00:00
Snappy Team	d8f5dd8eca	Clarify, in a comment, that offset/256 fits in 3 bits. It has to in this context, because the other 5 bits in the byte are used for len-4 and the tag. PiperOrigin-RevId: 374926553	2021-05-25 02:20:42 +00:00
Victor Costan	5e7c14bd05	Add stubs for abseil flags. This CL also removes support for using the gflags library to modify the flags. PiperOrigin-RevId: 361583626	2021-03-08 17:26:48 +00:00
Snappy Team	453942b38f	Add absl::GetFlag and absl::SetFlag to uses of flags. PiperOrigin-RevId: 357807059	2021-02-17 04:41:41 +00:00
Victor Costan	4ebd8b2f23	Split benchmarks and test tools into separate targets. This lets us remove main() from snappy_bench.cc and snappy_unittest.cc, which simplifies integrating these tests and benchmarks with other suites. PiperOrigin-RevId: 347857427	2020-12-16 19:09:56 +00:00
Victor Costan	6aa79cb471	Wrap snappy_unittest in an anonymous namespace and remove static from functions. PiperOrigin-RevId: 347541028	2020-12-15 06:18:35 +00:00
Victor Costan	549685a598	Remove custom testing and benchmarking code. Snappy includes a testing framework, which implements a subset of the Google Test API, and can be used when Google Test is not available. Snappy also includes a micro-benchmark framework, which implements an old version of the Google Benchmark API. This CL replaces the custom test and micro-benchmark frameworks with google/googletest and google/benchmark. The code is vendored in third_party/ via git submodules. The setup is similar to google/crc32c and google/leveldb. This CL also updates the benchmarking code to the modern Google Benchmark API. Benchmark results are expected to be more precise, as the old framework ran each benchmark with a fixed number of iterations, whereas Google Benchmark keeps iterating until the noise is low. PiperOrigin-RevId: 347456142	2020-12-14 21:27:31 +00:00
Snappy Team	3b571656fa	1) Improve the lookup table data to require less instructions to extract the necessary data. We now store len - offset in a signed int16, this happens to remove masking offset in the calculations and the calculations that need to be done precisely give the flags that we need for testing correctness. 2) Replace offset extraction with a lookup mask. This is less uops and is needed because we need to special case type 3 to always return 0 as to properly trigger the fallback. 3) Unroll the loop twice, this removes some loop-condition checks AND it improves the generated assembly. The loop variables tend to end up in a different register requiring mov's having two consecutive copies allows the elision of the mov's. PiperOrigin-RevId: 346663328	2020-12-14 02:48:03 +00:00
Shahriar Rouf	a9730ed505	Optimize zippy decompression by making IncrementalCopy faster. When SSSE3 is available: - Use PSHUFB (_mm_shuffle_epi8) to handle pattern size 1 to 15 (previously it handled size 1 to 7). - This enables us to do 16 byte copies instead of 8 bytes copies because we know that the pattern size >= 16. - Use shuffle-reshuffle strategy to generate the next pattern after loading the initial pattern. This enables us to write 4 conditionals (similar to when pattern size >= 16) which would allow FDO to layout the code with respect to actual probabilities of each length. - The PSHUFB masks are now generated programmatically at compile-time. When SSSE3 is unavailable: - No change. In both cases: - assert(op < op_limit) in IncrementalCopy so that we can check 'op_limit <= buf_limit - 15' instead of 'op_limit <= buf_limit - 16'. All existing call sites of IncrementalCopy guarantee this. 'bin' case is notably >20% faster because it has many repeated character patterns (i.e. pattern_size = 1). PiperOrigin-RevId: 346454471	2020-12-14 02:47:49 +00:00
Snappy Team	01a566f825	Fix opensource version PiperOrigin-RevId: 343272548	2020-11-19 17:06:26 +00:00
Snappy Team	616b8229b6	Add LZ4 as a benchmark option. Snappy is starting to look really good compared to LZ4. LZ4 is considered the fastest solution by many on internet. We now see that Snappy is actually becoming very competitive with compression a little faster and decompression slower but certainly not terribly slower. PiperOrigin-RevId: 343140860	2020-11-18 23:22:04 +00:00
Snappy Team	e4a6e97b91	Extend validate benchmarks over all types and also add a medley for validation. I also made the compression happen only once per benchmark. This way we get a cleaner measurement of #branch-misses using "perf stat". Compression suffers naturally from a large number of branch misses which was polluting the measurements. This showed that with the new decompression the branch misses is actually much lower then initially reported, only .2% and very stable, ie. doesn't really fluctuate with how you execute the benchmarks. PiperOrigin-RevId: 342628576	2020-11-18 23:21:55 +00:00
Snappy Team	11e5165b98	Add a benchmark that decreased the branch prediction memorization by increasing the amount of independent branches executed per benchmark iteration. PiperOrigin-RevId: 342242843	2020-11-18 23:21:12 +00:00
Victor Costan	c98344f626	Fix Clang/GCC compilation warnings. This makes it easier to adopt snappy in other projects. PiperOrigin-RevId: 309958249	2020-05-05 16:15:02 +00:00
Victor Costan	113cd97ab3	Tighten types on a few for loops. * Replace post-increment with pre-increment in for loops. * Replace unsigned int counters with precise types, like uint8_t. * Switch to C++11 iterating loops when possible. PiperOrigin-RevId: 309724233	2020-05-04 12:32:00 +00:00
Victor Costan	63620c06d2	Add some std:: qualifiers to types and functions. PiperOrigin-RevId: 309110343	2020-04-29 22:31:55 +00:00
Victor Costan	5417da69b7	Switch from C headers to C++ headers. This CL makes the following substitutions. * assert.h -> cassert * math.h -> cmath * stdarg.h -> cstdarg * stdio.h -> cstdio * stdlib.h -> cstdlib * string.h -> cstring stddef.h and stdint.h are not migrated to C++ headers. PiperOrigin-RevId: 309074805	2020-04-29 19:38:03 +00:00
Victor Costan	231b8be076	Migrate to standard integral types. The following changes are done via find/replace. * int8 -> int8_t * int16 -> int16_t * int32 -> int32_t * int64 -> int64_t The aliases were removed from snappy-stubs-public.h. PiperOrigin-RevId: 306141557	2020-04-12 20:10:03 +00:00
Victor Costan	14bef66290	Modernize memcpy() and memmove() usage. This CL replaces memcpy() with std::memcpy() and memmove() with std::memmove(), and #includes <cstring> in files that use either function. PiperOrigin-RevId: 306067788	2020-04-12 00:06:15 +00:00
Snappy Team	4dfcad9f4e	assertion failure on darwin_x86_64, have to investigage PiperOrigin-RevId: 303428229	2020-04-11 04:41:07 +00:00
Snappy Team	e19178748f	assertion failure on darwin_x86_64, have to investigage PiperOrigin-RevId: 303346402	2020-04-11 04:40:57 +00:00
Snappy Team	0faf56378e	This cl does two things 1) It shaves of a few cycles from the data dependency chain. By using "shrd" instead of a load. 2) The important loop is finding small copies (4-12) which are either "copy 1", or "copy 2" depending if the offset fits <2048. It turns out that this is a branch that is mispredicted often. Due to the long dependency chain the CPU is running with IPC~1 anyway so we can freely add instructions to instead emit copies branchfree. This reduces the branch misspredicts from 15% to 11% (for BM_ZFlat/6 txt1) and from 5.6% to 4% (for BM_ZFlat/10 or pb). PiperOrigin-RevId: 303328967	2020-04-11 04:40:48 +00:00
Victor Costan	f48c38f91a	Fix one forgotten instance of StringPrintf -> StrFormat. PiperOrigin-RevId: 278315159	2019-11-04 00:09:19 -08:00
Victor Costan	c9212708b2	Fix build errors. PiperOrigin-RevId: 278310119	2019-11-03 23:24:02 -08:00
Snappy Team	8f32e3fbc0	Internal changes PiperOrigin-RevId: 277555451	2019-11-03 21:51:08 -08:00
Victor Costan	62363d9a79	Fully qualify std::string. This is in preparation for removing the snappy::string alias of std::string. PiperOrigin-RevId: 271383199	2019-09-26 10:57:29 -07:00
Victor Costan	44d84addf2	Fix benchmarks. PiperOrigin-RevId: 264501168	2019-08-20 17:17:53 -07:00
Victor Costan	c6bf1170d8	Fix benchmarks. PiperOrigin-RevId: 264420835	2019-08-20 13:16:53 -07:00
Shahriar Rouf	4c7f2d5dfb	Add BM_ZFlatAll, BM_ZFlatIncreasingTableSize benchmarks to see how good zippy performs when it is processing different data one after the other. PiperOrigin-RevId: 257518137	2019-08-19 14:30:00 -07:00
Chris Mumford	c76b053449	Sync TODO and comment processing with external repo. Copybara transforms code slightly different than MOE. One example is the TODO username stripping where Copybara produces different results than MOE did. This change moves the Copybara versions of comments to the public repository. Note: These changes didn't originate in cl/247950252. PiperOrigin-RevId: 247950252	2019-05-14 11:02:57 -07:00
costan	9a6fa91217	Remove use of std::uniform_distribution<uint8_t>. A previous CL removed use of Google-specific random number generating functionality, such as ACMRandom, and used the C++11 standard library instead. The CL used std::uniform_distribution<uint8_t> to generate random bytes, which seems to be unsupported by the standard [1, 2]. For better or for worse, our toolchain does not complain. However, Visual Studio errors out with "invalid template argument for uniform_int_distribution: N4659 29.6.1.1 [rand.req.genl]/1e requires one of short, int, long, long long, unsigned short, unsigned int, unsigned long, or unsigned long long". This CL replaces std::uniform_distribution<uint8_t> with std::uniform_distribution<int>(0, 255) and appropriate static_cast<>s. [1] http://eel.is/c++draft/rand.req.genl#1.6 [2] `be83c0b472/source/numerics.tex (L1807-L1817)`	2019-01-06 12:48:39 -08:00
costan	3fcbc47f99	Use std random number generators in tests. An earlier CL introduced absl::Uniform, which is not yet open sourced, and therefore unavailable in the open source build. This CL removes absl::Uniform and ACMRandom in favor of equivalent C++11 standard random generators. Abseil promises to be faster than the standard library, but we can afford a speed hit in tests in return for an easier open sourcing story.	2019-01-04 19:09:39 -08:00
jueminyang	254966c71e	Migrate to use absl::random	2019-01-04 19:08:11 -08:00
alkis	53a38e5e33	Reduce number of allocations when compressing and simplify the code. Before we were allocating at least once: twice with large table and thrice when we used a scratch buffer. With this approach we always allocate once. name old speed new speed delta BM_UFlat/0 [html ] 2.45GB/s ± 0% 2.45GB/s ± 0% -0.13% (p=0.000 n=11+11) BM_UFlat/1 [urls ] 1.19GB/s ± 0% 1.22GB/s ± 0% +2.48% (p=0.000 n=11+11) BM_UFlat/2 [jpg ] 17.2GB/s ± 2% 17.3GB/s ± 1% ~ (p=0.193 n=11+11) BM_UFlat/3 [jpg_200 ] 1.52GB/s ± 0% 1.51GB/s ± 0% -0.78% (p=0.000 n=10+9) BM_UFlat/4 [pdf ] 12.5GB/s ± 1% 12.5GB/s ± 1% ~ (p=0.881 n=9+9) BM_UFlat/5 [html4 ] 1.86GB/s ± 0% 1.86GB/s ± 0% ~ (p=0.123 n=11+11) BM_UFlat/6 [txt1 ] 793MB/s ± 0% 799MB/s ± 0% +0.78% (p=0.000 n=11+9) BM_UFlat/7 [txt2 ] 739MB/s ± 0% 744MB/s ± 0% +0.77% (p=0.000 n=11+11) BM_UFlat/8 [txt3 ] 839MB/s ± 0% 845MB/s ± 0% +0.71% (p=0.000 n=11+11) BM_UFlat/9 [txt4 ] 678MB/s ± 0% 685MB/s ± 0% +1.01% (p=0.000 n=11+11) BM_UFlat/10 [pb ] 3.08GB/s ± 0% 3.12GB/s ± 0% +1.21% (p=0.000 n=11+11) BM_UFlat/11 [gaviota ] 975MB/s ± 0% 976MB/s ± 0% +0.11% (p=0.000 n=11+11) BM_UFlat/12 [cp ] 1.73GB/s ± 1% 1.74GB/s ± 1% +0.46% (p=0.010 n=11+11) BM_UFlat/13 [c ] 1.53GB/s ± 0% 1.53GB/s ± 0% ~ (p=0.987 n=11+10) BM_UFlat/14 [lsp ] 1.65GB/s ± 0% 1.63GB/s ± 1% -1.04% (p=0.000 n=11+11) BM_UFlat/15 [xls ] 1.08GB/s ± 0% 1.15GB/s ± 0% +6.12% (p=0.000 n=10+11) BM_UFlat/16 [xls_200 ] 944MB/s ± 0% 920MB/s ± 3% -2.51% (p=0.000 n=9+11) BM_UFlat/17 [bin ] 1.86GB/s ± 0% 1.87GB/s ± 0% +0.68% (p=0.000 n=10+11) BM_UFlat/18 [bin_200 ] 1.91GB/s ± 3% 1.92GB/s ± 5% ~ (p=0.356 n=11+11) BM_UFlat/19 [sum ] 1.31GB/s ± 0% 1.40GB/s ± 0% +6.53% (p=0.000 n=11+11) BM_UFlat/20 [man ] 1.42GB/s ± 0% 1.42GB/s ± 0% +0.33% (p=0.000 n=10+10)	2019-01-04 19:07:49 -08:00
jefflim	27ff0af12a	Improve performance of zippy decompression to IOVecs by up to almost 50% 1) Simplify loop condition for small pattern IncrementalCopy 2) Use pointers rather than indices to track current iovec. 3) Use fast IncrementalCopy 4) Bypass Append check from within AppendFromSelf While this code greatly improves the performance of ZippyIOVecWriter, a bigger question is whether IOVec writing should be improved, or removed. Perf tests: name old speed new speed delta BM_UFlat/0 [html ] 2.13GB/s ± 0% 2.14GB/s ± 1% ~ BM_UFlat/1 [urls ] 1.22GB/s ± 0% 1.24GB/s ± 0% +1.87% BM_UFlat/2 [jpg ] 17.2GB/s ± 1% 17.1GB/s ± 0% ~ BM_UFlat/3 [jpg_200 ] 1.55GB/s ± 0% 1.53GB/s ± 2% ~ BM_UFlat/4 [pdf ] 12.8GB/s ± 1% 12.7GB/s ± 2% -0.36% BM_UFlat/5 [html4 ] 1.89GB/s ± 0% 1.90GB/s ± 1% ~ BM_UFlat/6 [txt1 ] 811MB/s ± 0% 829MB/s ± 1% +2.24% BM_UFlat/7 [txt2 ] 756MB/s ± 0% 774MB/s ± 1% +2.41% BM_UFlat/8 [txt3 ] 860MB/s ± 0% 879MB/s ± 1% +2.16% BM_UFlat/9 [txt4 ] 699MB/s ± 0% 715MB/s ± 1% +2.31% BM_UFlat/10 [pb ] 2.64GB/s ± 0% 2.65GB/s ± 1% ~ BM_UFlat/11 [gaviota ] 1.00GB/s ± 0% 0.99GB/s ± 2% ~ BM_UFlat/12 [cp ] 1.66GB/s ± 1% 1.66GB/s ± 2% ~ BM_UFlat/13 [c ] 1.53GB/s ± 0% 1.47GB/s ± 5% -3.97% BM_UFlat/14 [lsp ] 1.60GB/s ± 1% 1.55GB/s ± 5% -3.41% BM_UFlat/15 [xls ] 1.12GB/s ± 0% 1.15GB/s ± 0% +1.93% BM_UFlat/16 [xls_200 ] 918MB/s ± 2% 929MB/s ± 1% +1.15% BM_UFlat/17 [bin ] 1.86GB/s ± 0% 1.89GB/s ± 1% +1.61% BM_UFlat/18 [bin_200 ] 1.90GB/s ± 1% 1.97GB/s ± 1% +3.67% BM_UFlat/19 [sum ] 1.32GB/s ± 0% 1.33GB/s ± 1% ~ BM_UFlat/20 [man ] 1.39GB/s ± 0% 1.36GB/s ± 3% ~ BM_UValidate/0 [html ] 2.85GB/s ± 3% 2.90GB/s ± 0% ~ BM_UValidate/1 [urls ] 1.57GB/s ± 0% 1.56GB/s ± 0% -0.20% BM_UValidate/2 [jpg ] 824GB/s ± 0% 825GB/s ± 0% +0.11% BM_UValidate/3 [jpg_200 ] 2.01GB/s ± 0% 2.02GB/s ± 0% +0.10% BM_UValidate/4 [pdf ] 30.4GB/s ±11% 33.5GB/s ± 0% ~ BM_UIOVec/0 [html ] 604MB/s ± 0% 856MB/s ± 0% +41.70% BM_UIOVec/1 [urls ] 440MB/s ± 0% 660MB/s ± 0% +49.91% BM_UIOVec/2 [jpg ] 15.1GB/s ± 1% 15.3GB/s ± 1% +1.22% BM_UIOVec/3 [jpg_200 ] 567MB/s ± 1% 629MB/s ± 0% +10.89% BM_UIOVec/4 [pdf ] 7.16GB/s ± 2% 8.56GB/s ± 1% +19.64% BM_UFlatSink/0 [html ] 2.13GB/s ± 0% 2.16GB/s ± 0% +1.47% BM_UFlatSink/1 [urls ] 1.22GB/s ± 0% 1.25GB/s ± 0% +2.18% BM_UFlatSink/2 [jpg ] 17.1GB/s ± 2% 17.1GB/s ± 2% ~ BM_UFlatSink/3 [jpg_200 ] 1.51GB/s ± 1% 1.53GB/s ± 2% +1.11% BM_UFlatSink/4 [pdf ] 12.7GB/s ± 2% 12.8GB/s ± 1% +0.67% BM_UFlatSink/5 [html4 ] 1.90GB/s ± 0% 1.92GB/s ± 0% +1.31% BM_UFlatSink/6 [txt1 ] 810MB/s ± 0% 835MB/s ± 0% +3.04% BM_UFlatSink/7 [txt2 ] 755MB/s ± 0% 779MB/s ± 0% +3.19% BM_UFlatSink/8 [txt3 ] 859MB/s ± 0% 884MB/s ± 0% +2.86% BM_UFlatSink/9 [txt4 ] 698MB/s ± 0% 718MB/s ± 0% +2.96% BM_UFlatSink/10 [pb ] 2.64GB/s ± 0% 2.67GB/s ± 0% +1.16% BM_UFlatSink/11 [gaviota ] 1.00GB/s ± 0% 1.01GB/s ± 0% +1.04% BM_UFlatSink/12 [cp ] 1.66GB/s ± 1% 1.68GB/s ± 1% +0.83% BM_UFlatSink/13 [c ] 1.52GB/s ± 1% 1.53GB/s ± 0% +0.38% BM_UFlatSink/14 [lsp ] 1.60GB/s ± 1% 1.61GB/s ± 0% +0.91% BM_UFlatSink/15 [xls ] 1.12GB/s ± 0% 1.15GB/s ± 0% +1.96% BM_UFlatSink/16 [xls_200 ] 906MB/s ± 3% 920MB/s ± 1% +1.55% BM_UFlatSink/17 [bin ] 1.86GB/s ± 0% 1.90GB/s ± 0% +2.15% BM_UFlatSink/18 [bin_200 ] 1.85GB/s ± 2% 1.92GB/s ± 2% +4.01% BM_UFlatSink/19 [sum ] 1.32GB/s ± 1% 1.35GB/s ± 0% +2.23% BM_UFlatSink/20 [man ] 1.39GB/s ± 1% 1.40GB/s ± 0% +1.12% BM_ZFlat/0 [html (22.31 %) ] 800MB/s ± 0% 793MB/s ± 0% -0.95% BM_ZFlat/1 [urls (47.78 %) ] 423MB/s ± 0% 424MB/s ± 0% +0.11% BM_ZFlat/2 [jpg (99.95 %) ] 12.0GB/s ± 2% 12.0GB/s ± 4% ~ BM_ZFlat/3 [jpg_200 (73.00 %)] 592MB/s ± 3% 594MB/s ± 2% ~ BM_ZFlat/4 [pdf (83.30 %) ] 7.26GB/s ± 1% 7.23GB/s ± 2% -0.49% BM_ZFlat/5 [html4 (22.52 %) ] 738MB/s ± 0% 739MB/s ± 0% +0.17% BM_ZFlat/6 [txt1 (57.88 %) ] 286MB/s ± 0% 285MB/s ± 0% -0.09% BM_ZFlat/7 [txt2 (61.91 %) ] 264MB/s ± 0% 264MB/s ± 0% +0.08% BM_ZFlat/8 [txt3 (54.99 %) ] 300MB/s ± 0% 300MB/s ± 0% ~ BM_ZFlat/9 [txt4 (66.26 %) ] 248MB/s ± 0% 247MB/s ± 0% -0.20% BM_ZFlat/10 [pb (19.68 %) ] 1.04GB/s ± 0% 1.03GB/s ± 0% -1.17% BM_ZFlat/11 [gaviota (37.72 %)] 451MB/s ± 0% 450MB/s ± 0% -0.35% BM_ZFlat/12 [cp (48.12 %) ] 543MB/s ± 0% 538MB/s ± 0% -1.04% BM_ZFlat/13 [c (42.47 %) ] 638MB/s ± 1% 643MB/s ± 0% +0.68% BM_ZFlat/14 [lsp (48.37 %) ] 686MB/s ± 0% 691MB/s ± 1% +0.76% BM_ZFlat/15 [xls (41.23 %) ] 636MB/s ± 0% 633MB/s ± 0% -0.52% BM_ZFlat/16 [xls_200 (78.00 %)] 523MB/s ± 2% 520MB/s ± 2% -0.56% BM_ZFlat/17 [bin (18.11 %) ] 1.01GB/s ± 0% 1.01GB/s ± 0% +0.50% BM_ZFlat/18 [bin_200 (7.50 %) ] 2.45GB/s ± 1% 2.44GB/s ± 1% -0.54% BM_ZFlat/19 [sum (48.96 %) ] 487MB/s ± 0% 478MB/s ± 0% -1.89% BM_ZFlat/20 [man (59.21 %) ] 567MB/s ± 1% 566MB/s ± 1% ~ The BM_UFlat/13 and BM_UFlat/14 results showed high variance, so I reran them: name old speed new speed delta BM_UFlat/13 [c ] 1.53GB/s ± 0% 1.53GB/s ± 1% ~ BM_UFlat/14 [lsp] 1.61GB/s ± 1% 1.61GB/s ± 1% +0.25%	2018-08-07 23:41:17 -07:00
costan	c8049c5827	Replace getpagesize() with sysconf(_SC_PAGESIZE). getpagesize() has been removed from POSIX.1-2001. Its recommended replacement is sysconf(_SC_PAGESIZE).	2017-08-01 14:38:57 -07:00
ysaed	82deffcde7	Remove benchmarking support for fastlz.	2017-06-28 18:33:55 -07:00
jyrki	83179dd8be	Remove quicklz and lzf support in benchmarks.	2017-06-05 13:54:10 -07:00
costan	ed3b7b242b	Clean up unused function warnings in snappy.	2017-03-17 13:59:03 -07:00
costan	8b60aac4fd	Remove "using namespace std;" from zippy-stubs-internal.h. This makes it easier to build zippy, as some compiles require a warning suppression to accept "using namespace std".	2017-03-13 13:03:01 -07:00
scrubbed	039b3a7ace	Add std:: prefix to STL non-type names. In order to disable global using declarations, this CL qualifies stl names with the std namespace.	2017-03-08 11:42:30 -08:00
Behzad Nouri	818b583387	adds std:: to stl types (#061 )	2017-01-26 21:43:13 +01:00
Geoff Pike	38a5ec5fca	Re-work fast path that emits copies in zippy compression. The primary motivation for the change is that FindMatchLength is likely to discover a difference in the first 8 bytes it compares. If that occurs then we know the length of the match is less than 12, because FindMatchLength is invoked after a 4-byte match is found. When emitting a copy, it is useful to know that the length is less than 12 because the two-byte variant of an emitted copy requires that. This is a performance-tuning change that should not affect the library's behavior. With FDO on perflab/Haswell the geometric mean for ZFlat/* went from 47,290ns to 45,741ns, an improvement of 3.4%. SAMPLE (before) BM_ZFlat/0 102824 102650 40691 951.4MB/s html (22.31 %) BM_ZFlat/1 1293512 1290442 3225 518.9MB/s urls (47.78 %) BM_ZFlat/2 10373 10353 417959 11.1GB/s jpg (99.95 %) BM_ZFlat/3 268 268 15745324 712.4MB/s jpg_200 (73.00 %) BM_ZFlat/4 12137 12113 342462 7.9GB/s pdf (83.30 %) BM_ZFlat/5 430672 429720 9724 909.0MB/s html4 (22.52 %) BM_ZFlat/6 420541 419636 9833 345.6MB/s txt1 (57.88 %) BM_ZFlat/7 373829 373158 10000 319.9MB/s txt2 (61.91 %) BM_ZFlat/8 1119014 1116604 3755 364.5MB/s txt3 (54.99 %) BM_ZFlat/9 1544203 1540657 2748 298.3MB/s txt4 (66.26 %) BM_ZFlat/10 91041 90866 46002 1.2GB/s pb (19.68 %) BM_ZFlat/11 332766 331990 10000 529.5MB/s gaviota (37.72 %) BM_ZFlat/12 39960 39886 100000 588.3MB/s cp (48.12 %) BM_ZFlat/13 14493 14465 287181 735.1MB/s c (42.47 %) BM_ZFlat/14 4447 4440 947927 799.3MB/s lsp (48.37 %) BM_ZFlat/15 1316362 1313350 3196 747.7MB/s xls (41.23 %) BM_ZFlat/16 312 311 10000000 613.0MB/s xls_200 (78.00 %) BM_ZFlat/17 388471 387502 10000 1.2GB/s bin (18.11 %) BM_ZFlat/18 65 64 64838208 2.9GB/s bin_200 (7.50 %) BM_ZFlat/19 65900 65787 63099 554.3MB/s sum (48.96 %) BM_ZFlat/20 6188 6177 681951 652.6MB/s man (59.21 %) SAMPLE (after) Benchmark Time(ns) CPU(ns) Iterations -------------------------------------------- BM_ZFlat/0 99259 99044 42428 986.0MB/s html (22.31 %) BM_ZFlat/1 1257039 1255276 3341 533.4MB/s urls (47.78 %) BM_ZFlat/2 10044 10030 405781 11.4GB/s jpg (99.95 %) BM_ZFlat/3 268 267 15732282 713.3MB/s jpg_200 (73.00 %) BM_ZFlat/4 11675 11657 358629 8.2GB/s pdf (83.30 %) BM_ZFlat/5 420951 419818 9739 930.5MB/s html4 (22.52 %) BM_ZFlat/6 415460 414632 10000 349.8MB/s txt1 (57.88 %) BM_ZFlat/7 367191 366436 10000 325.8MB/s txt2 (61.91 %) BM_ZFlat/8 1098345 1096036 3819 371.3MB/s txt3 (54.99 %) BM_ZFlat/9 1508701 1505306 2758 305.3MB/s txt4 (66.26 %) BM_ZFlat/10 87195 87031 47289 1.3GB/s pb (19.68 %) BM_ZFlat/11 322338 321637 10000 546.5MB/s gaviota (37.72 %) BM_ZFlat/12 36739 36668 100000 639.9MB/s cp (48.12 %) BM_ZFlat/13 13646 13618 304009 780.9MB/s c (42.47 %) BM_ZFlat/14 4249 4240 992456 837.0MB/s lsp (48.37 %) BM_ZFlat/15 1262925 1260012 3314 779.4MB/s xls (41.23 %) BM_ZFlat/16 308 308 10000000 619.8MB/s xls_200 (78.00 %) BM_ZFlat/17 379750 378944 10000 1.3GB/s bin (18.11 %) BM_ZFlat/18 62 62 67443280 3.0GB/s bin_200 (7.50 %) BM_ZFlat/19 61706 61587 67645 592.1MB/s sum (48.96 %) BM_ZFlat/20 5968 5958 698974 676.6MB/s man (59.21 %)	2017-01-26 21:39:39 +01:00
Steinar H. Gunderson	7525a1600d	Fix an issue where the ByteSource path (used for parsing std::string) would incorrectly accept some invalid varints that the other path would not, causing potential CHECK-failures if the unit test were run with --write_uncompressed and a corrupted input file. Found by the afl fuzzer.	2016-01-04 12:52:15 +01:00
Steinar H. Gunderson	0852af7606	Move the logic from ComputeTable into the unit test, which means it's run automatically together with the other tests, and also removes the stray function ComputeTable() (which was never referenced by anything else in the open-source version, causing compiler warnings for some) out of the core library. Fixes public issue 96. A=sesse R=sanjay	2015-08-19 11:37:51 +02:00
Steinar H. Gunderson	b2312c4c25	Add support for Uncompress(source, sink). Various changes to allow Uncompress(source, sink) to get the same performance as the different variants of Uncompress to Cord/DataBuffer/String/FlatBuffer. Changes to efficiently support Uncompress(source, sink) -------- a) For strings - we add support to StringByteSink to do GetAppendBuffer so we can write to it without copying. b) For flat array buffers, we do GetAppendBuffer and see if we can get a full buffer. With the above changes we get performance with ByteSource/ByteSink that is very close to directly using flat arrays and strings. We add various benchmark cases to demonstrate that. Orthogonal change ------------------ Add support for TryFastAppend() for SnappyScatteredWriter. Benchmark results are below CPU: Intel Core2 dL1:32KB dL2:4096KB Benchmark Time(ns) CPU(ns) Iterations ----------------------------------------------------- BM_UFlat/0 109065 108996 6410 896.0MB/s html BM_UFlat/1 1012175 1012343 691 661.4MB/s urls BM_UFlat/2 26775 26771 26149 4.4GB/s jpg BM_UFlat/3 48947 48940 14363 1.8GB/s pdf BM_UFlat/4 441029 440835 1589 886.1MB/s html4 BM_UFlat/5 39861 39880 17823 588.3MB/s cp BM_UFlat/6 18315 18300 38126 581.1MB/s c BM_UFlat/7 5254 5254 100000 675.4MB/s lsp BM_UFlat/8 1568060 1567376 447 626.6MB/s xls BM_UFlat/9 337512 337734 2073 429.5MB/s txt1 BM_UFlat/10 287269 287054 2434 415.9MB/s txt2 BM_UFlat/11 890098 890219 787 457.2MB/s txt3 BM_UFlat/12 1186593 1186863 590 387.2MB/s txt4 BM_UFlat/13 573927 573318 1000 853.7MB/s bin BM_UFlat/14 64250 64294 10000 567.2MB/s sum BM_UFlat/15 7301 7300 96153 552.2MB/s man BM_UFlat/16 109617 109636 6375 1031.5MB/s pb BM_UFlat/17 364438 364497 1921 482.3MB/s gaviota BM_UFlatSink/0 108518 108465 6450 900.4MB/s html BM_UFlatSink/1 991952 991997 705 675.0MB/s urls BM_UFlatSink/2 26815 26798 26065 4.4GB/s jpg BM_UFlatSink/3 49127 49122 14255 1.8GB/s pdf BM_UFlatSink/4 436674 436731 1604 894.4MB/s html4 BM_UFlatSink/5 39738 39733 17345 590.5MB/s cp BM_UFlatSink/6 18413 18416 37962 577.4MB/s c BM_UFlatSink/7 5677 5676 100000 625.2MB/s lsp BM_UFlatSink/8 1552175 1551026 451 633.2MB/s xls BM_UFlatSink/9 338526 338489 2065 428.5MB/s txt1 BM_UFlatSink/10 289387 289307 2420 412.6MB/s txt2 BM_UFlatSink/11 893803 893706 783 455.4MB/s txt3 BM_UFlatSink/12 1195919 1195459 586 384.4MB/s txt4 BM_UFlatSink/13 559637 559779 1000 874.3MB/s bin BM_UFlatSink/14 65073 65094 10000 560.2MB/s sum BM_UFlatSink/15 7618 7614 92823 529.5MB/s man BM_UFlatSink/16 110085 110121 6352 1027.0MB/s pb BM_UFlatSink/17 369196 368915 1896 476.5MB/s gaviota BM_UValidate/0 46954 46957 14899 2.0GB/s html BM_UValidate/1 500621 500868 1000 1.3GB/s urls BM_UValidate/2 283 283 2481447 417.2GB/s jpg BM_UValidate/3 16230 16228 43137 5.4GB/s pdf BM_UValidate/4 189129 189193 3701 2.0GB/s html4 A=uday R=sanjay	2015-07-06 14:21:00 +02:00
Steinar H. Gunderson	b2ad960067	Changes to eliminate compiler warnings on MSVC This code was not compiling under Visual Studio 2013 with warnings being treated as errors. Specifically: 1. Changed int -> size_t to eliminate signed/unsigned mismatch warning. 2. Added some missing return values to functions. 3. Inserting character instead of integer literals into strings to avoid type conversions. A=cmumford R=jeff	2015-06-22 16:09:56 +02:00

1 2

64 Commits