snappy

Commit Graph

Author	SHA1	Message	Date
Victor Costan	5417da69b7	Switch from C headers to C++ headers. This CL makes the following substitutions. * assert.h -> cassert * math.h -> cmath * stdarg.h -> cstdarg * stdio.h -> cstdio * stdlib.h -> cstdlib * string.h -> cstring stddef.h and stdint.h are not migrated to C++ headers. PiperOrigin-RevId: 309074805	2020-04-29 19:38:03 +00:00
Victor Costan	251d935d50	Remove #include <string> from snappy-stubs-public.h. The header hasn't been needed since the removal of the snappy::string alias to std::string. PiperOrigin-RevId: 306446542	2020-04-14 16:50:30 +00:00
Victor Costan	4f195aee43	Remove mismatched #endif. PiperOrigin-RevId: 306345559	2020-04-14 00:38:04 +00:00
Victor Costan	041c608086	Remove platform-dependent code for unaligned loads/stores. Snappy issues multi-byte (16/32/64-bit) loads and stores that are not aligned, meaning the addresses are 16/32/64-bit multiples. This is accomplished using two methods: 1) The portable method allocates a uint{16,32,64}_t on the stack, and std::memcpy()s the bytes into/from the integer. This method relies on well-defined behaviori (std::memcpy() works on all valid pointers, fixed-width unsigned integer types use a pure binary representation and therefore have no invalid values), and should compile to valid code on all platforms. 2) The fast method reinterpret_casts the address to a pointer to a uint{16,32,64}_t and dereferences the pointer. This is expected to compile to one hardware instruction (mov on x86, ldr/str on arm). The caveat is that the reinterpret_cast is undefined behavior (UB) unless the address happened to be a valid uint{16,32,64}_t pointer. The UB shows up as follows. * On architectures that don't have hardware instructions for unaligned loads / stores, the pointer access can trigger a hardware exceptions. This is mitigated by #ifdef blocks that attempt to restrict the fast method to platforms that support it. * On architectures that have separate instructions for aligned and unaligned access, the compiler may need an explicit hint to emit the hardware instruction for unaligned access. This is accomplished on Clang and GCC by wrapping the pointers into structs tagged with __attribute__((__packed__)). This CL removes the fast method. Fortunately, compilers have advanced enough that the portable method gets compiled down to the same instructions as the fast method, without the need for the caveats explained above. Specifically, modern Clang, GCC and MSVC optimize std::memcpy() to a single instruction (mov / ldr / str). A test case proving this can be seen at https://godbolt.org/z/gZg2Fk PiperOrigin-RevId: 306342728	2020-04-14 00:22:20 +00:00
Victor Costan	27ff130ff9	Remove platform-dependent code for little-endian loads and stores. The platform-independent code that breaks down the loads and stores into byte-level operations is optimized into single instructions (mov or ldr/str) and instruction pairs (mov+bswap or ldr/str+rev) by recent versions of Clang and GCC. Tested at https://godbolt.org/z/2BQP-o PiperOrigin-RevId: 306321608	2020-04-13 22:30:59 +00:00
Victor Costan	a4cdb5d133	Introduce SNAPPY_ATTRIBUTE_ALWAYS_INLINE. An internal CL started using ABSL_ATTRIBUTE_ALWAYS_INLINE from Abseil. This CL introduces equivalent functionality as SNAPPY_ALWAYS_INLINE. PiperOrigin-RevId: 306289650	2020-04-13 19:51:05 +00:00
Victor Costan	231b8be076	Migrate to standard integral types. The following changes are done via find/replace. * int8 -> int8_t * int16 -> int16_t * int32 -> int32_t * int64 -> int64_t The aliases were removed from snappy-stubs-public.h. PiperOrigin-RevId: 306141557	2020-04-12 20:10:03 +00:00
Victor Costan	14bef66290	Modernize memcpy() and memmove() usage. This CL replaces memcpy() with std::memcpy() and memmove() with std::memmove(), and #includes <cstring> in files that use either function. PiperOrigin-RevId: 306067788	2020-04-12 00:06:15 +00:00
Snappy Team	d674348a0c	Improve zippy with 5-10%. BM_ZCord/0 [html ] 1.26GB/s ± 0% 1.35GB/s ± 0% +7.90% (p=0.008 n=5+5) BM_ZCord/1 [urls ] 535MB/s ± 0% 562MB/s ± 0% +5.05% (p=0.008 n=5+5) BM_ZCord/2 [jpg ] 10.2GB/s ± 1% 10.2GB/s ± 0% ~ (p=0.310 n=5+5) BM_ZCord/3 [jpg_200] 841MB/s ± 1% 846MB/s ± 1% ~ (p=0.421 n=5+5) BM_ZCord/4 [pdf ] 6.77GB/s ± 1% 7.06GB/s ± 1% +4.28% (p=0.008 n=5+5) BM_ZCord/5 [html4 ] 1.00GB/s ± 0% 1.08GB/s ± 0% +7.94% (p=0.008 n=5+5) BM_ZCord/6 [txt1 ] 391MB/s ± 0% 417MB/s ± 0% +6.71% (p=0.008 n=5+5) BM_ZCord/7 [txt2 ] 363MB/s ± 0% 388MB/s ± 0% +6.73% (p=0.016 n=5+4) BM_ZCord/8 [txt3 ] 400MB/s ± 0% 426MB/s ± 0% +6.55% (p=0.008 n=5+5) BM_ZCord/9 [txt4 ] 328MB/s ± 0% 350MB/s ± 0% +6.66% (p=0.008 n=5+5) BM_ZCord/10 [pb ] 1.67GB/s ± 1% 1.80GB/s ± 0% +7.52% (p=0.008 n=5+5) 1) A key bottleneck in the data dependency chain is figuring out how many bytes are matched and loading the data for next hash value. The load-to-use latency is 5 cycles, in previous cl/303353110 we removed the load in lieu of "shrd" to align previous loads. Unfortunately "shrd" itself has a latency of 4 cycles, we'd prefer "shrx" which takes 1 cycle for variable shifts. 2)Maximally use data already computed. The above trick calculates 5 bytes of useful data. So in case we need to search for new match we can use this for the first search (which is one byte further). PiperOrigin-RevId: 303875535	2020-04-11 04:41:15 +00:00
Snappy Team	4dfcad9f4e	assertion failure on darwin_x86_64, have to investigage PiperOrigin-RevId: 303428229	2020-04-11 04:41:07 +00:00
Snappy Team	e19178748f	assertion failure on darwin_x86_64, have to investigage PiperOrigin-RevId: 303346402	2020-04-11 04:40:57 +00:00
Snappy Team	0faf56378e	This cl does two things 1) It shaves of a few cycles from the data dependency chain. By using "shrd" instead of a load. 2) The important loop is finding small copies (4-12) which are either "copy 1", or "copy 2" depending if the offset fits <2048. It turns out that this is a branch that is mispredicted often. Due to the long dependency chain the CPU is running with IPC~1 anyway so we can freely add instructions to instead emit copies branchfree. This reduces the branch misspredicts from 15% to 11% (for BM_ZFlat/6 txt1) and from 5.6% to 4% (for BM_ZFlat/10 or pb). PiperOrigin-RevId: 303328967	2020-04-11 04:40:48 +00:00
Snappy Team	0c7ed08a25	The result on protobuf benchmark is around 19%. Results vary by their propensity for compression. As the frequency of finding matches influences the amount of branch misspredicts and the amount of hashing. Two ideas 1) The code uses "heuristic match skipping" has a quadratic interpolation. However for the first 32 bytes it's just every byte. Special case 16 bytes. This removes a lot of code. 2) Load 64 bit integers and shift instead of reload. The hashing loop has a very long chain data = Load32(ip) -> hash = Hash(data) -> offset = table[hash] -> copy_data = Load32(base_ip + offset) followed by a compare between data and copy_data. This chain is around 20 cycles. It's unreasonable for the branch predictor to be able to predict when it's a match (that is completely driven by the content of the data). So when it's a miss this chain is on the critical path. By loading 64 bits and shifting we can effectively remove the first load. PiperOrigin-RevId: 302893821	2020-04-11 04:40:39 +00:00
Snappy Team	3c77e01459	1) Make the output pointer a local variable such it doesn't need a load add store on it's loop carried dependency chain. 2) Reduce the input pointer loop carried dependency chain from 7 cycles to 4 cycles by using pre-loading. This is a very subtle point. 3) Just brutally copy 64 bytes which removes a difficult to predict branch from the inner most loop. There is enough bandwidth to do so in the intrinsic cycles of the loop. 4) Implement limit pointers that include the slop region. This removes unnecessary instructions from the hot path. 5) It seems the removal of the difficult to predict branch has removed the code sensitivity to alignment, so remove the asm nop's. PiperOrigin-RevId: 294692928	2020-04-11 04:40:29 +00:00
Snappy Team	9eabb7baba	Cut a load from the critical dependency chain of the input pointer by speculating the uncommon case of COPY_4 is not happening. PiperOrigin-RevId: 293803653	2020-04-11 04:40:20 +00:00
Snappy Team	cddd9c0875	Improve comments in IncrementalCopy, add an assert. PiperOrigin-RevId: 292506754	2020-04-11 04:40:09 +00:00
Victor Costan	537f4ad624	Tag open source release 1.1.8. PiperOrigin-RevId: 289675084	2020-01-14 10:58:53 -08:00
Snappy Team	b5477a8457	Optimize IncrementalCopy: There are between 1 and 4 copy iterations. Allow FDO to work with full knowledge of the probabilities for each branch. On skylake, this improves protobuf and html decompression speed by 15% and 9% respectively, and the rest by ~2%. On haswell, this improves protobuf and html decompression speed by 23% and 16% respectively, and the rest by ~3%. PiperOrigin-RevId: 289090401	2020-01-14 10:58:42 -08:00
Victor Costan	f5acee902c	Move CI to Visual Studio 2019. PiperOrigin-RevId: 279785698	2019-11-11 12:05:59 -08:00
Victor Costan	26410cc4f8	Merge pull request #85 from bitomaxsp:patch-1 PiperOrigin-RevId: 279633518	2019-11-10 14:10:50 -08:00
Victor Costan	0eec45ed16	Align CMake configuration with related projects. PiperOrigin-RevId: 279237837	2019-11-07 22:39:04 -08:00
Victor Costan	6617df53fa	Remove redundant PROJECT_SOURCE_DIR usage from CMake config. Inspired by https://github.com/google/crc32c/pull/32 PiperOrigin-RevId: 278718367	2019-11-05 16:35:29 -08:00
Victor Costan	f48c38f91a	Fix one forgotten instance of StringPrintf -> StrFormat. PiperOrigin-RevId: 278315159	2019-11-04 00:09:19 -08:00
Victor Costan	c9212708b2	Fix build errors. PiperOrigin-RevId: 278310119	2019-11-03 23:24:02 -08:00
Victor Costan	eb2eb73e6b	Test CMake installation on Travis. PiperOrigin-RevId: 278300416	2019-11-03 21:51:20 -08:00
Snappy Team	8f32e3fbc0	Internal changes PiperOrigin-RevId: 277555451	2019-11-03 21:51:08 -08:00
Dmitry	38945971d6	Allow build with different standard if lib used as a subproject	2019-10-17 14:17:49 +02:00
Victor Costan	e9e11b84e6	Fix Travis CI build. * Fix bash conditionals: [ a == b ] should be [ a = b ]. * Upgrade to LLVM 9 on Travis. * Upgrade fuzzer build arguments for LLVM 9. PiperOrigin-RevId: 271898655	2019-09-29 20:39:28 -07:00
Victor Costan	9dabbca006	Remove snappy::string alias to std::string. PiperOrigin-RevId: 271678325	2019-09-28 09:04:06 -07:00
Victor Costan	62363d9a79	Fully qualify std::string. This is in preparation for removing the snappy::string alias of std::string. PiperOrigin-RevId: 271383199	2019-09-26 10:57:29 -07:00
Victor Costan	d837d5cfe1	Merge pull request #80 from tmm1:patch-2 PiperOrigin-RevId: 264514195	2019-08-21 09:11:04 -07:00
Victor Costan	44d84addf2	Fix benchmarks. PiperOrigin-RevId: 264501168	2019-08-20 17:17:53 -07:00
Victor Costan	c6bf1170d8	Fix benchmarks. PiperOrigin-RevId: 264420835	2019-08-20 13:16:53 -07:00
Victor Costan	6219c7787b	Fix unused variable warnings in fuzzers. PiperOrigin-RevId: 264377331	2019-08-20 13:16:41 -07:00
Victor Costan	5a57d32566	Rename zippy__fuzzer.cc -> snappy__fuzzer.cc. PiperOrigin-RevId: 264321311	2019-08-19 23:43:34 -07:00
Victor Costan	fd79e6f9b2	Merge pull request #78 from bshastry:libfuzzer-harness PiperOrigin-RevId: 264241380	2019-08-19 14:30:13 -07:00
Shahriar Rouf	4c7f2d5dfb	Add BM_ZFlatAll, BM_ZFlatIncreasingTableSize benchmarks to see how good zippy performs when it is processing different data one after the other. PiperOrigin-RevId: 257518137	2019-08-19 14:30:00 -07:00
Bhargava Shastry	a58d4b03c5	Update travis config for fuzzer builds	2019-07-27 10:57:49 +02:00
Aman Gupta	d926a6bcb5	Updated to match .gitignore from google/leveldb	2019-07-20 12:49:48 -07:00
Aman Gupta	6662dfb5d4	Create .gitignore	2019-07-13 13:08:35 -07:00
Bhargava Shastry	d71375bf8a	Add libFuzzer harnesses, a cmake option to build them	2019-07-12 14:42:48 +02:00
Chris Mumford	156cd8939c	Removed reference to deprecated autotools. PiperOrigin-RevId: 253128048	2019-06-14 15:40:42 -07:00
Victor Costan	fe702ad2a3	Use GCC 9 on Travis CI PiperOrigin-RevId: 249995900	2019-05-25 14:37:17 -07:00
Chris Mumford	a3e012d762	The snappy landing page at http://google.github.io/snappy/ is served by [GitHub Pages](https://pages.github.com/) and lives in the gh-pages branch. This changes moves the page contents to a more easily accessed Markdown file. PiperOrigin-RevId: 248561542	2019-05-16 11:11:34 -07:00
Chris Mumford	4312f49315	Merge pull request #75 from Maikuolan:patch-1 PiperOrigin-RevId: 248558516	2019-05-16 11:11:21 -07:00
Chris Mumford	407712f4c9	Merge pull request #76 from abyss7:patch-1 PiperOrigin-RevId: 248211389	2019-05-14 14:27:56 -07:00
Chris Mumford	8c188a6c78	Minor typo fix in README. PiperOrigin-RevId: 248170160	2019-05-14 11:05:38 -07:00
Chris Mumford	c76b053449	Sync TODO and comment processing with external repo. Copybara transforms code slightly different than MOE. One example is the TODO username stripping where Copybara produces different results than MOE did. This change moves the Copybara versions of comments to the public repository. Note: These changes didn't originate in cl/247950252. PiperOrigin-RevId: 247950252	2019-05-14 11:02:57 -07:00
Chris Mumford	54b6379e9f	Changed CMake version from 3.4 to that in CMakeLists.txt in README. PiperOrigin-RevId: 247484946	2019-05-13 10:11:19 -07:00
Victor Costan	0af4349bf0	Update Travis CI configuration. The Travis configuration: 1) Installs recent versions of clang and GCC. 2) Sets up the environment so that CMake picks up the installed compilers. Previously, the pre-installed clang compiler was used instead. 3) Requests a modern macOS image that has all the headers needed by GCC. The CL also removes now-unnecessary old workarounds from the Travis configuration. PiperOrigin-RevId: 245832795	2019-05-13 10:11:19 -07:00

1 2 3 4 5

239 Commits All Branches Search

239 Commits

All Branches