snappy

Commit Graph

Author	SHA1	Message	Date
Danila Kutenin	58d911b284	Add missing functional include for std::less_equal	2024-08-17 19:00:51 -07:00
danilak-G	b49982cc1e	Merge pull request #179 from mmorel-35/bzlmod chore(bazel): add MODULE.bazel files for bzlmod	2024-08-17 18:49:28 -07:00
danilak-G	07406b9380	Merge pull request #181 from AtomicRobotMan0101/main Update README.md	2024-08-17 18:47:30 -07:00
Danila Kutenin	2c94e11145	Release version 1.2.1	2024-05-21 19:36:39 +00:00
Danila Kutenin	465b5b60ca	Restore old compression functions to preserve ABI Fixes #183	2024-05-21 19:25:53 +00:00
Evan McBeth	dc3577f5b4	Update README.md typo on list	2024-04-18 18:13:32 +10:00
Matthieu MOREL	09d30d36f4	chore(bazel): add MODULE.bazel files for bzlmod Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-04-13 11:52:06 +02:00
danilak-G	52820ea9c6	Merge pull request #178 from jjerphan/build/update-version-to-1.2.0 Update version number to 1.2.0	2024-04-10 15:15:35 +01:00
Julien Jerphanion	ac6b63f042	Update version number to 1.2.0 Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>	2024-04-08 14:14:45 +02:00
danilak-G	23b3286820	Merge pull request #175 from Quuxplusone/suggest-override Fix -Wsuggest-override warnings from Clang	2024-04-04 20:04:38 +01:00
Danila Kutenin	6b2eb7028b	Fix all compilation errors to be C++11 compliant	2024-04-04 19:00:14 +00:00
Danila Kutenin	ab38064abe	Fix compilation in the benchmark	2024-04-04 18:44:31 +00:00
Danila Kutenin	4e693db158	Use C++11 style instead of C++20	2024-04-04 18:42:29 +00:00
Danila Kutenin	a60fd602ce	Fix sync	2024-04-04 18:36:37 +00:00
Snappy Team	766d24c95e	Zippy level 2 for denser compression and faster decompression We also increased the hashtable size by 1 bit as it significantly degraded the ratio. Thus even level 1 might slightly improve. PiperOrigin-RevId: 621456036	2024-04-04 18:27:00 +00:00
Snappy Team	4f5cf9a8d6	Internal changes PiperOrigin-RevId: 599838882	2024-04-04 18:26:53 +00:00
Snappy Team	8bf2640823	Internal changes PiperOrigin-RevId: 599151099	2024-04-04 18:26:42 +00:00
Snappy Team	f0b0c9b8ce	Internal changes PiperOrigin-RevId: 597991348	2024-04-04 18:24:48 +00:00
Snappy Team	54d07d53a2	Restructure compression sampling for comparative analysis PiperOrigin-RevId: 597989810	2024-04-04 18:21:10 +00:00
Howard Hinnant	41a3ade229	Silence -Wdeprecated warning on clang * definition of implicit copy constructor for 'SnappySinkAllocator' is deprecated because it has a user-declared destructor.	2023-12-31 23:23:37 -05:00
Arthur O'Dwyer	8774875e55	Fix -Wsuggest-override warnings from Clang	2023-12-31 23:21:34 -05:00
Richard O'Grady	27f34a580b	Fix -Wsign-compare warning PiperOrigin-RevId: 547529709	2023-07-12 11:12:48 -07:00
Richard O'Grady	c9f9edf6d7	Fixes for Windows bazel build. Don't pass -Wno-sign-compare on Windows. Add a #define HAVE_WINDOWS_H if _WIN32 is defined. Don't assume sys/uio.h is available on Windows. PiperOrigin-RevId: 524416809	2023-04-14 18:02:20 -07:00
Richard O'Grady	66a30b803f	Add initial bazel build support for snappy. PiperOrigin-RevId: 524135175	2023-04-13 17:10:32 -07:00
Richard O'Grady	f725f6766b	Upgrade googletest to v1.13.0 release.	2023-04-13 10:31:13 -07:00
Richard O'Grady	8325392950	Disable Wimplicit-int-float-conversion warning in googletest PiperOrigin-RevId: 524031046	2023-04-13 10:04:53 -07:00
Richard O'Grady	108139d275	Upgrade benchmark library to v1.7.1 release.	2023-04-11 13:16:42 -07:00
Richard O'Grady	00aa9ac61d	Disable -Wsign-compare warning. PiperOrigin-RevId: 523460180	2023-04-11 11:55:49 -07:00
Richard O'Grady	cfc573e08f	Define missing SNAPPY_PREFETCH macros. PiperOrigin-RevId: 523287305	2023-04-11 10:38:23 -07:00
Ilya Tokar	92f18e66fd	Add prefetch to zippy compress PiperOrigin-RevId: 518358512	2023-03-29 17:31:17 -07:00
Snappy Team	f603a02008	Explicitly #include <utility> in snappy-internal.h snappy-internal.h uses std::pair, which is defined in the <utility> header. Typically, this works because existing C++ standard library implementations provide <utility> via other transitive includes; however, these transitive includes are not guaranteed to exist, and don't exist in certain contexts (e.g. compiling against LLVM's libc++ with Clang modules.) PiperOrigin-RevId: 517213822	2023-03-29 17:31:10 -07:00
Snappy Team	9c42b71b19	Optimize check for uncommon decompression for ARM, saving two instructions and three cycles. PiperOrigin-RevId: 517141646	2023-03-29 17:30:58 -07:00
Victor Costan	dc05e02648	Tag open source release 1.1.10. PiperOrigin-RevId: 515161676	2023-03-08 15:44:00 -08:00
Snappy Team	7b82423c59	The output buffer in DecompressBranchless is never read from and the source buffers are never written. This allows us to defer any writes to the output buffer for an arbitrary amount of time as long as the writes all occur in the proper order. When a MemCopy64 would have normally occurred we save away the source address and length. Once we reach the location of the next write to the output buffer first perform the deferred copy. This gives time for the source address calculation and length to finish before the deferred copy. This change gives 1.84% on CLX and 0.97% Milan. PiperOrigin-RevId: 504012310	2023-03-07 06:35:00 -08:00
Victor Costan	30326e5b8c	Merge pull request #150 from davemgreen:betterunalignedloads PiperOrigin-RevId: 501489679	2023-01-12 13:33:26 +00:00
Snappy Team	74960e8bd6	Allow some buffer overwrite on literal emitting Calls to memcpy seem to be quite expensive ``` BM_ZFlat/0 [html (22.24 %) ] 114µs ± 6% 110µs ± 6% -3.97% (p=0.000 n=118+115) BM_ZFlat/1 [urls (47.84 %) ] 1.63ms ± 5% 1.58ms ± 5% -3.39% (p=0.000 n=117+115) BM_ZFlat/2 [jpg (99.95 %) ] 7.84µs ± 6% 7.70µs ± 6% -1.66% (p=0.000 n=119+117) BM_ZFlat/3 [jpg_200 (73.00 %)] 265ns ± 6% 255ns ± 6% -3.48% (p=0.000 n=101+98) BM_ZFlat/4 [pdf (83.31 %) ] 11.8µs ± 6% 11.6µs ± 6% -2.14% (p=0.000 n=118+116) BM_ZFlat/5 [html4 (22.52 %) ] 525µs ± 6% 513µs ± 6% -2.36% (p=0.000 n=117+116) BM_ZFlat/6 [txt1 (57.87 %) ] 494µs ± 5% 480µs ± 6% -2.84% (p=0.000 n=118+116) BM_ZFlat/7 [txt2 (62.02 %) ] 444µs ± 4% 428µs ± 7% -3.51% (p=0.000 n=119+117) BM_ZFlat/8 [txt3 (55.17 %) ] 1.34ms ± 5% 1.30ms ± 5% -2.40% (p=0.000 n=120+116) BM_ZFlat/9 [txt4 (66.41 %) ] 1.84ms ± 5% 1.78ms ± 5% -3.55% (p=0.000 n=110+111) BM_ZFlat/10 [pb (19.61 %) ] 101µs ± 5% 97µs ± 5% -4.67% (p=0.000 n=118+118) BM_ZFlat/11 [gaviota (37.73 %)] 368µs ± 5% 360µs ± 6% -2.13% (p=0.000 n=91+90) BM_ZFlat/12 [cp (48.25 %) ] 38.9µs ± 6% 36.8µs ± 6% -5.36% (p=0.000 n=88+87) BM_ZFlat/13 [c (42.52 %) ] 13.4µs ± 6% 13.1µs ± 8% -2.38% (p=0.000 n=115+116) BM_ZFlat/14 [lsp (48.94 %) ] 4.05µs ± 4% 3.94µs ± 4% -2.58% (p=0.000 n=91+85) BM_ZFlat/15 [xls (41.10 %) ] 1.42ms ± 5% 1.39ms ± 7% -2.49% (p=0.000 n=116+117) BM_ZFlat/16 [xls_200 (78.00 %)] 313ns ± 6% 307ns ± 5% -1.89% (p=0.000 n=89+84) BM_ZFlat/17 [bin (18.12 %) ] 518µs ± 5% 506µs ± 5% -2.42% (p=0.000 n=118+116) BM_ZFlat/18 [bin_200 (7.50 %) ] 86.8ns ± 6% 85.3ns ± 6% -1.76% (p=0.000 n=118+114) BM_ZFlat/19 [sum (48.99 %) ] 67.9µs ± 4% 61.1µs ± 6% -9.96% (p=0.000 n=114+117) BM_ZFlat/20 [man (59.45 %) ] 5.64µs ± 6% 5.47µs ± 7% -3.06% (p=0.000 n=117+115) BM_ZFlatAll [21 kTestDataFiles] 9.23ms ± 4% 9.01ms ± 5% -2.44% (p=0.000 n=80+83) BM_ZFlatIncreasingTableSize [7 tables ] 30.4µs ± 5% 29.3µs ± 7% -3.45% (p=0.000 n=96+96) ``` PiperOrigin-RevId: 490184133	2023-01-12 13:33:17 +00:00
Ilya Tokar	37f375ddeb	Add prefetch to zippy decompess, PiperOrigin-RevId: 489554313	2023-01-12 13:33:10 +00:00
Snappy Team	15e2a0e13d	Add "cc" clobbers to inline asm that modifies flags. As far as we know, the lack of "cc" in the clobbers hasn't caused problems yet, but it could. This change is to improve correctness, and is also almost certainly performance neutral. PiperOrigin-RevId: 487133620	2023-01-12 13:33:01 +00:00
Snappy Team	8881ba172a	Improve the speed of hashing in zippy compression. This change replaces the hashing function used during compression with one that is roughly as good but faster. This speeds up compression by two to a few percent on the Intel-, AMD-, and Arm-based machines we tested. The amount of compression is roughly unchanged. PiperOrigin-RevId: 485960303	2023-01-12 13:32:54 +00:00
Snappy Team	a2d219a8a8	Modify MemCopy64 to use AVX 32 byte copies instead of SSE2 16 byte copies on capable x86 platforms. This gives an average speedup of 6.87% on Milan and 1.90% on Skylake. PiperOrigin-RevId: 480370725	2023-01-12 13:32:43 +00:00
Marcin Kowalczyk	984b191f0f	Fix the remaining occurrence of non-const `std::string::data()`. PiperOrigin-RevId: 479818960	2022-10-08 21:59:12 +02:00
Matt Callanan	974fcc49e8	Fix compilation errors under C++11. `std::string::data()` is const-only until C++17. PiperOrigin-RevId: 479708109	2022-10-08 08:41:35 +02:00
Marcin Kowalczyk	d644ca8770	Fix warnings due to use of `__attribute__(always_inline)` without `inline`. PiperOrigin-RevId: 478984028	2022-10-05 10:38:16 +02:00
Matt Callanan	9758c9dfd7	Add `snappy::CompressFromIOVec`. This reads from an `iovec` array rather than from a `char` array as in `snappy::Compress`. PiperOrigin-RevId: 476930623	2022-09-29 09:32:28 -07:00
Victor Costan	af720f9a3b	Merge pull request #148 from pitrou:ubsan-ptr-add-overflow PiperOrigin-RevId: 463090354	2022-07-27 15:28:16 +00:00
Marcin Kowalczyk	44caf79086	Move the comment about non-overlap requirement from the implementation to the contract of `MemCopy64()`, and clarify that it applies to `size`, not to 64. PiperOrigin-RevId: 453920284	2022-07-27 15:28:08 +00:00
Snappy Team	d261d2766f	Optimize zippy MemCpy / MemMove during decompression By default MemCpy() / MemMove() always copies 64 bytes in DecompressBranchless(). Profiling shows that the vast majority of the time we need to copy many fewer bytes (typically <= 16 bytes). It is safe to copy fewer bytes as long as we exceed len. This change improves throughput by ~12% on ARM, ~35% on AMD Milan, and ~7% on Intel Cascade Lake. PiperOrigin-RevId: 453917840	2022-07-27 15:27:58 +00:00
Snappy Team	6a2b78a379	Optimize Zippy compression for ARM by 5-10% by choosing csel instructions PiperOrigin-RevId: 444863689	2022-05-09 16:19:11 +00:00
Snappy Team	8dd58a519f	Fix compilation for older GCC and Clang versions. Not everything defining __GNUC__ supports flag outputs from asm statements; in particular, some Clang versions on macOS does not. The correct test per the GCC documentation is __GCC_ASM_FLAG_OUTPUTS__, so use that instead. PiperOrigin-RevId: 423749308	2022-02-20 18:19:45 +00:00
David Green	6c6e890ef9	Change LittleEndian loads/stores to use memcpy The existing code uses a series of 8bit loads with shifts and ors to emulate an (unaligned) load of a larger type. These are then expected to become single loads in the compiler, producing optimal assembly. Whilst this is true it happens very late in the compiler, meaning that throughout most of the pipeline it is treated (and cost-modelled) as multiple loads, shifts and ors. This can make the compiler make poor decisions (such as not unrolling loops that should be), or to break up the pattern before it is turned into a single load. For example the loops in CompressFragment do not get unrolled as expected due to a higher cost than the unroll threshold in clang. Instead this patch uses a more conventional methods of loading unaligned data, using a memcpy directly which the compiler will be able to deal with much more straight forwardly, modelling it as a single unaligned load. The old code is left as-is for big-endian systems. This helps improve the performance of the BM_ZFlat benchmarks by up to 10-15% on an Arm Neoverse N1. Change-Id: I986f845ebd0a0806d052d2be3e4dbcbee91713d7	2022-01-19 07:14:46 +00:00

1 2 3 4 5 ...

366 Commits All Branches Search

366 Commits

All Branches