Danila Kutenin
58d911b284
Add missing functional include for std::less_equal
2024-08-17 19:00:51 -07:00
danilak-G
b49982cc1e
Merge pull request #179 from mmorel-35/bzlmod
...
chore(bazel): add MODULE.bazel files for bzlmod
2024-08-17 18:49:28 -07:00
danilak-G
07406b9380
Merge pull request #181 from AtomicRobotMan0101/main
...
Update README.md
2024-08-17 18:47:30 -07:00
Danila Kutenin
2c94e11145
Release version 1.2.1
2024-05-21 19:36:39 +00:00
Danila Kutenin
465b5b60ca
Restore old compression functions to preserve ABI
...
Fixes #183
2024-05-21 19:25:53 +00:00
Evan McBeth
dc3577f5b4
Update README.md
...
typo on list
2024-04-18 18:13:32 +10:00
Matthieu MOREL
09d30d36f4
chore(bazel): add MODULE.bazel files for bzlmod
...
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-04-13 11:52:06 +02:00
danilak-G
52820ea9c6
Merge pull request #178 from jjerphan/build/update-version-to-1.2.0
...
Update version number to 1.2.0
2024-04-10 15:15:35 +01:00
Julien Jerphanion
ac6b63f042
Update version number to 1.2.0
...
Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>
2024-04-08 14:14:45 +02:00
danilak-G
23b3286820
Merge pull request #175 from Quuxplusone/suggest-override
...
Fix -Wsuggest-override warnings from Clang
2024-04-04 20:04:38 +01:00
Danila Kutenin
6b2eb7028b
Fix all compilation errors to be C++11 compliant
2024-04-04 19:00:14 +00:00
Danila Kutenin
ab38064abe
Fix compilation in the benchmark
2024-04-04 18:44:31 +00:00
Danila Kutenin
4e693db158
Use C++11 style instead of C++20
2024-04-04 18:42:29 +00:00
Danila Kutenin
a60fd602ce
Fix sync
2024-04-04 18:36:37 +00:00
Snappy Team
766d24c95e
Zippy level 2 for denser compression and faster decompression
...
We also increased the hashtable size by 1 bit as it significantly degraded the ratio. Thus even level 1 might slightly improve.
PiperOrigin-RevId: 621456036
2024-04-04 18:27:00 +00:00
Snappy Team
4f5cf9a8d6
Internal changes
...
PiperOrigin-RevId: 599838882
2024-04-04 18:26:53 +00:00
Snappy Team
8bf2640823
Internal changes
...
PiperOrigin-RevId: 599151099
2024-04-04 18:26:42 +00:00
Snappy Team
f0b0c9b8ce
Internal changes
...
PiperOrigin-RevId: 597991348
2024-04-04 18:24:48 +00:00
Snappy Team
54d07d53a2
Restructure compression sampling for comparative analysis
...
PiperOrigin-RevId: 597989810
2024-04-04 18:21:10 +00:00
Howard Hinnant
41a3ade229
Silence -Wdeprecated warning on clang
...
* definition of implicit copy constructor for 'SnappySinkAllocator'
is deprecated because it has a user-declared destructor.
2023-12-31 23:23:37 -05:00
Arthur O'Dwyer
8774875e55
Fix -Wsuggest-override warnings from Clang
2023-12-31 23:21:34 -05:00
Richard O'Grady
27f34a580b
Fix -Wsign-compare warning
...
PiperOrigin-RevId: 547529709
2023-07-12 11:12:48 -07:00
Richard O'Grady
c9f9edf6d7
Fixes for Windows bazel build.
...
Don't pass -Wno-sign-compare on Windows.
Add a #define HAVE_WINDOWS_H if _WIN32 is defined.
Don't assume sys/uio.h is available on Windows.
PiperOrigin-RevId: 524416809
2023-04-14 18:02:20 -07:00
Richard O'Grady
66a30b803f
Add initial bazel build support for snappy.
...
PiperOrigin-RevId: 524135175
2023-04-13 17:10:32 -07:00
Richard O'Grady
f725f6766b
Upgrade googletest to v1.13.0 release.
2023-04-13 10:31:13 -07:00
Richard O'Grady
8325392950
Disable Wimplicit-int-float-conversion warning in googletest
...
PiperOrigin-RevId: 524031046
2023-04-13 10:04:53 -07:00
Richard O'Grady
108139d275
Upgrade benchmark library to v1.7.1 release.
2023-04-11 13:16:42 -07:00
Richard O'Grady
00aa9ac61d
Disable -Wsign-compare warning.
...
PiperOrigin-RevId: 523460180
2023-04-11 11:55:49 -07:00
Richard O'Grady
cfc573e08f
Define missing SNAPPY_PREFETCH macros.
...
PiperOrigin-RevId: 523287305
2023-04-11 10:38:23 -07:00
Ilya Tokar
92f18e66fd
Add prefetch to zippy compress
...
PiperOrigin-RevId: 518358512
2023-03-29 17:31:17 -07:00
Snappy Team
f603a02008
Explicitly #include <utility> in snappy-internal.h
...
snappy-internal.h uses std::pair, which is defined in the <utility>
header. Typically, this works because existing C++ standard library
implementations provide <utility> via other transitive includes;
however, these transitive includes are not guaranteed to exist, and
don't exist in certain contexts (e.g. compiling against LLVM's libc++
with Clang modules.)
PiperOrigin-RevId: 517213822
2023-03-29 17:31:10 -07:00
Snappy Team
9c42b71b19
Optimize check for uncommon decompression for ARM, saving two instructions and three cycles.
...
PiperOrigin-RevId: 517141646
2023-03-29 17:30:58 -07:00
Victor Costan
dc05e02648
Tag open source release 1.1.10.
...
PiperOrigin-RevId: 515161676
2023-03-08 15:44:00 -08:00
Snappy Team
7b82423c59
The output buffer in DecompressBranchless is never read from and the source buffers are never written. This allows us to defer any writes to the output buffer for an arbitrary amount of time as long as the writes all occur in the proper order. When a MemCopy64 would have normally occurred we save away the source address and length. Once we reach the location of the next write to the output buffer first perform the deferred copy. This gives time for the source address calculation and length to finish before the deferred copy.
...
This change gives 1.84% on CLX and 0.97% Milan.
PiperOrigin-RevId: 504012310
2023-03-07 06:35:00 -08:00
Victor Costan
30326e5b8c
Merge pull request #150 from davemgreen:betterunalignedloads
...
PiperOrigin-RevId: 501489679
2023-01-12 13:33:26 +00:00
Snappy Team
74960e8bd6
Allow some buffer overwrite on literal emitting
...
Calls to memcpy seem to be quite expensive
```
BM_ZFlat/0 [html (22.24 %) ] 114µs ± 6% 110µs ± 6% -3.97% (p=0.000 n=118+115)
BM_ZFlat/1 [urls (47.84 %) ] 1.63ms ± 5% 1.58ms ± 5% -3.39% (p=0.000 n=117+115)
BM_ZFlat/2 [jpg (99.95 %) ] 7.84µs ± 6% 7.70µs ± 6% -1.66% (p=0.000 n=119+117)
BM_ZFlat/3 [jpg_200 (73.00 %)] 265ns ± 6% 255ns ± 6% -3.48% (p=0.000 n=101+98)
BM_ZFlat/4 [pdf (83.31 %) ] 11.8µs ± 6% 11.6µs ± 6% -2.14% (p=0.000 n=118+116)
BM_ZFlat/5 [html4 (22.52 %) ] 525µs ± 6% 513µs ± 6% -2.36% (p=0.000 n=117+116)
BM_ZFlat/6 [txt1 (57.87 %) ] 494µs ± 5% 480µs ± 6% -2.84% (p=0.000 n=118+116)
BM_ZFlat/7 [txt2 (62.02 %) ] 444µs ± 4% 428µs ± 7% -3.51% (p=0.000 n=119+117)
BM_ZFlat/8 [txt3 (55.17 %) ] 1.34ms ± 5% 1.30ms ± 5% -2.40% (p=0.000 n=120+116)
BM_ZFlat/9 [txt4 (66.41 %) ] 1.84ms ± 5% 1.78ms ± 5% -3.55% (p=0.000 n=110+111)
BM_ZFlat/10 [pb (19.61 %) ] 101µs ± 5% 97µs ± 5% -4.67% (p=0.000 n=118+118)
BM_ZFlat/11 [gaviota (37.73 %)] 368µs ± 5% 360µs ± 6% -2.13% (p=0.000 n=91+90)
BM_ZFlat/12 [cp (48.25 %) ] 38.9µs ± 6% 36.8µs ± 6% -5.36% (p=0.000 n=88+87)
BM_ZFlat/13 [c (42.52 %) ] 13.4µs ± 6% 13.1µs ± 8% -2.38% (p=0.000 n=115+116)
BM_ZFlat/14 [lsp (48.94 %) ] 4.05µs ± 4% 3.94µs ± 4% -2.58% (p=0.000 n=91+85)
BM_ZFlat/15 [xls (41.10 %) ] 1.42ms ± 5% 1.39ms ± 7% -2.49% (p=0.000 n=116+117)
BM_ZFlat/16 [xls_200 (78.00 %)] 313ns ± 6% 307ns ± 5% -1.89% (p=0.000 n=89+84)
BM_ZFlat/17 [bin (18.12 %) ] 518µs ± 5% 506µs ± 5% -2.42% (p=0.000 n=118+116)
BM_ZFlat/18 [bin_200 (7.50 %) ] 86.8ns ± 6% 85.3ns ± 6% -1.76% (p=0.000 n=118+114)
BM_ZFlat/19 [sum (48.99 %) ] 67.9µs ± 4% 61.1µs ± 6% -9.96% (p=0.000 n=114+117)
BM_ZFlat/20 [man (59.45 %) ] 5.64µs ± 6% 5.47µs ± 7% -3.06% (p=0.000 n=117+115)
BM_ZFlatAll [21 kTestDataFiles] 9.23ms ± 4% 9.01ms ± 5% -2.44% (p=0.000 n=80+83)
BM_ZFlatIncreasingTableSize [7 tables ] 30.4µs ± 5% 29.3µs ± 7% -3.45% (p=0.000 n=96+96)
```
PiperOrigin-RevId: 490184133
2023-01-12 13:33:17 +00:00
Ilya Tokar
37f375ddeb
Add prefetch to zippy decompess,
...
PiperOrigin-RevId: 489554313
2023-01-12 13:33:10 +00:00
Snappy Team
15e2a0e13d
Add "cc" clobbers to inline asm that modifies flags.
...
As far as we know, the lack of "cc" in the clobbers hasn't caused
problems yet, but it could. This change is to improve correctness,
and is also almost certainly performance neutral.
PiperOrigin-RevId: 487133620
2023-01-12 13:33:01 +00:00
Snappy Team
8881ba172a
Improve the speed of hashing in zippy compression.
...
This change replaces the hashing function used during compression with
one that is roughly as good but faster. This speeds up compression by
two to a few percent on the Intel-, AMD-, and Arm-based machines we
tested. The amount of compression is roughly unchanged.
PiperOrigin-RevId: 485960303
2023-01-12 13:32:54 +00:00
Snappy Team
a2d219a8a8
Modify MemCopy64 to use AVX 32 byte copies instead of SSE2 16 byte copies on capable x86 platforms. This gives an average speedup of 6.87% on Milan and 1.90% on Skylake.
...
PiperOrigin-RevId: 480370725
2023-01-12 13:32:43 +00:00
Marcin Kowalczyk
984b191f0f
Fix the remaining occurrence of non-const `std::string::data()`.
...
PiperOrigin-RevId: 479818960
2022-10-08 21:59:12 +02:00
Matt Callanan
974fcc49e8
Fix compilation errors under C++11.
...
`std::string::data()` is const-only until C++17.
PiperOrigin-RevId: 479708109
2022-10-08 08:41:35 +02:00
Marcin Kowalczyk
d644ca8770
Fix warnings due to use of `__attribute__(always_inline)` without `inline`.
...
PiperOrigin-RevId: 478984028
2022-10-05 10:38:16 +02:00
Matt Callanan
9758c9dfd7
Add `snappy::CompressFromIOVec`.
...
This reads from an `iovec` array rather than from a `char` array as in `snappy::Compress`.
PiperOrigin-RevId: 476930623
2022-09-29 09:32:28 -07:00
Victor Costan
af720f9a3b
Merge pull request #148 from pitrou:ubsan-ptr-add-overflow
...
PiperOrigin-RevId: 463090354
2022-07-27 15:28:16 +00:00
Marcin Kowalczyk
44caf79086
Move the comment about non-overlap requirement from the implementation to the
...
contract of `MemCopy64()`, and clarify that it applies to `size`, not to 64.
PiperOrigin-RevId: 453920284
2022-07-27 15:28:08 +00:00
Snappy Team
d261d2766f
Optimize zippy MemCpy / MemMove during decompression
...
By default MemCpy() / MemMove() always copies 64 bytes in DecompressBranchless(). Profiling shows that the vast majority of the time we need to copy many fewer bytes (typically <= 16 bytes). It is safe to copy fewer bytes as long as we exceed len.
This change improves throughput by ~12% on ARM, ~35% on AMD Milan, and ~7% on Intel Cascade Lake.
PiperOrigin-RevId: 453917840
2022-07-27 15:27:58 +00:00
Snappy Team
6a2b78a379
Optimize Zippy compression for ARM by 5-10% by choosing csel instructions
...
PiperOrigin-RevId: 444863689
2022-05-09 16:19:11 +00:00
Snappy Team
8dd58a519f
Fix compilation for older GCC and Clang versions.
...
Not everything defining __GNUC__ supports flag outputs
from asm statements; in particular, some Clang versions
on macOS does not. The correct test per the GCC documentation
is __GCC_ASM_FLAG_OUTPUTS__, so use that instead.
PiperOrigin-RevId: 423749308
2022-02-20 18:19:45 +00:00
David Green
6c6e890ef9
Change LittleEndian loads/stores to use memcpy
...
The existing code uses a series of 8bit loads with shifts and ors to
emulate an (unaligned) load of a larger type. These are then expected to
become single loads in the compiler, producing optimal assembly. Whilst
this is true it happens very late in the compiler, meaning that
throughout most of the pipeline it is treated (and cost-modelled) as
multiple loads, shifts and ors. This can make the compiler make poor
decisions (such as not unrolling loops that should be), or to break up
the pattern before it is turned into a single load.
For example the loops in CompressFragment do not get unrolled as
expected due to a higher cost than the unroll threshold in clang.
Instead this patch uses a more conventional methods of loading unaligned
data, using a memcpy directly which the compiler will be able to deal
with much more straight forwardly, modelling it as a single unaligned
load. The old code is left as-is for big-endian systems.
This helps improve the performance of the BM_ZFlat benchmarks by up to
10-15% on an Arm Neoverse N1.
Change-Id: I986f845ebd0a0806d052d2be3e4dbcbee91713d7
2022-01-19 07:14:46 +00:00