A fast compressor/decompressor

Related tags

Compression snappy
Overview

Snappy, a fast compressor/decompressor.

Build Status Build status

Introduction

Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. (For more information, see "Performance", below.)

Snappy has the following properties:

  • Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code. See "Performance" below.
  • Stable: Over the last few years, Snappy has compressed and decompressed petabytes of data in Google's production environment. The Snappy bitstream format is stable and will not change between versions.
  • Robust: The Snappy decompressor is designed not to crash in the face of corrupted or malicious input.
  • Free and open source software: Snappy is licensed under a BSD-type license. For more information, see the included COPYING file.

Snappy has previously been called "Zippy" in some Google presentations and the like.

Performance

Snappy is intended to be fast. On a single core of a Core i7 processor in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more. (These numbers are for the slowest inputs in our benchmark suite; others are much faster.) In our tests, Snappy usually is faster than algorithms in the same class (e.g. LZO, LZF, QuickLZ, etc.) while achieving comparable compression ratios.

Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and other already-compressed data. Similar numbers for zlib in its fastest mode are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are capable of achieving yet higher compression rates, although usually at the expense of speed. Of course, compression ratio will vary significantly with the input.

Although Snappy should be fairly portable, it is primarily optimized for 64-bit x86-compatible processors, and may run slower in other environments. In particular:

  • Snappy uses 64-bit operations in several places to process more data at once than would otherwise be possible.
  • Snappy assumes unaligned 32 and 64-bit loads and stores are cheap. On some platforms, these must be emulated with single-byte loads and stores, which is much slower.
  • Snappy assumes little-endian throughout, and needs to byte-swap data in several places if running on a big-endian platform.

Experience has shown that even heavily tuned code can be improved. Performance optimizations, whether for 64-bit x86 or other platforms, are of course most welcome; see "Contact", below.

Building

You need the CMake version specified in CMakeLists.txt or later to build:

git submodule update --init
mkdir build
cd build && cmake ../ && make

Usage

Note that Snappy, both the implementation and the main interface, is written in C++. However, several third-party bindings to other languages are available; see the home page for more information. Also, if you want to use Snappy from C code, you can use the included C bindings in snappy-c.h.

To use Snappy from your own C++ program, include the file "snappy.h" from your calling file, and link against the compiled library.

There are many ways to call Snappy, but the simplest possible is

snappy::Compress(input.data(), input.size(), &output);

and similarly

snappy::Uncompress(input.data(), input.size(), &output);

where "input" and "output" are both instances of std::string.

There are other interfaces that are more flexible in various ways, including support for custom (non-array) input sources. See the header file for more information.

Tests and benchmarks

When you compile Snappy, the following binaries are compiled in addition to the library itself. You do not need them to use the compressor from your own library, but they are useful for Snappy development.

  • snappy_benchmark contains microbenchmarks used to tune compression and decompression performance.
  • snappy_unittests contains unit tests, verifying correctness on your machine in various scenarios.
  • snappy_test_tool can benchmark Snappy against a few other compression libraries (zlib, LZO, LZF, and QuickLZ), if they were detected at configure time. To benchmark using a given file, give the compression algorithm you want to test Snappy against (e.g. --zlib) and then a list of one or more file names on the command line.

If you want to change or optimize Snappy, please run the tests and benchmarks to verify you have not broken anything.

The testdata/ directory contains the files used by the microbenchmarks, which should provide a reasonably balanced starting point for benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they are used to verify correctness in the presence of corrupted data in the unit test.)

The gflags library for handling of command-line flags is used if it's installed. You can find it at

https://gflags.github.io/gflags/

Contact

Snappy is distributed through GitHub. For the latest version and other information, see https://github.com/google/snappy.

Comments
  • Make CMake sanitizer agnostic

    Make CMake sanitizer agnostic

    Fixes a minor issue in #78 (see https://github.com/google/snappy/pull/78#issuecomment-522886351) and another one (see https://github.com/google/oss-fuzz/issues/2834#issuecomment-531785580)

    cla: yes 
    opened by bshastry 26
  • Set both VERSION and SOVERSION for target snappy

    Set both VERSION and SOVERSION for target snappy

    Set proper values for properties VERSION and SOVERSION of target snappy, so that CMake produces the link libsnappy.so.1 to the shared library libsnappy.so.1.1.5 that is built. This link was available in 1.1.4 (with autotools) and it is still necessary for the run time linker to work. Its absence leads to run time errors like "error while loading shared libraries: libsnappy.so.1: cannot open shared object file: No such file or directory".

    opened by gdsotirov 26
  • Add libFuzzer harness and cmake option to build it

    Add libFuzzer harness and cmake option to build it

    Hello,

    I noticed that snappy is a security critical dependency of chromium and decided to create a fuzzer for it.

    This PR

    • adds a libFuzzer style harness for fuzzing snappy's compressor-decompressor
    • adds a CMake option (off by default) for build the said harness

    Once this PR is reviewed and approved, I plan to create a PR to oss-fuzz to upstream this fuzzer.

    CC @dor1s @inferno-chromium @kcc

    cla: yes 
    opened by bshastry 20
  • Error for missing README

    Error for missing README

    Should not this rule produce README, instead of README.tmp? Otherwise, I receive the following error:

    cat README.md > README.tmp
     /usr/bin/mkdir -p '/usr/src/tmp/package-snappy/usr/doc/snappy-1.1.5'
     /usr/bin/ginstall -c -m 644 ChangeLog COPYING INSTALL NEWS ./README format_description.txt framing_format.txt '/usr/src/tmp/package-snappy/usr/doc/snappy-1.1.5'
    /usr/bin/ginstall: cannot stat './README': No such file or directory
    Makefile:767: recipe for target 'install-dist_docDATA' failed
    make[1]: *** [install-dist_docDATA] Error 1
    make[1]: Leaving directory '/usr/src/tmp/snappy-1.1.5'
    Makefile:1230: recipe for target 'install-am' failed
    make: *** [install-am] Error 2
    
    opened by gdsotirov 20
  • CMake improvements

    CMake improvements

    So i tried to get my CMake wishes into the previous PRs, some of it landed, this is the rest.

    Since the changed CMake config is pretty important for downstream use, i hope this can be included before the initial CMake release.

    https://github.com/google/snappy/pull/38/commits/ad817785de16d59c6bc6a0a3ca379ca145453d2c is my take on one of CMake’s weak points, autotools allowed to do this before.

    opened by Optiligence 15
  • Revert

    Revert "Improve zippy decompression speed."

    This reverts commit 8bfb028b618747a6ac8af159c87e9196c729566f.

    The commit 8bfb028 (Improve zippy decompression speed) introduce a crash when snappy is compiled with _FORTIFY_SOURCE with musl libc. Backtrace reveals that it it comes from using memcpy with overlap. Since this may have security implications we better revert it for now.

    Bactrace from core dump created with make check:

    (gdb) bt
     #0  memcpy (__n=8, __os=0xb38c8367eea, __od=0xb38c8367eeb)
         at /usr/include/fortify/string.h:48
     #1  snappy::(anonymous namespace)::UnalignedCopy64 (src=0xb38c8367eea,
         dst=0xb38c8367eeb) at snappy.cc:92
     #2  0x00006fb4c7c31717 in snappy::(anonymous namespace)::IncrementalCopy
     (
         buf_limit=0xb38c8380ee0 "", op_limit=<optimized out>, op=<optimized
     out>,
         src=0xb38c8367eea " .\001") at snappy.cc:178
     #3  snappy::SnappyArrayWriter::AppendFromSelf (len=<optimized out>,
         offset=<optimized out>, this=<synthetic pointer>) at snappy.cc:1131
     #4
     snappy::SnappyDecompressor::DecompressAllTags<snappy::SnappyArrayWriter>
     (
         writer=<synthetic pointer>, this=0x7f1d26737050) at snappy.cc:715
     #5  snappy::InternalUncompressAllTags<snappy::SnappyArrayWriter> (
         uncompressed_len=<optimized out>, writer=<synthetic pointer>,
         decompressor=0x7f1d26737050) at snappy.cc:799
     #6  snappy::InternalUncompress<snappy::SnappyArrayWriter> (
         writer=<synthetic pointer>, r=0x7f1d26737000) at snappy.cc:789
     #7  snappy::RawUncompress (compressed=compressed@entry=0x7f1d267370c0,
         uncompressed=0xb38c8367ee0 "  content: .\001") at snappy.cc:1149
     #8  0x00006fb4c7c3194d in snappy::RawUncompress (compressed=<optimized
     out>,
         n=<optimized out>, uncompressed=<optimized out>) at snappy.cc:1144
     #9  0x00000b38c5c6261a in snappy::BM_UFlat (iters=99, arg=<optimized
     out>)
         at snappy_unittest.cc:1371
     #10 0x00000b38c5c69846 in snappy::Benchmark::Run (this=0xb38c8323c40)
         at snappy-test.cc:192
     #11 0x00000b38c5c5f3fd in RunSpecifiedBenchmarks () at snappy-test.h:485
     #12 main (argc=1, argv=0x7f1d267374b8) at snappy_unittest.cc:1515
    
    opened by ncopa 14
  • In gcc __powerpc64__ not __ppc64__ defines the PPC64 architecture

    In gcc __powerpc64__ not __ppc64__ defines the PPC64 architecture

    Corrects 18488d6212331fee647ecfded85353ab3ad91de8 and still maintains clang compatibility (like the #27 originally).

    Compiler tests:

    $ uname -m ppc64le $ cat /tmp/x.c

    $ gcc -c /tmp/x.c /tmp/x.c:3:2: warning: #warning powerpc64 exists [-Wcpp] #warning powerpc64 exists ^~~~~~~ /tmp/x.c:11:2: error: #error ppc64 not defined #error ppc64 not defined ^~~~~ $ clang /tmp/x.c /tmp/x.c:3:2: warning: powerpc64 exists [-W#warnings] ^ /tmp/x.c:9:2: warning: ppc64 exists [-W#warnings] ^ 2 warnings generated.

    I should be on the CLA list already.

    cla: yes 
    opened by grooverdan 12
  • Added pkg-config file and .gitignore

    Added pkg-config file and .gitignore

    This is a redo of https://github.com/google/snappy/pull/55, which had merge conflicts and seemed abandoned.

    I implemented the suggested fix of using the variables provided by GNUInstallDirs

    A notable difference is that the snappy.pc is now installed in <prefix>/lib/pkgconfig instead of <prefix>/share/pkgconfig, which seems to be more in line with the other packages installing libraries.

    Grabbed the absolute/relative logic from https://github.com/rtrlib/rtrlib/pull/150; when building in Nixos, it overrides GNUInstallDirs to absolute paths, which may not be in the prefix:

    Verification using a relative CMAKE_INSTALL_LIBDIR:

    $ cat libsnappy.pc
    prefix=/usr/local
    exec_prefix=/usr/local
    libdir=${prefix}/lib
    includedir=${prefix}/include
    
    Name: snappy
    Description: Fast compressor/decompressor library.
    Version: 1.1.7
    Libs: -L${prefix}/lib -lsnappy
    Cflags: -I${prefix}/include
    
    $ pkg-config libsnappy --libs
    -L/usr/local/lib -lsnappy
    $ pkg-config libsnappy --cflags
    -I/usr/local/include
    

    With an absolute CMAKE_INSTALL_LIBDIR, building with nix:

    prefix=/nix/store/hknzbw87dnmz7zq0lf99vjf7bf2l2hps-snappy-1.1.7
    exec_prefix=/nix/store/hknzbw87dnmz7zq0lf99vjf7bf2l2hps-snappy-1.1.7
    libdir=/nix/store/hknzbw87dnmz7zq0lf99vjf7bf2l2hps-snappy-1.1.7/lib
    includedir=/nix/store/z67qhiawswr2jq1p71r4f2987czx3slh-snappy-1.1.7-dev/include
    
    Name: snappy
    Description: Fast compressor/decompressor library.
    Version: 1.1.7
    Libs: -L/nix/store/hknzbw87dnmz7zq0lf99vjf7bf2l2hps-snappy-1.1.7/lib -lsnappy
    Cflags: -I/nix/store/z67qhiawswr2jq1p71r4f2987czx3slh-snappy-1.1.7-dev/include
    
    $ pkg-config libsnappy --libs
    -L/nix/store/hknzbw87dnmz7zq0lf99vjf7bf2l2hps-snappy-1.1.7/lib -lsnappy
    $ pkg-config libsnappy --cflags
    -I/nix/store/z67qhiawswr2jq1p71r4f2987czx3slh-snappy-1.1.7-dev/include
    

    Installing the snappy gem (i.e. it properly detected it and didn't recompile snappy):

    $ time gem install snappy
    Fetching snappy-0.0.17.gem
    Building native extensions. This could take a while...
    Successfully installed snappy-0.0.17
    Parsing documentation for snappy-0.0.17
    Installing ri documentation for snappy-0.0.17
    Done installing documentation for snappy after 0 seconds
    1 gem installed
    gem install snappy  1.25s user 0.57s system 80% cpu 2.270 total
    
    wontfix cla: yes 
    opened by lavoiesl 11
  • Allow building with cmake (to ease building on windows)

    Allow building with cmake (to ease building on windows)

    The patch series allows for building on windows with the following steps:

    git clone git://github.com/trondn/snappy
    mkdir build
    cd build
    cmake -G "NMake Makefiles" ..\snappy
    nmake all test
    

    (It should also build on unix systems by dropping the "-G NMake Makefiles" part)

    opened by trondn 10
  • fix cmake build error

    fix cmake build error

    I want to install libsnappy on linux mint 19. I already install GTest from github, but make complained undefined reference:

    [100%] Linking CXX executable snappy_unittest
    CMakeFiles/snappy_unittest.dir/snappy_unittest.cc.o: In function `snappy::Snappy_ZeroOffsetCopy_Test::TestBody()':
    snappy_unittest.cc:(.text+0x680b): undefined reference to `testing::Message::Message()'
    

    It seems like CMakeList not link0 to libgtest. Fix it by adding target_link_libraries(snappy_unittest gtest) in CMakelists.txt

    cla: yes 
    opened by LYKZZzz 9
  • Fix UBSan error (ptr + offset overflow)

    Fix UBSan error (ptr + offset overflow)

    As i + offset is promoted to a "negative" size_t, UBSan would complain when adding the resulting offset to dst:

    /tmp/RtmptDX1SS/file584e37df4e/snappy_ep-prefix/src/snappy_ep/snappy.cc:343:43: runtime error: addition of unsigned offset to 0x6120003c5ec1 overflowed to 0x6120003c5ec0
        #0 0x7f9ebd21769c in snappy::(anonymous namespace)::Copy64BytesWithPatternExtension(char*, unsigned long) /tmp/RtmptDX1SS/file584e37df4e/snappy_ep-prefix/src/snappy_ep/snappy.cc:343:43
        #1 0x7f9ebd21769c in std::__1::pair<unsigned char const*, long> snappy::DecompressBranchless<char*>(unsigned char const*, unsigned char const*, long, char*, long) /tmp/RtmptDX1SS/file584e37df4e/snappy_ep-prefix/src/snappy_ep/snappy.cc:1160:15
    
    cla: yes 
    opened by pitrou 8
  • Add CIFuzz GitHub Action

    Add CIFuzz GitHub Action

    Add CIFuzz workflow action to have fuzzers build and run on each PR.

    This is a service offered by OSS-Fuzz where Snappy already runs (https://github.com/google/oss-fuzz/tree/master/projects/snappy). CIFuzz can help detect regressions and catch fuzzing build issues early, and has a variety of features (see the URL above). In the current PR the fuzzers gets build on a pull request and will run for 300 seconds.

    Signed-off-by: David Korczynski [email protected]

    opened by DavidKorczynski 0
  • Add new framing chunk types without checksums

    Add new framing chunk types without checksums

    Adds two new chunk types to the Snappy framing format: compressed data without a checksum, and uncompressed data without a checksum. These types are identical to their existing counterparts except they do not contain a CRC-32C checksum. Essentially, this makes including checksums for each data chunk optional rather than required.

    In some use cases, computing the CRC-32C checksums for the data chunks in the Snappy framing format ends up dominating execution time. Eliminating the checksums provides massive 2.5x performance improvements in our uses of Snappy for compressing address trace data prior to storing to disk.

    Existing readers of the Snappy framing format would be expected to fail up front on an unknown chunk type when encountering the new types, until updated to handle them, which should be a simple coding change.

    opened by derekbruening 1
  • Allow compiling with MSVC-compatible compilers like clang-cl

    Allow compiling with MSVC-compatible compilers like clang-cl

    In order to be able to compile with clang-cl, test for MSVC rather than CMAKE_CXX_COMPILER_ID STREQUAL "MSVC". This is true for other MSVC command-line compatible compilers, i.e. clang-cl.

    opened by goedderz 0
Releases(1.1.9)
  • 1.1.9(May 4, 2021)

  • 1.1.8(Jan 14, 2020)

  • 1.1.7(Aug 25, 2017)

    • Improved CMake build support for 64-bit Linux distributions.
    • MSVC builds now use MSVC-specific intrinsics that map to clzll.
    • ARM64 (AArch64) builds use the code paths optimized for 64-bit processors.
    Source code(tar.gz)
    Source code(zip)
  • 1.1.6(Jul 13, 2017)

  • 1.1.5(Jun 29, 2017)

    This release has broken SONAME / SOVERSION values. Users of snappy as a shared library should avoid 1.1.5 and use 1.1.6 instead. SONAME / SOVERSION errors will manifest as the dynamic library loader complaining that it cannot find snappy's shared library file (libsnappy.so / libsnappy.dylib), or that the library it found does not have the required version. 1.1.6 has the same code as 1.1.5, but carries build configuration fixes for the issues above.

    • Add CMake build support. The autoconf build support is now deprecated, and will be removed in the next release.
    • Add AppVeyor configuration, for Windows CI coverage.
    • Small performance improvement on little-endian PowerPC.
    • Small performance improvement on LLVM with position-independent executables.
    • Fix a few issues with various build environments.
    Source code(tar.gz)
    Source code(zip)
  • 1.1.4(Jan 27, 2017)

  • 1.1.3(Jul 7, 2015)

    This is the first release to be done from GitHub, which means that some minor things like the ChangeLog format has changed (git log format instead of svn log).

    • Add support for Uncompress() from a Source to a Sink.
    • Various minor changes to improve MSVC support; in particular, the unit tests now compile and run under MSVC.
    Source code(tar.gz)
    Source code(zip)
    snappy-1.1.3.tar.gz(1.43 MB)
Owner
Google
Google ❤️ Open Source
Google
Experimental data compressor for 8-bit computers and low-end platforms

ZX5 (experimental) ZX5 is an experimental data compressor derived from ZX0, similarly targeted for low-end platforms, including 8-bit computers like t

Einar Saukas 9 Apr 14, 2022
Extremely Fast Compression algorithm

LZ4 - Extremely fast compression LZ4 is lossless compression algorithm, providing compression speed > 500 MB/s per core, scalable with multi-cores CPU

lz4 7.9k Dec 31, 2022
Zstandard - Fast real-time compression algorithm

Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better comp

Facebook 19.2k Jan 1, 2023
Przemyslaw Skibinski 579 Jan 8, 2023
Parallel, indexed xz compressor

pixz Pixz (pronounced pixie) is a parallel, indexing version of xz. Repository: https://github.com/vasi/pixz Downloads: https://github.com/vasi/pixz/r

Dave Vasilevsky 620 Dec 30, 2022
Parallel, indexed xz compressor

pixz Pixz (pronounced pixie) is a parallel, indexing version of xz. Repository: https://github.com/vasi/pixz Downloads: https://github.com/vasi/pixz/r

Dave Vasilevsky 619 Dec 22, 2022
A free, open-source compressor for the ZX0 format

salvador -- a fast, near-optimal compressor for the ZX0 format salvador is a command-line tool and a library that compresses bitstreams in the ZX0 for

Emmanuel Marty 35 Dec 26, 2022
Experimental data compressor for 8-bit computers and low-end platforms

ZX5 (experimental) ZX5 is an experimental data compressor derived from ZX0, similarly targeted for low-end platforms, including 8-bit computers like t

Einar Saukas 9 Apr 14, 2022
Fast Binary Encoding is ultra fast and universal serialization solution for C++, C#, Go, Java, JavaScript, Kotlin, Python, Ruby, Swift

Fast Binary Encoding (FBE) Fast Binary Encoding allows to describe any domain models, business objects, complex data structures, client/server request

Ivan Shynkarenka 654 Jan 2, 2023
Peregrine - A blazing fast language for the blazing fast world(WIP)

A Blazing-Fast Language for the Blazing-Fast world. The Peregrine Programming Language Peregrine is a Compiled, Systems Programming Language, currentl

Peregrine 1.5k Jan 2, 2023
Fast, orthogonal, open multi-methods. Supersedes yomm11.

YOMM2 This is a complete rewrite of YOMM11, which is now deprecated. This library is much better, see here to find out why. TL;DR If you are familiar

Jean-Louis Leroy 246 Dec 25, 2022
Extremely Fast Compression algorithm

LZ4 - Extremely fast compression LZ4 is lossless compression algorithm, providing compression speed > 500 MB/s per core, scalable with multi-cores CPU

lz4 7.9k Dec 31, 2022
Zstandard - Fast real-time compression algorithm

Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better comp

Facebook 19.2k Jan 1, 2023
A fast multi-producer, multi-consumer lock-free concurrent queue for C++11

moodycamel::ConcurrentQueue An industrial-strength lock-free queue for C++. Note: If all you need is a single-producer, single-consumer queue, I have

Cameron 7.4k Jan 3, 2023
A fast single-producer, single-consumer lock-free queue for C++

A single-producer, single-consumer lock-free queue for C++ This mini-repository has my very own implementation of a lock-free queue (that I designed f

Cameron 2.9k Jan 5, 2023
C++ implementation of a fast hash map and hash set using hopscotch hashing

A C++ implementation of a fast hash map and hash set using hopscotch hashing The hopscotch-map library is a C++ implementation of a fast hash map and

Thibaut Goetghebuer-Planchon 578 Dec 23, 2022
🏅State-of-the-art learned data structure that enables fast lookup, predecessor, range searches and updates in arrays of billions of items using orders of magnitude less space than traditional indexes

The Piecewise Geometric Model index (PGM-index) is a data structure that enables fast lookup, predecessor, range searches and updates in arrays of bil

Giorgio Vinciguerra 651 Dec 29, 2022
Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20

➵ robin_hood unordered map & set robin_hood::unordered_map and robin_hood::unordered_set is a platform independent replacement for std::unordered_map

Martin Ankerl 1.3k Jan 5, 2023
C++ implementation of a fast hash map and hash set using robin hood hashing

A C++ implementation of a fast hash map and hash set using robin hood hashing The robin-map library is a C++ implementation of a fast hash map and has

Thibaut Goetghebuer-Planchon 872 Dec 26, 2022