Zstandard - Fast real-time compression algorithm

Related tags

Compression zstd
Overview

Zstandard

Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It's backed by a very fast entropy stage, provided by Huff0 and FSE library.

The project is provided as an open-source dual BSD and GPLv2 licensed C library, and a command line utility producing and decoding .zst, .gz, .xz and .lz4 files. Should your project require another programming language, a list of known ports and bindings is provided on Zstandard homepage.

Development branch status:

Build Status Build status Build status Build status Fuzzing Status

Benchmarks

For reference, several fast compression algorithms were tested and compared on a server running Arch Linux (Linux version 5.5.11-arch1-1), with a Core i9-9900K CPU @ 5.0GHz, using lzbench, an open-source in-memory benchmark by @inikep compiled with gcc 9.3.0, on the Silesia compression corpus.

Compressor name Ratio Compression Decompress.
zstd 1.4.5 -1 2.884 500 MB/s 1660 MB/s
zlib 1.2.11 -1 2.743 90 MB/s 400 MB/s
brotli 1.0.7 -0 2.703 400 MB/s 450 MB/s
zstd 1.4.5 --fast=1 2.434 570 MB/s 2200 MB/s
zstd 1.4.5 --fast=3 2.312 640 MB/s 2300 MB/s
quicklz 1.5.0 -1 2.238 560 MB/s 710 MB/s
zstd 1.4.5 --fast=5 2.178 700 MB/s 2420 MB/s
lzo1x 2.10 -1 2.106 690 MB/s 820 MB/s
lz4 1.9.2 2.101 740 MB/s 4530 MB/s
zstd 1.4.5 --fast=7 2.096 750 MB/s 2480 MB/s
lzf 3.6 -1 2.077 410 MB/s 860 MB/s
snappy 1.1.8 2.073 560 MB/s 1790 MB/s

The negative compression levels, specified with --fast=#, offer faster compression and decompression speed in exchange for some loss in compression ratio compared to level 1, as seen in the table above.

Zstd can also offer stronger compression ratios at the cost of compression speed. Speed vs Compression trade-off is configurable by small increments. Decompression speed is preserved and remains roughly the same at all settings, a property shared by most LZ compression algorithms, such as zlib or lzma.

The following tests were run on a server running Linux Debian (Linux version 4.14.0-3-amd64) with a Core i7-6700K CPU @ 4.0GHz, using lzbench, an open-source in-memory benchmark by @inikep compiled with gcc 7.3.0, on the Silesia compression corpus.

Compression Speed vs Ratio Decompression Speed
Compression Speed vs Ratio Decompression Speed

A few other algorithms can produce higher compression ratios at slower speeds, falling outside of the graph. For a larger picture including slow modes, click on this link.

The case for Small Data compression

Previous charts provide results applicable to typical file and stream scenarios (several MB). Small data comes with different perspectives.

The smaller the amount of data to compress, the more difficult it is to compress. This problem is common to all compression algorithms, and reason is, compression algorithms learn from past data how to compress future data. But at the beginning of a new data set, there is no "past" to build upon.

To solve this situation, Zstd offers a training mode, which can be used to tune the algorithm for a selected type of data. Training Zstandard is achieved by providing it with a few samples (one file per sample). The result of this training is stored in a file called "dictionary", which must be loaded before compression and decompression. Using this dictionary, the compression ratio achievable on small data improves dramatically.

The following example uses the github-users sample set, created from github public API. It consists of roughly 10K records weighing about 1KB each.

Compression Ratio Compression Speed Decompression Speed
Compression Ratio Compression Speed Decompression Speed

These compression gains are achieved while simultaneously providing faster compression and decompression speeds.

Training works if there is some correlation in a family of small data samples. The more data-specific a dictionary is, the more efficient it is (there is no universal dictionary). Hence, deploying one dictionary per type of data will provide the greatest benefits. Dictionary gains are mostly effective in the first few KB. Then, the compression algorithm will gradually use previously decoded content to better compress the rest of the file.

Dictionary compression How To:

  1. Create the dictionary

    zstd --train FullPathToTrainingSet/* -o dictionaryName

  2. Compress with dictionary

    zstd -D dictionaryName FILE

  3. Decompress with dictionary

    zstd -D dictionaryName --decompress FILE.zst

Build instructions

Makefile

If your system is compatible with standard make (or gmake), invoking make in root directory will generate zstd cli in root directory.

Other available options include:

  • make install : create and install zstd cli, library and man pages
  • make check : create and run zstd, tests its behavior on local platform

cmake

A cmake project generator is provided within build/cmake. It can generate Makefiles or other build scripts to create zstd binary, and libzstd dynamic and static libraries.

By default, CMAKE_BUILD_TYPE is set to Release.

Meson

A Meson project is provided within build/meson. Follow build instructions in that directory.

You can also take a look at .travis.yml file for an example about how Meson is used to build this project.

Note that default build type is release.

VCPKG

You can build and install zstd vcpkg dependency manager:

git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg integrate install
./vcpkg install zstd

The zstd port in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please create an issue or pull request on the vcpkg repository.

Visual Studio (Windows)

Going into build directory, you will find additional possibilities:

  • Projects for Visual Studio 2005, 2008 and 2010.
    • VS2010 project is compatible with VS2012, VS2013, VS2015 and VS2017.
  • Automated build scripts for Visual compiler by @KrzysFR, in build/VS_scripts, which will build zstd cli and libzstd library without any need to open Visual Studio solution.

Buck

You can build the zstd binary via buck by executing: buck build programs:zstd from the root of the repo. The output binary will be in buck-out/gen/programs/.

Testing

You can run quick local smoke tests by executing the playTest.sh script from the src/tests directory. Two env variables $ZSTD_BIN and $DATAGEN_BIN are needed for the test script to locate the zstd and datagen binary. For information on CI testing, please refer to TESTING.md

Status

Zstandard is currently deployed within Facebook. It is used continuously to compress large amounts of data in multiple formats and use cases. Zstandard is considered safe for production environments.

License

Zstandard is dual-licensed under BSD and GPLv2.

Contributing

The dev branch is the one where all contributions are merged before reaching release. If you plan to propose a patch, please commit into the dev branch, or its own feature branch. Direct commit to release are not permitted. For more information, please read CONTRIBUTING.

Comments
  • Compressing individual documents with a dictionary produced from a sampled batch to get better compression ratio

    Compressing individual documents with a dictionary produced from a sampled batch to get better compression ratio

    I'm very excited to see work being done on dictionary support in the API, because this is something that could greatly help me solve a pressing problem.

    Context

    In the context of a Document Store, where we are storing a set of JSON-like documents that are sharing the same schema. Each document can be created, read or updated individually in a random fashion. We would like to compress the documents on disk, but there is very little redundancy within each document, which yields very poor compression ratio (maybe 10-20%). When compressing batchs of 10s or 100s documents, the compression ratio gets really good (10x, 50x or sometimes even more), because there is a lot of redundancy between documents, from:

    • the structure of the JSON itself which has a lot of ": ", or ": true, or {[[...]],[[...]]} symbols.
    • the names of the JSON fields: "Id", "Name", "Label", "SomeVeryLongFieldNameThatIsPresentOnlyOncePerDocument", etc..
    • frequent values like constants (true, "Red", "Administrator", ...), keywords, dates that start with 2015-12-14T.... for the next 24h, and even well-known or frequently used GUIDs that are shared by documents (Product Category, Tag Id, hugely popular nodes in graph databases, ...)

    In the past, I used femtozip (https://github.com/gtoubassi/femtozip) which is intended precisely for this use case. It includes a dictionary training step (by building a sample batch of documents), that is then used to compress and decompress single documents, with the same compression ratio as if it was a batch. Using real life data, compressing 1000 documents individually would give the same compression ratio as compressing all 1000 documents in a batch with gzip -5.

    The dictionary training part of femtozip can be very long: the more samples, the better the compression ratio would be in the end but you need tons of RAM to train it.

    Also, I realized that femtozip would sometimes offset the differences in size between different formats like JSON/BSON/JSONB/ProtoBuf and other binary formats, because it would pick up the "grammar" of the format (text or binary) in the dictionary, and only deal with the "meat" of the documents (guids, integers, doubles, natural text) when compressing. This means I can use a format like JSONB (used by Postgres) which is less compact, but is faster to decode at runtime than JSON text.

    Goal

    I would like to be able to do something similar with Zstandard. I don't really care about building the most efficient dictionary (though it could be nice), but at least being able to exploit the fact that FSE builds a list of tokens sorted by frequency. Extracting this list of tokens may help in building a dictionary that will have the most common tokens in the training batch.

    The goal would be:

    • For each new or modified document D, compress it AS IF we were compressing SAMPLES[cur_gen] + D.json, and only storing the bits produced by the D.json part.
    • When reading document D, decompress it AS IF we had the complete compressed version of SAMPLES[D.gen] + D.compressed, and only keeping the last decoded bits that make up D.

    Since it would be impractical to change the compression code to be able to know which compressed bits are from D, and which from the batch, we could aproximate this by computing a DICTIONARY[gen] that would be used to initialize the compressor and decompressor.

    Idea
    • Start by serializing an empty object into JSON (we would get the json structure and all the field names, but no values)
    • Use this as the the initial "gen 0" dictionary for the first batch of documents (when starting with an empty database)
    • After N documents, sample k random documents and compress them to produce a "generation 1" dictionary.
    • Compress each new or updated document (individually) with this new dictionary
    • After another N documents, or if some heuristic shows that compression ratio starts declining, then start a new generation of dictionary.

    The Document Store would durably store each generations of dictionaries, and use them to decompress older entries. Periodically, it could recycle the entire store by recompressing everything with the most recent dictionary.

    Concrete example:

    Training set:

    • { "id": 123, "label": "Hello", "enabled": true, "uuid": "9ad51b87-d627-4e04-85c2-d6cb77415981" }
    • { "id": 126, "label": "Hell", "enabled": false, "uuid": "0c8e13a5-cdc8-4e1f-8e80-4fee025ee59c" }
    • { "id": 129, "label": "Help", "enabled": true, "uuid": "fe6db321-cddd-4e7f-b3d6-6b38365b3e2a" }

    Looking at it, we can extract the following repeating segments: { "id": 12.., "label": "Hel... ", "enabled": ... e, "uuid": " ... " }, which could be condensed into:

    • { "id": 12, "label": "Hel", "enabled": e, "uuid":"" } (53 bytes shared by all docs)

    The unique part of each documents would be:

    • ...3...lo...tru...9ad51b87-d627-4e04-85c2-d6cb77415981 (42 bytes)
    • ...6...l...fals...0c8e13a5-cdc8-4e1f-8e80-4fee025ee59c (42 bytes)
    • ...9......tru...fe6db321-cddd-4e7f-b3d6-6b38365b3e2a (40 bytes)

    Zstd would only have to work on 42 bytes per doc, instead of 85 bytes. More realistic documents will have a lot more stuff in common than this example.

    What I've tested so far
    • create "gen0" dictionary with "hollow" JSON: { "id": , "foo": "", "bar": "", ....} produced by removing all values from the JSON document.
    • using ZSTD_compress_insertDictionary, compressing { "id": 123, "foo": "Hello", "bar": "World", ...} is indeed smaller than without dictionary.
    • looking at cctx->litStart, I can see a buffer with 123HelloWorld which is exactly the content specific to the document itself that got removed when producing the gen0 dict.

    Maybe one way to construct a better dictionary would be:

    • compress the batch of random and complete document (with values)
    • take K first symbols ordered by frequency descending
    • create dictionary by outputing symbol K-1, then K-2, up to 0 (I guess that if most frequent symbol is to the end of the dictionary, offsets to it would be smaller?)
    • maybe one could ask for a target dictionary size, and K would be the number of symbols needed to fill the dictionary?
    What I'm not sure about
    • I don't know how having a dictionary would help with larger documents above 128KB or 256KB. Currently I'm only inserting the dictionary for the first block. Would I need to reuse the same dictionary for each 128KB block?
    • What is the best size for this dictionary? 16KB? 64KB? 128KB?
    • ZSTD_decompress_insertDictionary branches off into different implementation for lazy, greedy and so on. I'm not sure if all compression strategy can be used to produce such a dictionary?

    Again, I don't care about producing the ideal dictionary that produces the smallest result possible, only something that would give me about better compression ratio, while still being able to handle documents in isolation.

    opened by KrzysFR 86
  • plans for packaging on different platforms (package manager integration)

    plans for packaging on different platforms (package manager integration)

    Are there any plans on getting zstd into the main package manager repositories of various (or all if possible) platforms? Is there a list for this already?

    A list of platforms includes (but is not limited to):

    • GNU/Linux
      • debian derived
        • [x] *buntu (aptitude)
          • package (xenial : 0.5.1-1 (outdated) ; yakkety : 0.8.0-1 (compatible) ; zesty : 1.1.2-1 (current) )
        • [x] Debian (aptitude)
      • Red Hat
        • [x] Fedora&Red Hat Enterprise Linux
      • SUSE
      • other
    • Unix&BSD
    • Windows
      • [x] Windows (some MSI thing and plain exes)
      • [x] MSYS2
      • [ ] Cygwin
    opened by benaryorg 51
  • Struggling with ZSTD_decompressBlock

    Struggling with ZSTD_decompressBlock

    Hi,

    I'm having a problem when using the block-based methods.

    If I use 'ZSTD_compressContinue' with 'ZSTD_decompressContinue' then my codes work fine:

    static Bool ZSTDCompress(File &src, File &dest, Int compression_level)
    {
       Bool ok=false;
       if(ZSTD_CCtx *ctx=ZSTD_createCCtx_advanced(ZSTDMem))
       {
          ZSTD_parameters params; Zero(params);
          params.cParams=ZSTD_getCParams(Mid(compression_level, 1, ZSTD_maxCLevel()), src.left(), 0);
          if(!ZSTD_isError(ZSTD_compressBegin_advanced(ctx, null, 0, params, src.left())))
          {
             // sizes for 'window_size', 'block_size', 's', 'd' were taken from "zstd" tutorial, "zbuff_compress.c" file, "ZBUFF_compressInit_advanced" function
           C Int window_size=1<<params.cParams.windowLog, block_size=Min(window_size, ZSTD_BLOCKSIZE_MAX);
             Memt<Byte> s, d; s.setNum(window_size+block_size); d.setNum(ZSTDSize(block_size)+1); Int s_pos=0;
             for(; !src.end(); )
             {
                Int read=Min(ZSTD_BLOCKSIZE_MAX, Min(s.elms(), src.left())); // ZSTD_BLOCKSIZE_MAX taken from 'ZBUFF_recommendedCInSize' (without this, 'ZSTD_compressContinue' may fail with 'dest' too small error)
                if(s_pos>s.elms()-read)s_pos=0; // if reading will exceed buffer size
                read=src.getReturnSize(&s[s_pos], read); if(read<=0)goto error;
                auto size=ZSTD_compressContinue(ctx, d.data(), d.elms(), &s[s_pos], read); if(ZSTD_isError(size))goto error;
                if(!dest.put(d.data(), size))goto error;
                s_pos+=read;
             }
             auto size=ZSTD_compressEnd(ctx, d.data(), d.elms()); if(ZSTD_isError(size))goto error;
             if(dest.put(d.data(), size))ok=true;
          }
       error:
          ZSTD_freeCCtx(ctx);
       }
       return ok;
    }
    static Bool ZSTDDecompress(File &src, File &dest, Long compressed_size, Long decompressed_size)
    {
       Bool ok=false;
       if(ZSTD_DCtx *ctx=ZSTD_createDCtx_advanced(ZSTDMem))
       {
          ZSTD_decompressBegin(ctx);
          Byte header[ZSTD_frameHeaderSize_max];
          Long pos=src.pos();
          Int read=src.getReturnSize(header, SIZE(header));
          src.pos(pos);
          ZSTD_frameParams frame; if(!ZSTD_getFrameParams(&frame, header, read))
          {
             Long start=dest.pos();
             // sizes for 'block_size', 's', 'd' were taken from "zstd" tutorial, "zbuff_decompress.c" file, "ZBUFF_decompressContinue" function
           C auto block_size=Min(frame.windowSize, ZSTD_BLOCKSIZE_MAX);
             Memt<Byte> s; s.setNum(block_size);
             for(;;)
             {
                auto size=ZSTD_nextSrcSizeToDecompress(ctx); if(!size){if(dest.pos()-start==decompressed_size)ok=true; break;} if(ZSTD_isError(size) || size>s.elms())break;
                if(!src.getFast(s.data(), size))break; // need exactly 'size' amount
                size=ZSTD_decompressContinue(ctx, dest.mem(), dest.left(), s.data(), size); if(ZSTD_isError(size))break;
                if(!MemWrote(dest, size))break;
             }
          }
          ZSTD_freeDCtx(ctx);
       }
       return ok;
    }
    

    But if I replace them with 'ZSTD_compressBlock' and 'ZSTD_decompressBlock', (including writing/reading the compressed buffer size before each buffer), then decompression fails:

    static Bool ZSTDCompressRaw(File &src, File &dest, Int compression_level)
    {
       Bool ok=false;
       if(ZSTD_CCtx *ctx=ZSTD_createCCtx_advanced(ZSTDMem))
       {
          ZSTD_parameters params; Zero(params);
          params.cParams=ZSTD_getCParams(Mid(compression_level, 1, ZSTD_maxCLevel()), src.left(), 0);
          if(!ZSTD_isError(ZSTD_compressBegin_advanced(ctx, null, 0, params, src.left())))
          {
             // sizes for 'window_size', 'block_size', 's', 'd' were taken from "zstd" tutorial, "zbuff_compress.c" file, "ZBUFF_compressInit_advanced" function
           C Int window_size=1<<params.cParams.windowLog, block_size=Min(window_size, ZSTD_BLOCKSIZE_MAX);
             Memt<Byte> s, d; s.setNum(window_size+block_size); d.setNum(ZSTDSize(block_size)+1); Int s_pos=0;
             dest.cmpUIntV(params.cParams.windowLog);
             for(; !src.end(); )
             {
                Int read=Min(ZSTD_BLOCKSIZE_MAX, Min(s.elms(), src.left())); // ZSTD_BLOCKSIZE_MAX taken from 'ZBUFF_recommendedCInSize' (without this, 'ZSTD_compressContinue' may fail with 'dest' too small error)
                if(s_pos>s.elms()-read)s_pos=0; // if reading will exceed buffer size
                read=src.getReturnSize(&s[s_pos], read); if(read<=0)goto error;
                auto size=ZSTD_compressBlock(ctx, d.data(), d.elms(), &s[s_pos], read); if(ZSTD_isError(size))goto error;
                if(  size>0) // compressed OK
                {
                   dest.cmpIntV(size-1);
                   if(!dest.put(d.data(), size))goto error;
                }else // failed to compress
                {
                   dest.cmpIntV(-read);
                   if(!dest.put(&s[s_pos], read))goto error;
                }
                s_pos+=read;
             }
             ok=true;
          }
       error:
          ZSTD_freeCCtx(ctx);
       }
       return ok;
    }
    static Bool ZSTDDecompressRaw(File &src, File &dest, Long compressed_size, Long decompressed_size)
    {
       Bool ok=false;
       if(ZSTD_DCtx *ctx=ZSTD_createDCtx_advanced(ZSTDMem))
       {
          ZSTD_decompressBegin(ctx);
          // sizes for 'block_size', 's', 'd' were taken from "zstd" tutorial, "zbuff_decompress.c" file, "ZBUFF_decompressContinue" function
        C auto window_size=1<<src.decUIntV(), block_size=Min(window_size, ZSTD_BLOCKSIZE_MAX);
          Memt<Byte> s; s.setNum(block_size);
          for(; !src.end(); )
          {
             Int chunk; src.decIntV(chunk);
             if( chunk<0) // un-compressed
             {
                if(!src.copy(dest, -chunk))goto error;
             }else
             {
                chunk++; if(chunk>s.elms())goto error;
                if(!src.getFast(s.data(), chunk))goto error; // need exactly 'chunk' amount
                auto size=ZSTD_decompressBlock(ctx, dest.mem(), dest.left(), s.data(), chunk); if(ZSTD_isError(size))Exit(ZSTD_getErrorName(size)); // here the error occurs
                if(!MemWrote(dest, size))goto error; // this does: dest.mem+=size; and dest.left-=size;
             }
          }
          ok=true;
       error:
          ZSTD_freeDCtx(ctx);
       }
       return ok;
    }
    

    The error occurs at the second call to ZSTD_decompressBlock First call succeeds: chunk=96050 size=ZSTD_decompressBlock(ctx, dest.mem(), dest.left(), s.data(), chunk); size=131072

    Second call fails: chunk=94707 size=ZSTD_decompressBlock(ctx, dest.mem(), dest.left(), s.data(), chunk); size=18446744073709551605 ("Corrupted block detected")

    Am I missing something obvious here?

    When decompressing, the 'dest' File, in this test is a continuous memory capable of storing the entire decompressed data. And with each decompression call, I am advancing 'dest.mem' to the next decompressed chunk position.

    Thanks for any help

    opened by GregSlazinski 51
  • Weird issues with using the streaming API vs `ZSTD_compress()`

    Weird issues with using the streaming API vs `ZSTD_compress()`

    I trying out the streaming API to write the equivalent of .NET's GZipStream (source), and I'm seeing some strange things.

    I'm using 0.6.1 for the tests, though the API seems to be unchanged in 0.7 at the moment.

    A stream works by having an internal buffer of 128 KB (131,072 bytes exactly). Each call to Write(..) appends any number of bytes to the buffer (could be called with 1 byte, could be called with 1 GB). Everytime the buffer is full, its content is compressed via ZSTD_compressContinue() on an empty destination buffer, and the result is copied into another stream down the line. When the producer is finished writing, it will Close the stream which will compress any pending data in its internal buffer (so anywhere between 1 and 131,071 bytes), call zstd_compress_end, and flush the final bytes to the stream.

    Seen from zstd, the pattern looks like:

    • ZSTD_compressBegin()
    • ZSTD_compressContinue() 131,072 bytes
    • ZSTD_compressContinue() 131,072 bytes
    • ...
    • ZSTD_compressContinue() 123 bytes (last chunk will always be < 128KB)
    • ZSTD_compressEnd()

    I'm comparing the final result, with calling ZSTD_compress() on the complete content of the input stream (ie: storing everything written into a memory buffer, and compress that in one step).

    Issue 1: ZSTD_compress() adds an extra empty frame at the start

    Looking at the compressed result, I see that usually a single call to ZSTD_compress() adds 6 bytes to the input.

    The left side is the compressed output of ZSTD_compress() on the whole file. The right side is the result of streaming with chunks of 128 KB on the same data:

    Left size: 23,350 bytes Right size: 23,344 bytes

    image

    The green part is identical between both files, only 7 bytes differ right after the header, and before the first compressed frame.

    Both results, when passed to ZSTD_decompress() return the same input text with no issues.

    Issue 2: N calls to ZSTD_compressContinue() produce N time the size of a single call to ZSTD_compress() on highly compressible data

    While testing with some text document, duplicated a bunch of time to get to to about 300KB (ie: the same 2 or 3 KB of text repeated about 100 times), I'm getting something strange

    • The result of calling zstd_compress on the whole 300KB returns a single 2.5 KB output.
    • The result of streaming using 3 calls to ZSTD_compressContinue() produces 7.5 KB output (3 times larger).

    Looking more closely: each call to ZSTD_compressContinue() returns 2.5 KB (first two calls with 128KB worth of text, third call with only 50 KB), which is too exact to be a coincidence.

    Since the dataset is the equivalent of "ABCABCABC..." a hundred times, I'm guessing that compressing 25%, 50% or 100% of it would produce the same output, which would look something like "repeat 'ABC' 100 times" vs "repeat 'ABC' 200 times".

    Only, when compressing 25% at a time, you get 4 times as many calls to ZSTD_compressContinue(), which will give you 4 times the output. Compressing 12.5% at a time would probably yield 8 times the output.

    image

    When changing the internal buffer size from 128KB down to 16KB, I get a result of 45 KiB, which is about 6x times more than before.

    Fudging the input data to get lower compression ratio makes this effect disappear progressively, until a point where the result of the streaming API is about the same as a single compression call (except the weird extra 6 bytes in the previous issue).

    opened by KrzysFR 50
  • Reduce size of dctx by reutilizing dst buffer

    Reduce size of dctx by reutilizing dst buffer

    WIP, this round of optimizations has gotten performance much closer to parity, though it has introduced a checksum error in the 270MB file test I'm still tracking down. This however hasn't affected the smaller size tests; benchmarks indicate that in some cases we now see performance improvements on top of the memory reduction due to the improved cache behavior. However there's other cases, at low file sizes and high compressibility, where we are still about 1% behind parity.

    Benchmark

    old performance

    ./tests/fullbench -b2 -B1000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 4987.5 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 612.0 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 585.8 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 2597.8 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 2635.7 MB/s ( 1000) ./tests/fullbench -b2 -B10000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 36167.5 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 1292.4 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 1671.9 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 3205.2 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 6179.7 MB/s ( 10000) ./tests/fullbench -b2 -B100000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 51880.0 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 1237.1 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 2151.4 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 3193.0 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 7095.5 MB/s ( 100000) ./tests/fullbench -b2 -B1000000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 34106.0 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 1309.3 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 1973.5 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 2637.8 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 14852.5 MB/s ( 1000000)

    new performance

    ./tests/fullbench -b2 -B1000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 4999.4 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 609.1 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 583.5 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 2402.1 MB/s ( 1000) ./tests/fullbench -b2 -B1000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000 bytes :
    2#decompress : 2587.4 MB/s ( 1000) ./tests/fullbench -b2 -B10000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 37441.8 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 1297.5 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 1656.7 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 3081.0 MB/s ( 10000) ./tests/fullbench -b2 -B10000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 10000 bytes :
    2#decompress : 6127.2 MB/s ( 10000) ./tests/fullbench -b2 -B100000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 52215.9 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 1252.2 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 2146.6 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 3614.6 MB/s ( 100000) ./tests/fullbench -b2 -B100000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 100000 bytes :
    2#decompress : 7084.7 MB/s ( 100000) ./tests/fullbench -b2 -B1000000 -P0 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 33857.1 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P10 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 1288.9 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P50 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 2095.4 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P90 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 2786.2 MB/s ( 1000000) ./tests/fullbench -b2 -B1000000 -P100 *** Zstandard speed analyzer 1.5.0 64-bits, by Yann Collet (Aug 19 2021) *** Sample 1000000 bytes :
    2#decompress : 15258.4 MB/s ( 1000000)

    CLA Signed optimization 
    opened by binhdvo 45
  • Patents?

    Patents?

    Does PATENTS from 4ded9e5 refer to specific patents that are actually used by zstd or is it a generic file added to github.com/facebook projects and zstd doesn't use any patented tech?

    Ie. is the "recipient of the software" the developer of a software using zstd (presumably) and/or also the user of a software using zstd?

    In the latter case, what does it imply in layman's terms? Would the patent license self-terminate if a company using a software using zstd did anything listed in (i)–(iii) in the second paragraph (line 14)?

    Clearing this up would be pretty important I feel before this can be used in other FOSS projects.

    Additional Grant of Patent Rights Version 2

    "Software" means the Zstandard software distributed by Facebook, Inc.

    Facebook, Inc. ("Facebook") hereby grants to each recipient of the Software ("you") a perpetual, worldwide, royalty-free, non-exclusive, irrevocable (subject to the termination provision below) license under any Necessary Claims, to make, have made, use, sell, offer to sell, import, and otherwise transfer the Software. For avoidance of doubt, no license is granted under Facebook’s rights in any patent claims that are infringed by (i) modifications to the Software made by you or any third party or (ii) the Software in combination with any software or other technology.

    The license granted hereunder will terminate, automatically and without notice, if you (or any of your subsidiaries, corporate affiliates or agents) initiate directly or indirectly, or take a direct financial interest in, any Patent Assertion: (i) against Facebook or any of its subsidiaries or corporate affiliates, (ii) against any party if such Patent Assertion arises in whole or in part from any software, technology, product or service of Facebook or any of its subsidiaries or corporate affiliates, or (iii) against any party relating to the Software. Notwithstanding the foregoing, if Facebook or any of its subsidiaries or corporate affiliates files a lawsuit alleging patent infringement against you in the first instance, and you respond by filing a patent infringement counterclaim in that lawsuit against that party that is unrelated to the Software, the license granted hereunder will not terminate under section (i) of this paragraph due to such counterclaim.

    A "Necessary Claim" is a claim of a patent owned by Facebook that is necessarily infringed by the Software standing alone.

    A "Patent Assertion" is any lawsuit or other action alleging direct, indirect, or contributory infringement or inducement to infringe any patent, including a cross-claim or counterclaim.

    question 
    opened by enkore 42
  • Adding --long support for --patch-from

    Adding --long support for --patch-from

    Patch From Zstandard is introducing a new command line option —patch-from= which leverages our existing compressors, dictionaries and the long range match finder to deliver a high speed engine for producing and applying patches to files.

    Patch from increases the previous maximum limit for dictionaries from 32 MB to 2 GB. Additionally, it maintains fast speeds on lower compression levels without compromising patch size by using the long range match finder (now extended to find dictionary matches). By default, Zstandard uses a heuristic based on file size and internal compression parameters to determine when to activate long mode but it can also be manually specified as before.

    Patch from also works with multi-threading mode at a minimal compression ratio loss vs single threaded mode.

    Example usage:

    # create the patch
    zstd --patch-from=<oldfile> <newfile> -o <patchfile>
    
    # apply the patch
    zstd -d --patch-from=<oldfile> <patchfile> -o <newfile>
    

    Benchmarks: We compared zstd to bsdiff, a popular industry grade diff engine. Our testing data were tarballs of different versions of source code from popular GitHub repositories. Specifically

    repos = {
        # ~31mb (small file)
        "zstd": {"url": "https://github.com/facebook/zstd", "dict-branch": "refs/tags/v1.4.2", "src-branch": "refs/tags/v1.4.3"},
        # ~273mb (medium file)
        "wordpress": {"url": "https://github.com/WordPress/WordPress", "dict-branch": "refs/tags/5.3.1", "src-branch": "refs/tags/5.3.2"},
        # ~1.66gb (large file)
        "llvm": {"url": "https://github.com/llvm/llvm-project", "dict-branch": "refs/tags/llvmorg-9.0.0", "src-branch": "refs/tags/llvmorg-9.0.1"}
    }
    

    alt text Patch from on level 19 (with chainLog=30 and targetLength=4kb) remains competitive with bsdiff when comparing patch sizes.
    alt text And patch from greatly outperforms bsdiff in speed even on its slowest setting of level 19 boasting an average speedup of ~7X. Patch from is >200X faster on level 1 and >100X faster (shown below) on level 3 vs bsdiff while still delivering patch sizes less than 0.5% of the original file size. alt text

    And of course, there is no change to the fast zstd decompression speed.

    CLA Signed 
    opened by bimbashrestha 40
  • pzstd compression ratios vs zstd

    pzstd compression ratios vs zstd

    We've run some benchmarks on a number of our internal backups and noticed that while pzstd is beautifully fast, it seems to produce worse results:

    time zstd -11 X -o X.11
    X : 13.37%   (123965807088 => 16572036784 bytes, X.11) 
    
    real	49m32.875s
    user	42m6.636s
    sys	0m49.172s
    
    
    time pzstd -11 -p 1 X -o X.11.1thread
    X : 13.76%   (123965807088 => 17056707732 bytes, X.11.1thread)
    
    real	42m50.245s
    user	40m33.648s
    sys	0m44.436s
    
    
    time pzstd -11 -p 3 X -o X.11.3threads
    X : 13.76%   (123965807088 => 17056707732 bytes, X.11.3threads)
    
    real	21m53.584s  <- bottlenecked by the slow hdd
    user	58m14.732s
    sys	1m0.036s
    

    Is this part of the design or a bug? We also noticed that pzstd -p 1 ran faster than zstd.

    question 
    opened by alubbe 34
  • Consider re-licensing to Apache License v2

    Consider re-licensing to Apache License v2

    Hello,

    The Apache Software Foundation recently changed its policy regarding the "Facebook BSD+patents" license that applies to zstd and many other FB open-source projects, and now considers it unsuitable for inclusion in ASF projects. There is a discussion of this in the context of RocksDB on LEGAL-303, which was resolved when RocksDB was relicensed as dual ALv2 and GPLv2.

    Is the zstd community also open to relicensing with ALv2? This change would be helpful for Apache Hadoop (of which I'm a PMC member) since it would let us bundle zstd as part of our release artifacts. @omalley also expressed interest in this relicensing as an Apache ORC PMC member.

    Thanks in advance!

    opened by umbrant 33
  • small files compression / dictionary issues

    small files compression / dictionary issues

    Hi,

    I am struggling a bit with the compression ratio that remains very low, while I think it should be higher.

    I am compressing a huge number of small data chunks (1323 bytes), representing terrain elevation data for a terrain rendering system.

    The chunk is made of 441x u16 (elevation) + 441x u8 (alpha).

    Now, I understand small data are not great, therefore I tried the 'dictionary' approach. But whatever I do, I can't even reach a 2x compression ratio.

    Some weird things I have noticed:

    • using a 100 KB target dictionary along with a 10 MB samples buffer (each sample being 1323 bytes), I get a dictionary of 73 KB (not a big deal actually) and everything works as expected (but I am still left with my low compression ratio)
    • so I thought I would create a larger dictionary, and tried to train a 1MB dictionary using a 100MB sample buffer. However in that case, the dictionary training phase lasts forever (i.e. I interrupted it after 1 hour or so) => fail.

    Would you have some hints for me ? Any idea how I could achieve higher compression ratio ? I need to be able to read/uncompress each tile randomly (so streaming is not an option).

    Basically, the source data are the SRTM data (roughly 15 GB zip compressed) - when put into small tiles and compressed using ZSTD, it compresses down to about 100 GB (dictionary doesn't even bring 5%).

    Thanks a lot ! Greg

    question 
    opened by gjaegy 32
  • [0.7.0] No more VS2013 project makes it difficult to target a MS VC Runtime above 10.0

    [0.7.0] No more VS2013 project makes it difficult to target a MS VC Runtime above 10.0

    I need to build the Windows library targetting the MS VC Runtime 12.0 (the one that came out with VS2013, which maps to MSVCR120.dll), but in the latest dev 0.7.0 branch, there is only two projects left: one for Visual Studio 2008 and one for Visual Studio 2010 (targetting MSVCR100.dll). The project for VS2013 is gone. I'm not sure how to use the CMake stuff, but after a quick look, it does not seem to have code that specifically deal with the version of the VC runtime in it.

    Usually, each developper will use version X of Visual Studio (2010, 2013, 2015, 15/vNext, ....), while targetting a version Y of the Microsoft VC Runtime (10.0, 11.0, 12.0, 14.0, ..., with Y <= X). I guess most dev may use a more recent version of VS than the version of the runtime they target, because retargetting all your dependencies to the latest VC runtime can be a lot of work, or even not possible at all. This means that the version of the VC runtime may not be the same as the version of Visual Studio project, and will be different for each person.

    Currently, if you have VS 2015 Update 2, and open the VS2010 project in the dev branch, VS attempts to convert the project to the latest version possible (14.0, which will probably break because of some includes that are different). And if you don't perform the conversion, then the build will probably fail because you don't have the 10.0 SDK installed on your dev machine or CI build server (unless you have also installed VS2010 before). Plus, if you want a different version (12.0 for me), you then need to update each project (for all the projects, times two for Release/Debug, times two again for Win32/x64). And then you are left with a change in the .vcproj that will cause trouble when you update from git again (forcing you to maintain a custom branch).

    I can see that having to maitain one set of projects for each versions of Visual Studio can be too much work, but what do you think would be the best way to be able to specify which version of the VC runtime to target when building? Maybe a settings for CMake that would then create a set of VS projects specific to you, and not checked in the repo? (this would be ideal).

    opened by KrzysFR 31
  • Update linux kernel to latest zstd (from 1.4.10)

    Update linux kernel to latest zstd (from 1.4.10)

    Hi, people of reddit asked (https://old.reddit.com/r/kernel/comments/xp2o53/why_is_the_version_of_zstd_in_the_kernel_outdated/) about updating the linux port of zstd to newer version. This has been promised back then when 1.4.10 was synced a year ago and I'd be glad to see an update (or even a regular update in each kernel release) as btrfs is using zstd and any performance improvement is most welcome.

    There's automation support (contrib/linux-kernel) so generating the patch itself should not take too much time. As there are more things to do like review and benchmarking it's not the last step but at least the preliminary version can be added to linux-next tree. The final pull request sent to Linus may or may not happen depending on the testing results.

    IMHO neglecting the regular updates is more "expensive", like it was with the 1.4.10 update that took about a year and a lot of convincing. The linux-next tree is really convenient as it does not pose a huge risk for users and helps to catch bugs early.

    packaging issue 
    opened by kdave 2
  • Make CMake official? (Makefile build does not provide CMake config file)

    Make CMake official? (Makefile build does not provide CMake config file)

    README.md says

    make is the officially maintained build system of this project.

    When using Makefile, CMake config files like zstdConfig.cmake is not installed. This makes projects using CMake awkward to use zstd. E.g. llvm-project has

    // https://github.com/llvm/llvm-project/blob/main/llvm/cmake/config-ix.cmake
    if(LLVM_ENABLE_ZSTD)
      if(LLVM_ENABLE_ZSTD STREQUAL FORCE_ON)
        find_package(zstd REQUIRED)
        if(NOT zstd_FOUND)
          message(FATAL_ERROR "Failed to configure zstd, but LLVM_ENABLE_ZSTD is FORCE_ON")
        endif()
      elseif(NOT LLVM_USE_SANITIZER MATCHES "Memory.*")
        find_package(zstd QUIET)
      endif()
    endif()
    set(LLVM_ENABLE_ZSTD ${zstd_FOUND})
    
    // https://github.com/llvm/llvm-project/blob/main/llvm/lib/Support/CMakeLists.txt#L28
    if(LLVM_ENABLE_ZSTD)
      if(TARGET zstd::libzstd_shared AND NOT LLVM_USE_STATIC_ZSTD)
        set(zstd_target zstd::libzstd_shared)
      else()
        set(zstd_target zstd::libzstd_static)
      endif()
    endif()
    

    It could add pkg-config fallback but that is inconvenient, and logic like zstd::libzstd_shared does not have a good replacement.

    Related:

    • https://bugs.gentoo.org/872254
    • https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1020403

    The simplest solution is to make CMake official so downstream is motivated to switch to CMake.

    opened by MaskRay 5
  • Enable the OpenSSF Scorecard Github Action

    Enable the OpenSSF Scorecard Github Action

    Hello, I'm working on behalf of Google and the Open Source Security Foundation to help essential open-source projects improve their supply-chain security. Given relevance that the Zstandard has on countless projects, the OpenSSF has identified it as one of the 100 most critical open source projects.

    Is your feature request related to a problem? Please describe. According to Open Source Security and Risk Analysis Report, 84% of all codebases have at least one vulnerability, with an average of 158 per codebase. The majority have been in the code for more than 2 years and have documented solutions available.

    Even in large tech companies, the tedious process of reviewing code for vulnerabilities falls down the priority list, and there is little insight into known vulnerabilities and solutions that companies can draw on.

    That’s where the OpenSSF tool called Scorecards is helping. Its focus is to understand the security posture of a project and assess the risks that the dependencies could introduce.

    Describe the solution you'd like Scorecards runs dozens of automated security checks to help maintainers better understand their project's supply-chain security posture. It is developed by the OpenSSF, in partnership with GitHub.

    To simplify maintainers' lives, the OpenSSF has also developed the Scorecard GitHub Action. It is very lightweight and runs on every change to the repository's main branch. The results of its checks are available on the project's security dashboard, and include suggestions on how to solve any issues (see examples in additional context). The Action does not run or interact with any workflows, but merely parses them to identify possible vulnerabilities. This Action has been adopted by 1800+ projects already.

    Zstandard already follow many of the Scorecard recommended best practices and criterias for a greater security, such as not having any binary artifacts, CI-Tests, code review, Fuzzing, etc. Although, there are still some criterias that would need to be improved to achieve a good level of security. In these cases, the Scorecard Github Action can help on diagnosing and proposing solutions.

    Would you be interested in a PR which adds this Action? Optionally, it can also publish your results to the OpenSSF REST API, which allows a badge with the project's score to be added to its README.

    Additional context

    Code scanning dashboard with multiple alerts, including Code-Review and Token-Permissions

    Detail of a Token-Permissions alert, indicating the specific file and remediation steps

    opened by joycebrum 0
  • [Help Wanted] should `ZSTD_flushStream` behaves differently from `ZSTD_endStream` when used for flushing purposes?

    [Help Wanted] should `ZSTD_flushStream` behaves differently from `ZSTD_endStream` when used for flushing purposes?

    Hi, I'm using 1.5.2, through jni(https://github.com/luben/zstd-jni 1.5.2-3), both are latest version as of writing

    I use streaming compression to stream some lines of json string, delimited by line feed. After some critical point is reached I would like to flush zstd buffer so that the receiver can decode them immediately. I added a switch to call either ZSTD_flushStream or ZSTD_endStream for flushing purposes. Am I correct that they should be equivalent if the receiver is looping on ZSTD_decompressStream(ignoring its return value, thus also ignoring frame boundaries)?

    More specifically, I found that: when using ZSTD_endStream the receiver never hangs on receive, but using ZSTD_flushStream the receiver almost always hang due to insufficient bytes available(so.read(...) inside read0, code at bottom), that doesn't seems to be correct. This can be reproduced on my production system(I'm trying to minimize it into a minimal reproducer in the mean time), but it can't be reproduced on smaller dataset(yet)

    pointers and their offsets should already be updated and advanced properly according to doc, both pointer arith and error checking code are omitted from the diagram I'll post more code if the the information provided here is not enough(and when I've successfully produced minimal reproducer), but the same code works for smaller dataset, and the only change is the eof flag: from using ZSTD_endStream to ZSTD_flushStream

    reference diagram image

    code snippets: https://github.com/luben/zstd-jni/blob/905202f10e5355cf7ed7a558e909558bc6ebf184/src/main/native/jni_directbuffercompress_zstd.c https://github.com/luben/zstd-jni/blob/905202f10e5355cf7ed7a558e909558bc6ebf184/src/main/native/jni_directbufferdecompress_zstd.c

        private fun castZstd(n:Long)=if(isError(n))throw ZstdException(n)else n.toIntExact()
        fun decompress(dst:ByteBuffer,src:ByteBuffer):Int{
            if(stream<0)throw ClosedChannelException()
            val n=castZstd(decompressStream(stream,dst,dst.position(),dst.remaining(),src,src.position(),src.remaining()))
            src.position(src.position()+consumed)
            dst.position(dst.position()+produced)
            return n
        }
        suspend fun compress(m:ByteBuffer){ // input buffer
            if(stream<0)throw ClosedChannelException();while(m.hasRemaining()){
                if(!buffer.hasRemaining())flushBuffer()
                castZstd(compressDirectByteBuffer(stream,buffer,buffer.position(),buffer.remaining(),m,m.position(),m.remaining()))
                buffer.position(buffer.position()+produced)
                m.position(m.position()+consumed)
            }
        }
        suspend fun flush(eof:Boolean){ // the previously mentioned `flag`
            if(stream<0)throw ClosedChannelException();val fn=if(eof)::endStream else::flushStream;do{
                val n=castZstd(fn(stream,buffer,buffer.position(),buffer.remaining()))
                buffer.position(buffer.position()+produced)
                flushBuffer()
            }while(n!=0)
        }
        // socket outer loop, if `buffer` doesn't contain '\n', call `read0` one more time
        override suspend fun read0():Int{ // `produced` is how much data is produced by calling `decompress`
            do v=decompress(buffer,so.read(max(1,v)))while(produced==0)
            return produced
        }
    
    question 
    opened by revintec 2
  • FATAL ERROR: zstd uncompress failed with error code 10

    FATAL ERROR: zstd uncompress failed with error code 10

    Describe the bug I am trying to compress a directory containing files of size ~= 2.5GiB via mksquashfs /input/path/* /path/to/output.img -comp zstd -b 256K -noappend -Xcompression-level 22

    /path/to/output.img is inside a mounted s3fs directory (that's the possible cause of issue, but current architecture of the tool is reason to do that). however the same command work for data less than ~= 1 GiB.

    To Reproduce Steps to reproduce the behavior:

    1. Directory with more that 2.5 GiB data
    2. A mounted s3fs bucket
    3. The above mentioned command
    4. FATAL ERROR: zstd uncompress failed with error code 10

    Expected behavior This thing should work the same way it works with data lower than 1 GiB

    Error Code FATAL ERROR: zstd uncompress failed with error code 10

    Desktop (please complete the following information):

    • OS: Ubuntu
    • Version 22.04

    Additional context Error is on myside for sure, I need help in debugging the error code as i am not much aware about the code base of ZSTD,

    • I tried to get information of the exit code from manual but missing.

    I also tried to use the ZSTD_getErrorName(10) for some meaningful information but it gave No error detected.

    opened by itsManjeet 0
Releases(v1.5.2)
  • v1.5.2(Jan 20, 2022)

    Zstandard v1.5.2 is a bug-fix release, addressing issues that were raised with the v1.5.1 release.

    In particular, as a side-effect of the inclusion of assembly code in our source tree, binary artifacts were being marked as needing an executable stack on non-amd64 architectures. This release corrects that issue. More context is available in #2963.

    This release also corrects a performance regression that was introduced in v1.5.0 that slows down compression of very small data when using the streaming API. Issue #2966 tracks that topic.

    In addition there are a number of smaller improvements and fixes.

    Full Changelist

    • Fix zstd-static output name with MINGW/Clang by @MehdiChinoune in https://github.com/facebook/zstd/pull/2947
    • storeSeq & mlBase : clarity refactoring by @Cyan4973 in https://github.com/facebook/zstd/pull/2954
    • Fix mini typo by @fwessels in https://github.com/facebook/zstd/pull/2960
    • Refactor offset+repcode sumtype by @Cyan4973 in https://github.com/facebook/zstd/pull/2962
    • meson: fix MSVC support by @eli-schwartz in https://github.com/facebook/zstd/pull/2951
    • fix performance issue in scenario #2966 (part 1) by @Cyan4973 in https://github.com/facebook/zstd/pull/2969
    • [meson] Explicitly disable assembly for non clang/gcc copmilers by @terrelln in https://github.com/facebook/zstd/pull/2972
    • Mark Huffman Decoder Assembly noexecstack on All Architectures by @felixhandte in https://github.com/facebook/zstd/pull/2964
    • Improve Module Map File by @felixhandte in https://github.com/facebook/zstd/pull/2953
    • Remove Dependencies to Allow the Zstd Binary to Dynamically Link to the Library by @felixhandte in https://github.com/facebook/zstd/pull/2977
    • [opt] Fix oss-fuzz bug in optimal parser by @terrelln in https://github.com/facebook/zstd/pull/2980
    • [license] Fix license header of huf_decompress_amd64.S by @terrelln in https://github.com/facebook/zstd/pull/2981
    • Fix stderr progress logging for decompression by @terrelln in https://github.com/facebook/zstd/pull/2982
    • Fix tar test cases by @sunwire in https://github.com/facebook/zstd/pull/2956
    • Fixup MSVC source file inclusion for cmake builds by @hmaarrfk in https://github.com/facebook/zstd/pull/2957
    • x86-64: Hide internal assembly functions by @hjl-tools in https://github.com/facebook/zstd/pull/2993
    • Prepare v1.5.2 by @felixhandte in https://github.com/facebook/zstd/pull/2987
    • Documentation and minor refactor to clarify MT memory management by @embg in https://github.com/facebook/zstd/pull/3000
    • Avoid updating timestamps when the destination is stdout by @floppym in https://github.com/facebook/zstd/pull/2998
    • [build][asm] Pass ASFLAGS to the assembler instead of CFLAGS by @terrelln in https://github.com/facebook/zstd/pull/3009
    • Update CI documentation by @embg in https://github.com/facebook/zstd/pull/2999
    • Zstandard v1.5.2 by @felixhandte in https://github.com/facebook/zstd/pull/2995

    New Contributors

    • @MehdiChinoune made their first contribution in https://github.com/facebook/zstd/pull/2947
    • @fwessels made their first contribution in https://github.com/facebook/zstd/pull/2960
    • @sunwire made their first contribution in https://github.com/facebook/zstd/pull/2956
    • @hmaarrfk made their first contribution in https://github.com/facebook/zstd/pull/2957
    • @floppym made their first contribution in https://github.com/facebook/zstd/pull/2998

    Full Changelog: https://github.com/facebook/zstd/compare/v1.5.1...v1.5.2

    Source code(tar.gz)
    Source code(zip)
    zstd-1.5.2.tar.gz(1.84 MB)
    zstd-1.5.2.tar.gz.sha256(84 bytes)
    zstd-1.5.2.tar.gz.sig(858 bytes)
    zstd-1.5.2.tar.zst(1.41 MB)
    zstd-1.5.2.tar.zst.sha256(85 bytes)
    zstd-1.5.2.tar.zst.sig(858 bytes)
    zstd-v1.5.2-win32.zip(1.39 MB)
    zstd-v1.5.2-win64.zip(1.55 MB)
  • v1.5.1(Dec 21, 2021)

    Notice : it has been brought to our attention that the v1.5.1 library might be built with an executable stack on non-x64 architectures, which could end up being flagged as problematic by some systems with thorough security settings which disallow executable stack. We are currently reviewing the issue. Be aware of it if you build libzstd for non-x64 architecture.

    Zstandard v1.5.1 is a maintenance release, bringing a good number of small refinements to the project. It also offers a welcome crop of performance improvements, as detailed below.

    Performance Improvements

    Speed improvements for fast compression (levels 1–4)

    PRs #2749, #2774, and #2921 refactor single-segment compression for ZSTD_fast and ZSTD_dfast, which back compression levels 1 through 4 (as well as the negative compression levels). Speedups in the ~3-5% range are observed. In addition, the compression ratio of ZSTD_dfast (levels 3 and 4) is slightly improved.

    Rebalanced middle compression levels

    v1.5.0 introduced major speed improvements for mid-level compression (from 5 to 12), while preserving roughly similar compression ratio. As a consequence, the speed scale became tilted towards faster speed. Unfortunately, the difference between successive levels was no longer regular, and there is a large performance gap just after the impacted range, between levels 12 and 13.

    v1.5.1 tries to rebalance parameters so that compression levels can be roughly associated to their former speed budget. Consequently, v1.5.1 mid compression levels feature speeds closer to former v1.4.9 (though still sensibly faster) and receive in exchange an improved compression ratio, as shown in below graph.

    comparing v1.4.9 vs v1.5.0 vs 1.5.1on x64 (i7-9700k)

    comparing v1.4.9 vs v1.5.0 vs 1.5.1 on arm64 (snapdragon 855)

    Note that, since middle levels only experience a rebalancing, save some special cases, no significant performance differences between versions v1.5.0 and v1.5.1 should be expected: levels merely occupy different positions on the same curve. The situation is a bit different for fast levels (1-4), for which v1.5.1 delivers a small but consistent performance benefit on all platforms, as described in previous paragraph.

    Huffman Improvements

    Our Huffman code was significantly revamped in this release. Both encoding and decoding speed were improved. Additionally, encoding speed for small inputs was improved even further. Speed is measured on the Silesia corpus by compressing with level 1 and extracting the literals left over after compression. Then compressing and decompressing the literals from each block. Measurements are done on an Intel i9-9900K @ 3.6 GHz.

    | Compiler | Scenario | v1.5.0 Speed | v1.5.1 Speed | Delta | |----------|-------------------------------------|--------------|--------------|--------| | gcc-11 | Literal compression - 128KB block | 748 MB/s | 927 MB/s | +23.9% | | clang-13 | Literal compression - 128KB block | 810 MB/s | 927 MB/s | +14.4% | | gcc-11 | Literal compression - 4KB block | 223 MB/s | 321 MB/s | +44.0% | | clang-13 | Literal compression - 4KB block | 224 MB/s | 310 MB/s | +38.2% | | gcc-11 | Literal decompression - 128KB block | 1164 MB/s | 1500 MB/s | +28.8% | | clang-13 | Literal decompression - 128KB block | 1006 MB/s | 1504 MB/s | +49.5% |

    Overall impact on (de)compression speed depends on the compressibility of the data. Compression speed improves from 1-4%, and decompression speed improves from 5-15%.

    PR #2722 implements the Huffman decoder in assembly for x86-64 with BMI2 enabled. We detect BMI2 support at runtime, so this speedup applies to all x86-64 builds running on CPUs that support BMI2. This improves Huffman decoding speed by about 40%, depending on the scenario. PR #2733 improves Huffman encoding speed by 10% for clang and 20% for gcc. PR #2732 drastically speeds up the HUF_sort() function, which speeds up Huffman tree building for compression. This is a significant speed boost for small inputs, measuring in at a 40% improvement for 4K inputs.

    Binary Size and Build Speed

    zstd binary size grew significantly in v1.5.0 due to the new code added for middle compression level speed optimizations. In this release we recover the binary size, and in the process also significantly speed up builds, especially with sanitizers enabled.

    Measured on x86-64 compiled with -O3 we measure libzstd.a size. We regained 161 KB of binary size on gcc, and 293 KB of binary size on clang. Note that these binary sizes are listed for the whole library, optimized for speed over size. The decoder only, with size saving options enabled, and compiled with -Os or -Oz can be much smaller.

    | Version | gcc-11 size | clang-13 size | |---------|-------------|---------------| | v1.5.1 | 1177 KB | 1167 KB | | v1.5.0 | 1338 KB | 1460 KB | | v1.4.9 | 1137 KB | 1151 KB |

    Change log

    Featured user-visible changes

    • perf: rebalanced compression levels, to better match intended speed/level curve, by @senhuang42 and @cyan4973
    • perf: faster huffman decoder, using x64 assembly, by @terrelln
    • perf: slightly faster high speed modes (strategies fast & dfast), by @felixhandte
    • perf: smaller binary size and faster compilation times, by @terrelln and @nolange
    • perf: new row64 mode, used notably at highest lazy2 levels 11-12, by @senhuang42
    • perf: faster mid-level compression speed in presence of highly repetitive patterns, by @senhuang42
    • perf: minor compression ratio improvements for small data at high levels, by @cyan4973
    • perf: reduced stack usage (mostly useful for Linux Kernel), by @terrelln
    • perf: faster compression speed on incompressible data, by @bindhvo
    • perf: on-demand reduced ZSTD_DCtx state size, using build macro ZSTD_DECODER_INTERNAL_BUFFER, at a small cost of performance, by @bindhvo
    • build: allows hiding static symbols in the dynamic library, using build macro, by @skitt
    • build: support for m68k (Motorola 68000's), by @cyan4973
    • build: improved AIX support, by @Helflym
    • build: improved meson unofficial build, by @eli-schwartz
    • cli : fix : forward mtime to output file, by @felixhandte
    • cli : custom memory limit when training dictionary (#2925), by @embg
    • cli : report advanced parameters information when compressing in very verbose mode (-vv), by @Svetlitski-FB
    • cli : advanced commands in the form --long-param= can accept negative value arguments, by @binhdvo

    PR full list

    • Add determinism fuzzers and fix rare determinism bugs by @terrelln in https://github.com/facebook/zstd/pull/2648
    • ZSTD_VecMask_next: fix incorrect variable name in fallback code path by @dnelson-1901 in https://github.com/facebook/zstd/pull/2657
    • improve tar compatibility by @Cyan4973 in https://github.com/facebook/zstd/pull/2660
    • Enable SSE2 compression path to work on MSVC by @TrianglesPCT in https://github.com/facebook/zstd/pull/2653
    • Fix CircleCI Config to Fully Remove publish-github-release Job by @felixhandte in https://github.com/facebook/zstd/pull/2649
    • [CI] Fix zlib-wrapper test by @senhuang42 in https://github.com/facebook/zstd/pull/2668
    • [CI] Add ARM tests back into CI by @senhuang42 in https://github.com/facebook/zstd/pull/2667
    • [trace] Refine the ZSTD_HAVE_WEAK_SYMBOLS detection by @terrelln in https://github.com/facebook/zstd/pull/2674
    • [CI][1/2] Re-do the github actions workflows, migrate various travis and appveyor tests. by @senhuang42 in https://github.com/facebook/zstd/pull/2675
    • Make GH Actions CI tests run apt-get update before apt-get install by @senhuang42 in https://github.com/facebook/zstd/pull/2682
    • Add arm64 fuzz test to travis by @senhuang42 in https://github.com/facebook/zstd/pull/2686
    • Add ldm and block splitter auto-enable to old api by @senhuang42 in https://github.com/facebook/zstd/pull/2684
    • Add documentation for --patch-from by @binhdvo in https://github.com/facebook/zstd/pull/2693
    • Make regression test run on every PR by @senhuang42 in https://github.com/facebook/zstd/pull/2691
    • Initialize "potentially uninitialized" pointers. by @wolfpld in https://github.com/facebook/zstd/pull/2654
    • Flatten ZSTD_row_getMatchMask by @aqrit in https://github.com/facebook/zstd/pull/2681
    • Update README for Travis CI Badge by @gauthamkrishna9991 in https://github.com/facebook/zstd/pull/2700
    • Fuzzer test with no intrinsics on S390x (big endian) by @senhuang42 in https://github.com/facebook/zstd/pull/2678
    • Fix --progress flag to properly control progress display and default … by @binhdvo in https://github.com/facebook/zstd/pull/2698
    • [bug] Fix entropy repeat mode bug by @senhuang42 in https://github.com/facebook/zstd/pull/2697
    • Format File Sizes Human-Readable in the cli by @felixhandte in https://github.com/facebook/zstd/pull/2702
    • Add support for negative values in advanced flags by @binhdvo in https://github.com/facebook/zstd/pull/2705
    • [fix] Add missing bounds checks during compression by @terrelln in https://github.com/facebook/zstd/pull/2709
    • Add API for fetching skippable frame content by @binhdvo in https://github.com/facebook/zstd/pull/2708
    • Add option to use logical cores for default threads by @binhdvo in https://github.com/facebook/zstd/pull/2710
    • lib/Makefile: Fix small typo in ZSTD_FORCE_DECOMPRESS_* build macros by @luisdallos in https://github.com/facebook/zstd/pull/2714
    • [RFC] Add internal API for converting ZSTD_Sequence into seqStore by @senhuang42 in https://github.com/facebook/zstd/pull/2715
    • Optimize zstd decompression by another x% by @danlark1 in https://github.com/facebook/zstd/pull/2689
    • Include what you use in zstd_ldm_geartab by @danlark1 in https://github.com/facebook/zstd/pull/2719
    • [trace] remove zstd_trace.c reference from freestanding by @heitbaum in https://github.com/facebook/zstd/pull/2655
    • Remove folder when done with test by @senhuang42 in https://github.com/facebook/zstd/pull/2720
    • Proactively skip huffman compression based on sampling where non-comp… by @binhdvo in https://github.com/facebook/zstd/pull/2717
    • Add support for MCST LCC compiler by @makise-homura in https://github.com/facebook/zstd/pull/2725
    • [bug-fix] Fix a determinism bug with the DUBT by @terrelln in https://github.com/facebook/zstd/pull/2726
    • Fix DDSS Load by @felixhandte in https://github.com/facebook/zstd/pull/2729
    • Z_PREFIX zError function by @koalabearguo in https://github.com/facebook/zstd/pull/2707
    • pzstd: fix linking for static builds by @jonringer in https://github.com/facebook/zstd/pull/2724
    • [HUF] Improve Huffman encoding speed by @terrelln in https://github.com/facebook/zstd/pull/2733
    • [HUF] Improve Huffman sorting algorithm by @senhuang42 in https://github.com/facebook/zstd/pull/2732
    • Set mtime on Output Files by @felixhandte in https://github.com/facebook/zstd/pull/2742
    • [RFC] Rebalance compression levels by @senhuang42 in https://github.com/facebook/zstd/pull/2692
    • Improve branch misses on FSE symbol spreading by @senhuang42 in https://github.com/facebook/zstd/pull/2750
    • make ZSTD_HASHLOG3_MAX private by @Cyan4973 in https://github.com/facebook/zstd/pull/2752
    • meson fixups by @eli-schwartz in https://github.com/facebook/zstd/pull/2746
    • [easy] Fix zstd bench error message by @senhuang42 in https://github.com/facebook/zstd/pull/2753
    • Reduce test time on TravisCI by @Cyan4973 in https://github.com/facebook/zstd/pull/2757
    • added qemu tests by @Cyan4973 in https://github.com/facebook/zstd/pull/2758
    • Add 8 bytes to FSE_buildCTable wksp by @senhuang42 in https://github.com/facebook/zstd/pull/2761
    • minor rebalancing of level 13 by @Cyan4973 in https://github.com/facebook/zstd/pull/2762
    • Improve compile speed and binary size in opt by @senhuang42 in https://github.com/facebook/zstd/pull/2763
    • [easy] Fix patch-from help msg typo by @senhuang42 in https://github.com/facebook/zstd/pull/2769
    • Pipelined Implementation of ZSTD_fast (~+5% Speed) by @felixhandte in https://github.com/facebook/zstd/pull/2749
    • meson: fix type error for integer option by @eli-schwartz in https://github.com/facebook/zstd/pull/2775
    • Fix dictionary training huffman segfault and small speed improvement by @senhuang42 in https://github.com/facebook/zstd/pull/2773
    • [rsyncable] Ensure ZSTD_compressBound() is respected by @terrelln in https://github.com/facebook/zstd/pull/2776
    • Improve optimal parser performance on small data by @Cyan4973 in https://github.com/facebook/zstd/pull/2771
    • [rsyncable] Fix test failures by @terrelln in https://github.com/facebook/zstd/pull/2777
    • Revert opt outlining change by @senhuang42 in https://github.com/facebook/zstd/pull/2778
    • [build] Add support for ASM files in Make + CMake by @terrelln in https://github.com/facebook/zstd/pull/2783
    • add msvc2019 to build.generic.cmd by @animalize in https://github.com/facebook/zstd/pull/2787
    • [fuzzer] Add Huffman decompression fuzzer by @terrelln in https://github.com/facebook/zstd/pull/2784
    • Assembly implementation of 4X1 & 4X2 Huffman by @terrelln in https://github.com/facebook/zstd/pull/2722
    • [huf] Fix compilation when DYNAMIC_BMI2=0 && BMI2 is supported by @terrelln in https://github.com/facebook/zstd/pull/2791
    • Use new paramSwitch enum for row matchfinder and block splitter by @senhuang42 in https://github.com/facebook/zstd/pull/2788
    • Fix NCountWriteBound by @senhuang42 in https://github.com/facebook/zstd/pull/2779
    • [contrib][linux] Fix up SPDX license identifiers by @terrelln in https://github.com/facebook/zstd/pull/2794
    • [contrib][linux] Reduce stack usage by 80 bytes by @terrelln in https://github.com/facebook/zstd/pull/2795
    • Reduce stack usage of block splitter by @senhuang42 in https://github.com/facebook/zstd/pull/2780
    • minor: constify MatchState* parameter when possible by @Cyan4973 in https://github.com/facebook/zstd/pull/2797
    • [build] Fix oss-fuzz build with the dataflow sanitizer by @terrelln in https://github.com/facebook/zstd/pull/2799
    • [lib] Make lib compatible with -Wfall-through excepting legacy by @terrelln in https://github.com/facebook/zstd/pull/2796
    • [contrib][linux] Fix build after introducing ASM HUF implementation by @solbjorn in https://github.com/facebook/zstd/pull/2790
    • Smaller code with disabled features by @nolange in https://github.com/facebook/zstd/pull/2805
    • [huf] Fix OSS-Fuzz assert by @terrelln in https://github.com/facebook/zstd/pull/2808
    • Skip most long matches in lazy hash table update by @senhuang42 in https://github.com/facebook/zstd/pull/2755
    • add missing BUNDLE DESTINATION by @3nids in https://github.com/facebook/zstd/pull/2810
    • [contrib][linux] Fix -Wundef inside Linux kernel tree by @solbjorn in https://github.com/facebook/zstd/pull/2802
    • [contrib][linux-kernel] Add standard warnings and -Werror to CI by @terrelln in https://github.com/facebook/zstd/pull/2803
    • Add AIX support in Makefile by @Helflym in https://github.com/facebook/zstd/pull/2747
    • Limit train samples by @stanjo74 in https://github.com/facebook/zstd/pull/2809
    • [multiple-ddicts] Fix NULL checks by @terrelln in https://github.com/facebook/zstd/pull/2817
    • [ldm] Fix ZSTD_c_ldmHashRateLog bounds check by @terrelln in https://github.com/facebook/zstd/pull/2819
    • [binary-tree] Fix underflow of nbCompares by @terrelln in https://github.com/facebook/zstd/pull/2820
    • Enhance streaming_compression examples. by @marxin in https://github.com/facebook/zstd/pull/2813
    • Pipelined Implementation of ZSTD_dfast by @felixhandte in https://github.com/facebook/zstd/pull/2774
    • Fix a C89 error in msvc by @animalize in https://github.com/facebook/zstd/pull/2800
    • [asm] Switch to C style comments by @terrelln in https://github.com/facebook/zstd/pull/2825
    • Support thread pool section in HTML documentation. by @marxin in https://github.com/facebook/zstd/pull/2822
    • Reduce size of dctx by reutilizing dst buffer by @binhdvo in https://github.com/facebook/zstd/pull/2751
    • [lazy] Speed up compilation times by @terrelln in https://github.com/facebook/zstd/pull/2828
    • separate compression level tables into their own file by @Cyan4973 in https://github.com/facebook/zstd/pull/2830
    • minor : change build macro to ZSTD_DECODER_INTERNAL_BUFFER by @Cyan4973 in https://github.com/facebook/zstd/pull/2829
    • Fix oss fuzz test error by @binhdvo in https://github.com/facebook/zstd/pull/2837
    • Move mingw tests from appveyor to github actions by @binhdvo in https://github.com/facebook/zstd/pull/2838
    • Improvements to verbose mode output by @Svetlitski-FB in https://github.com/facebook/zstd/pull/2839
    • Use unused functions to appease Visual Studio by @senhuang42 in https://github.com/facebook/zstd/pull/2846
    • Backport zstd patch from LKML by @terrelln in https://github.com/facebook/zstd/pull/2849
    • Fix fullbench CI failure by @binhdvo in https://github.com/facebook/zstd/pull/2851
    • Fix Determinism Bug: Avoid Reducing Indices to Reserved Values by @felixhandte in https://github.com/facebook/zstd/pull/2850
    • ZSTD_copy16() uses ZSTD_memcpy() by @animalize in https://github.com/facebook/zstd/pull/2836
    • Display command line parameters with concrete values in verbose mode by @Svetlitski-FB in https://github.com/facebook/zstd/pull/2847
    • Reduce function size in fast & dfast by @terrelln in https://github.com/facebook/zstd/pull/2863
    • [linux-kernel] Don't inline function in zstd_opt.c by @terrelln in https://github.com/facebook/zstd/pull/2864
    • Remove executable flag from GNU_STACK segment by @ko-zu in https://github.com/facebook/zstd/pull/2857
    • [linux-kernel] Don't add -O3 to CFLAGS by @terrelln in https://github.com/facebook/zstd/pull/2866
    • Support Swift Package Manager by @cntrump in https://github.com/facebook/zstd/pull/2858
    • Determinism: Avoid Mapping Window into Reserved Indices during Reduction by @felixhandte in https://github.com/facebook/zstd/pull/2869
    • Clarify documentation for -c by @binhdvo in https://github.com/facebook/zstd/pull/2883
    • Fix build for cygwin/bsd by @binhdvo in https://github.com/facebook/zstd/pull/2882
    • Move visual studio tests from per-release to per-PR by @senhuang42 in https://github.com/facebook/zstd/pull/2845
    • Fix SPM warning: umbrella header for module 'libzstd' does not include header 'xxx.h' by @cntrump in https://github.com/facebook/zstd/pull/2872
    • Add detection when compiling with Clang and Ninja under Windows by @jannkoeker in https://github.com/facebook/zstd/pull/2877
    • [contrib][pzstd] Fix build issue with gcc-5 by @terrelln in https://github.com/facebook/zstd/pull/2889
    • [bmi2] Add lzcnt and bmi target attributes by @terrelln in https://github.com/facebook/zstd/pull/2888
    • [test] Test that the exec-stack bit isn't set on libzstd.so by @terrelln in https://github.com/facebook/zstd/pull/2886
    • Solve the bug of extra output newline character by @15596858998 in https://github.com/facebook/zstd/pull/2876
    • [zdict] Remove ZDICT_CONTENTSIZE_MIN restriction for ZDICT_finalizeDictionary by @terrelln in https://github.com/facebook/zstd/pull/2887
    • Explicitly hide static symbols by @skitt in https://github.com/facebook/zstd/pull/2501
    • Makefile: sort all wildcard file list expansions by @kanavin in https://github.com/facebook/zstd/pull/2895
    • merge #2501 by @Cyan4973 in https://github.com/facebook/zstd/pull/2894
    • Makefile: fix build for mingw by @sapiippo in https://github.com/facebook/zstd/pull/2687
    • [CircleCI] Fix short-tests-0 by @terrelln in https://github.com/facebook/zstd/pull/2892
    • Zstandard compiles and run on m68k cpus by @Cyan4973 in https://github.com/facebook/zstd/pull/2896
    • Improve zstd_opt build speed and size by @terrelln in https://github.com/facebook/zstd/pull/2898
    • [CI] Add cmake windows build by @terrelln in https://github.com/facebook/zstd/pull/2900
    • Disable Multithreading in CMake Builds for Android by @felixhandte in https://github.com/facebook/zstd/pull/2899
    • Avoid Using Deprecated Functions in Deprecated Code by @felixhandte in https://github.com/facebook/zstd/pull/2897
    • [asm] Share portability macros and restrict ASM further by @terrelln in https://github.com/facebook/zstd/pull/2893
    • fixbug CLI's -D fails when the argument is not a regular file by @15596858998 in https://github.com/facebook/zstd/pull/2890
    • Apply FORCE_MEMORY_ACCESS=1 to legacy by @Hello71 in https://github.com/facebook/zstd/pull/2907
    • [lib] Fix libzstd.pc for lib-mt builds by @ericonr in https://github.com/facebook/zstd/pull/2659
    • Imply -q when stderr is not a tty by @binhdvo in https://github.com/facebook/zstd/pull/2884
    • Fix Up #2659; Build libzstd.pc Whenever Building the Lib on Unix by @felixhandte in https://github.com/facebook/zstd/pull/2912
    • Remove possible NULL pointer addition by @terrelln in https://github.com/facebook/zstd/pull/2916
    • updated xxHash to latest v0.8.1 by @Cyan4973 in https://github.com/facebook/zstd/pull/2914
    • Reject Irregular Dictionary Files by @felixhandte in https://github.com/facebook/zstd/pull/2910
    • x32 compatibility by @Cyan4973 in https://github.com/facebook/zstd/pull/2922
    • typo: Small spelling mistake in example by @IAL32 in https://github.com/facebook/zstd/pull/2923
    • add test case by @15596858998 in https://github.com/facebook/zstd/pull/2905
    • Stagger Stepping in Negative Levels by @felixhandte in https://github.com/facebook/zstd/pull/2921
    • Fix performance degradation with -m32 by @binhdvo in https://github.com/facebook/zstd/pull/2926
    • Reduce tables to 8bit by @nolange in https://github.com/facebook/zstd/pull/2930
    • simplify SSE implementation of row_lazy match finder by @Cyan4973 in https://github.com/facebook/zstd/pull/2929
    • Allow user to specify memory limit for dictionary training by @embg in https://github.com/facebook/zstd/pull/2925
    • fixed incorrect rowlog initialization by @Cyan4973 in https://github.com/facebook/zstd/pull/2931
    • rebalance lazy compression levels by @Cyan4973 in https://github.com/facebook/zstd/pull/2934

    New Contributors

    • @dnelson-1901 made their first contribution in https://github.com/facebook/zstd/pull/2657
    • @TrianglesPCT made their first contribution in https://github.com/facebook/zstd/pull/2653
    • @binhdvo made their first contribution in https://github.com/facebook/zstd/pull/2693
    • @wolfpld made their first contribution in https://github.com/facebook/zstd/pull/2654
    • @aqrit made their first contribution in https://github.com/facebook/zstd/pull/2681
    • @gauthamkrishna9991 made their first contribution in https://github.com/facebook/zstd/pull/2700
    • @luisdallos made their first contribution in https://github.com/facebook/zstd/pull/2714
    • @danlark1 made their first contribution in https://github.com/facebook/zstd/pull/2689
    • @heitbaum made their first contribution in https://github.com/facebook/zstd/pull/2655
    • @makise-homura made their first contribution in https://github.com/facebook/zstd/pull/2725
    • @koalabearguo made their first contribution in https://github.com/facebook/zstd/pull/2707
    • @jonringer made their first contribution in https://github.com/facebook/zstd/pull/2724
    • @eli-schwartz made their first contribution in https://github.com/facebook/zstd/pull/2746
    • @abxhr made their first contribution in https://github.com/facebook/zstd/pull/2798
    • @solbjorn made their first contribution in https://github.com/facebook/zstd/pull/2790
    • @nolange made their first contribution in https://github.com/facebook/zstd/pull/2805
    • @3nids made their first contribution in https://github.com/facebook/zstd/pull/2810
    • @Helflym made their first contribution in https://github.com/facebook/zstd/pull/2747
    • @stanjo74 made their first contribution in https://github.com/facebook/zstd/pull/2809
    • @Svetlitski-FB made their first contribution in https://github.com/facebook/zstd/pull/2839
    • @cntrump made their first contribution in https://github.com/facebook/zstd/pull/2858
    • @rex4539 made their first contribution in https://github.com/facebook/zstd/pull/2856
    • @jannkoeker made their first contribution in https://github.com/facebook/zstd/pull/2877
    • @yoniko made their first contribution in https://github.com/facebook/zstd/pull/2885
    • @15596858998 made their first contribution in https://github.com/facebook/zstd/pull/2876
    • @kanavin made their first contribution in https://github.com/facebook/zstd/pull/2895
    • @sapiippo made their first contribution in https://github.com/facebook/zstd/pull/2687
    • @supperPants made their first contribution in https://github.com/facebook/zstd/pull/2891
    • @Hello71 made their first contribution in https://github.com/facebook/zstd/pull/2907
    • @ericonr made their first contribution in https://github.com/facebook/zstd/pull/2659
    • @IAL32 made their first contribution in https://github.com/facebook/zstd/pull/2923
    • @embg made their first contribution in https://github.com/facebook/zstd/pull/2925

    Full Changelog: https://github.com/facebook/zstd/compare/v1.5.0...v1.5.1

    Source code(tar.gz)
    Source code(zip)
    zstd-1.5.1.tar.gz(1.84 MB)
    zstd-1.5.1.tar.gz.sha256(84 bytes)
    zstd-1.5.1.tar.gz.sig(858 bytes)
    zstd-1.5.1.tar.zst(1.40 MB)
    zstd-1.5.1.tar.zst.sha256(85 bytes)
    zstd-1.5.1.tar.zst.sig(858 bytes)
    zstd-v1.5.1-win32.zip(485.95 KB)
    zstd-v1.5.1-win64.zip(544.02 KB)
    zstd151_silesia_9700k.png(243.46 KB)
    zstd151_silesia_snap855.png(407.90 KB)
  • v1.5.0(May 14, 2021)

    v1.5.0 is a major release featuring large performance improvements as well as API changes.

    Performance

    Improved Middle-Level Compression Speed

    1.5.0 introduces a new default match finder for the compression strategies greedy, lazy, and lazy2, (which map to levels 5-12 for inputs larger than 256K). The optimization brings a massive improvement in compression speed with slight perturbations in compression ratio (< 0.5%) and equal or decreased memory usage.

    Benchmarked with gcc, on an i9-9900K: | level | silesia.tar speed delta | enwik7 speed delta | |--------|-----------|------------| | 5 | +25% | +25% | | 6 | +50% | +50% | | 7 | +40% | +40% | | 8 | +40% | +50% | | 9 | +50% | +65% | | 10 | +65% | +80% | | 11 | +85% | +105% | | 12 | +110% | +140% |

    On heavily loaded machines with significant cache contention, we have internally measured even larger gains: 2-3x+ speed at levels 5-7. 🚀

    The biggest gains are achieved on files typically larger than 128KB. On files smaller than 16KB, by default we revert back to the legacy match finder which becomes the faster one. This default policy can be overriden manually: the new match finder can be forcibly enabled with the advanced parameter ZSTD_c_useRowMatchFinder, or through the CLI option --[no-]row-match-finder.

    Note: only CPUs that support SSE2 realize the full extent of this improvement.

    Improved High-Level Compression Ratio

    Improving compression ratio via block splitting is now enabled by default for high compression levels (16+). The amount of benefit varies depending on the workload. Compressing archives comprised of heavily differing files will see more improvement than compression of single files that don’t vary much entropically (like text files/enwik). At levels 16+, we observe no measurable regression to compression speed.

    level 22 compression | file | ratio 1.4.9 | ratio 1.5.0 | ratio % delta | |-----|---------|--------|-------| | silesia.tar | 4.021 | 4.041 | +0.49% | | calgary.tar | 3.646 | 3.672 | +0.71% | | enwik7 | 3.579 | 3.579 | +0.0% |

    The block splitter can be forcibly enabled on lower compression levels as well with the advanced parameter ZSTD_c_splitBlocks. When forcibly enabled at lower levels, speed regressions can become more notable. Additionally, since more compressed blocks may be produced, decompression speed on these blobs may also see small regressions.

    Faster Decompression Speed

    The decompression speed of data compressed with large window settings (such as --long or --ultra) has been significantly improved in this version. The gains vary depending on compiler brand and version, with clang generally benefiting the most.

    The following benchmark was measured by compressing enwik9 at level --ultra -22 (with a 128 MB window size) on a core i7-9700K.

    | Compiler version | D. Speed improvement | | --- | --- | | gcc-7 | +15% | | gcc-8 | +10 % | | gcc-9 | +5% | | gcc-10 | +1% | | clang-6 | +21% | | clang-7 | +16% | | clang-8 | +16% | | clang-9 | +18% | | clang-10 | +16% | | clang-11 | +15% |

    Average decompression speed for “normal” payload is slightly improved too, though the impact is less impressive. Once again, mileage varies depending on exact compiler version, payload, and even compression level. In general, a majority of scenarios see benefits ranging from +1 to +9%. There are also a few outliers here and there, from -4% to +13%. The average gain across all these scenarios stands at ~+4%.

    Library Updates

    Dynamic Library Supports Multithreading by Default

    It was already possible to compile libzstd with multithreading support. But it was an active operation. By default, the make build script would build libzstd as a single-thread-only library.

    This changes in v1.5.0. Now the dynamic library (typically libzstd.so.1 on Linux) supports multi-threaded compression by default. Note that this property is not extended to the static library (typically libzstd.a on Linux) because doing so would have impacted the build script of existing client applications (requiring them to add -pthread to their recipe), thus potentially breaking their build. In order to avoid this disruption, the static library remains single-threaded by default. Luckily, this build disruption does not extend to the dynamic library, which can be built with multi-threading support while existing applications linking to libzstd.so and expecting only single-thread capabilities will be none the wiser, and remain completely unaffected.

    The idea is that starting from v1.5.0, applications can expect the dynamic library to support multi-threading should they need it, which will progressively lead to increased adoption of this capability overtime. That being said, since the locally deployed dynamic library may, or may not, support multi-threading compression, depending on local build configuration, it’s always better to check this capability at runtime. For this goal, it’s enough to check the return value when changing parameter ZSTD_c_nbWorkers , and if it results in an error, then multi-threading is not supported.

    Q: What if I prefer to keep the libraries in single-thread mode only ? The target make lib-nomt will ensure this outcome.

    Q: Actually, I want both static and dynamic library versions to support multi-threading ! The target make lib-mt will generate this outcome.

    Promotions to Stable

    Moving up to the higher digit 1.5 signals an opportunity to extend the stable portion of zstd public API. This update is relatively minor, featuring only a few non-controversial newcomers.

    ZSTD_defaultCLevel() indicates which level is default (applied when selecting level 0). It completes existing ZSTD_minCLevel() and ZSTD_maxCLevel(). Similarly, ZSTD_getDictID_fromCDict() is a straightforward equivalent to already promoted ZSTD_getDictID_fromDDict().

    Deprecations

    Zstd-1.4.0 stabilized a new advanced API which allows users to pass advanced parameters to zstd. We’re now deprecating all the old experimental APIs that are subsumed by the new advanced API. They will be considered for removal in the next Zstd major release zstd-1.6.0. Note that only experimental symbols are impacted. Stable functions, like ZSTD_initCStream(), remain fully supported.

    The deprecated functions are listed below, together with the migration. All the suggested migrations are stable APIs, meaning that once you migrate, the API will be supported forever. See the documentation for the deprecated functions for more details on how to migrate.

    Header File Locations

    Zstd has slightly re-organized the library layout to move all public headers to the top level lib/ directory. This is for consistency, so all public headers are in lib/ and all private headers are in a sub-directory. If you build zstd from source, this may affect your build system.

    • lib/common/zstd_errors.h has moved to lib/zstd_errors.h.
    • lib/dictBuilder/zdict.h has moved to lib/zdict.h.

    Single-File Library

    We have moved the scripts in contrib/single_file_libs to build/single_file_libs. These scripts, originally contributed by @cwoffenden, produce a single compilation-unit amalgamation of the zstd library, which can be convenient for integrating Zstandard into other source trees. This move reflects a commitment on our part to support this tool and this pattern of using zstd going forward.

    Windows Release Artifact Format

    We are slightly changing the format of the Windows release .zip files, to match our other release artifacts. The .zip files now bundle everything in a single folder whose name matches the archive name. The contents of that folder exactly match what was previously included in the root of the archive.

    Signed Releases

    We have created a signing key for the Zstandard project. This release and all future releases will be signed by this key. See #2520 for discussion.

    Changelog

    • api: Various functions promoted from experimental to stable API: (#2579-#2581, @senhuang42)
      • ZSTD_defaultCLevel()
      • ZSTD_getDictID_fromCDict()
    • api: Several experimental functions have been deprecated and will emit a compiler warning (#2582, @senhuang42)
      • ZSTD_compress_advanced()
      • ZSTD_compress_usingCDict_advanced()
      • ZSTD_compressBegin_advanced()
      • ZSTD_compressBegin_usingCDict_advanced()
      • ZSTD_initCStream_srcSize()
      • ZSTD_initCStream_usingDict()
      • ZSTD_initCStream_usingCDict()
      • ZSTD_initCStream_advanced()
      • ZSTD_initCStream_usingCDict_advanced()
      • ZSTD_resetCStream()
    • api: ZSTDMT_NBWORKERS_MAX reduced to 64 for 32-bit environments (#2643, @Cyan4973)
    • perf: Significant speed improvements for middle compression levels (#2494, @senhuang42 & @terrelln)
    • perf: Block splitter to improve compression ratio, enabled by default for high compression levels (#2447, @senhuang42)
    • perf: Decompression loop refactor, speed improvements on clang and for --long modes (#2614 #2630, @Cyan4973)
    • perf: Reduced stack usage during compression and decompression entropy stage (#2522 #2524, @terrelln)
    • bug: Make the number of physical CPU cores detection more robust (#2517, @PaulBone)
    • bug: Improve setting permissions of created files (#2525, @felixhandte)
    • bug: Fix large dictionary non-determinism (#2607, @terrelln)
    • bug: Fix various dedicated dictionary search bugs (#2540 #2586, @senhuang42 @felixhandte)
    • bug: Fix non-determinism test failures on Linux i686 (#2606, @terrelln)
    • bug: Fix UBSAN error in decompression (#2625, @terrelln)
    • bug: Fix superblock compression divide by zero bug (#2592, @senhuang42)
    • bug: Ensure ZSTD_estimateCCtxSize*() monotonically increases with compression level (#2538, @senhuang42)
    • doc: Improve zdict.h dictionary training API documentation (#2622, @terrelln)
    • doc: Note that public ZSTD_free*() functions accept NULL pointers (#2521, @animalize)
    • doc: Add style guide docs for open source contributors (#2626, @Cyan4973)
    • tests: Better regression test coverage for different dictionary modes (#2559, @senhuang42)
    • tests: Better test coverage of index reduction (#2603, @terrelln)
    • tests: OSS-Fuzz coverage for seekable format (#2617, @senhuang42)
    • tests: Test coverage for ZSTD threadpool API (#2604, @senhuang42)
    • build: Dynamic library built multithreaded by default (#2584, @senhuang42)
    • build: Move zstd_errors.h and zdict.h to lib/ root (#2597, @terrelln)
    • build: Single file library build script moved to build/ directory (#2618, @felixhandte)
    • build: Allow ZSTDMT_JOBSIZE_MIN to be configured at compile-time, reduce default to 512KB (#2611, @Cyan4973)
    • build: Fixed Meson build (#2548, @SupervisedThinking & @kloczek)
    • build: ZBUFF_*() is no longer built by default (#2583, @senhuang42)
    • build: Fix excessive compiler warnings with clang-cl and CMake (#2600, @nickhutchinson)
    • build: Detect presence of md5 on Darwin (#2609, @felixhandte)
    • build: Avoid SIGBUS on armv6 (#2633, @bmwiedmann)
    • cli: --progress flag added to always display progress bar (#2595, @senhuang42)
    • cli: Allow reading from block devices with --force (#2613, @felixhandte)
    • cli: Fix CLI filesize display bug (#2550, @Cyan4973)
    • cli: Fix windows CLI --filelist end-of-line bug (#2620, @Cyan4973)
    • contrib: Various fixes for linux kernel patch (#2539, @terrelln)
    • contrib: Seekable format - Decompression hanging edge case fix (#2516, @senhuang42)
    • contrib: Seekable format - New seek table-only API (#2113 #2518, @mdittmer @Cyan4973)
    • contrib: Seekable format - Fix seek table descriptor check when loading (#2534, @foxeng)
    • contrib: Seekable format - Decompression fix for large offsets, (#2594, @azat)
    • misc: Automatically published release tarballs available on Github (#2535, @felixhandte)
    Source code(tar.gz)
    Source code(zip)
    zstd-1.5.0.tar.gz(1.76 MB)
    zstd-1.5.0.tar.gz.sha256(84 bytes)
    zstd-1.5.0.tar.gz.sig(858 bytes)
    zstd-1.5.0.tar.zst(1.35 MB)
    zstd-1.5.0.tar.zst.sha256(85 bytes)
    zstd-1.5.0.tar.zst.sig(858 bytes)
    zstd-v1.5.0-win32.zip(1.49 MB)
    zstd-v1.5.0-win64.zip(1.65 MB)
  • v1.4.9(Mar 3, 2021)

    This is an incremental release which includes various improvements and bug-fixes.

    >2x Faster Long Distance Mode

    Long Distance Mode (LDM) --long just got a whole lot faster thanks to optimizations by @mpu in #2483! These optimizations preserve the compression ratio but drastically speed up compression. It is especially noticeable in multithreaded mode, because the long distance match finder is not parallelized. Benchmarking with zstd -T0 -1 --long=31 on an Intel I9-9900K at 3.2 GHz we see:

    |File |v1.4.8 MB/s |v1.4.9 MB/s |Improvement | |--- |--- |--- |--- | |silesia.tar |308 |692 |125% | |linux-versions* |312 |667 |114% | |enwik9 |294 |747 |154% |

    * linux-versions is a concatenation of the linux 4.0, 5.0, and 5.10 git archives.

    New Experimental Decompression Feature: ZSTD_d_refMultipleDDicts

    If the advanced parameter ZSTD_d_refMultipleDDicts is enabled, then multiple calls to ZSTD_refDDict() will be honored in the corresponding DCtx. Example usage:

    ZSTD_DCtx* dctx = ZSTD_createDCtx();
    ZSTD_DCtx_setParameter(dctx, ZSTD_d_refMultipleDDicts, ZSTD_rmd_refMultipleDDicts);
    ZSTD_DCtx_refDDict(dctx, ddict1);
    ZSTD_DCtx_refDDict(dctx, ddict2);
    ZSTD_DCtx_refDDict(dctx, ddict3);
    ...
    ZSTD_decompress...
    

    Decompression of multiple frames, each with their own dictID, is now possible with a single ZSTD_decompress call. As long as the dictID from each frame header references one of the dictIDs within the DCtx, then the corresponding dictionary will be used to decompress that particular frame. Note that this feature is disabled with a statically-allocated DCtx.

    Changelog

    • bug: Use umask() to Constrain Created File Permissions (#2495, @felixhandte)
    • bug: Make Simple Single-Pass Functions Ignore Advanced Parameters (#2498, @terrelln)
    • api: Add (De)Compression Tracing Functionality (#2482, @terrelln)
    • api: Support References to Multiple DDicts (#2446, @senhuang42)
    • api: Add Function to Generate Skippable Frame (#2439, @senhuang42)
    • perf: New Algorithms for the Long Distance Matcher (#2483, @mpu)
    • perf: Performance Improvements for Long Distance Matcher (#2464, @mpu)
    • perf: Don't Shrink Window Log when Streaming with a Dictionary (#2451, @terrelln)
    • cli: Fix --output-dir-mirror's Rejection of ..-Containing Paths (#2512, @felixhandte)
    • cli: Allow Input From Console When -f/--force is Passed (#2466, @felixhandte)
    • cli: Improve Help Message (#2500, @senhuang42)
    • tests: Avoid Using stat -c on NetBSD (#2513, @felixhandte)
    • tests: Correctly Invoke md5 Utility on NetBSD (#2492, @niacat)
    • tests: Remove Flaky Tests (#2455, #2486, #2445, @Cyan4973)
    • build: Zstd CLI Can Now be Linked to Dynamic libzstd (#2457, #2454 @Cyan4973)
    • build: Avoid Using Static-Only Symbols (#2504, @skitt)
    • build: Fix Fuzzer Compiler Detection & Update UBSAN Flags (#2503, @terrelln)
    • build: Explicitly Hide Static Symbols (#2501, @skitt)
    • build: CMake: Enable Only C for lib/ and programs/ Projects (#2498, @concatime)
    • build: CMake: Use configure_file() to Create the .pc File (#2462, @lazka)
    • build: Add Guards for _LARGEFILE_SOURCE and _LARGEFILE64_SOURCE (#2444, @indygreg)
    • build: Improve zlibwrapper Makefile (#2437, @Cyan4973)
    • contrib: Add recover_directory Program (#2473, @terrelln)
    • doc: Change License Year to 2021 (#2452 & #2465, @terrelln & @senhuang42)
    • doc: Fix Typos (#2459, @ThomasWaldmann)
    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.9.tar.gz(1.73 MB)
    zstd-1.4.9.tar.gz.sha256(84 bytes)
    zstd-1.4.9.tar.zst(1.33 MB)
    zstd-1.4.9.tar.zst.sha256(85 bytes)
    zstd-v1.4.9-win32.zip(1.33 MB)
    zstd-v1.4.9-win64.zip(1.49 MB)
  • v1.4.8(Dec 19, 2020)

    This is a minor hotfix for v1.4.7, where an internal buffer unalignment bug was detected by @bmwiedemann . The issue is of no consequence for x64 and arm64 targets, but could become a problem for cpus relying on strict alignment, such as mips or older arm designs. Additionally, some targets, like 32-bit x86 cpus, do not care much about alignment, but the code does, and will detect the misalignment and return an error code. Some other less common platforms, such as s390x, also seem to trigger the same issue.

    While it's a minor fix, this update is nonetheless recommended.

    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.8.tar.gz(1.71 MB)
    zstd-1.4.8.tar.gz.sha256(84 bytes)
    zstd-1.4.8.tar.zst(1.32 MB)
    zstd-1.4.8.tar.zst.sha256(85 bytes)
    zstd-v1.4.8-win32.zip(1.27 MB)
    zstd-v1.4.8-win64.zip(1.42 MB)
  • v1.4.7(Dec 17, 2020)

    Note : this version features a minor bug, which can be present on systems others than x64 and arm64. Update v1.4.8 is recommended for all other platforms.

    v1.4.7 unleashes several months of improvements across many axis, from performance to various fixes, to new capabilities, of which a few are highlighted below. It’s a recommended upgrade.

    (Note: if you ever wondered what happened to v1.4.6, it’s an internal release number reserved for synchronization with Linux Kernel)

    Improved --long mode

    --long mode makes it possible to analyze vast quantities of data in reasonable time and memory budget. The --long mode algorithm runs on top of the regular match finder, and both contribute to the final compressed outcome. However, the fact that these 2 stages were working independently resulted in minor discrepancies at highest compression levels, where the cost of each decision must be carefully monitored. For this reason, in situations where the input is not a good fit for --long mode (no large repetition at long distance), enabling it could reduce compression performance, even if by very little, compared to not enabling it (at high compression levels). This situation made it more difficult to "just always enable" the --long mode by default. This is fixed in this version. For compression levels 16 and up, usage of --long will now never regress compared to compression without --long. This property made it possible to ramp up --long mode contribution to the compression mix, improving its effectiveness.

    The compression ratio improvements are most notable when --long mode is actually useful. In particular, --patch-from (which implicitly relies on --long) shows excellent gains from the improvements. We present some brief results here (tested on Macbook Pro 16“, i9).

    long_v145_v147

    Since --long mode is now always beneficial at high compression levels, it’s now automatically enabled for any window size >= 128MB and up.

    Faster decompression of small blocks

    This release includes optimizations that significantly speed up decompression of small blocks and small data. The decompression speed gains will vary based on the block size according to the table below:

    Block Size | Decompression Speed Improvement -----------|-------------------------------- 1 KB | ~+30% 2 KB | ~+30% 4 KB | ~+25% 8 KB | ~+15% 16 KB | ~+10% 32 KB | ~+5%

    These optimizations come from improving the process of reading the block header, and building the Huffman and FSE decoding tables. zstd’s default block size is 128 KB, and at this block size the time spent decompressing the data dominates the time spent reading the block header and building the decoding tables. But, as blocks become smaller, the cost of reading the block header and building decoding tables becomes more prominent.

    CLI improvements

    The CLI received several noticeable upgrades with this version. To begin with, zstd can accept a new parameter through environment variable, ZSTD_NBTHREADS . It’s useful when zstd is called behind an application (tar, or a python script for example). Also, users which prefer multithreaded compression by default can now set a desired nb of threads with their environment. This setting can still be overridden on demand via command line. A new command --output-dir-mirror makes it possible to compress a directory containing subdirectories (typically with -r command) producing one compressed file per source file, and reproduce the arborescence into a selected destination directory. There are other various improvements, such as more accurate warning and error messages, full equivalence between conventions --long-command=FILE and --long-command FILE, fixed confusion risks between stdin and user prompt, or between console output and status message, as well as a new short execution summary when processing multiple files, cumulatively contributing to a nicer command line experience.

    New experimental features

    Shared Thread Pool

    By default, each compression context can be set to use a maximum nb of threads. In complex scenarios, there might be multiple compression contexts, working in parallel, and each using some nb of threads. In such cases, it might be desirable to control the total nb of threads used by all these compression contexts altogether.

    This is now possible, by making all these compression contexts share the same threadpool. This capability is expressed thanks to a new advanced compression parameter, ZSTD_CCtx_refThreadPool(), contributed by @marxin. See its documentation for more details.

    Faster Dictionary Compression

    This release introduces a new experimental dictionary compression algorithm, applicable to mid-range compression levels, employing strategies such as ZSTD_greedy, ZSTD_lazy, and ZSTD_lazy2. This new algorithm can be triggered by selecting the compression parameter ZSTD_c_enableDedicatedDictSearch during ZSTD_CDict creation (experimental section).

    Benchmarks show the new algorithm providing significant compression speed gains :

    Level | Hot Dict | Cold Dict ----- | -------- | --------- 5 | ~+17% | ~+30% 6 | ~+12% | ~+45% 7 | ~+13% | ~+40% 8 | ~+16% | ~+50% 9 | ~+19% | ~+65% 10 | ~+24% | ~+70%

    We hope it will help making mid-levels compression more attractive for dictionary scenarios. See the documentation for more details. Feedback is welcome!

    New Sequence Ingestion API

    We introduce a new entry point, ZSTD_compressSequences(), which makes it possible for users to define their own sequences, by whatever mechanism they prefer, and present them to this new entry point, which will generate a single zstd-compressed frame, based on provided sequences.

    So for example, users can now feed to the function an array of externally generated ZSTD_Sequence: [(offset: 5, matchLength: 4, litLength: 10), (offset: 7, matchLength: 6, litLength: 3), ...] and the function will output a zstd compressed frame based on these sequences.

    This experimental API has currently several limitations (and its relevant params exist in the “experimental” section). Notably, this API currently ignores any repeat offsets provided, instead always recalculating them on the fly. Additionally, there is no way to forcibly specify existence of certain zstd features, such as RLE or raw blocks. If you are interested in this new entry point, please refer to zstd.h for more detailed usage instructions.

    Changelog

    There are many other features and improvements in this release, and since we can’t highlight them all, they are listed below:

    • perf: stronger --long mode at high compression levels, by @senhuang42
    • perf: stronger --patch-from at high compression levels, thanks to --long improvements
    • perf: faster decompression speed for small blocks, by @terrelln
    • perf: faster dictionary compression at medium compression levels, by @felixhandte
    • perf: small speed & memory usage improvements for ZSTD_compress2(), by @terrelln
    • perf: minor generic decompression speed improvements, by @helloguo
    • perf: improved fast compression speeds with Visual Studio, by @animalize
    • cli : Set nb of threads with environment variable ZSTD_NBTHREADS, by @senhuang42
    • cli : new --output-dir-mirror DIR command, by @xxie24 (#2219)
    • cli : accept decompressing files with *.zstd suffix
    • cli : --patch-from can compress stdin when used with --stream-size, by @bimbashrestha (#2206)
    • cli : provide a condensed summary by default when processing multiple files
    • cli : fix : stdin input can no longer be confused with user prompt
    • cli : fix : console output no longer mixes stdout and status messages
    • cli : improve accuracy of several error messages
    • api : new sequence ingestion API, by @senhuang42
    • api : shared thread pool: control total nb of threads used by multiple compression jobs, by @marxin
    • api : new ZSTD_getDictID_fromCDict(), by @LuAPi
    • api : zlibWrapper only uses public API, and is compatible with dynamic library, by @terrelln
    • api : fix : multithreaded compression has predictable output even in special cases (see #2327) (issue not present on cli)
    • api : fix : dictionary compression correctly respects dictionary compression level (see #2303) (issue not present on cli)
    • api : fix : return dstSize_tooSmall error whenever appropriate
    • api : fix : ZSTD_initCStream_advanced() with static allocation and no dictionary
    • build: fix cmake script when employing path including spaces, by @terrelln
    • build: new ZSTD_NO_INTRINSICS macro to avoid explicit intrinsics
    • build: new STATIC_BMI2 macro for compile time detection of BMI2 on MSVC, by @Niadb (#2258)
    • build: improved compile-time detection of aarch64/neon platforms, by @bsdimp
    • build: Fix building on AIX 5.1, by @likema
    • build: compile paramgrill with cmake on Windows, requested by @mirh
    • build: install pkg-config file with CMake and MinGW, by @tonytheodore (#2183)
    • build: Install DLL with CMake on Windows, by @BioDataAnalysis (#2221)
    • build: fix : cli compilation with uclibc
    • misc: Improve single file library and include dictBuilder, by @cwoffenden
    • misc: Fix single file library compilation with Emscripten, by @yoshihitoh (#2227)
    • misc: Add freestanding translation script in contrib/freestanding_lib, by @terrelln
    • doc : clarify repcode updates in format specification, by @felixhandte
    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.7.tar.gz(1.72 MB)
    zstd-1.4.7.tar.gz.sha256(84 bytes)
    zstd-1.4.7.tar.zst(1.32 MB)
    zstd-1.4.7.tar.zst.sha256(85 bytes)
    zstd-v1.4.7-win32.zip(1.27 MB)
    zstd-v1.4.7-win64.zip(1.41 MB)
  • v1.4.5(May 22, 2020)

    Zstd v1.4.5 Release Notes

    This is a fairly important release which includes performance improvements and new major CLI features. It also fixes a few corner cases, making it a recommended upgrade.

    Faster Decompression Speed

    Decompression speed has been improved again, thanks to great contributions from @terrelln. As usual, exact mileage varies depending on files and compilers. For x64 cpus, expect a speed bump of at least +5%, and up to +10% in favorable cases. ARM cpus receive more benefit, with speed improvements ranging from +15% vicinity, and up to +50% for certain SoCs and scenarios (ARM‘s situation is more complex due to larger differences in SoC designs).

    For illustration, some benchmarks run on a modern x64 platform using zstd -b compiled with gcc v9.3.0 :

    | |v1.4.4 |v1.4.5 | |--- |--- |--- | |silesia.tar |1568 MB/s |1653 MB/s | |--- |--- |--- | |enwik8 |1374 MB/s |1469 MB/s | |calgary.tar |1511 MB/s |1610 MB/s |

    Same platform, using clang v10.0.0 compiler :

    | |v1.4.4 |v1.4.5 | |--- |--- |--- | |silesia.tar |1439 MB/s |1496 MB/s | |--- |--- |--- | |enwik8 |1232 MB/s |1335 MB/s | |calgary.tar |1361 MB/s |1457 MB/s |

    Simplified integration

    Presuming a project needs to integrate libzstd's source code (as opposed to linking a pre-compiled library), the /lib source directory can be copy/pasted into target project. Then the local build system must setup a few include directories. Some setups are automatically provided in prepared build scripts, such as Makefile, but any other 3rd party build system must do it on its own. This integration is now simplified, thanks to @felixhandte, by making all dependencies within /lib relative, meaning it’s only necessary to setup include directories for the *.h header files that are directly included into target project (typically zstd.h). Even that task can be circumvented by copy/pasting the *.h into already established include directories.

    Alternatively, if you are a fan of one-file integration strategy, @cwoffenden has extended his one-file decoder script into a full feature one-file compression library. The script create_single_file_library.sh will generate a file zstd.c, which contains all selected elements from the library (by default, compression and decompression). It’s then enough to import just zstd.h and the generated zstd.c into target project to access all included capabilities.

    --patch-from

    Zstandard CLI is introducing a new command line option --patch-from, which leverages existing compressors, dictionaries and long range match finder to deliver a high speed engine for producing and applying patches to files.

    --patch-from is based on dictionary compression. It will consider a previous version of a file as a dictionary, to better compress a new version of same file. This operation preserves fast zstd speeds at lower compression levels. To this ends, it also increases the previous maximum limit for dictionaries from 32 MB to 2 GB, and automatically uses the long range match finder when needed (though it can also be manually overruled). --patch-from can also be combined with multi-threading mode at a very minimal compression ratio loss.

    Example usage:

    # create the patch
    zstd --patch-from=<oldfile> <newfile> -o <patchfile>
    
    # apply the patch
    zstd -d --patch-from=<oldfile> <patchfile> -o <newfile>`
    

    Benchmarks: We compared zstd to bsdiff, a popular industry grade diff engine. Our test corpus were tarballs of different versions of source code from popular GitHub repositories. Specifically:

    `repos = {
        # ~31mb (small file)
        "zstd": {"url": "https://github.com/facebook/zstd", "dict-branch": "refs/tags/v1.4.2", "src-branch": "refs/tags/v1.4.3"},
        # ~273mb (medium file)
        "wordpress": {"url": "https://github.com/WordPress/WordPress", "dict-branch": "refs/tags/5.3.1", "src-branch": "refs/tags/5.3.2"},
        # ~1.66gb (large file)
        "llvm": {"url": "https://github.com/llvm/llvm-project", "dict-branch": "refs/tags/llvmorg-9.0.0", "src-branch": "refs/tags/llvmorg-9.0.1"}
    }`
    

    --patch-from on level 19 (with chainLog=30 and targetLength=4kb) is comparable with bsdiff when comparing patch sizes. patch-size-bsdiff-vs-zstd-19

    --patch-from greatly outperforms bsdiff in speed even on its slowest setting of level 19 boasting an average speedup of ~7X. --patch-from is >200X faster on level 1 and >100X faster (shown below) on level 3 vs bsdiff while still delivering patch sizes less than 0.5% of the original file size.

    speed-bsdiff-vs-zstd-19

    speed-bsdiff-vs-zstd-19-1

    And of course, there is no change to the fast zstd decompression speed.

    Addendum :

    After releasing --patch-from, we were made aware of two other popular diff engines by the community: SmartVersion and Xdelta. We ran some additional benchmarks for them and here are our primary takeaways. All three tools are excellent diff engines with clear advantages (especially in speed) over the popular bsdiff. Patch sizes for both binary and text data produced by all three are pretty comparable with Xdelta underperforming Zstd and SmartVersion only slightly [1]. For patch creation speed, Xdelta is the clear winner for text data and Zstd is the clear winner for binary data [2]. And for Patch Extraction Speed (ie. decompression), Zstd is fastest in all scenarios [3]. See wiki for details.

    --filelist=

    Finally, --filelist= is a new CLI capability, which makes it possible to pass a list of files to operate upon from a file, as opposed to listing all target files solely on the command line. This makes it possible to prepare a list offline, save it into a file, and then provide the prepared list to zstd. Another advantage is that this method circumvents command line size limitations, which can become a problem when operating on very large directories (such situation can typically happen with shell expansion). In contrast, passing a very large list of filenames from within a file is free of such size limitation.

    Full List

    • perf: Improved decompression speed (x64 >+5%, ARM >+15%), by @terrelln
    • perf: Automatically downsizes ZSTD_DCtx when too large for too long (#2069, by @bimbashrestha)
    • perf: Improved fast compression speed on aarch64 (#2040, ~+3%, by @caoyzh)
    • perf: Small level 1 compression speed gains (depending on compiler)
    • fix: Compression ratio regression on huge files (> 3 GB) using high levels (--ultra) and multithreading, by @terrelln
    • api: ZDICT_finalizeDictionary() is promoted to stable (#2111)
    • api: new experimental parameter ZSTD_d_stableOutBuffer (#2094)
    • build: Generate a single-file libzstd library (#2065, by @cwoffenden)
    • build: Relative includes, no longer require -I flags for zstd lib subdirs (#2103, by @felixhandte)
    • build: zstd now compiles cleanly under -pedantic (#2099)
    • build: zstd now compiles with make-4.3
    • build: Support mingw cross-compilation from Linux, by @Ericson2314
    • build: Meson multi-thread build fix on windows
    • build: Some misc icc fixes backed by new ci test on travis
    • cli: New --patch-from command, create and apply patches from files, by @bimbashrestha
    • cli: --filelist= : Provide a list of files to operate upon from a file
    • cli: -b can now benchmark multiple files in decompression mode
    • cli: New --no-content-size command
    • cli: New --show-default-cparams command
    • misc: new diagnosis tool, checked_flipped_bits, in contrib/, by @felixhandte
    • misc: Extend largeNbDicts benchmark to compression
    • misc: experimental edit-distance match finder in contrib/
    • doc: Improved beginner CONTRIBUTING.md docs
    • doc: New issue templates for zstd
    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.5.tar.gz(1.88 MB)
    zstd-1.4.5.tar.gz.sha256(84 bytes)
    zstd-1.4.5.tar.zst(1.37 MB)
    zstd-1.4.5.tar.zst.sha256(85 bytes)
    zstd-v1.4.5-win32.zip(1.27 MB)
    zstd-v1.4.5-win64.zip(1.37 MB)
  • v1.4.4(Nov 5, 2019)

    This release includes some major performance improvements and new CLI features, which make it a recommended upgrade.

    Faster Decompression Speed

    Decompression speed has been substantially improved, thanks to @terrelln. Exact mileage obviously varies depending on files and scenarios, but the general expectation is a bump of about +10%. The benefit is considered applicable to all scenarios, and will be perceptible for most usages.

    Some benchmark figures for illustration:

    | | v1.4.3 | v1.4.4 | | --- | --- | --- | | silesia.tar | 1440 MB/s | 1600 MB/s | | enwik8 | 1225 MB/s | 1390 MB/s | | calgary.tar | 1360 MB/s | 1530 MB/s |

    Faster Compression Speed when Re-Using Contexts

    In server workloads (characterized by very high compression volume of relatively small inputs), the allocation and initialization of zstd's internal datastructures can become a significant part of the cost of compression. For this reason, zstd has long had an optimization (which we recommended for large-scale users, perhaps with something like this): when you provide an already-used ZSTD_CCtx to a compression operation, zstd tries to re-use the existing data structures, if possible, rather than re-allocate and re-initialize them.

    Historically, this optimization could avoid re-allocation most of the time, but required an exact match of internal parameters to avoid re-initialization. In this release, @felixhandte removed the dependency on matching parameters, allowing the full context re-use optimization to be applied to effectively all compressions. Practical workloads on small data should expect a ~3% speed-up.

    In addition to improving average performance, this change also has some nice side-effects on the extremes of performance.

    • On the fast end, it is now easier to get optimal performance from zstd. In particular, it is no longer necessary to do careful tracking and matching of contexts to compressions based on detailed parameters (as discussed for example in #1796). Instead, straightforwardly reusing contexts is now optimal.
    • Second, this change ameliorates some rare, degenerate scenarios (e.g., high volume streaming compression of small inputs with varying, high compression levels), in which it was possible for the allocation and initialization work to vastly overshadow the actual compression work. These cases are up to 40x faster, and now perform in-line with similar happy cases.

    Dictionaries and Large Inputs

    In theory, using a dictionary should always be beneficial. However, due to some long-standing implementation limitations, it can actually be detrimental. Case in point: by default, dictionaries are prepared to compress small data (where they are most useful). When this prepared dictionary is used to compress large data, there is a mismatch between the prepared parameters (targeting small data) and the ideal parameters (that would target large data). This can cause dictionaries to counter-intuitively result in a lower compression ratio when compressing large inputs.

    Starting with v1.4.4, using a dictionary with a very large input will no longer be detrimental. Thanks to a patch from @senhuang42, whenever the library notices that input is sufficiently large (relative to dictionary size), the dictionary is re-processed, using the optimal parameters for large data, resulting in improved compression ratio.

    The capability is also exposed, and can be manually triggered using ZSTD_dictForceLoad.

    New commands

    zstd CLI extends its capabilities, providing new advanced commands, thanks to great contributions :

    • zstd generated files (compressed or decompressed) can now be automatically stored into a different directory than the source one, using --output-dir-flat=DIR command, provided by @senhuang42 .
    • It’s possible to inform zstd about the size of data coming from stdin . @nmagerko proposed 2 new commands, allowing users to provide the exact stream size (--stream-size=# ) or an approximative one (--size-hint=#). Both only make sense when compressing a data stream from a pipe (such as stdin), since for a real file, zstd obtains the exact source size from the file system. Providing a source size allows zstd to better adapt internal compression parameters to the input, resulting in better performance and compression ratio. Additionally, providing the precise size makes it possible to embed this information in the compressed frame header, which also allows decoder optimizations.
    • In situations where the same directory content get regularly compressed, with the intention to only compress new files not yet compressed, it’s necessary to filter the file list, to exclude already compressed files. This process is simplified with command --exclude-compressed, provided by @shashank0791 . As the name implies, it simply excludes all compressed files from the list to process.

    Single-File Decoder with Web Assembly

    Let’s complete the picture with an impressive contribution from @cwoffenden. libzstd has long offered the capability to build only the decoder, in order to generate smaller binaries that can be more easily embedded into memory-constrained devices and applications.

    @cwoffenden built on this capability and offers a script creating a single-file decoder, as an amalgamated variant of reference Zstandard’s decoder. The package is completed with a nice build script, which compiles the one-file decoder into WASM code, for embedding into web application, and even tests it.

    As a capability example, check out the awesome WebGL demo provided by @cwoffenden in /contrib/single_file_decoder/examples directory!

    Full List

    • perf: Improved decompression speed, by > 10%, by @terrelln
    • perf: Better compression speed when re-using a context, by @felixhandte
    • perf: Fix compression ratio when compressing large files with small dictionary, by @senhuang42
    • perf: zstd reference encoder can generate RLE blocks, by @bimbashrestha
    • perf: minor generic speed optimization, by @davidbolvansky
    • api: new ability to extract sequences from the parser for analysis, by @bimbashrestha
    • api: fixed decoding of magic-less frames, by @terrelln
    • api: fixed ZSTD_initCStream_advanced() performance with fast modes, reported by @QrczakMK
    • cli: Named pipes support, by @bimbashrestha
    • cli: short tar's extension support, by @stokito
    • cli: command --output-dir-flat=DIE , generates target files into requested directory, by @senhuang42
    • cli: commands --stream-size=# and --size-hint=#, by @nmagerko
    • cli: command --exclude-compressed, by @shashank0791
    • cli: faster -t test mode
    • cli: improved some error messages, by @vangyzen
    • cli: fix command -D dictionary on Windows
    • cli: fix rare deadlock condition within dictionary builder, by @terrelln
    • build: single-file decoder with emscripten compilation script, by @cwoffenden
    • build: fixed zlibWrapper compilation on Visual Studio, reported by @bluenlive
    • build: fixed deprecation warning for certain gcc version, reported by @jasonma163
    • build: fix compilation on old gcc versions, by @cemeyer
    • build: improved installation directories for cmake script, by Dmitri Shubin
    • pack: modified pkgconfig, for better integration into openwrt, requested by @neheb
    • misc: Improved documentation : ZSTD_CLEVEL, DYNAMIC_BMI2, ZSTD_CDict, function deprecation, zstd format
    • misc: fixed educational decoder : accept larger literals section, and removed UNALIGNED() macro
    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.4.tar.gz(1.85 MB)
    zstd-1.4.4.tar.gz.sha256(84 bytes)
    zstd-1.4.4.tar.zst(1.33 MB)
    zstd-1.4.4.tar.zst.sha256(85 bytes)
    zstd-v1.4.4-win32.zip(1.25 MB)
    zstd-v1.4.4-win64.zip(1.35 MB)
  • v1.4.3(Aug 19, 2019)

    Dictionary Compression Regression

    We discovered an issue in the v1.4.2 release, which can degrade the effectiveness of dictionary compression. This release fixes that issue.

    Detailed Changes

    • bug: Fix Dictionary Compression Ratio Regression by @cyan4973 (#1709)
    • bug: Fix Buffer Overflow in v0.3 Decompression by @felixhandte (#1722)
    • build: Add support for IAR C/C++ Compiler for Arm by @joseph0918 (#1705)
    • misc: Add NULL pointer check in util.c by @leeyoung624 (#1706)
    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.3.tar.gz(1.81 MB)
    zstd-1.4.3.tar.gz.sha256(84 bytes)
    zstd-1.4.3.tar.zst(1.30 MB)
    zstd-1.4.3.tar.zst.sha256(85 bytes)
    zstd-v1.4.3-win32.zip(1.28 MB)
    zstd-v1.4.3-win64.zip(1.35 MB)
  • v1.4.2(Jul 25, 2019)

    Legacy Decompression Fix

    This release is a small one, that corrects an issue discovered in the previous release. Zstandard v1.4.1 included a bug in decompressing v0.5 legacy frames, which is fixed in v1.4.2.

    Detailed Changes

    • bug: Fix bug in zstd-0.5 decoder by @terrelln (#1696)
    • bug: Fix seekable decompression in-memory API by @iburinoc (#1695)
    • bug: Close minor memory leak in CLI by @LeeYoung624 (#1701)
    • misc: Validate blocks are smaller than size limit by @vivekmig (#1685)
    • misc: Restructure source files by @ephiepark (#1679)
    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.2.tar.gz(1.80 MB)
    zstd-1.4.2.tar.gz.sha256(84 bytes)
    zstd-1.4.2.tar.zst(1.30 MB)
    zstd-1.4.2.tar.zst.sha256(85 bytes)
    zstd-v1.4.2-win32.zip(1.27 MB)
    zstd-v1.4.2-win64.zip(1.34 MB)
  • v1.4.1(Jul 19, 2019)

    Maintenance

    This release is primarily a maintenance release.

    It includes a few bug fixes, including a fix for a rare data corruption bug, which could only be triggered in a niche use case, when doing all of the following: using multithreading mode, with an overlap size >= 512 MB, using a strategy >= ZSTD_btlazy, and compressing more than 4 GB. None of the default compression levels meet these requirements (not even --ultra ones).

    Performance

    This release also includes some performance improvements, among which the primary improvement is that Zstd decompression is ~7% faster, thanks to @mgrice.

    See this comparison of decompression speeds at different compression levels, measured on the Silesia Corpus, on an Intel i9-9900K with GCC 9.1.0.

    | Level | v1.4.0 | v1.4.1 | Delta | | ---: | :---: | :---: | ---: | | 1 | 1390 MB/s | 1453 MB/s | +4.5% | | 3 | 1208 MB/s | 1301 MB/s | +7.6% | | 5 | 1129 MB/s | 1233 MB/s | +9.2% | | 7 | 1224 MB/s | 1347 MB/s | +10.0% | | 16 | 1278 MB/s | 1430 MB/s | +11.8% |

    Detailed list of changes

    • bug: Fix data corruption in niche use cases (huge inputs + multithreading + large custom window sizes + other conditions) by @terrelln (#1659)
    • bug: Fuzz legacy modes, fix uncovered bugs by @terrelln (#1593, #1594, #1595)
    • bug: Fix out of bounds read by @terrelln (#1590)
    • perf: Improved decoding speed by ~7% @mgrice (#1668)
    • perf: Large compression ratio improvement for small windowLog by @cyan4973 (#1624)
    • perf: Faster compression speed in high compression mode for repetitive data by @terrelln (#1635)
    • perf: Slightly improved compression ratio of level 3 and 4 (ZSTD_dfast) by @cyan4973 (#1681)
    • perf: Slightly faster compression speed when re-using a context by @cyan4973 (#1658)
    • api: Add parameter to generate smaller dictionaries by @tyler-tran (#1656)
    • cli: Recognize symlinks when built in C99 mode by @felixhandte (#1640)
    • cli: Expose cpu load indicator for each file on -vv mode by @ephiepark (#1631)
    • cli: Restrict read permissions on destination files by @chungy (#1644)
    • cli: zstdgrep: handle -f flag by @felixhandte (#1618)
    • cli: zstdcat: follow symlinks by @vejnar (#1604)
    • doc: Remove extra size limit on compressed blocks by @felixhandte (#1689)
    • doc: Improve documentation on streaming buffer sizes by @cyan4973 (#1629)
    • build: CMake: support building with LZ4 @leeyoung624 (#1626)
    • build: CMake: install zstdless and zstdgrep by @leeyoung624 (#1647)
    • build: CMake: respect existing uninstall target by @j301scott (#1619)
    • build: Make: skip multithread tests when built without support by @michaelforney (#1620)
    • build: Make: Fix examples/ test target by @sjnam (#1603)
    • build: Meson: rename options out of deprecated namespace by @lzutao (#1665)
    • build: Meson: fix build by @lzutao (#1602)
    • build: Visual Studio: don't export symbols in static lib by @scharan (#1650)
    • build: Visual Studio: fix linking by @absotively (#1639)
    • build: Fix MinGW-W64 build by @myzhang1029 (#1600)
    • misc: Expand decodecorpus coverage by @ephiepark (#1664)
    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.1.tar.gz(1.80 MB)
    zstd-1.4.1.tar.gz.sha256(84 bytes)
    zstd-1.4.1.tar.zst(1.30 MB)
    zstd-1.4.1.tar.zst.sha256(85 bytes)
    zstd-v1.4.1-win32.zip(1.26 MB)
    zstd-v1.4.1-win64.zip(1.33 MB)
  • v1.4.0(Apr 16, 2019)

    Advanced API

    The main focus of the v1.4.0 release is the stabilization of the advanced API.

    The advanced API provides a way to set specific parameters during compression and decompression in an API and ABI compatible way. For example, it allows you to compress with multiple threads, enable --long mode, set frame parameters, and load dictionaries. It is compatible with ZSTD_compressStream*() and ZSTD_compress2(). There is also an advanced decompression API that allows you to set parameters like maximum memory usage, and load dictionaries. It is compatible with the existing decompression functions ZSTD_decompressStream() and ZSTD_decompressDCtx().

    The old streaming functions are all compatible with the new API, and the documentation provides the equivalent function calls in the new API. For example, see ZSTD_initCStream(). The stable functions will remain supported, but the functions in the experimental sections, like ZSTD_initCStream_usingDict(), will eventually be marked as deprecated and removed in favor of the new advanced API.

    The examples have all been updated to use the new advanced API. If you have questions about how to use the new API, please refer to the examples, and if they are unanswered, please open an issue.

    Performance

    Zstd's fastest compression level just got faster! Thanks to ideas from Intel's igzip and @gbtucker, we've made level 1, zstd's fastest strategy, 6-8% faster in most scenarios. For example on the Silesia Corpus with level 1, we see 0.2% better compression compared to zstd-1.3.8, and these performance figures on an Intel i9-9900K:

    Version | C. Speed | D. Speed -- | -- | -- 1.3.8 gcc-8 | 489 MB/s | 1343 MB/s 1.4.0 gcc-8 | 532 MB/s (+8%) | 1346 MB/s 1.3.8 clang-8 | 488 MB/s | 1188 MB/s 1.4.0 clang-8 | 528 MB/s (+8%) | 1216 MB/s

    New Features

    A new experimental function ZSTD_decompressBound() has been added by @shakeelrao. It is useful when decompressing zstd data in a single shot that may, or may not have the decompressed size written into the frame. It is exact when the decompressed size is written into the frame, and a tight upper bound within 128 KB, as long as ZSTD_e_flush and ZSTD_flushStream() aren't used. When ZSTD_e_flush is used, in the worst case the bound can be very large, but this isn't a common scenario.

    The parameter ZSTD_c_literalCompressionMode and the CLI flag --[no-]compress-literals allow users to explicitly enable and disable literal compression. By default literals are compressed with positive compression levels, and left uncompressed for negative compression levels. Disabling literal compression boosts compression and decompression speed, at the cost of compression ratio.

    Detailed list of changes

    • perf: Improve level 1 compression speed in most scenarios by 6% by @gbtucker and @terrelln
    • api: Move the advanced API, including all functions in the staging section, to the stable section
    • api: Make ZSTD_e_flush and ZSTD_e_end block for maximum forward progress
    • api: Rename ZSTD_CCtxParam_getParameter to ZSTD_CCtxParams_getParameter
    • api: Rename ZSTD_CCtxParam_setParameter to ZSTD_CCtxParams_setParameter
    • api: Don't export ZSTDMT functions from the shared library by default
    • api: Require ZSTD_MULTITHREAD to be defined to use ZSTDMT
    • api: Add ZSTD_decompressBound() to provide an upper bound on decompressed size by @shakeelrao
    • api: Fix ZSTD_decompressDCtx() corner cases with a dictionary
    • api: Move ZSTD_getDictID_*() functions to the stable section
    • api: Add ZSTD_c_literalCompressionMode flag to enable or disable literal compression by @terrelln
    • api: Allow compression parameters to be set when a dictionary is used
    • api: Allow setting parameters before or after ZSTD_CCtx_loadDictionary() is called
    • api: Fix ZSTD_estimateCStreamSize_usingCCtxParams()
    • api: Setting ZSTD_d_maxWindowLog to 0 means use the default
    • cli: Ensure that a dictionary is not used to compress itself by @shakeelrao
    • cli: Add --[no-]compress-literals flag to enable or disable literal compression
    • doc: Update the examples to use the advanced API
    • doc: Explain how to transition from old streaming functions to the advanced API in the header
    • build: Improve the Windows release packages
    • build: Improve CMake build by @hjmjohnson
    • build: Build fixes for FreeBSD by @lwhsu
    • build: Remove redundant warnings by @thatsafunnyname
    • build: Fix tests on OpenBSD by @bket
    • build: Extend fuzzer build system to work with the new clang engine
    • build: CMake now creates the libzstd.so.1 symlink
    • build: Improve Menson build by @lzutao
    • misc: Fix symbolic link detection on FreeBSD
    • misc: Use physical core count for -T0 on FreeBSD by @cemeyer
    • misc: Fix zstd --list on truncated files by @kostmo
    • misc: Improve logging in debug mode by @felixhandte
    • misc: Add CirrusCI tests by @lwhsu
    • misc: Optimize dictionary memory usage in corner cases
    • misc: Improve the dictionary builder on small or homogeneous data
    • misc: Fix spelling across the repo by @jsoref
    Source code(tar.gz)
    Source code(zip)
    zstd-1.4.0.tar.gz(1.79 MB)
    zstd-1.4.0.tar.gz.sha256(84 bytes)
    zstd-1.4.0.tar.zst(1.29 MB)
    zstd-1.4.0.tar.zst.sha256(85 bytes)
    zstd-v1.4.0-win32.zip(1.19 MB)
    zstd-v1.4.0-win64.zip(1.28 MB)
  • v1.3.8(Dec 27, 2018)

    Advanced API

    v1.3.8 main focus is the stabilization of the advanced API.

    This API has been in the making for more than a year, and makes it possible to trigger advanced features, such as multithreading, --long mode, or detailed frame parameters, in a straightforward and extensible manner. Some examples are provided in this blog entry. To make this vision possible, the advanced API relies on sticky parameters, which can be stacked on top of each other in any order. This makes it possible to introduce new features in the future without breaking API nor ABI.

    This API has provided a good experience in our infrastructure, and we hope it will prove easy to use and efficient in your applications. Nonetheless, before being branded "stable", this proposal must spend a last round in "staging area", in order to generate comments and feedback from new users. It's planned to be labelled "stable" by v1.4.0, which is expected to be next release, depending on received feedback.

    The experimental section still contains a lot of prototypes which are largely redundant with the new advanced API. Expect them to become deprecated, and then later dropped in some future. Transition towards the newer advanced API is therefore highly recommended.

    Performance

    Decoding speed has been improved again, primarily for some specific scenarios : frames using large window sizes (--ultra or --long), and cold dictionary. Cold dictionary is expected to become more important in the near future, as solutions relying on thousands of dictionaries simultaneously will be deployed.

    The higher compression levels get a slight compression ratio boost, mostly visible for small (<256 KB) and large (>32 MB) data streams. This change benefits asymmetric scenarios (compress ones, decompress many times), typically targeting level 19.

    New features

    A noticeable addition, @terrelln introduces the --rsyncable mode to zstd. Similar to gzip --rsyncable, it generates a compressed frame which is friendly to rsync in case of limited changes : a difference in the input data will only impact a small localized amount of compressed data, instead of everything from the position onward due to cascading impacts. This is useful for very large archives regularly updated and synchronized over long distance connections (as an example, compressed mailboxes come to mind).

    The method used by zstd preserves the compression ratio very well, introducing only very tiny losses due to synchronization points, meaning it's no longer a sacrifice to use --rsyncable. Here is an example on silesia.tar, using default compression level :

    | compressor | normal | --rsyncable | Ratio diff. | time | | --- | --- | --- | --- | --- | | gzip | 68235456 | 68778265 | -0.795% | 7.92s | | zstd | 66829650 | 66846769 | -0.026% | 1.17s |

    Speaking of compression of level : it's now possible to use environment variable ZSTD_CLEVEL to influence default compression level. This can prove useful in situations where it's not possible to provide command line parameters, typically when zstd is invoked "under the hood" by some calling process.

    Lastly, anyone interested in embedding a small zstd decoder into a space-constrained application will be interested in a new set of build macros introduced by @felixhandte, which makes it possible to selectively turn off decoder features to reduce binary size even further. Final binary size will of course vary depending on target assembler and compiler, but in preliminary testings on x64, it helped reducing the decoder size by a factor 3 (from ~64KB towards ~20KB).

    Detailed list of changes

    • perf: better decompression speed on large files (+7%) and cold dictionaries (+15%)
    • perf: slightly better compression ratio at high compression modes
    • api : finalized advanced API, last stage before "stable" status
    • api : new --rsyncable mode, by @terrelln
    • api : support decompression of empty frames into NULL (used to be an error) (#1385)
    • build: new set of build macros to generate a minimal size decoder, by @felixhandte
    • build: fix compilation on MIPS32, reported by @clbr (#1441)
    • build: fix compilation with multiple -arch flags, by @ryandesign
    • build: highly upgraded meson build, by @lzutao
    • build: improved buck support, by @obelisk
    • build: fix cmake script : can create debug build, by @pitrou
    • build: Makefile : grep works on both colored consoles and systems without color support
    • build: fixed zstd-pgo target, by @bmwiedemann
    • cli : support ZSTD_CLEVEL environment variable, by @yijinfb (#1423)
    • cli : --no-progress flag, preserving final summary (#1371), by @terrelln
    • cli : ensure destination file is not source file (#1422)
    • cli : clearer error messages, notably when input file not present
    • doc : clarified zstd_compression_format.md, by @ulikunitz
    • misc: fixed zstdgrep, returns 1 on failure, by @lzutao
    • misc: NEWS renamed as CHANGELOG, in accordance with fb.oss policy
    Source code(tar.gz)
    Source code(zip)
    zstd-1.3.8.tar.gz(1.77 MB)
    zstd-1.3.8.tar.gz.sha256(84 bytes)
    zstd-1.3.8.tar.sha256(81 bytes)
    zstd-1.3.8.tar.zst(1.28 MB)
    zstd-1.3.8.tar.zst.sha256(85 bytes)
    zstd-v1.3.8-win32.zip(1.19 MB)
    zstd-v1.3.8-win64.zip(1.27 MB)
  • v1.3.7(Oct 19, 2018)

    This is minor fix release building upon v1.3.6.

    The main reason we publish this new version is that @indygreg detected an important compression ratio regression for a specific scenario (compressing with dictionary at level 9 or 10 for small data, or 11 - 12 for large data) . We don't anticipate this scenario to be common : dictionary compression is still rare, then most users prefer fast modes (levels <=3), a few rare ones use strong modes (level 15-19), so "middle compression" is an extreme rarity. But just in case some user do, we publish this release.

    A few other minor things were ongoing and are therefore bundled.

    Decompression speed might be slightly better with clang, depending on exact target and version. We could observe as mush as 7% speed gains in some cases, though in other cases, it's rather in the ~2% range.

    The integrated backtrace functionality in the cli is updated : its presence can be more easily controlled, invoking BACKTRACE build macro. The automatic detector is more restrictive, and release mode builds without it by default. We want to be sure the default make compiles without any issue on most platforms.

    Finally, the list of man pages has been completed with documentation for zstdless and zstdgrep, by @samrussell .

    Detailed list of changes

    • perf: slightly better decompression speed on clang (depending on hardware target)
    • fix : ratio for dictionary compression at levels 9 and 10, reported by @indygreg
    • build: no longer build backtrace by default in release mode; restrict further automatic mode
    • build: control backtrace support through build macro BACKTRACE
    • misc: added man pages for zstdless and zstdgrep, by @samrussell
    Source code(tar.gz)
    Source code(zip)
    zstd-1.3.7.tar.gz(1.72 MB)
    zstd-1.3.7.tar.gz.sha256(84 bytes)
    zstd-1.3.7.tar.sha256(81 bytes)
    zstd-1.3.7.tar.zst(1.23 MB)
    zstd-1.3.7.tar.zst.sha256(85 bytes)
    zstd-v1.3.7-win32.zip(1.17 MB)
    zstd-v1.3.7-win64.zip(1.25 MB)
  • v1.3.6(Oct 5, 2018)

    Zstandard v1.3.6 release is focused on intensive dictionary compression for database scenarios.

    This is a new environment we are experimenting. The success of dictionary compression on small data, of which databases tend to store plentiful, led to increased adoption, and we now see scenarios where literally thousands of dictionaries are being used simultaneously, with permanent generation or update of new dictionaries.

    To face these new conditions, v1.3.6 brings a few improvements to the table :

    • A brand new, faster dictionary builder, by @jenniferliu, under guidance from @terrelln. The new builder, named fastcover, is about 10x faster than our previous default generator, cover, while suffering only negligible accuracy losses (<1%). It's effectively an approximative version of cover, which throws away accuracy for the benefit of speed and memory. The new dictionary builder is so effective that it has become our new default dictionary builder (--train). Slower but higher quality generator remains accessible using --train-cover command.

    Here is an example, using the "github user records" public dataset (about 10K records of about 1K each) :

    | builder algorithm | generation time | compression ratio | | --- | --- | --- | | fast cover (v1.3.6 --train) | 0.9 s | x10.29 | | cover (v1.3.5 --train) | 10.1 s | x10.31 | High accuracy fast cover (--train-fastcover) | 6.6 s | x10.65 | High accuracy cover (--train-cover) | 50.5 s | x10.66

    • Faster dictionary decompression under memory pressure, when using thousands of dictionaries simultaneously. The new decoder is able to detect cold vs hot dictionary scenarios, and adds clever prefetching decisions to minimize memory latency. It typically improves decoding speed by ~+30% (vs v1.3.5).

    • Faster dictionary compression under memory pressure, when using a lot of contexts simultaneously. The new design, by @felixhandte, reduces considerably memory usage when compressing small data with dictionaries, which is the main scenario found in databases. The sharp memory usage reduction makes it easier for CPU caches to manages multiple contexts in parallel. Speed gains scale with number of active contexts, as shown in the graph below :
      Dictionary compression : Speed vs Nb Active Contexts

      Note that, in real-life environment, benefits are present even faster, since cpu caches tend to be used by multiple other process / threads at the same time, instead of being monopolized by a single synthetic benchmark.

    Other noticeable improvements

    A new command --adapt, makes it possible to pipe gigantic amount of data between servers (typically for backup scenarios), and let the compressor automatically adjust compression level based on perceived network conditions. When the network becomes slower, zstd will use available time to compress more, and accelerate again when bandwidth permit. It reduces the need to "pre-calibrate" speed and compression level, and is a good simplification for system administrators. It also results in gains for both dimensions (better compression ratio and better speed) compared to the more traditional "fixed" compression level strategy. This is still early days for this feature, and we are eager to get feedback on its usages. We know it works better in fast bandwidth environments for example, as adaptation itself becomes slow when bandwidth is slow. This is something that will need to be improved. Nonetheless, in its current incarnation, --adapt already proves useful for several datacenter scenarios, which is why we are releasing it.

    Advanced users will be please by the expansion of an existing tool, tests/paramgrill, which has been refined by @georgelu. This tool explores the space of advanced compression parameters, to find the best possible set of compression parameters for a given scenario. It takes as input a set of samples, and a set of constraints, and works its way towards better and better compression parameters respecting the constraints.

    Example :

    ./paramgrill --optimize=cSpeed=50M dirToSamples/*   # requires minimum compression speed of 50 MB/s
    optimizing for dirToSamples/* - limit compression speed 50 MB/s
    
    (...)
    
    /*   Level  5   */       { 20, 18, 18,  2,  5,  2,ZSTD_greedy  ,  0 },     /* R:3.147 at  75.7 MB/s - 567.5 MB/s */   # best level satisfying constraint
    --zstd=windowLog=20,chainLog=18,hashLog=18,searchLog=2,searchLength=5,targetLength=2,strategy=3,forceAttachDict=0
    
    (...)
    
    /* Custom Level */       { 21, 16, 18,  2,  6,  0,ZSTD_lazy2   ,  0 },     /* R:3.240 at  53.1 MB/s - 661.1 MB/s */  # best custom parameters found
    --zstd=windowLog=21,chainLog=16,hashLog=18,searchLog=2,searchLength=6,targetLength=0,strategy=5,forceAttachDict=0   # associated command arguments, can be copy/pasted for `zstd`
    

    Finally, documentation has been updated, to reflect wording adopted by IETF RFC 8478 (Zstandard Compression and the application/zstd Media Type).

    Detailed changes list

    • perf: much faster dictionary builder, by @jenniferliu
    • perf: faster dictionary compression on small data when using multiple contexts, by @felixhandte
    • perf: faster dictionary decompression when using a very large number of dictionaries simultaneously
    • cli : fix : does no longer overwrite destination when source does not exist (#1082)
    • cli : new command --adapt, for automatic compression level adaptation
    • api : fix : block api can be streamed with > 4 GB, reported by @catid
    • api : reduced ZSTD_DDict size by 2 KB
    • api : minimum negative compression level is defined, and can be queried using ZSTD_minCLevel() (#1312).
    • build: support Haiku target, by @korli
    • build: Read Legacy support is now limited to v0.5+ by default. Can be changed at compile time with macro ZSTD_LEGACY_SUPPORT.
    • doc : zstd_compression_format.md updated to match wording in IETF RFC 8478
    • misc: tests/paramgrill, a parameter optimizer, by @GeorgeLu97
    Source code(tar.gz)
    Source code(zip)
    zstd-1.3.6.tar.gz(1.72 MB)
    zstd-1.3.6.tar.gz.sha256(84 bytes)
    zstd-v1.3.6-win32.zip(1.17 MB)
    zstd-v1.3.6-win64.zip(1.25 MB)
  • v1.3.5(Jun 28, 2018)

    Zstandard v1.3.5 is a maintenance release focused on dictionary compression performance.

    Compression is generally associated with the act of willingly requesting the compression of some large source. However, within datacenters, compression brings its best benefits when completed transparently. In such scenario, it's actually very common to compress a large number of very small blobs (individual messages in a stream or log, or records in a cache or datastore, etc.). Dictionary compression is a great tool for these use cases.

    This release makes dictionary compression significantly faster for these situations, when compressing small to very small data (inputs up to ~16 KB).

    Dictionary compression : speed vs input size

    The above image plots the compression speeds at different input sizes for zstd v1.3.4 (red) and v1.3.5 (green), at levels 1, 3, 9, and 18. The benchmark data was gathered on an Intel Xeon CPU E5-2680 v4 @ 2.40GHz. The benchmark was compiled with clang-7.0, with the flags -O3 -march=native -mtune=native -DNDEBUG. The file used in the results shown here is the osdb file from the Silesia corpus, cut into small blocks. It was selected because it performed roughly in the middle of the pack among the Silesia files.

    The new version saves substantial initialization time, which is increasingly important as the average size to compress becomes smaller. The impact is even more perceptible at higher levels, where initialization costs are higher. For larger inputs, performance remain similar.

    Users can expect to measure substantial speed improvements for inputs smaller than 8 KB, and up to 32 KB depending on the context. The expected speed-up ranges from none (large, incompressible blobs) to many times faster (small, highly compressible inputs). Real world examples up to 15x have been observed.

    Other noticeable improvements

    The compression levels have been slightly adjusted, taking into consideration the higher top speed of level 1 since v1.3.4, and making level 19 a substantially stronger compression level while preserving the 8 MB window size limit, hence keeping an acceptable memory budget for decompression.

    It's also possible to select the content of libzstd by modifying macro values at compilation time. By default, libzstd contains everything, but its size can be made substantially smaller by removing support for the dictionary builder, or legacy formats, or deprecated functions. It's even possible to build a compression-only or a decompression-only library.

    Detailed changes list

    • perf: much faster dictionary compression, by @felixhandte
    • perf: small quality improvement for dictionary generation, by @terrelln
    • perf: improved high compression levels (notably level 19)
    • mem : automatic memory release for long duration contexts
    • cli : fix : overlapLog can be manually set
    • cli : fix : decoding invalid lz4 frames
    • api : fix : performance degradation for dictionary compression when using advanced API, by @terrelln
    • api : change : clarify ZSTD_CCtx_reset() vsZSTD_CCtx_resetParameters(), by @terrelln
    • build: select custom libzstd scope through control macros, by @GeorgeLu97
    • build: OpenBSD support, by @bket
    • build: make and make all are compatible with -j
    • doc : clarify zstd_compression_format.md, updated for IETF RFC process
    • misc: pzstd compatible with reproducible compilation, by @lamby

    Known bug

    zstd --list does not work with non-interactive tty. This issue is fixed in dev branch.

    Source code(tar.gz)
    Source code(zip)
    zstd-src.tar.zst(1.15 MB)
    zstd-src.tar.zst.sha256.sig(189 bytes)
    zstd-v1.3.5-win32.zip(1.18 MB)
    zstd-v1.3.5-win64.zip(1.25 MB)
  • v1.3.4(Mar 26, 2018)

    The v1.3.4 release of Zstandard is focused on performance, and offers nice speed boost in most scenarios.

    Asynchronous compression by default for zstd CLI

    zstd cli will now performs compression in parallel with I/O operations by default. This requires multi-threading capability (which is also enabled by default). It doesn't sound like much, but effectively improves throughput by 20-30%, depending on compression level and underlying I/O performance.

    For example, on a Mac OS-X laptop with an Intel Core i7-5557U CPU @ 3.10GHz, running time zstd enwik9 at default compression level (2) on a SSD gives the following :

    | Version | real time | | --- | --- | | 1.3.3 | 9.2s | | 1.3.4 --single-thread | 8.8s | | 1.3.4 (asynchronous) | 7.5s |

    This is a nice boost to all scripts using zstd cli, typically in network or storage tasks. The effect is even more pronounced at faster compression setting, since the CLI overlaps a proportionally higher share of compression with I/O.

    Previous default behavior (blocking single thread) is still available, accessible through --single-thread long command. It's also the only mode available when no multi-threading capability is detected.

    General speed improvements

    Some core routines have been refined to provide more speed on newer cpus, making better use of their out-of-order execution units. This is more sensible on the decompression side, and even more so with gcc compiler.

    Example on the same platform, running in-memory benchmark zstd -b1 silesia.tar :

    | Version | C.Speed | D.Speed | | --- | ---- | --- | | 1.3.3 llvm9 | 290 MB/s | 660 MB/s | | 1.3.4 llvm9 | 304 MB/s | 700 MB/s (+6%) | | 1.3.3 gcc7 | 280 MB/s | 710 MB/s | 1.3.4 gcc7 | 300 MB/s | 890 MB/s (+25%)|

    Faster compression levels

    So far, compression level 1 has been the fastest one available. Starting with v1.3.4, there will be additional choices. Faster compression levels can be invoked using negative values. On the command line, the equivalent one can be triggered using --fast[=#] command.

    Negative compression levels sample data more sparsely, and disable Huffman compression of literals, translating into faster decoding speed.

    It's possible to create one's own custom fast compression level by using strategy ZSTD_fast, increasing ZSTD_p_targetLength to desired value, and turning on or off literals compression, using ZSTD_p_compressLiterals.

    Performance is generally on par or better than other high speed algorithms. On below benchmark (compressing silesia.tar on an Intel Core i7-6700K CPU @ 4.00GHz) , it ends up being faster and stronger on all metrics compared with quicklz and snappy at --fast=2. It also compares favorably to lzo with --fast=3. lz4 still offers a better speed / compression combo, with zstd --fast=4 approaching close.

    name | ratio | compression | decompression -- | -- | -- | -- zstd 1.3.4 --fast=5 | 1.996 | 770 MB/s | 2060 MB/s lz4 1.8.1 | 2.101 | 750 MB/s | 3700 MB/s zstd 1.3.4 --fast=4 | 2.068 | 720 MB/s | 2000 MB/s zstd 1.3.4 --fast=3 | 2.153 | 675 MB/s | 1930 MB/s lzo1x 2.09 -1 | 2.108 | 640 MB/s | 810 MB/s zstd 1.3.4 --fast=2 | 2.265 | 610 MB/s | 1830 MB/s quicklz 1.5.0 -1 | 2.238 | 540 MB/s | 720 MB/s snappy 1.1.4 | 2.091 | 530 MB/s | 1820 MB/s zstd 1.3.4 --fast=1 | 2.431 | 530 MB/s | 1770 MB/s zstd 1.3.4 -1 | 2.877 | 470 MB/s | 1380 MB/s brotli 1.0.2 -0 | 2.701 | 410 MB/s | 430 MB/s lzf 3.6 -1 | 2.077 | 400 MB/s | 860 MB/s zlib 1.2.11 -1 | 2.743 | 110 MB/s | 400 MB/s

    Applications which were considering Zstandard but were worried of being CPU-bounded are now able to shift the load from CPU to bandwidth on a larger scale, and may even vary temporarily their choice depending on local conditions (to deal with some sudden workload surge for example).

    Long Range Mode with Multi-threading

    zstd-1.3.2 introduced the long range mode, capable to deduplicate long distance redundancies in a large data stream, a situation typical in backup scenarios for example. But its usage in association with multi-threading was discouraged, due to inefficient use of memory. zstd-1.3.4 solves this issue, by making long range match finder run in serial mode, like a pre-processor, before passing its result to backend compressors (regular zstd). Memory usage is now bounded to the maximum of the long range window size, and the memory that zstdmt would require without long range matching. As the long range mode runs at about 200 MB/s, depending on the number of cores available, it's possible to tune compression level to match the LRM speed, which becomes the upper limit.

    zstd -T0  -5  --long    file # autodetect threads, level 5, 128 MB window
    zstd -T16 -10 --long=31 file # 16 threads, level 10, 2 GB window
    

    As illustration, benchmarks of the two files "Linux 4.7 - 4.12" and "Linux git" from the 1.3.2 release are shown below. All compressors are run with 16 threads, except "zstd single 2 GB". zstd compressors are run with either a 128 MB or 2 GB window size, and lrzip compressor is run with lzo, gzip, and xz backends. The benchmarks were run on a 16 core Sandy Bridge @ 2.2 GHz.

    Linux 4.7 - 12 compression ratio vs speed Linux git compression ratio vs speed

    The association of Long Range Mode with multi-threading is pretty compelling for large stream scenarios.

    Miscellaneous

    This release also brings its usual list of small improvements and bug fixes, as detailed below :

    • perf: faster speed (especially decoding speed) on recent cpus (haswell+)
    • perf: much better performance associating --long with multi-threading, by @terrelln
    • perf: better compression at levels 13-15
    • cli : asynchronous compression by default, for faster experience (use --single-thread for former behavior)
    • cli : smoother status report in multi-threading mode
    • cli : added command --fast=#, for faster compression modes
    • cli : fix crash when not overwriting existing files, by Pádraig Brady (@pixelb)
    • api : nbThreads becomes nbWorkers : 1 triggers asynchronous mode
    • api : compression levels can be negative, for even more speed
    • api : ZSTD_getFrameProgression() : get precise progress status of ZSTDMT anytime
    • api : ZSTDMT can accept new compression parameters during compression
    • api : implemented all advanced dictionary decompression prototypes
    • build: improved meson recipe, by Shawn Landden (@shawnl)
    • build: VS2017 scripts, by @HaydnTrigg
    • misc: all /contrib projects fixed
    • misc: added /contrib/docker script by @gyscos
    Source code(tar.gz)
    Source code(zip)
    zstd-src.tar.zst(1.48 MB)
    zstd-src.tar.zst.sha256.sig(189 bytes)
    zstd-v1.3.4-win32.zip(1.07 MB)
    zstd-v1.3.4-win64.zip(1.16 MB)
  • v1.3.3(Dec 21, 2017)

    This is bugfix release, mostly focused on cleaning several detrimental corner cases scenarios. It is nonetheless a recommended upgrade.

    Changes Summary

    • perf: improved zstd_opt strategy (levels 16-19)
    • fix : bug #944 : multithreading with shared ditionary and large data, reported by @gsliepen
    • cli : change : -o can be combined with multiple inputs, by @terrelln
    • cli : fix : content size written in header by default
    • cli : fix : improved LZ4 format support, by @felixhandte
    • cli : new : hidden command -b -S, to benchmark multiple files and generate one result per file
    • api : change : when setting pledgedSrcSize, use ZSTD_CONTENTSIZE_UNKNOWN macro value to mean "unknown"
    • api : fix : support large skippable frames, by @terrelln
    • api : fix : re-using context could result in suboptimal block size in some corner case scenarios
    • api : fix : streaming interface was adding a useless 3-bytes null block to small frames
    • build: fix : compilation under rhel6 and centos6, reported by @pixelb
    • build: added check target
    • build: improved meson support, by @shawnl
    Source code(tar.gz)
    Source code(zip)
    zstd-v1.3.3-win32.zip(1.04 MB)
    zstd-v1.3.3-win64.zip(1.11 MB)
  • v1.3.2(Oct 9, 2017)

    Zstandard Long Range Match Finder

    Zstandard has a new long range match finder written by Facebook's intern Stella Lau (@stellamplau), which specializes on finding long matches in the distant past. It integrates seamlessly with the regular compressor, and the output can be decompressed just like any other Zstandard compressed data.

    The long range match finder adds minimal overhead to the compressor, works with any compression level, and maintains Zstandard's blazingly fast decompression speed. However, since the window size is larger, it requires more memory for compression and decompression.

    To go along with the long range match finder, we've increased the maximum window size to 2 GB. The decompressor only accepts window sizes up to 128 MB by default, but zstd -d --memory=2GB will decompress window sizes up to 2 GB.

    Example usage

    # 128 MB window size
    zstd -1 --long file
    zstd -d file.zst
    
    # 2 GB window size (window log = 31)
    zstd -6 --long=31 file
    zstd -d --long=31 file.zst
    # OR
    zstd -d --memory=2GB file.zst
    
    ZSTD_CCtx *cctx = ZSTD_createCCtx();
    ZSTD_CCtx_setParameter(cctx, ZSTD_p_compressionLevel, 19);
    ZSTD_CCtx_setParameter(cctx, ZSTD_p_enableLongDistanceMatching, 1); // Sets windowLog=27
    ZSTD_CCtx_setParameter(cctx, ZSTD_p_windowLog, 30); // Optionally increase the window log
    ZSTD_compress_generic(cctx, &out, &in, ZSTD_e_end);
    
    ZSTD_DCtx *dctx = ZSTD_createDCtx();
    ZSTD_DCtx_setMaxWindowSize(dctx, 1 << 30);
    ZSTD_decompress_generic(dctx, &out, &in);
    

    Benchmarks

    We compared the zstd long range matcher to zstd and lrzip. The benchmarks were run on an AMD Ryzen 1800X (8 cores with 16 threads at 3.6 GHz).

    Compressors

    • zstd — The regular Zstandard compressor.
    • zstd 128 MB — The Zstandard compressor with a 128 MB window size.
    • zstd 2 GB — The Zstandard compressor with a 2 GB window size.
    • lrzip xz — The lrzip compressor with default options, which uses the xz backend at level 7 with 16 threads.
    • lrzip xz single — The lrzip compressor with a single-threaded xz backend at level 7.
    • lrzip zstd — The lrzip compressor without a backend, then its output is compressed by zstd (not multithreaded).

    Files

    • Linux 4.7 - 4.12 — This file consists of the uncompressed tarballs of the six Linux kernel release from 4.7 to 4.12 concatenated together in order. This file is extremely compressible if the compressor can match against the previous versions well.
    • Linux git — This file is a tarball of the linux repo, created by git clone https://github.com/torvalds/linux && tar -cf linux-git.tar linux/. This file gets a small benefit from long range matching. This file shows how the long range matcher performs when there isn't too many matches to find.

    Results

    Both zstd and zstd 128 MB don't have large enough of a window size to compress Linux 4.7 - 4.12 well. zstd 2 GB compresses the fastest, and slightly better than lrzip-zstd. lrzip-xz compresses the best, and at a reasonable speed with multithreading enabled. The place where zstd shines is decompression ease and speed. Since it is just regular Zstandard compressed data, it is decompressed by the highly optimized decompressor.

    The Linux git file shows that the long range matcher maintains good compression and decompression speed, even when there are far less long range matches. The decompression speed takes a small hit because it has to look further back to reconstruct the matches.

    Compression Ratio vs Speed | Decompression Speed ---------------------------|-------------------- Linux 4.7 - 12 compression ratio vs speed | Linux 4.7 - 12 decompression speed Linux git compression ratio vs speed | Linux git decompression speed

    Implementation details

    The long distance match finder was inspired by great work from Con Kolivas' lrzip, which in turn was inspired by Andrew Tridgell's rzip. Also, let's mention Bulat Ziganshin's srep, which we have not been able to test unfortunately (site down), but the discussions on encode.ru proved great sources of inspiration.

    Therefore, many similar mechanisms are adopted, such as using a rolling hash, and filling a hash table divided into buckets of entries.

    That being said, we also made different choices, with the goal to favor speed, as can be observed in benchmark. The rolling hash formula is selected for computing efficiency. There is a restrictive insertion policy, which only inserts candidates that respect a mask condition. The insertion policy allows us to skip the hash table in the common case that a match isn't present. Confirmation bits are saved, to only check for matches when there is a strong presumption of success. These and a few more details add up to make zstd's long range matcher a speed-oriented implementation.

    The biggest difference though is that the long range matcher is blended into the regular compressor, producing a single valid zstd frame, undistinguishable from normal operation (except obviously for the larger window size). This makes decompression a single pass process, preserving its speed property.

    More details are available directly in source code, at lib/compress/zstd_ldm.c.

    Future work

    This is a first implementation, and it still has a few limitations, that we plan to lift in the future.

    The long range matcher doesn't interact well with multithreading. Due to the way zstd multithreading is currently implemented, memory usage will scale with the window size times the number of threads, which is a problem for large window sizes. We plan on supporting multithreaded long range matching with reasonable memory usage in a future version.

    Secondly, Zstandard is currently limited to a 2 GB window size because of indexer's design. While this is a significant update compared to previous 128 MB limit, we believe this limitation can be lifted altogether, with some structural changes in the indexer. However, it also means that window size would become really big, with knock-off consequences on memory usage. So, to reduce this load, we will have to consider memory map as a complementary way to reference past content in the uncompressed file.

    Detailed list of changes

    • new : long range mode, using --long command, by Stella Lau (@stellamplau)
    • new : ability to generate and decode magicless frames (#591)
    • changed : maximum nb of threads reduced to 200, to avoid address space exhaustion in 32-bits mode
    • fix : multi-threading compression works with custom allocators, by @terrelln
    • fix : a rare compression bug when compression generates very large distances and bunch of other conditions (only possible at --ultra -22)
    • fix : 32-bits build can now decode large offsets (levels 21+)
    • cli : added LZ4 frame support by default, by Felix Handte (@felixhandte)
    • cli : improved --list output
    • cli : new : can split input file for dictionary training, using command -B#
    • cli : new : clean operation artefact on Ctrl-C interruption (#854)
    • cli : fix : do not change /dev/null permissions when using command -t with root access, reported by @mike155 (#851)
    • cli : fix : write file size in header in multiple-files mode
    • api : added macro ZSTD_COMPRESSBOUND() for static allocation
    • api : experimental : new advanced decompression API
    • api : fix : sizeof_CCtx() used to over-estimate
    • build: fix : compilation works with -mbmi (#868)
    • build: fix : no-multithread variant compiles without pool.c dependency, reported by Mitchell Blank Jr (@mitchblank) (#819)
    • build: better compatibility with reproducible builds, by Bernhard M. Wiedemann (@bmwiedemann) (#818)
    • example : added streaming_memory_usage
    • license : changed /examples license to BSD + GPLv2
    • license : fix a few header files to reflect new license (#825)

    Warning

    bug #944 : v1.3.2 is known to produce corrupted data in the following scenario, requiring all these conditions simultaneously :

    • compression using multi-threading
    • with a dictionary
    • on "large enough" files (several MB, exact threshold depends on compression level)

    Note that dictionary is meant to help compression of small files (a few KB), while multi-threading is only useful for large files, so it's pretty rare to need both at the same time. Nonetheless, if your application happens to trigger this situation, it's recommended to skip v1.3.2 for a newer version. At the time of this warning, the dev branch is known to work properly for the same scenario.

    Source code(tar.gz)
    Source code(zip)
    zstd-v1.3.2-win32.zip(1007.93 KB)
    zstd-v1.3.2-win64.zip(1.04 MB)
  • v1.3.1(Aug 20, 2017)

    • New license : BSD + GPLv2
    • perf: substantially decreased memory usage in Multi-threading mode, thanks to reports by Tino Reichardt (@mcmilk)
    • perf: Multi-threading supports up to 256 threads. Cap at 256 when more are requested (#760)
    • cli : improved and fixed --list command, by @ib (#772)
    • cli : command -vV lists supported formats, by @ib (#771)
    • build : fixed binary variants, reported by @svenha (#788)
    • build : fix Visual compilation for non x86/x64 targets, reported by @GregSlazinski (#718)
    • API exp : breaking change : ZSTD_getframeHeader() provides more information
    • API exp : breaking change : pinned down values of error codes
    • doc : fixed huffman example, by Ulrich Kunitz (@ulikunitz)
    • new : contrib/adaptive-compression, I/O driven compression level, by Paul Cruz (@paulcruz74)
    • new : contrib/long_distance_matching, statistics tool by Stella Lau (@stellamplau)
    • updated : contrib/linux-kernel, by Nick Terrell (@terrelln)
    Source code(tar.gz)
    Source code(zip)
    zstd-v1.3.1-win32.zip(325.24 KB)
    zstd-v1.3.1-win64.zip(345.15 KB)
  • v1.3.0(Jul 5, 2017)

    cli : new : --list command, by @paulcruz74 cli : changed : xz/lzma support enabled by default cli : changed : -t * continue processing list after a decompression error API : added : ZSTD_versionString() API : promoted to stable status : ZSTD_getFrameContentSize(), by @iburinoc API exp : new advanced API : ZSTD_compress_generic(), ZSTD_CCtx_setParameter() API exp : new : API for static or external allocation : ZSTD_initStatic?Ctx() API exp : added : ZSTD_decompressBegin_usingDDict(), requested by @Crazee (#700) API exp : clarified memory estimation / measurement functions. API exp : changed : strongest strategy renamed ZSTD_btultra, fastest strategy ZSTD_fast set to 1 Improved : reduced stack memory usage, by @terrelln and @stellamplau tools : decodecorpus can generate random dictionary-compressed samples, by @paulcruz74 new : contrib/seekable_format, demo and API, by @iburinoc changed : contrib/linux-kernel, updated version and license, by @terrelln

    Source code(tar.gz)
    Source code(zip)
    zstd-v1.3.0-win32.zip(876.96 KB)
    zstd-v1.3.0-win64.zip(937.75 KB)
  • v1.2.0(May 4, 2017)

    Major features :

    • Multithreading is enabled by default in the cli. Use -T# to select nb of thread. To disable multithreading, build target zstd-nomt or compile with HAVE_THREAD=0.
    • New dictionary builder named "cover" with improved quality (produces better compression ratio), by @terrelln. Legacy dictionary builder remains available, using --train-legacy command.

    Other changes : cli : new : command -T0 means "detect and use nb of cores", by @iburinoc cli : new : zstdmt symlink hardwired to zstd -T0 cli : new : command --threads=# (#671) cli : new : commands --train-cover and --train-legacy, to select dictionary algorithm and parameters cli : experimental targets zstd4 and xzstd4, supporting lz4 format, by @iburinoc cli : fix : does not output compressed data on console cli : fix : ignore symbolic links unless --force specified, API : breaking change : ZSTD_createCDict_advanced() uses compressionParameters as argument API : added : prototypes ZSTD_*_usingCDict_advanced(), for direct control over frameParameters. API : improved: ZSTDMT_compressCCtx() reduced memory usage API : fix : ZSTDMT_compressCCtx() now provides srcSize in header (#634) API : fix : src size stored in frame header is controlled at end of frame API : fix : enforced consistent rules for pledgedSrcSize==0 (#641) API : fix : error code GENERIC replaced by dstSizeTooSmall when appropriate build: improved cmake script, by @Majlen build: enabled Multi-threading support for *BSD, by @bapt tools: updated paramgrill. Command -O# provides best parameters for sample and speed target. new : contrib/linux-kernel version, by @terrelln

    Source code(tar.gz)
    Source code(zip)
    zstd-v1.2.0-win32.zip(839.70 KB)
    zstd-v1.2.0-win64.zip(901.50 KB)
  • v1.1.4(Mar 17, 2017)

    cli : new : can compress in *.gz format, using --format=gzip command, by @inikep cli : new : advanced benchmark command --priority=rt cli : fix : write on sparse-enabled file systems in 32-bits mode, by @ds77 cli : fix : --rm remains silent when input is stdin cli : experimental xzstd target, with support for xz/lzma decoding, by @inikep speed : improved decompression speed in streaming mode for single pass scenarios (+5%) memory : DDict (decompression dictionary) memory usage down from 150 KB to 20 KB arch : 32-bits variant able to generate and decode very long matches (>32 MB), by @iburinoc API : new : ZSTD_findFrameCompressedSize(), ZSTD_getFrameContentSize(), ZSTD_findDecompressedSize() API : changed : dropped support of legacy versions <= v0.3 (can be selected by modifying ZSTD_LEGACY_SUPPORT value) build: new: meson build system in contrib/meson, by @dimkr build: improved cmake script, by @Majlen build: added -Wformat-security flag, as recommended by @pixelb doc : new : doc/educational_decoder, by @iburinoc

    Warning : the experimental target zstdmt contained in this release has an issue when using multiple threads on large enough files, which makes it generate buggy header. While fixing the header after the fact is possible, it's much better to avoid the issue. This can be done by using zstdmt in pipe mode : cat file | zstdmt -T2 -o file.zst This issue is fixed in current dev branch, so alternatively, create zstdmt from dev branch.

    Note : pre-compiled Windows binaries attached below contain the fix for zstdmt

    Source code(tar.gz)
    Source code(zip)
    zstd-v1.1.4-win32-fix.zip(2.57 MB)
    zstd-v1.1.4-win64-fix.zip(2.88 MB)
  • v1.1.3(Feb 6, 2017)

    cli : zstd can decompress .gz files (can be disabled with make zstd-nogz or make HAVE_ZLIB=0) cli : new : experimental target make zstdmt, with multi-threading support cli : new : improved dictionary builder "cover" (experimental), by @terrelln, based on previous work by @ot cli : new : advanced commands for detailed parameters, by @inikep cli : fix zstdless on Mac OS-X, by @apjanke cli : fix #232 "compress non-files" API : new : lib/compress/ZSTDMT_compress.h multithreading API (experimental) API : new : ZSTD_create?Dict_byReference(), requested by Bartosz Taudul API : new : ZDICT_finalizeDictionary() API : fix : ZSTD_initCStream_usingCDict() properly writes dictID into frame header, by @indygreg (#511) API : fix : all symbols properly exposed in libzstd, by @terrelln build : support for Solaris target, by @inikep doc : clarified specification, by @iburinoc

    Sample set for reference dictionary compression benchmark

    # Download and expand sample set 
    wget https://github.com/facebook/zstd/releases/download/v1.1.3/github_users_sample_set.tar.zst
    zstd -d github_users_sample_set.tar.zst
    tar xf github_users_sample_set.tar
    
    # benchmark sample set with and without dictionary compression
    zstd -b1 -r github
    zstd --train -r github
    zstd -b1 -r github -D dictionary
    
    # rebuild sample set archive
    tar cf github_users_sample_set.tar github
    zstd -f --ultra -22 github_users_sample_set.tar
    
    Source code(tar.gz)
    Source code(zip)
    github_users_sample_set.tar.gz(1.02 MB)
    github_users_sample_set.tar.zst(571.50 KB)
  • v1.1.2(Dec 15, 2016)

    new : programs/gzstd , combined *.gz and *.zst decoder, by @inikep new : zstdless, less on compressed *.zst files new : zstdgrep, grep on compressed *.zst files fixed : zstdcat

    cli : new : preserve file attributes cli : fixed : status displays total amount decoded, even for file consisting of multiple frames (like pzstd) lib : improved : faster decompression speed at ultra compression settings and 32-bits mode lib : changed : only public ZSTD_ symbols are now exposed in dynamic library lib : changed : reduced usage of stack memory lib : fixed : several corner case bugs, by @terrelln API : streaming : decompression : changed : automatic implicit reset when chain-decoding new frames without init API : experimental : added : dictID retrieval functions, and ZSTD_initCStream_srcSize() API : zbuff : changed : prototypes now generate deprecation warnings zlib_wrapper : added support for gz* functions, by @inikep install : better compatibility with FreeBSD, by @DimitryAndric source tree : changed : zbuff source files moved to lib/deprecated

    Source code(tar.gz)
    Source code(zip)
    zstd-v1.1.2-win32.zip(832.91 KB)
    zstd-v1.1.2-win64.zip(880.17 KB)
  • v1.1.1(Nov 2, 2016)

    New : cli commands -M#, --memory=, --memlimit=, --memlimit-decompress= to limit allowed memory consumption during decompression New : doc/zstd_manual.html, by @inikep Improved : slightly better compression ratio at --ultra levels (>= 20) Improved : better memory usage when using streaming compression API, thanks to @Rogier-5 report Added : API : ZSTD_initCStream_usingCDict(), ZSTD_initDStream_usingDDict() (experimental section) Added : examples/multiple_streaming_compression.c Changed : zstd_errors.h is now installed within /include (and replaces errors_public.h) Updated man page Fixed : several sanitizer warnings, by @terrelln Fixed : zstd-small, zstd-compress and zstd-decompress compilation targets

    Source code(tar.gz)
    Source code(zip)
    zstd-windows-v1.1.1.zip(1.11 MB)
  • v1.1.0(Sep 28, 2016)

    New : pzstd , parallel version of zstd, by @terrelln

    added : NetBSD install target (#338) Improved : speed for batches of small files Improved : speed of zlib wrapper, by @inikep Changed : libzstd on Windows supports legacy formats, by @KrzysFR Fixed : CLI -d output to stdout by default when input is stdin (#322) Fixed : CLI correctly detects console on Mac OS-X Fixed : CLI supports recursive mode -r on Mac OS-X Fixed : Legacy decoders use unified error codes, reported by benrg (#341), fixed by @inikep Fixed : compatibility with OpenBSD, reported [email protected] (#319) Fixed : compatibility with Hurd, by @inikep (#365) Fixed : zstd-pgo, reported by @octoploid (#329)

    Source code(tar.gz)
    Source code(zip)
    pzstd-windows-v1.1.0.zip(1.33 MB)
    zstd-windows-v1.1.0.zip(1.07 MB)
  • v1.0.0(Aug 31, 2016)

    Change Licensing, all project is now BSD, copyright Facebook Added Patent Grant Small decompression speed improvement API : Streaming API supports legacy format API : New : ZDICT_getDictID(), ZSTD_sizeof_{CCtx, DCtx, CStream, DStream}(), ZSTD_setDStreamParamter() CLI supports legacy formats v0.4+ Fixed : compression fails on certain huge files, reported by Jesse McGrew Enhanced documentation, by @inikep

    Source code(tar.gz)
    Source code(zip)
    zstd-windows-v1.0.0.zip(883.06 KB)
Owner
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
Facebook
A simple C library implementing the compression algorithm for isosceles triangles.

orvaenting Summary A simple C library implementing the compression algorithm for isosceles triangles. License This project's license is GPL 2 (as of J

Kevin Matthes 0 Apr 1, 2022
Better lossless compression than PNG with a simpler algorithm

Zpng Small experimental lossless photographic image compression library with a C API and command-line interface. It's much faster than PNG and compres

Chris Taylor 202 Oct 3, 2022
Brotli compression format

SECURITY NOTE Please consider updating brotli to version 1.0.9 (latest). Version 1.0.9 contains a fix to "integer overflow" problem. This happens when

Google 11.5k Oct 6, 2022
Multi-format archive and compression library

Welcome to libarchive! The libarchive project develops a portable, efficient C library that can read and write streaming archives in a variety of form

null 1.9k Oct 3, 2022
LZFSE compression library and command line tool

LZFSE This is a reference C implementation of the LZFSE compressor introduced in the Compression library with OS X 10.11 and iOS 9. LZFSE is a Lempel-

null 1.7k Sep 29, 2022
Small strings compression library

SMAZ - compression for very small strings ----------------------------------------- Smaz is a simple compression library suitable for compressing ver

Salvatore Sanfilippo 1k Sep 26, 2022
A massively spiffy yet delicately unobtrusive compression library.

ZLIB DATA COMPRESSION LIBRARY zlib 1.2.11 is a general purpose data compression library. All the code is thread safe. The data format used by the z

Mark Adler 3.8k Sep 25, 2022
Lossless data compression codec with LZMA-like ratios but 1.5x-8x faster decompression speed, C/C++

LZHAM - Lossless Data Compression Codec Public Domain (see LICENSE) LZHAM is a lossless data compression codec written in C/C++ (specifically C++03),

Rich Geldreich 636 Oct 1, 2022
A bespoke sample compression codec for 64k intros

pulsejet A bespoke sample compression codec for 64K intros codec pulsejet lifts a lot of ideas from Opus, and more specifically, its CELT layer, which

logicoma 34 Jul 25, 2022
A variation CredBandit that uses compression to reduce the size of the data that must be trasnmitted.

compressedCredBandit compressedCredBandit is a modified version of anthemtotheego's proof of concept Beacon Object File (BOF). This version does all t

Conor Richard 18 Sep 22, 2022
Data compression utility for minimalist demoscene programs.

bzpack Bzpack is a data compression utility which targets retrocomputing and demoscene enthusiasts. Given the artificially imposed size limits on prog

Milos Bazelides 20 Jul 27, 2022
gzip (GNU zip) is a compression utility designed to be a replacement for 'compress'

gzip (GNU zip) is a compression utility designed to be a replacement for 'compress'

ACM at UCLA 7 Apr 27, 2022
Advanced DXTc texture compression and transcoding library

crunch/crnlib v1.04 - Advanced DXTn texture compression library Public Domain - Please see license.txt. Portions of this software make use of public d

null 761 Oct 4, 2022
A fast compressor/decompressor

Snappy, a fast compressor/decompressor. Introduction Snappy is a compression/decompression library. It does not aim for maximum compression, or compat

Google 5.4k Sep 25, 2022
Analysing and implementation of lossless data compression techniques like Huffman encoding and LZW was conducted along with JPEG lossy compression technique based on discrete cosine transform (DCT) for Image compression.

PROJECT FILE COMPRESSION ALGORITHMS - Huffman compression LZW compression DCT Aim of the project - Implement above mentioned compression algorithms an

null 1 Dec 14, 2021
A fast and small port of Zstandard to WASM.

Zstandard WASM A fast and small port of Zstandard to WASM. (Decompress-only for now). Features Fast: Zstandard has been compiled with the -03 flag, so

Fabio Spampinato 12 Jul 27, 2022
Przemyslaw Skibinski 561 Sep 8, 2022
data compression library for embedded/real-time systems

heatshrink A data compression/decompression library for embedded/real-time systems. Key Features: Low memory usage (as low as 50 bytes) It is useful f

Atomic Object 1.1k Sep 23, 2022
PyFLAC - Real-time lossless audio compression in Python

A simple Pythonic interface for libFLAC. FLAC stands for Free Lossless Audio Codec, an audio format similar to MP3, but lossless, meaning that audio i

Sonos, Inc. 92 Aug 4, 2022
Extremely Fast Compression algorithm

LZ4 - Extremely fast compression LZ4 is lossless compression algorithm, providing compression speed > 500 MB/s per core, scalable with multi-cores CPU

lz4 7.6k Sep 28, 2022