New Generation Entropy coders
This library proposes two high speed entropy coders :
Huff0, a Huffman codec designed for modern CPU, featuring OoO (Out of Order) operations on multiple ALU (Arithmetic Logic Unit), achieving extremely fast compression and decompression speeds.
FSE is a new kind of Entropy encoder, based on ANS theory, from Jarek Duda, achieving precise compression accuracy (like Arithmetic coding) at much higher speeds.
Benchmarks are run on an Intel Core i7-5600U, with Linux Mint 64-bits. Source code is compiled using GCC 4.8.4, 64-bits mode. Test files are generated using the provided
probagen program. Benchmark breaks sample files into blocks of 32 KB.
FSE are compared to
zlibh, the huffman encoder within zlib, provided by Frederic Kayser.
|Huff0||6.38||600 MB/s||1350 MB/s|
|FSE||8.84||325 MB/s||440 MB/s|
|zlibh||6.38||265 MB/s||300 MB/s|
|Huff0||1.90||595 MB/s||860 MB/s|
|FSE||1.91||330 MB/s||460 MB/s|
|zlibh||1.90||255 MB/s||250 MB/s|
|Huff0||1.13||525 MB/s||555 MB/s|
|FSE||1.13||325 MB/s||445 MB/s|
|zlibh||1.13||180 MB/s||210 MB/s|
By design, Huffman can't break the "1 bit per symbol" limit, hence loses efficiency on squeezed distributions, such as
Proba80. FSE is free of such limit, and its compression efficiency remains close to Shannon limit in all circumstances. However, this accuracy is not always necessary, and less compressible distributions show little difference with Huffman. On its side, Huff0 delivers in the form of a massive speed advantage.
External contributions are welcomed and encouraged. The "master" branch is only meant to host stable releases. The "dev" branch is the one where all contributions are merged. If you want to propose a patch, please commit into "dev" branch or dedicated feature branch. Direct commit to "master" are not permitted.