New generation entropy codecs : Finite State Entropy and Huff0

Overview

New Generation Entropy coders

This library proposes two high speed entropy coders :

Huff0, a Huffman codec designed for modern CPU, featuring OoO (Out of Order) operations on multiple ALU (Arithmetic Logic Unit), achieving extremely fast compression and decompression speeds.

FSE is a new kind of Entropy encoder, based on ANS theory, from Jarek Duda, achieving precise compression accuracy (like Arithmetic coding) at much higher speeds.

Branch Status
master Build Status
dev Build Status

Benchmarks

Benchmarks are run on an Intel Core i7-5600U, with Linux Mint 64-bits. Source code is compiled using GCC 4.8.4, 64-bits mode. Test files are generated using the provided probagen program. Benchmark breaks sample files into blocks of 32 KB. Huff0 and FSE are compared to zlibh, the huffman encoder within zlib, provided by Frederic Kayser.

File Codec Ratio Compression Decompression
Proba80
Huff0 6.38 600 MB/s 1350 MB/s
FSE 8.84 325 MB/s 440 MB/s
zlibh 6.38 265 MB/s 300 MB/s
Proba14
Huff0 1.90 595 MB/s 860 MB/s
FSE 1.91 330 MB/s 460 MB/s
zlibh 1.90 255 MB/s 250 MB/s
Proba02
Huff0 1.13 525 MB/s 555 MB/s
FSE 1.13 325 MB/s 445 MB/s
zlibh 1.13 180 MB/s 210 MB/s

By design, Huffman can't break the "1 bit per symbol" limit, hence loses efficiency on squeezed distributions, such as Proba80. FSE is free of such limit, and its compression efficiency remains close to Shannon limit in all circumstances. However, this accuracy is not always necessary, and less compressible distributions show little difference with Huffman. On its side, Huff0 delivers in the form of a massive speed advantage.

Branch Policy

External contributions are welcomed and encouraged. The "master" branch is only meant to host stable releases. The "dev" branch is the one where all contributions are merged. If you want to propose a patch, please commit into "dev" branch or dedicated feature branch. Direct commit to "master" are not permitted.

Comments
  • Public methods are not exported to DLL/LIB

    Public methods are not exported to DLL/LIB

    As stated in facebook/zstd#472, I would like to use FSE and HUFF0 methods from .NET which requires having a Windows DLL with public method marked as exported.

    As was done for zstd, we just need to add a macro that does the same thing as here: https://github.com/facebook/zstd/blob/426a9d4b7128ef54d79627cff346173e833f733a/lib/zstd.h#L21

    I can try putting a PR together for this, but I have a few questions:

    1. Should there be a single PUBLIC_API macro used in all the fse.h, huf.h error_public.h? Or should there be multiple FSE_API, HUF_API macros to opt-in/out of each module?

    2. Can we consider all methods declared in fse.h, huf.h and error_public.h as public, and everything else as private? What about bitstream.h and mem.h ?

    3. the VC2012 project does not seem to provide a name for the generated dll (so it's probably . Would there be objections to updating the project to generate libfse.dll instead? Though if it can contains huff0 as well, many libfse.dll is too specific, and there should be another name for this library? (the repo is named FiniteStateEntropy but the title says "New Generation Entropy coders"...)

    opened by KrzysFR 9
  • Integer division by zero in FSE_normalizeM2

    Integer division by zero in FSE_normalizeM2

    Hi, I get sometimes integer division by zero error from FSE_normalizeM2, when having large max symbol values. Specifically, I am using following compiler defines to compile FSE:

    FSE_DEFAULT_MEMORY_USAGE=14 FSE_MAX_MEMORY_USAGE=16 FSEU16_DEFAULT_MEMORY_USAGE=14 FSEU16_MAX_MEMORY_USAGE=16 FSEU16_MAX_SYMBOL_VALUE=4095

    Please find below example code that will reproduce the error.

    #include "fse.h"
    
    void test()
    {
    	uint16_t sourceBuffer[64] =
    	{
    		4048, 1072, 1532, 1936, 2252, 2460, 2536, 2480, 2292, 1988, 1596, 1140, 668, 200, 
    		207, 535, 747, 839, 795, 615, 327, 64, 512, 988, 1456, 1868, 2208, 2432, 2532,
    		2500, 2332, 2048, 1672, 1224, 748, 280, 139, 483, 723, 831, 811, 659, 383, 11,
    		432, 904, 1376, 1800, 2152, 2404, 2524, 2516, 2372, 2104, 1744, 1304, 832, 364,
    		71, 431, 687, 823, 827, 263
    	};
    	uint16_t destinationBuffer[64];
    	static_assert(FSEU16_MAX_SYMBOL_VALUE == 4095, "Custom fseu16 max symbol value");
    	FSE_compressU16(destinationBuffer, 64, sourceBuffer, 64, 4048, 13);
    	/* -> Integer division by zero at fse_compress.c line 540 (FSE_normalizeM2) */
    }
    
    

    Thanks, Markus

    opened by sh4k3n 8
  • huff0 with > 256 symbols

    huff0 with > 256 symbols

    I am fascinated by the speed of the huff0 encoder however unfortunately I wish to encode/decode data with 512 bits. I have already tried the fseU16 and it works great, however is there a quick hack/modification I can do to make the huff0 also work with 9-bit data?

    Any pointers in this direction are appreciated. Thanks

    UPDATE: My apologies, I didn't realized that HUF_compress2() is doing exactly what I desire. However will have to use it before saying it for sure.

    question 
    opened by rajputasif 7
  • Crash when FSE_MAX_MEMORY_USAGE 13

    Crash when FSE_MAX_MEMORY_USAGE 13

    1. Clone current master
    2. Edit FSE_MAX_MEMORY_USAGE to 13
    3. fse blows up its stack when compressing

    Valgrind and gdb traces are useless. Changing DEBUGLEVEL to 1 in debug.h does not help, no assert fires. Compiling with -fstack-protector says "*** stack smashing detected ***" and gdb says it came when exiting FSE_compress2 at ../lib/fse_compress.c:706.

    opened by clbr 6
  • FSE_emergencyDistrib does not exit under certain conditions

    FSE_emergencyDistrib does not exit under certain conditions

    when all values in the first maxSymbolValue positions of the normalizedCounter are <= 1 the function will not exit and an infinite loop will result

    opened by tjizep 5
  • Feature request: safe decompression for U16 coder

    Feature request: safe decompression for U16 coder

    The 8-bit decompressor has a safe function which guarantees it will not read outside of the compressed buffer. It would be great if the 16-bit decompressor also had that functionality.

    opened by chadaustin 5
  • GCC and LLVM CLANG Stats

    GCC and LLVM CLANG Stats

    Hi,

    I ran these on a mediocre Core2Duo with 1.3GHz and 4GB DDR on Gentoo Linux with the latest kernel and thought that sharing these stats might be useful. I've not integrated Yepp! into FSE, but would be curious, if it could bring any advantage. (The -lyeppp flag was used in the hope that it would have a positive effect.) I have tried various combinations of gcc/clang flags and none have had a positive effect, except -02 in combination with -lyeppp, but most visibly the usage of -funroll-loops in combination with clang version 3.3.

    FSE : Finite State Entropy, capability demo by Yann Collet (Jan 12 2014)

    File already compressed

    GCC ../data/win98-lz : 4671615 -> 4671758 (100.0%), 73.0 MB/s , 1420.2 MB/s GCC -funroll-loops ../data/win98-lz : 4671615 -> 4671758 (100.0%), 76.5 MB/s , 1405.2 MB/s GCC -funroll-loops -lyeppp ../data/win98-lz : 4671615 -> 4671758 (100.0%), 75.9 MB/s , 1420.9 MB/s

    CLANG ../data/win98-lz : 4671615 -> 4671758 (100.0%), 78.4 MB/s , 1409.0 MB/s CLANG -funroll-loops ../data/win98-lz : 4671615 -> 4671758 (100.0%), 78.4 MB/s , 1418.3 MB/s CLANG -funroll-loops -lyeppp ../data/win98-lz : 4671615 -> 4671758 (100.0%), 78.3 MB/s , 1431.4 MB/s

    File is uncompressed

    GCC ../data/win98-lz : 12536244 -> 4671591 (37.26%), 73.0 MB/s , 96.9 MB/s GCC -funroll-loops ../data/win98-lz : 12536244 -> 4671591 (37.26%), 76.5 MB/s , 112.9 MB/s GCC -funroll-loops -lyeppp `../data/win98-lz : 12536244 -> 4671591 (37.26%), 76.5 MB/s , 112.9 MB/s

    CLANG ../data/win98-lz : 12536244 -> 4671591 (37.26%), 78.2 MB/s , 107.9 MB/s CLANG -funroll-loops ../data/win98-lz : 12536244 -> 4671591 (37.26%), 78.2 MB/s , 108.0 MB/s CLANG -funroll-loops -lyeppp ../data/win98-lz : 12536244 -> 4671591 (37.26%), 78.6 MB/s , 108.0 MB/s

    EDIT: I have been thinking for several months about entropy, the universe and the use and state of compression in computer science. I've also used entropy as a main theme in my thesis. What strikes me is that the similarity between this and a neural networks is diminishing, if you chained multiple FSE's into a multi-layer network. Thought leader in this region is currently the work of Prof. Dr. Jürgen Schmidthuber's work and those of his Students which you can study here: http://www.idsia.ch/~juergen/onlinepub.html The main problem being the topology of data in the study of entropy raises the question why topological data analysis is so rarely used in the field as a method of exploiting the nature of the dataset to achieve higher compression ratios. It would be a pleasure to exchange ideas on entropy with you. Thanks for this great contribution! I've just recently enjoyed a growing awe on groundbreaking algorithms that have come up with linear, near optimal and rarely even near perfect solutions.

    opened by X4 5
  • Build fails because 32 bit absolute addressing isn't supported

    Build fails because 32 bit absolute addressing isn't supported

    I'm on OS X, just did a fresh clone of the repo, and clang outputs the following when calling make.

    I'm currently using the beta of Xcode 8, but this issue has been ongoing for a while.

    make CFLAGS="-march=native -ofast" LDFLAGS="-flto"

    /Applications/Xcode-beta.app/Contents/Developer/usr/bin/make -C programs test cc -I../lib -march=native -ofast -flto probaGenerator.c -o probagen cc -I../lib -march=native -ofast -flto ../lib/huf_decompress.c ../lib/entropy_common.c bench.c commandline.c fileio.c xxhash.c zlibh.c ../lib/fse_decompress.c ../lib/fse_compress.c ../lib/fseU16.c ../lib/huf_compress.c -o fse ./probagen 20% Binary file generator Generating 1023 KB with P=20.00% File proba.bin generated **** compress using FSE **** ./fse -f proba.bin tmp FSE : Finite State Entropy, 64-bits demo by Yann Collet (Sep 4 2016) Compressed 1048575 bytes into 474414 bytes ==> 45.24%
    ./fse -df tmp result FSE : Finite State Entropy, 64-bits demo by Yann Collet (Sep 4 2016) Decoded 1048575 bytes
    diff proba.bin result **** compress using HUF **** ./fse -fh proba.bin tmp FSE : Finite State Entropy, 64-bits demo by Yann Collet (Sep 4 2016) Compressed 1048575 bytes into 478412 bytes ==> 45.62%
    ./fse -df tmp result FSE : Finite State Entropy, 64-bits demo by Yann Collet (Sep 4 2016) Decoded 1048575 bytes
    diff proba.bin result **** compress using zlibh **** ./fse -fz proba.bin tmp FSE : Finite State Entropy, 64-bits demo by Yann Collet (Sep 4 2016) Compressed 1048575 bytes into 478213 bytes ==> 45.61%
    ./fse -df tmp result FSE : Finite State Entropy, 64-bits demo by Yann Collet (Sep 4 2016) Decoded 1048575 bytes
    diff proba.bin result rm result rm proba.bin rm tmp cc -I../lib -march=native -ofast -flto fullbench.c xxhash.c ../lib/fse_decompress.c ../lib/fse_compress.c ../lib/fseU16.c ../lib/huf_compress.c ../lib/huf_decompress.c ../lib/entropy_common.c -o fullbench LLVM ERROR: 32-bit absolute addressing is not supported in 64-bit mode clang: error: linker command failed with exit code 1 (use -v to see invocation) make[1]: *** [fullbench] Error 1 make: *** [test] Error 2

    opened by MarcusJohnson91 4
  • Feature: high-level pseudocode in README

    Feature: high-level pseudocode in README

    Would be great if you could add pseudocode to the README. I'd love to understand how the algorithm works, but there's too much C for me to understand, and the whitepaper is not good for a layman. :/

    opened by dbkaplun 3
  • Can't compile on cygwin64.

    Can't compile on cygwin64.

    In fileio.c and in commandline.c , both of the defined (CYGWIN) lines must be removed to get it to compile as IS_CONSOLE and SET_BINARY_MODE are not supported.

    opened by neheb 3
  • resync with changes from zstd

    resync with changes from zstd

    zstd has accumulated a lot of little fixes and changes to its embedded fse/huff0. Would be great to see those backported to standalone FiniteStateEntropy if plausible.

    opened by adamdmoss 2
  • Why Huffman code in reverse order?

    Why Huffman code in reverse order?

    I'm very interested in compression.Huffman coding can be performed sequentially,I can't understand why huffman code is executed in reverse order in the code.

    question 
    opened by huhu0823 2
  • How should we understand this function ‘FSE_normalizeM2’?

    How should we understand this function ‘FSE_normalizeM2’?

    opened by huhu0823 2
  • a step in normalization of count

    a step in normalization of count

    I am new here and still reading the source code. In this line, why normalizedCounter[s] = -1? In my opinion, -1 should be 1. Could you please explain it to me? Thanks a lot. https://github.com/Cyan4973/FiniteStateEntropy/blob/12a533a9bf4d7bdcc507bf9d11302a7a1be454f5/lib/fse_compress.c#L459

    question 
    opened by BeeBreeze 2
  • question: FSE doesn't always compress better than huf - expected?

    question: FSE doesn't always compress better than huf - expected?

    I can't share the data, but in short, it's 9108363519 bytes (~9GB) of almost-uncompressible data (IIRC it's the 9GB tail of a larger already-compressed stream).

    % ./fse -e ./almost-uncompressable Compressed 9108363519 bytes into 8992064047 bytes ==> 98.72% % ./fse -h ./almost-uncompressable Compressed 9108363519 bytes into 8943423537 bytes ==> 98.19% % ./fse -z ./almost-uncompressable Compressed 9108363519 bytes into 8944678105 bytes ==> 98.20%

    Granted that I don't know the intimate details of FSE and this is a near-pathological case, but I'd have expected the two huffman implementations to fare rather worse than FSE on these almost-but-not-quite-uniform distributions of data.

    Am I wrong?

    Cheers.

    question 
    opened by adamdmoss 2
  • benchmark mode claims a decompression error on this data w/huf

    benchmark mode claims a decompression error on this data w/huf

     % ./fse -h -b ./xxx    
    FSE : Finite State Entropy, 64-bits demo by Yann Collet (Jul 17 2020)
    !! Error decompressing block 4 of cSize 18041 !! => (Corrupted block detected)
    

    gunzip the below file and run the above. xxx.gz The 'xxx' file appears to survive a huf compress and then a huf decompress intact when doing them individually, so perhaps this is an issue specific to the benchmark mode.

    Tested with 3865a704e8079fb45fc921480a6c351bb6bfbd72

    opened by adamdmoss 5
  • FSE_compressU16() computes clipped data if dstCapacity different

    FSE_compressU16() computes clipped data if dstCapacity different

    Repro:

    constexpr size_t inSize = 24;
    uint16_t in[inSize] =
      {0, 0, 3, 2, 0, 0, 0, 0, 314, 0, 0, 0, 0, 51, 50, 0, 0, 59, 22, 36, 0, 55, 32, 22};
    unsigned int maxSymbolValue = 314;
    unsigned int outSize = 48;
    
    uint8_t out1[256];
    uint8_t out2[256];
    size_t        numBytes1 = FSE_compressU16(out1,
                                       outSize,
                                       in,
                                       inSize,
                                       maxSymbolValue,
                                       0);
    
    size_t numBytes2 = FSE_compressU16(out2,
                                       256,
                                       in,
                                       inSize,
                                       maxSymbolValue,
                                       0);
    // numBytes1 is now 34 bytes
    // numBytes2 is now 43 bytes
    
    FSE_decompressU16(in, inSize, out2, numBytes2);
    FSE_decompressU16(in, inSize, out1, numBytes1); // <- Crashes 
    

    When output bytes is 48 rather than 256 as an argument for FSE_compressU16(), all generated bytes are identical up until 34 bytes. The rest of the bytes are clipped. The difference in code paths is that in the 34 bytes case, the non-fast path is chosen.

    opened by JohanKnutzen 1
Owner
Yann Collet
data compression
Yann Collet
zlib replacement with optimizations for "next generation" systems.

zlib-ng zlib data compression library for the next generation systems Maintained by Hans Kristian Rosbach aka Dead2 (zlib-ng àt circlestorm dót org) C

zlib-ng 1.2k Dec 31, 2022
Runtime Archiver plugin for Unreal Engine. Cross-platform archiving and unarchiving directories and files. Currently supports ZIP format.

Runtime Archiver Archiving and dearchiving directories and files Explore the docs » Marketplace . Releases . Support Chat Features Fast speed Easy arc

Georgy Treshchev 26 Dec 15, 2022
A C++ static library offering a clean and simple interface to the 7-zip DLLs.

bit7z A C++ static library offering a clean and simple interface to the 7-zip DLLs Supported Features • Getting Started • Download • Requirements • Bu

Riccardo 326 Jan 1, 2023
Multi-format archive and compression library

Welcome to libarchive! The libarchive project develops a portable, efficient C library that can read and write streaming archives in a variety of form

null 1.9k Jan 8, 2023
LZFSE compression library and command line tool

LZFSE This is a reference C implementation of the LZFSE compressor introduced in the Compression library with OS X 10.11 and iOS 9. LZFSE is a Lempel-

null 1.7k Jan 4, 2023
Fuzzing harnesses, corpora, scripts, and target-specific notes for fuzzing IrfanView

FuzzIrfanView Here is the accompany repository for my blog post, Fuzzing IrfanView with WinAFL. It contains the following: The scripts used to downloa

Moshe Kaplan 15 Oct 29, 2022
Slow5tools is a toolkit for converting (FAST5 <-> SLOW5), compressing, viewing, indexing and manipulating data in SLOW5 format.

slow5tools Slow5tools is a simple toolkit for converting (FAST5 <-> SLOW5), compressing, viewing, indexing and manipulating data in SLOW5 format. Abou

Hasindu Gamaarachchi 57 Dec 2, 2022
PNGFuse is a cross-platform application that allows you to embed and extract full zlib-compressed files within PNG metadata.

PNGFuse PNGFuse is a portable, lightweight, and cross-platform application written in C++ that allows you to embed and extract full zlib-compressed fi

Eta 3 Dec 29, 2021
Experimental data compressor for 8-bit computers and low-end platforms

ZX5 (experimental) ZX5 is an experimental data compressor derived from ZX0, similarly targeted for low-end platforms, including 8-bit computers like t

Einar Saukas 9 Apr 14, 2022
Advanced DXTc texture compression and transcoding library

crunch/crnlib v1.04 - Advanced DXTn texture compression library Public Domain - Please see license.txt. Portions of this software make use of public d

null 775 Dec 26, 2022
New generation entropy codecs : Finite State Entropy and Huff0

New Generation Entropy coders This library proposes two high speed entropy coders : Huff0, a Huffman codec designed for modern CPU, featuring OoO (Out

Yann Collet 1.1k Dec 26, 2022
This project Orchid-Fst implements a fast text string dictionary search data structure: Finite state transducer (short for FST) in c++ language.This FST C++ open source project has much significant advantages.

Orchid-Fst 1. Project Overview This project Orchid-Fst implements a fast text string dictionary search data structure: Finite state transducer , which

Bin Ding 10 Oct 18, 2022
Automatic differentiation with weighted finite-state transducers.

GTN: Automatic Differentiation with WFSTs Quickstart | Installation | Documentation What is GTN? GTN is a framework for automatic differentiation with

null 100 Dec 29, 2022
Finite State Machine implementation using std::variant

mp::fsm Implementation of Finite State Machine presented by me on CppCon 2018 in a talk Effective replacement of dynamic polymorphism with std::varian

Mateusz Pusz 65 Jan 3, 2023
My version of psxfunkin with new changes like new story mode, new options,etc

PSXFunkin Friday Night Funkin' on the PSX LOL Compilation Refer to COMPILE.md here Characters Igor Ver added new characters Like XmasGF,Monster and mu

IgorSou3000 4 Jun 8, 2022
yangwebrtc is a self-developed rtc architecture supporting Webrtc/Srt/Rtmp, including a variety of video and audio codecs and processing, etc.

YangWebrtc Overview yangwebrtc是一个自主研发的支持Webrtc/Srt/Rtmp的rtc架构,包含多种视音频编解码和处理等。 支持视频会议、高清录播直播、直播互动等多种视音频应用。 可用于远程教育、远程医疗、指挥调度、安防监控、影视录播、协同办公、直播互动等多种行业应用

null 331 Dec 27, 2022
John Walker 24 Dec 15, 2022
New generation message broker service

Push1st Push1st is open source multiple protocol PUB/SUB message broker server (Pusher, MQTT, RAW WebSocket). Key features Suitable for distributed on

Naveksoft 16 Dec 14, 2022
The ESP-BOX is a new generation AIoT development platform released by Espressif Systems.

中文版本 ESP-BOX AIoT Development Framework Important Note: We recommend updating the ESP32-S3-BOX firmware when you first receive the product to have the

Espressif Systems 160 Dec 29, 2022
SuanPan - 🧮 An Open Source, Parallel and Heterogeneous Finite Element Analysis Framework

suanPan Introduction ?? suanPan is a finite element method (FEM) simulation platform for applications in fields such as solid mechanics and civil/stru

Theodore 22 Dec 27, 2022