Superfast compression library

Related tags

Compilers density
Overview

DENSITY

Superfast compression library

DENSITY is a free C99, open-source, BSD licensed compression library.

It is focused on high-speed compression, at the best ratio possible. All three of DENSITY's algorithms are currently at the pareto frontier of compression speed vs ratio (cf. here for an independent benchmark).

DENSITY features a simple API to enable quick integration in any project.

Branch Linux & MacOS Windows
master Build Status Build status
dev Build Status Build status

Why is it so fast ?

One of the biggest assets of DENSITY is that its work unit is not a byte like other libraries, but a group of 4 bytes.

When other libraries consume one byte of data and then apply an algorithmic processing to it, DENSITY consumes 4 bytes and then applies its algorithmic processing.

That's why DENSITY's algorithms were designed from scratch. They have to alleviate for 4-byte work units and still provide interesting compression ratios.

Speed pedigree traits

  • 4-byte work units
  • heavy use of registers as opposed to memory for processing
  • avoidance of or use of minimal branching when possible
  • use of low memory data structures to favor processor cache Lx accesses
  • library wide inlining
  • specific unrollings
  • prefetching and branching hints
  • restricted pointers to maximize compiler optimizations

A "blowup protection" is provided, dramatically increasing the processing speed of incompressible input data. Also, the output, compressed data size will never exceed the original uncompressed data size by more than 1% in case of incompressible, reasonably-sized inputs.

Benchmarks

Quick benchmark

DENSITY features an integrated in-memory benchmark. After building the project (see build), a benchmark executable will be present in the build directory. If run without arguments, usage help will be displayed.

File used : enwik8 (100 MB)

Platform : MacBook Pro, MacOS 10.13.3, 2.3 GHz Intel Core i7, 8Gb 1600 MHz DDR, SSD, compiling with Clang/LLVM 9.0.0

Timing : using the time function, and taking the best user output after multiple runs. In the case of density, the in-memory integrated benchmark's best value (which uses the same usermode CPU timing) is used.

Library Algorithm Compress Decompress Size Ratio Round trip
density 0.14.2 Chameleon 0.092s (1085 MB/s) 0.059s (1684 MB/s) 61 524 084 61,52% 0.151s
lz4 r129 -1 0.468s (214 MB/s) 0.115s (870 MB/s) 57 285 990 57,29% 0.583s
lzo 2.08 -1 0.367s (272 MB/s) 0.309s (324 MB/s) 56 709 096 56,71% 0.676s
density 0.14.2 Cheetah 0.170s (587 MB/s) 0.126s (796 MB/s) 53 156 668 53,16% 0.296s
density 0.14.2 Lion 0.303s (330 MB/s) 0.288s (347 MB/s) 47 817 692 47,82% 0.591s
lz4 r129 -3 1.685s (59 MB/s) 0.118s (847 MB/s) 44 539 940 44,54% 1.803s
lzo 2.08 -7 9.562s (10 MB/s) 0.319s (313 MB/s) 41 720 721 41,72% 9.881s

Other benchmarks

Here are a few other benchmarks featuring DENSITY (non exhaustive list) :

  • squash is an abstraction layer for compression algorithms, and has an extremely exhaustive set of benchmark results, including density's, available here.

  • lzbench is an in-memory benchmark of open-source LZ77/LZSS/LZMA compressors.

  • fsbench is a command line utility that enables real-time testing of compression algorithms, but also hashes and much more. A fork with density releases is available here for easy access. The original author's repository can be found here.

Build

DENSITY can be built on a number of platforms, via the provided makefiles.

It was developed and optimized against Clang/LLVM which makes it the preferred compiler, but GCC and MSVC are also supported. Please use the latest compiler versions for best performance.

MacOS

On MacOS, Clang/LLVM is the default compiler, which makes things simpler.

  1. Get the source code :
    git clone https://github.com/centaurean/density.git
    cd density
  1. Build and test :
    make
    build/benchmark -f

Alternatively, thanks to the Homebrew project, DENSITY can also be installed with a single command on MacOS:

    brew install density

Linux

On Linux, Clang/LLVM is not always available by default, but can be easily added thanks to the provided package managers. The following example assumes a Debian or Ubuntu distribution with apt-get.

  1. From the command line, install Clang/LLVM (optional, GCC is also supported if Clang/LLVM can't be used) and other prerequisites.
    sudo apt-get install clang git
  1. Get the source code :
    git clone https://github.com/centaurean/density.git
    cd density
  1. Build and test :
    make

or

    make CC=gcc-... AR=gcc-ar-...

or

    make CC=clang-... AR=llvm-ar-...

to choose alternative compilers. For a quick test of resulting binaries, run

    build/benchmark -f

Windows

Please install git for Windows to begin with.

On Windows, density can be built in different ways. The first method is to use mingw's gcc compiler; for that it is necessary to download and install mingw-w64.

  1. Once mingw-w64 is installed, get the source :
    git clone https://github.com/centaurean/density.git
    cd density
  1. Build and test :
    mingw32-make.exe
    build/benchmark.exe -f

As an alternative, MSYS2 also offers a linux-like environment for Windows.

The second method is to download and install Microsoft's Visual Studio IDE community edition. It comes with Microsoft's own compilers and is free.

  1. Once Visual Studio is installed, open a developer command prompt and type :
    git clone https://github.com/centaurean/density.git
    cd density\msvc
  1. Build and test :
    msbuild Density.sln
    bin\Release\benchmark.exe -f

An extra recommended step would be to install Clang/LLVM for Windows. It is downloadable from this link. Once installed, open the Visual Studio IDE by double-clicking on Density.sln, then right-click on project names and change the platform toolsets to LLVM. Rebuild the solution to generate binaries with Clang/LLVM.

Output format

DENSITY outputs compressed data in a simple format, which enables file storage and optional parallelization for both compression and decompression.

A very short header holding vital informations (like DENSITY version and algorithm used) precedes the binary compressed data.

APIs

DENSITY features a straightforward API, simple yet powerful enough to keep users' creativity unleashed.

For advanced developers, it allows use of custom dictionaries and exportation of generated dictionaries after a compression session. Although using the default, blank dictionary is perfectly fine in most cases, setting up your own, tailored dictionaries could somewhat improve compression ratio especially for low sized input datum.

Please see the quick start at the bottom of this page.

About the algorithms

Chameleon ( DENSITY_ALGORITHM_CHAMELEON )

Chameleon is a dictionary lookup based compression algorithm. It is designed for absolute speed and usually reaches a 60% compression ratio on compressible data. Decompression is just as fast. This algorithm is a great choice when main concern is speed.

Cheetah ( DENSITY_ALGORITHM_CHEETAH )

Cheetah was developed with inputs from Piotr Tarsa. It is derived from chameleon and uses swapped double dictionary lookups and predictions. It can be extremely good with highly compressible data (ratio reaching 10% or less). On typical compressible data compression ratio is about 50% or less. It is still extremely fast for both compression and decompression and is a great, efficient all-rounder algorithm.

Lion ( DENSITY_ALGORITHM_LION )

Lion is a multiform compression algorithm derived from cheetah. It goes further in the areas of dynamic adaptation and fine-grained analysis. It uses multiple swapped dictionary lookups and predictions, and forms rank entropy coding. Lion provides the best compression ratio of all three algorithms under any circumstance, and is still very fast.

Quick start (a simple example using the API)

Using DENSITY in your application couldn't be any simpler.

First you need to include this file in your project :

  • density_api.h

When this is done you can start using the DENSITY API :

    #include <string.h>
    #include "density_api.h"

    char* text = "This is a simple example on how to use the simple Density API.  This is a simple example on how to use the simple Density API.";
    uint64_t text_length = (uint64_t)strlen(text);

    // Determine safe buffer sizes
    uint_fast64_t compress_safe_size = density_compress_safe_size(text_length);
    uint_fast64_t decompress_safe_size = density_decompress_safe_size(text_length);

    // Allocate required memory
    uint8_t *outCompressed   = malloc(compress_safe_size * sizeof(char));
    uint8_t *outDecompressed = malloc(decompress_safe_size * sizeof(char));
    density_processing_result result;

    // Compress
    result = density_compress(text, text_length, outCompressed, compress_safe_size, DENSITY_COMPRESSION_MODE_CHAMELEON_ALGORITHM);
    if(!result.state)
        printf("Compressed %llu bytes to %llu bytes\n", result.bytesRead, result.bytesWritten);

    // Decompress
    result = density_decompress(outCompressed, result.bytesWritten, outDecompressed, decompress_safe_size);
    if(!result.state)
        printf("Decompressed %llu bytes to %llu bytes\n", result.bytesRead, result.bytesWritten);

    // Free memory_allocated
    free(outCompressed);
    free(outDecompressed);

And that's it ! We've done a compression/decompression round trip with a few lines !

Related projects

Comments
  • Crash in density_memory_teleport_read_reserved

    Crash in density_memory_teleport_read_reserved

    My density plugin is crashing on ARM. I do

    density_stream_create (NULL, NULL); // returns 0xb5200640
    density_stream_prepare (0xb5200640, 0xb35ff800, 102171, 0xb34fd800, 1048576); // returns 0
    density_stream_decompress_continue (0xb5200640);
    

    Density allocates 1024 at 0xb480b980 (memory_teleport.c:37), then tries to memcpy(0xb480b980, 0xb35ff810, 102155).

    And I end up with a crash in density_memory_teleport_read_reserved:

    ==3054==ERROR: AddressSanitizer: unknown-crash on address 0xb480b980 at pc 0xb3ce9445 bp 0xbef2fd88 sp 0xbef2fd8c
    WRITE of size 102155 at 0xb480b980 thread T0
        #0 0xb3ce9443 in memcpy /usr/include/arm-linux-gnueabihf/bits/string3.h:51
        #1 0xb3ce9443 in density_memory_teleport_read_reserved /home/nemequ/squash/plugins/density/density/src/memory_teleport.c:132
        #2 0xb3cd155b in density_chameleon_decode_continue /home/nemequ/squash/plugins/density/density/src/kernel_chameleon_decode.c:173
        #3 0xb3cc9ef5 in density_block_decode_continue /home/nemequ/squash/plugins/density/density/src/block_decode.c:218
        #4 0xb3ce6a3b in density_decode_continue /home/nemequ/squash/plugins/density/density/src/main_decode.c:110
        #5 0xb3cf548f in density_stream_decompress_continue /home/nemequ/squash/plugins/density/density/src/stream.c:223
        #6 0xb3cc7781 in squash_density_process_stream /home/nemequ/squash/plugins/density/squash-density.c:369
        #7 0xb6a6a7cb in squash_stream_process_internal /home/nemequ/squash/squash/stream.c:515
        #8 0xb6a6add7 in squash_stream_process /home/nemequ/squash/squash/stream.c:619
        #9 0xb6a5eff9 in squash_codec_process_file_with_options /home/nemequ/squash/squash/codec.c:1087
        #10 0xb6a5f22d in squash_codec_decompress_file_with_options /home/nemequ/squash/squash/codec.c:1162
        #11 0xad71 in main /home/nemequ/squash/utils/squash.c:292
        #12 0xb6966631 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x17631)
    
    0xb480bd80 is located 0 bytes to the right of 1024-byte region [0xb480b980,0xb480bd80)
    allocated by thread T0 here:
        #0 0xb6abd343 in __interceptor_malloc (/usr/lib/arm-linux-gnueabihf/libasan.so.1+0x43343)
        #1 0xb3ce7c69 in density_memory_teleport_allocate /home/nemequ/squash/plugins/density/density/src/memory_teleport.c:37
        #2 0xb3cf4809 in density_stream_create /home/nemequ/squash/plugins/density/density/src/stream.c:40
        #3 0xb3cc8d81 in squash_density_stream_init /home/nemequ/squash/plugins/density/squash-density.c:193
        #4 0xb3cc8d81 in squash_density_stream_new /home/nemequ/squash/plugins/density/squash-density.c:179
        #5 0xb3cc8d81 in squash_density_create_stream /home/nemequ/squash/plugins/density/squash-density.c:222
        #6 0xb6a5d20d in squash_codec_create_stream_with_options /home/nemequ/squash/squash/codec.c:507
        #7 0xb6a5ee83 in squash_codec_process_file_with_options /home/nemequ/squash/squash/codec.c:1064
        #8 0xb6a5f22d in squash_codec_decompress_file_with_options /home/nemequ/squash/squash/codec.c:1162
        #9 0xad71 in main /home/nemequ/squash/utils/squash.c:292
        #10 0xb6966631 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x17631)
    
    SUMMARY: AddressSanitizer: unknown-crash /usr/include/arm-linux-gnueabihf/bits/string3.h:51 memcpy
    Shadow bytes around the buggy address:
      0x369016e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x369016f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x36901700: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x36901710: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x36901720: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    =>0x36901730:[00]00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x36901740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x36901750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x36901760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x36901770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x36901780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07 
      Heap left redzone:       fa
      Heap right redzone:      fb
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack partial redzone:   f4
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Contiguous container OOB:fc
      ASan internal:           fe
    ==3054==ABORTING
    
    bug 
    opened by nemequ 28
  • Performance drop on low end / mobile architectures

    Performance drop on low end / mobile architectures

    Issue by gpnuma from Tuesday Aug 13, 2013 at 22:33 GMT Originally opened as https://github.com/centaurean/sharc/issues/10


    According to these benchmarks from @nemequ (https://github.com/quixdb/squash) : http://quixdb.github.io/squash/benchmarks/core-i5-2400.html#enwik8 http://quixdb.github.io/squash/benchmarks/atom-d525.html#enwik8 SHARC seems to have a significant performance drop in comparison with other algorithms on the Atom platform, while being way ahead in compression speed on an Intel Core i5.

    enhancement question 
    opened by k0dai 28
  • Issues when using density inside the Blosc meta-compressor

    Issues when using density inside the Blosc meta-compressor

    Hi, I am trying to add support for DENSITY into the Blosc meta-compressor. So, right now, my attempt lives here: https://github.com/FrancescAlted/c-blosc/tree/density, and in particular, you can see how DENSITY is called here: https://github.com/FrancescAlted/c-blosc/blob/density/blosc/blosc.c#L504

    However, I am running into issues when selecting the DENSITY codec:

    $ bench/bench density
    Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
    List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
    Supported compression libraries:
      BloscLZ: 1.0.5
      LZ4: 1.7.0
      Snappy: 1.1.1
      Zlib: 1.2.8
      DENSITY: 0.12.5
    Using compressor: density
    Running suite: single
    --> 4, 2097152, 8, 19, density
    ********************** Run info ******************************
    Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
    Using synthetic data with 19 significant bits (out of 32)
    Dataset size: 2097152 bytes     Type size: 8 bytes
    Working set: 256.0 MB           Number of threads: 4
    ********************** Running benchmarks *********************
    memcpy(write):            515.7 us, 3878.2 MB/s
    memcpy(read):             247.1 us, 8095.3 MB/s
    Compression level: 0
    comp(write):      339.5 us, 5890.4 MB/s   Final bytes: 2097168  Ratio: 1.00
    decomp(read):     252.4 us, 7925.2 MB/s   OK
    Compression level: 1
    comp(write):     13871.5 us, 144.2 MB/s   Final bytes: 1204240  Ratio: 1.74
    decomp(read):     143.1 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 2
    comp(write):     10653.2 us, 187.7 MB/s   Final bytes: 1204240  Ratio: 1.74
    decomp(read):     230.5 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 3
    comp(write):     10549.5 us, 189.6 MB/s   Final bytes: 1204240  Ratio: 1.74
    decomp(read):     149.4 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 4
    comp(write):     7510.8 us, 266.3 MB/s    Final bytes: 1159184  Ratio: 1.81
    decomp(read):     143.1 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 5
    comp(write):     5459.7 us, 366.3 MB/s    Final bytes: 1159184  Ratio: 1.81
    decomp(read):     149.7 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 6
    comp(write):     3324.9 us, 601.5 MB/s    Final bytes: 1136656  Ratio: 1.85
    decomp(read):     148.9 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 7
    comp(write):     2294.0 us, 871.8 MB/s    Final bytes: 1125520  Ratio: 1.86
    decomp(read):     152.4 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 8
    comp(write):     2570.4 us, 778.1 MB/s    Final bytes: 1125520  Ratio: 1.86
    decomp(read):     174.4 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 9
    comp(write):     1798.7 us, 1111.9 MB/s   Final bytes: 1119824  Ratio: 1.87
    decomp(read):     252.0 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    
    Round-trip compr/decompr on 7.5 GB
    Elapsed time:      23.4 s, 721.2 MB/s
    

    The above is with 'master' branch (refreshed some minutes ago). With the 'dev' of DENSITY, I get somewhat better results:

    $ bench/bench density
    Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
    List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
    Supported compression libraries:
      BloscLZ: 1.0.5
      LZ4: 1.7.0
      Snappy: 1.1.1
      Zlib: 1.2.8
      DENSITY: 0.12.5
    Using compressor: density
    Running suite: single
    --> 4, 2097152, 8, 19, density
    ********************** Run info ******************************
    Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
    Using synthetic data with 19 significant bits (out of 32)
    Dataset size: 2097152 bytes     Type size: 8 bytes
    Working set: 256.0 MB           Number of threads: 4
    ********************** Running benchmarks *********************
    memcpy(write):            511.3 us, 3911.4 MB/s
    memcpy(read):             246.3 us, 8121.1 MB/s
    Compression level: 0
    comp(write):      282.7 us, 7074.0 MB/s   Final bytes: 2097168  Ratio: 1.00
    decomp(read):     213.6 us, 9365.2 MB/s   OK
    Compression level: 1
    comp(write):     10286.4 us, 194.4 MB/s   Final bytes: 1206288  Ratio: 1.74
    decomp(read):    10408.2 us, 192.2 MB/s   OK
    Compression level: 2
    comp(write):     10418.7 us, 192.0 MB/s   Final bytes: 1206288  Ratio: 1.74
    decomp(read):    11595.9 us, 172.5 MB/s   OK
    Compression level: 3
    comp(write):     10521.3 us, 190.1 MB/s   Final bytes: 1206288  Ratio: 1.74
    decomp(read):    10770.5 us, 185.7 MB/s   OK
    Compression level: 4
    comp(write):     5685.5 us, 351.8 MB/s    Final bytes: 1160208  Ratio: 1.81
    decomp(read):    6010.8 us, 332.7 MB/s    OK
    Compression level: 5
    comp(write):     5879.1 us, 340.2 MB/s    Final bytes: 1160208  Ratio: 1.81
    decomp(read):    6056.7 us, 330.2 MB/s    OK
    Compression level: 6
    comp(write):     3476.0 us, 575.4 MB/s    Final bytes: 1137168  Ratio: 1.84
    decomp(read):    3381.0 us, 591.5 MB/s    OK
    Compression level: 7
    comp(write):     2396.0 us, 834.7 MB/s    Final bytes: 1125776  Ratio: 1.86
    decomp(read):     194.8 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 8
    comp(write):     2266.4 us, 882.5 MB/s    Final bytes: 1125776  Ratio: 1.86
    decomp(read):     164.7 us, -0.0 MB/s     FAILED.  Error code: -1
    OK
    Compression level: 9
    comp(write):     2056.3 us, 972.6 MB/s    Final bytes: 1119952  Ratio: 1.87
    decomp(read):    1559.9 us, 1282.1 MB/s   OK
    
    Round-trip compr/decompr on 7.5 GB
    Elapsed time:      40.1 s, 421.2 MB/s
    

    So, I suppose DENSITY is still in beta, but please consider c-blosc as a another testing bench. Second, I wonder why the speed is so low. For example, by using the LZ4 codec I am getting this:

    $ bench/bench lz4
    Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
    List of supported compressors in this build: blosclz,lz4,lz4hc,snappy,zlib,density
    Supported compression libraries:
      BloscLZ: 1.0.5
      LZ4: 1.7.0
      Snappy: 1.1.1
      Zlib: 1.2.8
      DENSITY: 0.12.5
    Using compressor: lz4
    Running suite: single
    --> 4, 2097152, 8, 19, lz4
    ********************** Run info ******************************
    Blosc version: 1.7.0.dev ($Date:: 2015-05-27 #$)
    Using synthetic data with 19 significant bits (out of 32)
    Dataset size: 2097152 bytes     Type size: 8 bytes
    Working set: 256.0 MB           Number of threads: 4
    ********************** Running benchmarks *********************
    memcpy(write):            536.8 us, 3725.7 MB/s
    memcpy(read):             250.7 us, 7978.5 MB/s
    Compression level: 0
    comp(write):      328.2 us, 6093.7 MB/s   Final bytes: 2097168  Ratio: 1.00
    decomp(read):     244.8 us, 8170.1 MB/s   OK
    Compression level: 1
    comp(write):      499.4 us, 4005.2 MB/s   Final bytes: 554512  Ratio: 3.78
    decomp(read):     268.6 us, 7445.3 MB/s   OK
    Compression level: 2
    comp(write):      472.6 us, 4231.5 MB/s   Final bytes: 498960  Ratio: 4.20
    decomp(read):     278.4 us, 7184.0 MB/s   OK
    Compression level: 3
    comp(write):      449.4 us, 4450.4 MB/s   Final bytes: 520824  Ratio: 4.03
    decomp(read):     323.2 us, 6188.7 MB/s   OK
    Compression level: 4
    comp(write):      440.6 us, 4539.6 MB/s   Final bytes: 332112  Ratio: 6.31
    decomp(read):     321.1 us, 6227.8 MB/s   OK
    Compression level: 5
    comp(write):      421.8 us, 4741.8 MB/s   Final bytes: 327112  Ratio: 6.41
    decomp(read):     309.7 us, 6458.5 MB/s   OK
    Compression level: 6
    comp(write):      465.4 us, 4297.5 MB/s   Final bytes: 226308  Ratio: 9.27
    decomp(read):     395.4 us, 5058.2 MB/s   OK
    Compression level: 7
    comp(write):      631.9 us, 3165.3 MB/s   Final bytes: 211880  Ratio: 9.90
    decomp(read):     564.4 us, 3543.5 MB/s   OK
    Compression level: 8
    comp(write):      602.6 us, 3318.9 MB/s   Final bytes: 220464  Ratio: 9.51
    decomp(read):     568.9 us, 3515.8 MB/s   OK
    Compression level: 9
    comp(write):      645.2 us, 3099.9 MB/s   Final bytes: 132154  Ratio: 15.87
    decomp(read):     694.6 us, 2879.3 MB/s   OK
    
    Round-trip compr/decompr on 7.5 GB
    Elapsed time:       3.8 s, 4497.0 MB/s
    

    which is roughly 10x faster.

    In case you want to experiment by yourself, the support for DENSITY in c-blosc is via a shared library for now (requiring C99 is not supported right now in c-blosc because it has to have support for other codecs that are non-C99 compliant code). So, in case the shared libraries for DENSITY are installed in the system (say /usr/local/lib and /usr/local/include for headers), here it is how to compile c-blosc:

    $ mkdir build
    $ cd build
    $ CC="clang-3.5" CXX="clang++-3.5" CFLAGS="-O3" cmake ..
    $ CC="clang-3.5" CXX="clang++-3.5" CFLAGS="-O3" make
    $ bench/bench density    # bench executable ready to be used
    

    Thanks!

    enhancement 
    opened by FrancescAlted 14
  • Unaligned stores/loads

    Unaligned stores/loads

    ubsan detects a lot of undefined stores/loads:

    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff52 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff52: note: pointer points here
     69 70  0e 3f 39 90 98 7f 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  01 00 00 00 06 00
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff56 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff56: note: pointer points here
     70 72 69 6d 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  01 00 00 00 06 00 00 00  04 00
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff5a for type 'density_byte', which requires 4 byte alignment
    0x00000206ff5a: note: pointer points here
     69 73  20 69 00 00 00 00 00 00  00 00 00 00 00 00 00 00  01 00 00 00 06 00 00 00  04 00 00 00 04 00
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff5e for type 'density_byte', which requires 4 byte alignment
    0x00000206ff5e: note: pointer points here
     6e 20 66 61 00 00  00 00 00 00 00 00 00 00  01 00 00 00 06 00 00 00  04 00 00 00 04 00 00 00  00 00
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff62 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff62: note: pointer points here
     75 63  69 62 00 00 00 00 00 00  01 00 00 00 06 00 00 00  04 00 00 00 04 00 00 00  00 00 00 00 b7 73
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff66 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff66: note: pointer points here
     75 73 20 6f 00 00  01 00 00 00 06 00 00 00  04 00 00 00 04 00 00 00  00 00 00 00 b7 73 e4 b2  00 00
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff6a for type 'density_byte', which requires 4 byte alignment
    0x00000206ff6a: note: pointer points here
     72 63  69 20 00 00 06 00 00 00  04 00 00 00 04 00 00 00  00 00 00 00 b7 73 e4 b2  00 00 00 00 00 00
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff6e for type 'density_byte', which requires 4 byte alignment
    0x00000206ff6e: note: pointer points here
     6c 75 63 74 00 00  04 00 00 00 04 00 00 00  00 00 00 00 b7 73 e4 b2  00 00 00 00 00 00 00 00  00 00
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000206ff72 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff72: note: pointer points here
     75 73  20 65 00 00 04 00 00 00  00 00 00 00 b7 73 e4 b2  00 00 00 00 00 00 00 00  00 00 00 00 00 00
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000207007a for type 'density_byte', which requires 4 byte alignment
    0x00000207007a: note: pointer points here
     76 65  72 72 00 00 00 00 00 00  c8 41 73 90 98 7f 00 00  c8 41 73 90 98 7f 00 00  70 00 07 02 00 00
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000207007e for type 'density_byte', which requires 4 byte alignment
    0x00000207007e: note: pointer points here
     61 2e 20 43 00 00  c8 41 73 90 98 7f 00 00  c8 41 73 90 98 7f 00 00  70 00 07 02 00 00 00 00  70 00
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x000002070082 for type 'density_byte', which requires 4 byte alignment
    0x000002070082: note: pointer points here
     72 61  73 20 73 90 98 7f 00 00  c8 41 73 90 98 7f 00 00  70 00 07 02 00 00 00 00  70 00 07 02 00 00
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x000002070086 for type 'density_byte', which requires 4 byte alignment
    0x000002070086: note: pointer points here
     69 6e 74 65 00 00  c8 41 73 90 98 7f 00 00  70 00 07 02 00 00 00 00  70 00 07 02 00 00 00 00  79 e9
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000207008a for type 'density_byte', which requires 4 byte alignment
    0x00000207008a: note: pointer points here
     72 64  75 6d 73 90 98 7f 00 00  70 00 07 02 00 00 00 00  70 00 07 02 00 00 00 00  79 e9 0d a3 75 c0
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000207008e for type 'density_byte', which requires 4 byte alignment
    0x00000207008e: note: pointer points here
     20 76 65 6c 00 00  70 00 07 02 00 00 00 00  70 00 07 02 00 00 00 00  79 e9 0d a3 75 c0 8d d4  84 f2
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x000002070092 for type 'density_byte', which requires 4 byte alignment
    0x000002070092: note: pointer points here
     20 6e  69 73 07 02 00 00 00 00  70 00 07 02 00 00 00 00  79 e9 0d a3 75 c0 8d d4  84 f2 35 bf 79 df
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x000002070096 for type 'density_byte', which requires 4 byte alignment
    0x000002070096: note: pointer points here
     6c 20 69 6e 00 00  70 00 07 02 00 00 00 00  79 e9 0d a3 75 c0 8d d4  84 f2 35 bf 79 df a2 f6  3d 5e
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000207009a for type 'density_byte', which requires 4 byte alignment
    0x00000207009a: note: pointer points here
     20 66  61 63 07 02 00 00 00 00  79 e9 0d a3 75 c0 8d d4  84 f2 35 bf 79 df a2 f6  3d 5e 6a cc 51 da
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000207009e for type 'density_byte', which requires 4 byte alignment
    0x00000207009e: note: pointer points here
     69 6c 69 73 00 00  79 e9 0d a3 75 c0 8d d4  84 f2 35 bf 79 df a2 f6  3d 5e 6a cc 51 da 51 c3  9f 56
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700a2 for type 'density_byte', which requires 4 byte alignment
    0x0000020700a2: note: pointer points here
     69 73  2e 20 0d a3 75 c0 8d d4  84 f2 35 bf 79 df a2 f6  3d 5e 6a cc 51 da 51 c3  9f 56 07 7d 4b 00
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700a6 for type 'density_byte', which requires 4 byte alignment
    0x0000020700a6: note: pointer points here
     43 75 72 61 8d d4  84 f2 35 bf 79 df a2 f6  3d 5e 6a cc 51 da 51 c3  9f 56 07 7d 4b 00 2d 06  b6 78
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700aa for type 'density_byte', which requires 4 byte alignment
    0x0000020700aa: note: pointer points here
     62 69  74 75 35 bf 79 df a2 f6  3d 5e 6a cc 51 da 51 c3  9f 56 07 7d 4b 00 2d 06  b6 78 64 8b d2 e2
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700ae for type 'density_byte', which requires 4 byte alignment
    0x0000020700ae: note: pointer points here
     72 20 73 6f a2 f6  3d 5e 6a cc 51 da 51 c3  9f 56 07 7d 4b 00 2d 06  b6 78 64 8b d2 e2 c2 d9  a4 9b
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700b2 for type 'density_byte', which requires 4 byte alignment
    0x0000020700b2: note: pointer points here
     6c 6c  69 63 6a cc 51 da 51 c3  9f 56 07 7d 4b 00 2d 06  b6 78 64 8b d2 e2 c2 d9  a4 9b e8 ee 63 94
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700b6 for type 'density_byte', which requires 4 byte alignment
    0x0000020700b6: note: pointer points here
     69 74 75 64 51 c3  9f 56 07 7d 4b 00 2d 06  b6 78 64 8b d2 e2 c2 d9  a4 9b e8 ee 63 94 2a a5  33 a6
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700ba for type 'density_byte', which requires 4 byte alignment
    0x0000020700ba: note: pointer points here
     69 6e  20 74 07 7d 4b 00 2d 06  b6 78 64 8b d2 e2 c2 d9  a4 9b e8 ee 63 94 2a a5  33 a6 63 ad d1 2f
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700be for type 'density_byte', which requires 4 byte alignment
    0x0000020700be: note: pointer points here
     6f 72 74 6f 2d 06  b6 78 64 8b d2 e2 c2 d9  a4 9b e8 ee 63 94 2a a5  33 a6 63 ad d1 2f 8b 2e  e5 f0
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700c2 for type 'density_byte', which requires 4 byte alignment
    0x0000020700c2: note: pointer points here
     72 20  76 65 64 8b d2 e2 c2 d9  a4 9b e8 ee 63 94 2a a5  33 a6 63 ad d1 2f 8b 2e  e5 f0 1c ef d4 ac
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700c6 for type 'density_byte', which requires 4 byte alignment
    0x0000020700c6: note: pointer points here
     6c 20 63 6f c2 d9  a4 9b e8 ee 63 94 2a a5  33 a6 63 ad d1 2f 8b 2e  e5 f0 1c ef d4 ac 41 87  54 c2
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:109:33: runtime error: store to misaligned address 0x00000206ffc4 for type 'density_chameleon_signature', which requires 8 byte alignment
    0x00000206ffc4: note: pointer points here
      20 73 65 64 00 00 00 00  00 00 00 00 20 74 65 6d  70 6f 72 20 70 75 72 75  73 20 63 75 72 73 75 73
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700d2 for type 'density_byte', which requires 4 byte alignment
    0x0000020700d2: note: pointer points here
     2a a5  33 a6 63 ad d1 2f 8b 2e  e5 f0 1c ef d4 ac 41 87  54 c2 46 e2 4f 8c db 44  84 3c 1a 13 f1 4f
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700d6 for type 'density_byte', which requires 4 byte alignment
    0x0000020700d6: note: pointer points here
     20 61 75 63 8b 2e  e5 f0 1c ef d4 ac 41 87  54 c2 46 e2 4f 8c db 44  84 3c 1a 13 f1 4f 5f 8e  eb b8
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x0000020700da for type 'density_byte', which requires 4 byte alignment
    0x0000020700da: note: pointer points here
     74 6f  72 2e 1c ef d4 ac 41 87  54 c2 46 e2 4f 8c db 44  84 3c 1a 13 f1 4f 5f 8e  eb b8 07 d2 07 b6
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode.c:126:38: runtime error: store to misaligned address 0x00000207089a for type 'density_byte', which requires 4 byte alignment
    0x00000207089a: note: pointer points here
     8b 39  f0 21 20 65 75 69 73 6d  6f 64 2c 20 6e 6f 6e 20  76 61 72 69 75 73 20 66  65 6c 69 73 20 64
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_encode_template.h:117:25: runtime error: store to misaligned address 0x00000207082a for type 'density_chameleon_signature', which requires 8 byte alignment
    0x00000207082a: note: pointer points here
     75 73  20 66 75 65 74 20 65 73  74 20 e3 f9 20 64 69 63  74 75 6d 2e 65 f6 29 f5  6b 9a 70 75 73 20
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x00000206ff52 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff52: note: pointer points here
     69 70  0e 3f 70 72 69 6d 69 73  20 69 6e 20 66 61 75 63  69 62 75 73 20 6f 72 63  69 20 6c 75 63 74
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x00000206ff56 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff56: note: pointer points here
     70 72 69 6d 69 73  20 69 6e 20 66 61 75 63  69 62 75 73 20 6f 72 63  69 20 6c 75 63 74 75 73  20 65
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x00000206ff5a for type 'density_byte', which requires 4 byte alignment
    0x00000206ff5a: note: pointer points here
     69 73  20 69 6e 20 66 61 75 63  69 62 75 73 20 6f 72 63  69 20 6c 75 63 74 75 73  20 65 74 20 75 6c
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x00000206ff5e for type 'density_byte', which requires 4 byte alignment
    0x00000206ff5e: note: pointer points here
     6e 20 66 61 75 63  69 62 75 73 20 6f 72 63  69 20 6c 75 63 74 75 73  20 65 74 20 75 6c 3a 6d  65 73
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x00000206ff62 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff62: note: pointer points here
     75 63  69 62 75 73 20 6f 72 63  69 20 6c 75 63 74 75 73  20 65 74 20 75 6c 3a 6d  65 73 20 70 6f 73
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x00000206ff66 for type 'density_byte', which requires 4 byte alignment
    0x00000206ff66: note: pointer points here
     75 73 20 6f 72 63  69 20 6c 75 63 74 75 73  20 65 74 20 75 6c 3a 6d  65 73 20 70 6f 73 75 65  72 65
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x00000206ff6a for type 'density_byte', which requires 4 byte alignment
    0x00000206ff6a: note: pointer points here
     72 63  69 20 6c 75 63 74 75 73  20 65 74 20 75 6c 3a 6d  65 73 20 70 6f 73 75 65  72 65 20 63 75 62
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x00000206ff6e for type 'density_byte', which requires 4 byte alignment
    0x00000206ff6e: note: pointer points here
     6c 75 63 74 75 73  20 65 74 20 75 6c 3a 6d  65 73 20 70 6f 73 75 65  72 65 20 63 75 62 69 6c  69 61
                 ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:85:24: runtime error: load of misaligned address 0x00000206ffc4 for type 'density_byte', which requires 8 byte alignment
    0x00000206ffc4: note: pointer points here
      20 73 65 64 00 00 00 00  80 00 00 00 20 74 65 6d  70 6f 72 20 70 75 72 75  73 20 63 75 72 73 75 73
                  ^ 
    /home/nemequ/local/src/squash/plugins/density/density/src/kernel_chameleon_decode.c:97:14: runtime error: load of misaligned address 0x000002071cda for type 'density_byte', which requires 4 byte alignment
    0x000002071cda: note: pointer points here
     00 00  e3 f9 20 64 69 63 74 75  6d 2e 65 f6 29 f5 6b 9a  70 75 73 20 13 23 72 a7  b4 b9 65 20 61 20
                  ^
    

    I only tested chameleon there, but it's probably a good bet that cheetah and lion have similar issues.

    bug 
    opened by nemequ 10
  • MSFT build ?

    MSFT build ?

    Does density build with Microsoft VS ? I see:

    if !defined(clang) && !defined(GNUC)

    error Unsupported compiler.

    endif

    but then many checks for _WIN ...

    enhancement 
    opened by mckellyln 7
  • Allow use of external dictionary

    Allow use of external dictionary

    Right now, density adds (I believe) 16 bytes of header data to each block of compressed data. For larger pieces of data this obviously isn't much of an issue and, while for most cases it's nice to have this information with each piece of data, there are applications where 16 bytes is a lot of data.

    For example, one use case I'm interested in is compressing small pieces of data, such as rows (or cells) in a database, a blob of JSON/protobuf/thrift data, etc. In such cases, it would be preferable to communicate everything you can out of band in order to reduce duplication. In the case of a database, this information could be stored once per database instead of once per row/cell. In other cases it could be hard-coded per application.

    I don't think the stream API would be very useful for such use cases, but if there is a good way to expose a simple buffer API for this sort of thing, I think there could be a market for it. I think this would just require a simple function which just uses DENSITY_ENCODE_OUTPUT_TYPE_WITHOUT_HEADER_NOR_FOOTER, in which case I'll add codecs to Squash for this, but given how simple (I think) it is, I thought you might be interested in including it in density, too.

    enhancement 
    opened by nemequ 7
  • provide test(s)

    provide test(s)

    Hi!

    I am trying to compile density cross platform using cmake and it does compile with minor modifications. however I am missing some tests which ensure the validity of you library. If you had at least one which you could commit (simple program which returns 0 or 1 depending on test outcome and is cross platform)

    Questions to my changes:

    Could you tell me if the includes utime.h and unistd.h are really required? because there does not seem to be a detrimental effect if I comment them out and MSVC does not support them.

    Another Question: the structs density_block_header and density_main_footer are empty which causes the compile error C2016 on MSVC. A quick fix for me was to insert a short dummy variable however I do not know the consequences of that action.

    I am not really in any hurry here as I am just trying to build you lib with cmake (as an experiment) and I currently do not plan to use it. However if everything works I can create a pull request for the cmake build. (you can see it at https://github.com/toeb/density)

    Cheers

    Tobi

    opened by toeb 6
  • Add missing Make dependencies

    Add missing Make dependencies

    Hello

    This pull request fixes the build script of this project. Specifically, it adds missing Make dependencies so that the targets of the project are re-generated correctly whenever there are updates to any of the dependent source files.

    In this way, the project is incrementally built and we no longer sacrifice time in clean builds (i.e., builds after a make clean).

    Note that this fix follows the best practices for tracking dependencies automatically (through gcc -MD)

    For more details, see here. https://www.gnu.org/software/make/manual/html_node/Automatic-Prerequisites.html

    opened by StefanosChaliasos 5
  • Memory leaks in dev branch

    Memory leaks in dev branch

    During compression:

    ==32661== 4,096 bytes in 1 blocks are definitely lost in loss record 21 of 22
    ==32661==    at 0x482E364: malloc (vg_replace_malloc.c:292)
    ==32661==    by 0x4D8F63F: density_memory_teleport_allocate (memory_teleport.c:40)
    ==32661==    by 0x4D8FD93: density_stream_create (stream.c:40)
    ==32661==    by 0x4D8095F: squash_density_create_stream (squash-density.c:213)
    ==32661==    by 0x486264F: squash_codec_create_stream_with_options (codec.c:507)
    ==32661==    by 0x4862F4B: squash_codec_process_file_with_options (codec.c:1064)
    ==32661==    by 0x486308D: squash_codec_compress_file_with_options (codec.c:1137)
    ==32661==    by 0x9879: main (squash.c:290)
    ==32661== 
    ==32661== 262,172 bytes in 1 blocks are possibly lost in loss record 22 of 22
    ==32661==    at 0x482E364: malloc (vg_replace_malloc.c:292)
    ==32661==    by 0x4D8F35D: density_encode_init (main_encode.c:87)
    ==32661==    by 0x4D8FE81: density_stream_compress_init (stream.c:98)
    ==32661==    by 0x4D805F1: squash_density_process_stream (squash-density.c:406)
    ==32661==    by 0x4866125: squash_stream_process_internal (stream.c:515)
    ==32661==    by 0x48662CF: squash_stream_process (stream.c:621)
    ==32661==    by 0x4862FC5: squash_codec_process_file_with_options (codec.c:1087)
    ==32661==    by 0x486308D: squash_codec_compress_file_with_options (codec.c:1137)
    ==32661==    by 0x9879: main (squash.c:290)
    

    And decompression:

    ==629== 4,096 bytes in 1 blocks are definitely lost in loss record 21 of 21
    ==629==    at 0x482E364: malloc (vg_replace_malloc.c:292)
    ==629==    by 0x4D8F63F: density_memory_teleport_allocate (memory_teleport.c:40)
    ==629==    by 0x4D8FD93: density_stream_create (stream.c:40)
    ==629==    by 0x4D8095F: squash_density_create_stream (squash-density.c:213)
    ==629==    by 0x486264F: squash_codec_create_stream_with_options (codec.c:507)
    ==629==    by 0x4862F4B: squash_codec_process_file_with_options (codec.c:1064)
    ==629==    by 0x48630B9: squash_codec_decompress_file_with_options (codec.c:1162)
    ==629==    by 0x988D: main (squash.c:292)
    
    bug 
    opened by nemequ 5
  • Diffucult to use as submodule for build systems other than make and autotools

    Diffucult to use as submodule for build systems other than make and autotools

    It's currently a pain to use libssc as a submodule with build systems other than simple makefiles since there is magic for building 1p/2p versions of some files. If you want to push people to use the shared library I think this issue can be ignored, but I don't think you do.

    FWIW, this is causing problems for squash (which uses CMake), but similar issues should pop up for virtually every build system other than simple makefiles (or autotools). Basically, people will have to re-implement that logic in whatever build system they're using (scons, waf, nmake, etc., as well as IDE-specific stuff like Visual Studio project files, xcode, etc. The CMake documentation lists a subset of the possibilities).

    opened by nemequ 5
  • Warnings from public header GCC when using -Wstrict-prototypes

    Warnings from public header GCC when using -Wstrict-prototypes

    It's currently not possible to use libssc with -Wstrict-prototypes in GCC without warnings, since the public header contains some errors.

    Note that this occurs when trying to use ssc_api.h, not when compiling libssc itself. Even if you're using the shared library compiled with libssc's Makefile, gcc will emit these when you try to use ssc_api.h in your software.

    Trivial patch at http://paste.fedoraproject.org/52208/83772366/

    opened by nemequ 4
  • Bug and Security

    Bug and Security

    Cool project, looks promising. Has the master or dev version fixed the shortened decompressed output bug mentioned on https://github.com/inikep/lzbench and/or the security vulnerability mentioned on https://github.com/nemequ/compfuzz/wiki/Results ? If so, any plans to release a new version?

    opened by mrbluecoat 0
  • API issue: density_decompress () requires a larger buffer than needed to store the decompressed data

    API issue: density_decompress () requires a larger buffer than needed to store the decompressed data

    Currently, density_decompress() is accessing bytes past the end of the unpacked buffer, so it is needed to allocate a buffer of density_decompress_safe_size() bytes. This makes density not as fast and easy to use as some alternatives for some applications. For example, when unpacking several chunks of a large buffer in multiple threads to be placed in one large buffer, it cannot be done in place without additional copying. I understand that access after the end of the unpacked buffer is necessary to avoid additional checks for speed reasons, but you can stop the decompression loop at some distance before reaching the end and continue with a safe version of the algorithm that does not go beyond the unpacked buffer. Thus, you can simplify the use of the API without losing speed.

    opened by Luke546 3
  • Rust port

    Rust port

    Hi!

    I'm interested in porting this project over to ~~Golang~~ Rust. I have a project I'm working on, and I think it would be pretty cool to use this library. I know this is covered by the BSD-3 license (which I believe allows for derived works,) but I wanted to ask @k0dai for permission and see his thoughts on it.

    opened by Crypto-Spartan 23
  • Creating a C++/CLI wrapper for C#

    Creating a C++/CLI wrapper for C#

    Is it possible for anyone to make an API wrapper of this compatible for C# language? I did try a lot but I just couldn't make it to work. Or is there any simpler alternative way?

    opened by DarkKnight17 0
  • make install target

    make install target

    Hello, is it possible to have a make install target added to the Makefile? This would be beneficial for package manager contributors to work with. Originally requested from here.

    enhancement 
    opened by alebcay 1
  • Flattening of directory hierarchy ?

    Flattening of directory hierarchy ?

    Quoting @191919

    @gpnuma Would you please consider flattening the directory hierarchy of density code, i.e., put all source files in a single directory? The file names already do the organization job well.

    enhancement 
    opened by k0dai 1
Releases(density-0.14.2)
  • density-0.14.2(Feb 12, 2018)

    0.14.2

    February 12, 2018

    • Improved chameleon decode speed
    • Added data hash checks and display option in benchmark
    • Now using makefiles as build system
    • Big endian support correctly implemented and tested
    • Improved continuous integration tests
    Source code(tar.gz)
    Source code(zip)
  • density-0.14.1(Jan 20, 2018)

    0.14.1

    January 20, 2018

    • Added MSVC support
    • Added continuous integration on travis and appveyor
    • Premake script improvement
    • Various codebase improvements
    Source code(tar.gz)
    Source code(zip)
  • density-0.14.0(Jan 16, 2018)

    0.14.0

    January 16, 2018

    • First stable version of DENSITY
    • Complete project reorganization and API rewrite
    • Many stability fixes and improvements
    • Fast revert to conditional copy for incompressible input
    • Custom dictionaries in API
    • Improvements in compression ratio and speed
    Source code(tar.gz)
    Source code(zip)
  • density-0.12.5-beta(Jun 20, 2015)

    0.12.5 beta

    June 20, 2015

    • Added conditional main footer read/write
    • Improved teleport staging buffer management
    • Regression - a minimum buffer output size has to be ensured to avoid signature loss
    • Modified the minimum lookahead and the resulting minimum buffer size in the API
    • Lion : corrected a signature interception problem due to an increase in process unit size
    • Lion : corrected chunk count conditions for new block / mode marker detection
    • Lion : modified end of stream marker conditions
    • Stability fixes and improvements
    Source code(tar.gz)
    Source code(zip)
  • density-0.12.4-beta(May 25, 2015)

    0.12.4 beta

    May 25, 2015

    • Removed remaining undefined behavior potential occurences
    • Implemented parallelizable decompressible output block header reads/writes (disabled by default)
    Source code(tar.gz)
    Source code(zip)
  • density-0.12.3-beta(May 20, 2015)

    0.12.3 beta

    May 20, 2015

    • New lion algorithm, faster and more efficient
    • Compiler specific optimizations
    • Switched to premake 5 to benefit from link time optimizations
    • Various fixes and improvements
    Source code(tar.gz)
    Source code(zip)
  • density-0.12.2-beta(May 4, 2015)

    0.12.2 beta

    May 4, 2015

    • Added an integrated in-memory benchmark
    • Better Windows compatibility
    • Fixed misaligned load/stores
    • Switched to the premake build system
    • Performance optimizations (pointers, branches, loops ...)
    • Various fixes and improvements
    Source code(tar.gz)
    Source code(zip)
  • density-0.12.1-beta(Apr 3, 2015)

    0.12.1 beta

    April 3, 2015

    • Better unrolling readability and efficiency
    • Improved read speed of dictionary/predictions entries
    • Implemented case generators in cheetah to speed up decoding by using less branches
    • Added signatures interception in lion to cancel the need for large output buffers
    • Improved lion decode speed with specific form data access and use of ctz in form read
    • Enabled decompression to exact-sized buffer for all algorithms
    • Various fixes and improvements
    Source code(tar.gz)
    Source code(zip)
  • density-0.12.0-beta(Mar 24, 2015)

    0.12.0 beta

    March 24, 2015

    • Added new lion kernel
    • Renamed kernel mandala to cheetah
    • Kernel chameleon and cheetah improvements in encoding/decoding speeds
    • Generic function macros to avoid code rewrite
    • Improved memory teleport IO flexibility and speed, bytes issued by memory teleport can now be partially read
    • Various fixes and improvements
    Source code(tar.gz)
    Source code(zip)
  • density-0.11.4-beta(Feb 10, 2015)

    0.11.4 beta

    February 10, 2015

    • Removed unnecessary makefile, now using a single makefile
    • Mandala kernel : reset prediction entries as required
    • Mandala kernel : convert hash to 16-bit unsigned int before storing
    • Updated SpookyHash to 1.0.2
    Source code(tar.gz)
    Source code(zip)
  • density-0.11.3-beta(Feb 5, 2015)

    0.11.3 beta

    February 5, 2015

    • Added integrity check system
    • Corrected pointer usage and update on footer read/writes
    • Now freeing kernel state memory only when compression mode is not copy
    • Updated Makefiles
    • Improved memory teleport
    • Fixed sequencing problem after kernels request a new block
    Source code(tar.gz)
    Source code(zip)
  • density-0.11.2-beta(Feb 3, 2015)

    0.11.2 beta

    February 3, 2015

    • Added an algorithms overview in README
    • Removed ssc references
    • Now initializing last hash to zero on mandala kernel inits
    • Reimplemented the buffer API
    • Various corrections and improvements
    Source code(tar.gz)
    Source code(zip)
  • density-0.11.1-beta(Jan 19, 2015)

    0.11.1 beta

    January 19, 2015

    • Added a sharc benchmark in README
    • Stateless memory teleport
    • Improved event management and dispatching
    • Improved compression/decompression finishes
    • Improved streams API
    • Various bug fixes, robustness improvements
    Source code(tar.gz)
    Source code(zip)
  • density-0.10.2-beta(Jan 7, 2015)

    0.10.2 beta

    January 7, 2015

    • Improved organization of compile-time switches and run-time options in the API
    • Removed method density_stream_decompress_utilities_get_header from the API, header info is now returned in the density_stream_decompress_init function
    • Corrected readme to reflect API changes
    Source code(tar.gz)
    Source code(zip)
  • density-0.10.1-beta(Jan 5, 2015)

    0.10.1 beta

    January 5, 2015

    • Re-added mandala kernel
    • Corrected available bytes adjustment problem
    • Added missing restrict keywords
    • Cleaned unnecessary defines
    Source code(tar.gz)
    Source code(zip)
  • density-0.10.0-beta(Jan 1, 2015)

  • density-0.9.12-beta(Dec 12, 2013)

    0.9.12 beta

    December 2, 2013

    • Mandala kernel addition, replacing dual pass chameleon
    • Simplified, faster hash function
    • Fixed memory freeing issue during main encoding/decoding finish
    • Implemented no footer encode output type
    • Namespace migration, kernel structure reorganization
    • Corrected copy mode problem
    • Implemented efficiency checks and mode reversions
    • Corrected lack of main header parameters retrieval
    • Fixed stream not being properly ended when mode reversion occurred
    • Updated metadata computations
    Source code(tar.gz)
    Source code(zip)
  • libssc-0.9.11-beta(Dec 11, 2013)

    0.9.11 beta

    November 2, 2013

    • First beta release of DENSITY, including all the compression code from SHARC in a standalone, BSD licensed library
    • Added copy mode (useful for enhancing data security via the density block checksums for example)
    • Makefile produces static and dynamic libraries
    Source code(tar.gz)
    Source code(zip)
Owner
Centaurean
Centaurean
data compression library for embedded/real-time systems

heatshrink A data compression/decompression library for embedded/real-time systems. Key Features: Low memory usage (as low as 50 bytes) It is useful f

Atomic Object 1.1k Jan 7, 2023
Small strings compression library

SMAZ - compression for very small strings ----------------------------------------- Smaz is a simple compression library suitable for compressing ver

Salvatore Sanfilippo 1k Dec 28, 2022
Compression abstraction library and utilities

Squash - Compresion Abstraction Library

null 375 Dec 22, 2022
Multi-format archive and compression library

Welcome to libarchive! The libarchive project develops a portable, efficient C library that can read and write streaming archives in a variety of form

null 1.9k Dec 26, 2022
Brotli compression format

SECURITY NOTE Please consider updating brotli to version 1.0.9 (latest). Version 1.0.9 contains a fix to "integer overflow" problem. This happens when

Google 11.8k Jan 5, 2023
Heavily optimized zlib compression algorithm

Optimized version of longest_match for zlib Summary Fast zlib longest_match function. Produces slightly smaller compressed files for significantly fas

Konstantin Nosov 124 Dec 12, 2022
Fastest Integer Compression

TurboPFor: Fastest Integer Compression TurboPFor: The new synonym for "integer compression" ?? (2019.11) ALL functions now available for 64 bits ARMv8

powturbo 647 Dec 26, 2022
A simple C library for compressing lists of integers using binary packing

The SIMDComp library A simple C library for compressing lists of integers using binary packing and SIMD instructions. The assumption is either that yo

Daniel Lemire 409 Dec 22, 2022
A portable, simple zip library written in C

A portable (OSX/Linux/Windows), simple zip library written in C This is done by hacking awesome miniz library and layering functions on top of the min

Kuba Podgórski 1.1k Dec 29, 2022
is a c++20 compile and runtime Struct Reflections header only library.

is a c++20 compile and runtime Struct Reflections header only library. It allows you to iterate over aggregate type's member variables.

RedSkittleFox 4 Apr 18, 2022
Analysing and implementation of lossless data compression techniques like Huffman encoding and LZW was conducted along with JPEG lossy compression technique based on discrete cosine transform (DCT) for Image compression.

PROJECT FILE COMPRESSION ALGORITHMS - Huffman compression LZW compression DCT Aim of the project - Implement above mentioned compression algorithms an

null 1 Dec 14, 2021
Przemyslaw Skibinski 579 Jan 8, 2023
Multi-format archive and compression library

Welcome to libarchive! The libarchive project develops a portable, efficient C library that can read and write streaming archives in a variety of form

null 1.9k Jan 8, 2023
LZFSE compression library and command line tool

LZFSE This is a reference C implementation of the LZFSE compressor introduced in the Compression library with OS X 10.11 and iOS 9. LZFSE is a Lempel-

null 1.7k Jan 4, 2023
Small strings compression library

SMAZ - compression for very small strings ----------------------------------------- Smaz is a simple compression library suitable for compressing ver

Salvatore Sanfilippo 1k Dec 28, 2022
data compression library for embedded/real-time systems

heatshrink A data compression/decompression library for embedded/real-time systems. Key Features: Low memory usage (as low as 50 bytes) It is useful f

Atomic Object 1.1k Jan 7, 2023
Small strings compression library

SMAZ - compression for very small strings ----------------------------------------- Smaz is a simple compression library suitable for compressing ver

Salvatore Sanfilippo 1k Dec 28, 2022
Compression abstraction library and utilities

Squash - Compresion Abstraction Library

null 375 Dec 22, 2022
Multi-format archive and compression library

Welcome to libarchive! The libarchive project develops a portable, efficient C library that can read and write streaming archives in a variety of form

null 1.9k Dec 26, 2022
A massively spiffy yet delicately unobtrusive compression library.

ZLIB DATA COMPRESSION LIBRARY zlib 1.2.11 is a general purpose data compression library. All the code is thread safe. The data format used by the z

Mark Adler 4.1k Dec 30, 2022