Heavily optimized zlib compression algorithm

Overview

Optimized version of longest_match for zlib

Summary

Fast zlib longest_match function. Produces slightly smaller compressed files for significantly faster time. For more details and updates please visit this page. Brief algorithm description is available on the home page of the project.

Source code repository is located on Github. Wiki page contains some additional notes and test results.

You may get precompiled binaries for Windows platform at this location

Build instructions

This library consists of 2 versions: C and assembly.

Building C version

You should replace longest_match() function in deflate.c with one located in Sources/match.h. One of possible ways is to making a file with contents looking like

#define ASMV
#include "deflate.c"

#undef local
#define local

#include "match.h"

void match_init()
{
}

and linking zlib with this file instead of deflate.c.

Building a 32-bit assembly version

This version is compatible only with x86 platform. To build a matcher, please start with obtaining the copy of the Netwide Assembler

For use with the Microsoft Visual Studio compile it with the following command line:

nasm match32.asm -f win32 -o match.obj

For the Watcom C use this:

nasm match32.asm -f obj -o match.obj -D OMF_FORMAT

For Linux use this command line:

nasm match32.asm -f elf32 -o match.o

Running tests

The library goes with built-in test application. Some information about its features could be found on wiki page.

On Unix platforms (those which has bash and perl out-of-the-box) you may simply run build.sh script located at the root directory of the project. This will build several versions of the test application and put them to obj/bin. There's a script which will run tests in automated mode, test.sh. Run test.sh --help to see available options. By default it will run several predefined tests for data locations from my own PC. If you'll need to change locations, just modify several bottom lined in the script (see DoTests <directory>).

For Windows platform, you'll need bash and perl to be installed and available via PATH environment variable. This could be achieved by installing Gygwin or MSYS projects to your computer. You may also get a set of required binaries here (you'll need BuildTools.zip).

License

This library is licensed under the BSD license.

You might also like...
Signed - a 3D modeling and construction language based on Lua and SDFs. Signed will be available for macOS and iOS and is heavily optimized for Metal.
Signed - a 3D modeling and construction language based on Lua and SDFs. Signed will be available for macOS and iOS and is heavily optimized for Metal.

Signed - A 3D modeling language Abstract Signed is a Lua based 3D modeling language, it provides a unique way to create high quality 3D content for yo

Shamir’s Secret Sharing Algorithm: Shamir’s Secret Sharing is an algorithm in cryptography created by Adi Shamir. The main aim of this algorithm is to divide secret that needs to be encrypted into various unique parts.
Shamir’s Secret Sharing Algorithm: Shamir’s Secret Sharing is an algorithm in cryptography created by Adi Shamir. The main aim of this algorithm is to divide secret that needs to be encrypted into various unique parts.

Shamir-s-Secret-Sharing-Algorithm-Cryptography Shamir’s Secret Sharing Algorithm: Shamir’s Secret Sharing is an algorithm in cryptography created by A

My old heavily modified version of bigbase v1, it has an impulse-like scrollbar, ytd header loader, Vector3 fix + gamestate fix and some other misc changes!
My old heavily modified version of bigbase v1, it has an impulse-like scrollbar, ytd header loader, Vector3 fix + gamestate fix and some other misc changes!

Old Bigbase V1 UI This is my old ui for bigbase v1 but i dont need it anymore because the dev of solar mod menu stole it, and the new paragon menu (Fr

The core engine forked from NVidia's Q2RTX. Heavily modified and extended to allow for a nicer experience all-round.

Nail & Crescent - Development Branch Scratchpad - Things to do or not forget: Items are obviously broken. Physics.cpp needs more work, revising. Proba

My color picker in GTK, based very heavily on the MS Powertoys color picker

My color picker in GTK, based very heavily on the MS Powertoys color picker

The core engine forked from NVidia's Q2RTX. Heavily modified and extended to allow for a nicer experience all-round.

Polyhedron - A Q2RTX Fork A fork of the famous Q2RTX project by NVIDIA™ that strives to improve all of its other factors of what was once upon a time

 2D lowpoly editor heavily based on this one
2D lowpoly editor heavily based on this one

polyedit 2D lowpoly editor heavily based on this one Download Direct link Releases page Libraries This project uses: SFML Tiny File Dialogs jsoncpp im

Extremely Fast Compression algorithm

LZ4 - Extremely fast compression LZ4 is lossless compression algorithm, providing compression speed 500 MB/s per core, scalable with multi-cores CPU

Zstandard - Fast real-time compression algorithm
Zstandard - Fast real-time compression algorithm

Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better comp

CComp: A Parallel Compression Algorithm for Compressed Word Search

The goal of CComp is to achieve better compressed search times while achieving the same compression-decompression speed as other parallel compression algorithms. CComp achieves this by splitting both the word dictionaries and the input stream, processing them in parallel.

A simple C library implementing the compression algorithm for isosceles triangles.

orvaenting Summary A simple C library implementing the compression algorithm for isosceles triangles. License This project's license is GPL 2 (as of J

Better lossless compression than PNG with a simpler algorithm

Zpng Small experimental lossless photographic image compression library with a C API and command-line interface. It's much faster than PNG and compres

A texture compression algorithm for sprite sheets that allows decompression on the GPU during rendering.
A texture compression algorithm for sprite sheets that allows decompression on the GPU during rendering.

CRABBY A texture compression format for spritesheets Crabby TL;DR Crabby is a compressed texture format for spritesheets and flipbook animations. What

Optimized implementations of the Number Theoretic Transform (NTT) algorithm for the ring R/(X^N + 1) where N=2^m.

optimized-number-theoretic-transform-implementations This sample code package is an implementation of the Number Theoretic Transform (NTT) algorithm f

miniz: Single C source file zlib-replacement library, originally from code.google.com/p/miniz

Miniz Miniz is a lossless, high performance data compression library in a single source file that implements the zlib (RFC 1950) and Deflate (RFC 1951

Fork of the popular zip manipulation library found in the zlib distribution.

minizip-ng 3.0.0 minizip-ng is a zip manipulation library written in C that is supported on Windows, macOS, and Linux. Developed and maintained by Nat

zlib replacement with optimizations for "next generation" systems.

zlib-ng zlib data compression library for the next generation systems Maintained by Hans Kristian Rosbach aka Dead2 (zlib-ng àt circlestorm dót org) C

zlib replacement with optimizations for "next generation" systems.

zlib-ng zlib data compression library for the next generation systems Maintained by Hans Kristian Rosbach aka Dead2 (zlib-ng àt circlestorm dót org) C

Fork of the popular zip manipulation library found in the zlib distribution.

minizip-ng 3.0.1 minizip-ng is a zip manipulation library written in C that is supported on Windows, macOS, and Linux. Developed and maintained by Nat

Comments
  • Неожиданный Z_STREAM_ERROR

    Неожиданный Z_STREAM_ERROR

    Привет, не знаю куда копать дальше... Написал managed-обертку над zlib (очень нужна скорость в одном месте). С версией 1.2.3 c сайта WinImage работает. С твоими dll (что с cdecl что c WinApi) работает но "неустойчиво".

    Я написал цикл 100 раз чтобы скорость померить, вот где-то на 12 шаге все ломается... Т.е. изначально несколько раз полный цикл deflateInit2_ -> deflate(Z_NO_FLUSH)->deflate(Z_FINISH)->deflateEnd() отрабатывает... а иногда (итерации на 12 например) выбрасывает Z_STREAM_ERROR (-2) чаще всего на deflate(Z_FINISH) но бывает и на deflateEnd и на deflate(Z_NO_FLUSH)

    Может быть сборочку сделаешь чтобы диагностику в консольку писало что этой скотине не нравится? или могу свою тестовую прогу дать...

    opened by dimzon 11
  • Recommended compiling flags

    Recommended compiling flags

    Hey, @gildor2 thanks for the awesome code. I'm using zlib(boost iostreams) in a small game server to compress(deflate) the game packets and i want to know if this flags("CFLAGS="-march=native -O2" ./configure") are good enough for optimize zlib together with your C code.

    question 
    opened by yamaken93 7
  • Add support for other architectures

    Add support for other architectures

    Hi, I tried to use the test scripts on a Power processor, but I found out that some things were hardcoded (e.g. use of -msse2 flag). This PR increments the scripts to lift that limitation and enable using build.sh and test.sh for other archs besides x86.

    Tested on Linux on x86_64 and powerpc64le.

    enhancement 
    opened by mscastanho 5
  • [question] Description of the algorithm

    [question] Description of the algorithm

    @gildor2

    Hi Konstantin,

    I have tested your fast-longest-match with the data set I need compress, and fast-longest-match does outperforms the original zlib in terms of speed. I want to understand how it works and so I read your brief description of the algorithm and checked the source code. Now I still have a few questions about your algorithm, could you help me better understand it?

    When algorithm finds a match (intermediate one), it computes hashes for all matching bytes, and selects a different hash chain - that one which points farther from current position. ...

    Q1: Do you mean when finding a match, the algorithm computes hashes for every 3 bytes in the matching string? Then I think when these hash values are inserted into the hash table head[] (or the prev[] if head[x] already exists, they will fall into different positions. Is my understanding correct?

    Q2: Regarding to "select a different hash chain - that one which points farther from current position. If we have 2 hash chains, one points at distance 10, and another one at distance 1000, it is not worth checking distance 10 because other bytes will not match ..."

    • What do you mean when you refer to "hash chain"? Do you mean the elements in prev[], eg. checking prev[1000] betters checking prev[100]?
    • When you say "it not worth checking distance 10 because other bytes will not match", my understanding is like the following case:
    current position 1100, string: abcdabcd......
    position 1000, string: abcdabcd......
    position 100, string: abcdabcd.......
    

    Here we can check the string starting on position 100, because the possible match length is 1000+. If checking position 1000, the possible match length is 100+. Is this what you mean?

    Q3: To this piece of code:

    cur_match -= offset;
    offset = 0;
    next_pos = cur_match;
    for (i = 0; i <= len - MIN_MATCH; i++) {
        pos = prev[(cur_match + i) & wmask];
        if (pos < next_pos) {
            /* this hash chain is more distant, use it */
            next_pos = pos;
            offset = i;
        } 
    }
    /* Switch cur_match to next_pos chain */
    cur_match = next_pos;
    

    I think this piece of code implements the magic "jumping to the hash chain which points to farther distance". I don't quite understand what it tries to do though. Looks to me it does something like this:

    when a matching string "abcdefg" at position 1000 is found, try to find a longest match, such as "abcdefg123456"
    - if prev[1000] exists, record prev[1000] as p1 (a farther position with the same match, using hash value of "abc")
    - if prev[1001] ("bcd") exists, record prev[1001] as p2
    - if p2 < p1, record next_pos = p2
    - do similar thing for prev[1002] ("cde"), prev[1003] ("def"), prev[1004] ("efg"), find the *farthest* next_pos 
    

    Is this process the one you mentioned as "jumping to the hash chain which points to farther distance"? What's the fundamental difference between this one and the zlib behavior, which is keep checking p = prev[1000], p1 = prev[p], p2 = prev[p1]?

    My questions might be too long, but I really want to understand this algorithm better, and look forward to getting some feedback from you.

    Thank you.

    Euccas

    question 
    opened by euccas 1
Releases(zlib-1.2.13)
Owner
Konstantin Nosov
Game engine programmer
Konstantin Nosov
zlib replacement with optimizations for "next generation" systems.

zlib-ng zlib data compression library for the next generation systems Maintained by Hans Kristian Rosbach aka Dead2 (zlib-ng àt circlestorm dót org) C

zlib-ng 13 Dec 29, 2022
Brotli compression format

SECURITY NOTE Please consider updating brotli to version 1.0.9 (latest). Version 1.0.9 contains a fix to "integer overflow" problem. This happens when

Google 11.8k Jan 5, 2023
Superfast compression library

DENSITY Superfast compression library DENSITY is a free C99, open-source, BSD licensed compression library. It is focused on high-speed compression, a

Centaurean 984 Dec 17, 2022
data compression library for embedded/real-time systems

heatshrink A data compression/decompression library for embedded/real-time systems. Key Features: Low memory usage (as low as 50 bytes) It is useful f

Atomic Object 1.1k Jan 7, 2023
Small strings compression library

SMAZ - compression for very small strings ----------------------------------------- Smaz is a simple compression library suitable for compressing ver

Salvatore Sanfilippo 1k Dec 28, 2022
Compression abstraction library and utilities

Squash - Compresion Abstraction Library

null 375 Dec 22, 2022
Fastest Integer Compression

TurboPFor: Fastest Integer Compression TurboPFor: The new synonym for "integer compression" ?? (2019.11) ALL functions now available for 64 bits ARMv8

powturbo 647 Dec 26, 2022
Multi-format archive and compression library

Welcome to libarchive! The libarchive project develops a portable, efficient C library that can read and write streaming archives in a variety of form

null 1.9k Dec 26, 2022
Przemyslaw Skibinski 579 Jan 8, 2023
Analysing and implementation of lossless data compression techniques like Huffman encoding and LZW was conducted along with JPEG lossy compression technique based on discrete cosine transform (DCT) for Image compression.

PROJECT FILE COMPRESSION ALGORITHMS - Huffman compression LZW compression DCT Aim of the project - Implement above mentioned compression algorithms an

null 1 Dec 14, 2021