A stable adaptive partitioning comparison sort.

Related tags

Debug gridsort
Overview

Intro

This document describes a partitioning stable adaptive comparison-based sort named gridsort.

Binary Cube

Gridsort sorts data by storing data in a simplified binary cube, a multidimentional sorted array. The binary cube offers excellent cache utilization. It's easiest to view a binary cube as a hash table, but instead of a hash function to find a bucket it uses a binary search on a lookup table.

Boundless Binary Search

The first step when sorting an element is a boundless binary search to pin point the bucket where the element should be stored. A boundless binary search is up to two times faster than the legacy binary search used by most applications. Once a bucket is found the element is added to the end of the bucket.

Gridsort switches to an adaptive binary search when it detects data that is already sorted.

Quadsort

Once a bucket overflows it is sorted using quadsort and a new bucket is created. The sorted data is split between the two buckets so each bucket is half full. The lowest element in each bucket is added to the lookup table.

Finish

Once all elements have been inserted into the binary cube every bucket receives a final sort and is copied back to the original array.

Data Types

The C implementation of gridsort supports long doubles and 8, 16, 32, and 64 bit data types. By using pointers it's possible to sort any other data type.

Interface

Gridsort uses the same interface as qsort, which is described in man qsort.

Big O

                 ┌───────────────────────┐┌───────────────────────┐
                 │comparisons            ││swap memory           
┌───────────────┐├───────┬───────┬───────┤├───────┬───────┬───────┤┌──────┐┌─────────┐┌─────────┐
│name           ││min    │avg    │max    ││min    │avg    │max    ││stable││partition││adaptive │
├───────────────┤├───────┼───────┼───────┤├───────┼───────┼───────┤├──────┤├─────────┤├─────────┤
│gridsort       ││n      │n log n│n log n││n      │n      │n      ││yes   ││yes      ││yes     
├───────────────┤├───────┼───────┼───────┤├───────┼───────┼───────┤├──────┤├─────────┤├─────────┤
│mergesort      ││n log n│n log n│n log n││n      │n      │n      ││yes   ││no       ││no      
├───────────────┤├───────┼───────┼───────┤├───────┼───────┼───────┤├──────┤├─────────┤├─────────┤
│quadsort       ││n      │n log n│n log n││1      │n      │n      ││yes   ││no       ││yes     
├───────────────┤├───────┼───────┼───────┤├───────┼───────┼───────┤├──────┤├─────────┤├─────────┤
│quicksort      ││n      │n log n│n²     ││1      │1      │1      ││no    ││yes      ││no      
├───────────────┤├───────┼───────┼───────┤├───────┼───────┼───────┤├──────┤├─────────┤├─────────┤
│introsort      ││n log n│n log n│n log n││1      │1      │1      ││no    ││yes      ││no      
└───────────────┘└───────┴───────┴───────┘└───────┴───────┴───────┘└──────┘└─────────┘└─────────┘

Gridsort makes n comparisons when the data is fully in order or in reverse order.

Porting

People wanting to port gridsort might want to have a look at twinsort, which is a simplified implementation of quadsort. Gridsort itself is a simplified implementation of cubesort. Another sort worth looking at is fluxsort which uses simpler top-down partitioning instead of gridsort's bottom-up partitioning.

Visualization

In the visualization below eight tests are performed. Random, Ascending, Ascending Saw, Generic, Descending, Descending Saw, Random Tail, and Wave order.

cubesort visualization

In the visualization below one test is performed on a random distribution. This visualization more accurately shows the use of pointer operations to partition memory.

Cyan numbers are unsorted, green numbers are sorted, white numbers are sorted and ready to be merged, yellow numbers are the index upon which a binary search is performed to find out where to insert the next number, magenta numbers are ready to be merged back to the main array.

gridsort visualization

Benchmarks

The following benchmark was on WSL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) using the wolfsort benchmark. The source code was compiled using g++ -O3 -w -fpermissive bench.c. The bar graph shows the best run out of 100 on 32 bit integers. Comparisons for gridsort and std::sort are inlined. The std::sort() in the benchmark should be an in-place IntroSort.

gridsort vs stdsort

data table
Name Items Type Best Average Loops Samples Distribution
std::sort 1000000 128 0.110579 0.110943 1 100 random order
gridsort 1000000 128 0.105021 0.105474 1 100 random order
Name Items Type Best Average Loops Samples Distribution
std::sort 1000000 64 0.065856 0.066048 1 100 random order
gridsort 1000000 64 0.054824 0.055161 1 100 random order
Name Items Type Best Average Loops Samples Distribution
std::sort 1000000 32 0.065065 0.065391 1 100 random order
gridsort 1000000 32 0.053922 0.054189 1 100 random order
std::sort 1000000 32 0.011443 0.011800 1 100 ascending order
gridsort 1000000 32 0.003463 0.003526 1 100 ascending order
std::sort 1000000 32 0.033698 0.033884 1 100 ascending saw
gridsort 1000000 32 0.013591 0.013691 1 100 ascending saw
std::sort 1000000 32 0.030675 0.030970 1 100 generic order
gridsort 1000000 32 0.015841 0.016119 1 100 generic order
std::sort 1000000 32 0.008789 0.009153 1 100 descending order
gridsort 1000000 32 0.003662 0.003750 1 100 descending order
std::sort 1000000 32 0.026253 0.026449 1 100 descending saw
gridsort 1000000 32 0.012359 0.012535 1 100 descending saw
std::sort 1000000 32 0.044365 0.044593 1 100 random tail
gridsort 1000000 32 0.015910 0.016008 1 100 random tail
std::sort 1000000 32 0.055819 0.056025 1 100 random half
gridsort 1000000 32 0.029525 0.029700 1 100 random half
std::sort 1000000 32 0.027889 0.028305 1 100 ascending tiles
gridsort 1000000 32 0.012352 0.012592 1 100 ascending tiles

The following benchmark was on WSL gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1). The source code was compiled using gcc -O3 bench.c. The bar graph shows the best run out of 100 on 32 bit integers. Comparisons for gridsort and qsort are not inlined. The stdlib qsort() in the benchmark is a mergesort variant.

gridsort vs stdsort

data table
Name Items Type Best Average Compares Samples Distribution
qsort 100000 128 0.019332 0.020187 1536181 100 random order
gridsort 100000 128 0.013077 0.013145 1645784 100 random order
Name Items Type Best Average Compares Samples Distribution
qsort 100000 64 0.009379 0.009614 1536491 100 random order
gridsort 100000 64 0.007207 0.007287 1654963 100 random order
Name Items Type Best Average Compares Samples Distribution
qsort 100000 32 0.008563 0.008838 1536634 100 random order
gridsort 100000 32 0.006496 0.006611 1648950 100 random order
qsort 100000 32 0.002268 0.002402 815024 100 ascending order
gridsort 100000 32 0.000685 0.000695 202485 100 ascending order
qsort 100000 32 0.003044 0.003232 915019 100 ascending saw
gridsort 100000 32 0.002210 0.002243 639757 100 ascending saw
qsort 100000 32 0.006369 0.006623 1532339 100 generic order
gridsort 100000 32 0.002942 0.003046 1151338 100 generic order
qsort 100000 32 0.002314 0.002538 853904 100 descending order
gridsort 100000 32 0.000661 0.000677 200036 100 descending order
qsort 100000 32 0.002754 0.002995 1063907 100 descending saw
gridsort 100000 32 0.001940 0.002044 841084 100 descending saw
qsort 100000 32 0.003875 0.004092 1012028 100 random tail
gridsort 100000 32 0.002176 0.002206 627704 100 random tail
qsort 100000 32 0.005599 0.005844 1200835 100 random half
gridsort 100000 32 0.003740 0.003799 1001582 100 random half
qsort 100000 32 0.003878 0.004274 1209200 100 ascending tiles
gridsort 100000 32 0.003160 0.003267 867858 100 ascending tiles
You might also like...
Adaptive Runtime AUTOSAR Linux Simulator

Adaptive-AUTOSAR Adaptive AUTOSAR is a simulated Adaptive Platform enviroment over Linux defined by AUTOSAR. The goal of this project is to implement

A composable container for Adaptive ROS 2 Node computations. Select between FPGA, CPU or GPU at run-time.

adaptive_component A composable stateless container for Adaptive ROS 2 Node computations. Select between FPGA, CPU or GPU at run-time. Nodes using har

BAAF-Net - Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)
BAAF-Net - Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021) This repository is for BAAF-Net introduce

A software serial driver package by using the hardware timer capture / comparison functionality.

Soft serial 1.简介 Soft serial 是利用硬件定时器捕获/比较功能实现软件模拟串口的软件包。 1.1目录结构 Soft serial 软件包目录结构如下所示: soft_serial ├───inc // 头文件目录 │

C implementation of C++ Utility functions Integer Comparison Macros

C implementation of C++ Utility functions Integer Comparison Macros

Harbour DB speed tests comparison

hbDBSpeedTests Harbour DB speed tests comparison - Registers Count: 821051 MySql configuration( 1 or 2 ) /data/mysql/dbstru.zip - Import structure of

Web to Desktop framework comparison

Web to Desktop framework comparison This repository was made to create an objective comparison of multiple framework that grant us to "transform" our

Fft-benchmark - A benchmark for comparison of FFT algorithms performance

FFT benchmark A benchmark for comparison of FFT algorithms performance. Currently supports Intel IPP, KFR, FFTW and KissFFT. Requires: Clang 6.0+ (GCC

High dynamic range (HDR) image comparison tool for graphics people. With an emphasis on OpenEXR images.
High dynamic range (HDR) image comparison tool for graphics people. With an emphasis on OpenEXR images.

tev — The EXR Viewer A high dynamic range (HDR) image comparison tool for graphics people. tev allows viewing images through various tonemapping opera

This is example for coding with Kotlin/Native, linking C libraries and comparison java.net.* with libcurl

KotlinNativeExample This is example for coding with Kotlin/Native, linking C libraries and comparison java.net.* with libcurl Running Running mingw64

Bolt is a C++ template library optimized for GPUs. Bolt provides high-performance library implementations for common algorithms such as scan, reduce, transform, and sort.

Bolt is a C++ template library optimized for heterogeneous computing. Bolt is designed to provide high-performance library implementations for common

Pipy is a tiny, high performance, highly stable, programmable proxy.

Pipy Pipy is a tiny, high performance, highly stable, programmable proxy. Written in C++, built on top of Asio asynchronous I/O library, Pipy is extre

Cetus is a high performance, stable, protocol aware proxy for MySQL Group Replication.

Introduction Cetus is a high performance, stable, protocol aware proxy for MySQL Group Replication. Getting started 1. Prerequisites cmake gcc glib2-d

Pipy is a tiny, high performance, highly stable, programmable proxy written in C++

Pipy is a tiny, high performance, highly stable, programmable proxy. Written in C++, built on top of Asio asynchronous I/O library, Pipy is extremely lightweight and fast, making it one of the best choices for service mesh sidecars.

SortNode is a JS binding for SORT: Simple, online, and real-time tracking of multiple objects in a video sequence.

SortNode is a JS binding for SORT: Simple, online, and real-time tracking of multiple objects in a video sequence.

alie, simplified Discord bot, that's it. As fast and stable as possible.

alie alie, simplified Discord bot, that's it. As fast and stable as possible. Requirements Linux-compatible OS (aka Linux distribution) A C compiler w

Kernel source for j7y17lte - the goal is to make it as closest to linux-stable sources as possible without breaking OneUI compatibility.

Linux kernel release 3.x http://kernel.org/ These are the release notes for Linux version 3. Read them carefully, as they tell you what this is al

Hand tuned sort routines for different C types

c 2008,2009,2021 Andrew I. Schein See LICENSE file. Update for 2021 This repo disappeared from the internet when Bitbucket turned off its mercurial se

This is a Program, to sort Arrays with the QuickSort Algorithm.

QuickSort This is a program, to sort arrays with the QuickSort Algorithm. The Algorithm is optimized to be quick, but it isn't the fastest. I have wri

Owner
null
A stable adaptive branchless partitioning comparison sort.

Intro This document describes a partitioning stable adaptive comparison-based sort named fluxsort. Benchmarks and a visualization are available at the

null 198 Dec 4, 2022
Blitsort is an in-place stable adaptive rotate merge sort

Origin Blitsort is a rotate merge sort based on quadsort. Visualization In the visualization below nine tests are performed on 256 elements. Random or

null 667 Jan 1, 2023
Parallel algorithms (quick-sort, merge-sort , enumeration-sort) implemented by p-threads and CUDA

程序运行方式 一、编译程序,进入sort-project(cuda-sort-project),输入命令行 make 程序即可自动编译为可以执行文件sort(cudaSort)。 二、运行可执行程序,输入命令行 ./sort 或 ./cudaSort 三、删除程序 make clean 四、指定线程

Fu-Yun Wang 3 May 30, 2022
Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of data into smaller arrays.

Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of data into smaller arrays. A large array is partitioned into two arrays one of which holds values smaller than the specified value, say pivot, based on which the partition is made and another array holds values greater than the pivot value.

Lakshan Sandanayaka 4 Jan 2, 2023
Fast comparison-based sort algorithm

nanosort Algorithm nanosort aims to be a fast comparison-based sorting algorithm, tuned for POD types of reasonably small sizes. nanosort implements a

Arseny Kapoulkine 40 Dec 29, 2022
Bitset Sort, a faster std::sort replacement.

Bitset Sort Bitset Sort is a variant of quick sort, specifically BlockQuickSort. Bitset Sort uses a carefully written partition function to let the co

null 66 Dec 1, 2022
Space partitioning structures visualization

vptree-draw SVG export of several 2D space partitioning structures. Code structure geo.h minimalist 2D geometry structures (Point, Vector, Box, Sphere

Basile Fraboni 35 Dec 7, 2022
Shared-Memory Parallel Graph Partitioning for Large K

KaMinPar The graph partitioning software KaMinPar -- Karlsruhe Minimal Graph Partitioning. KaMinPar is a shared-memory parallel tool to heuristically

Karlsruhe High Quality Graph Partitioning 17 Nov 10, 2022
Simple command line tool that processes image files using the FidelityFX Super Resolution (FSR) or Contrast Adaptive Sharpening (CAS) shader systems.

Simple command line tool that processes image files using the FidelityFX Super Resolution (FSR) or Contrast Adaptive Sharpening (CAS) shader systems.

GPUOpen Effects 190 Dec 12, 2022
around — adaptive rounding operation

around — adaptive rounding operation Attempts to perform nice rounding of a floating point number, like a human would do. Usage: around.h #include "ar

Jan Ringoš 9 Oct 21, 2022