KVDK (Key-Value Development Kit) is a key-value store library implemented in C++ language

Related tags

NoSQL kvdk
Overview

KVDK

KVDK (Key-Value Development Kit) is a key-value store library implemented in C++ language. It is designed for persistent memory and provides unified APIs for both volatile and persistent scenarios. It also demonstrates several optimization methods for high performance with persistent memory. Besides providing the basic APIs of key-value store, it offers several advanced features, like transaction, snapshot as well.

Features

  • The basic get/set/update/delete opertions on unsorted keys.
  • The basic get/set/update/delete/iterate operations on sorted keys.
  • Multiple changes on unsorted keys can be made in one atomic batch.
  • User can create multiple collections of sorted keys.
  • Support read-committed transaction. (TBD)
  • Support snapshot to get a consistent view of data. (TBD)

Limitations

  • Maximum supported key-value size are 64KB-64MB.
  • The maximum write thread number can't be dynamicly changed after start-up.
  • No support of key-value compression.
  • Persistent memory space can't be expanded on the fly.

Getting the Source

git clone --recurse-submodules https://github.com/pmem/kvdk.git

Building

mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release && make -j

Before each commit, please check its coding style with below instructions

cmake .. -DCHECK_CPP_STYLE=ON && make -j

Benchmarks

Here are the examples on how to benchmark the performance of KVDK on your own systems.

Documentations

User Guide

Please reference to User guide for API introductions of KVDK.

Architecture

Comments
  • Unexpected InvalidDataSize

    Unexpected InvalidDataSize

    Bug Report

    KVDK version

    14f8778dea099225cd2bb1d9c8b406c7b53b8133

    System configuration

    Reproduce steps

        kvdk::Configs engine_configs;
        {
    
          engine_configs.pmem_file_size = 40 * 1024UL * 1024UL * 1024UL;
          engine_configs.pmem_segment_blocks = (1ull << 8);
          engine_configs.hash_bucket_num = (1ull << 10);
          engine_configs.log_level = kvdk::LogLevel::Debug;
        }
    
        std::string engine_path{kvpmem_data_file_path};
    
        // Purge old KVDK instance
        system(std::string{"rm -rf " + engine_path + "\n"}.c_str());
    
        status = kvdk::Engine::Open(engine_path, &engine, engine_configs, stdout);
        assert(status == kvdk::Status::Ok);
        DBUG_PRINT("KVDK",
                   ("Successfully created KVDK engine %s", kvpmem_data_file_path));
    
        status = engine->SSet("hi", "yo", "peeep");
        ASSERT(status == kvdk::Status::Ok, (int)status);
      }
    

    Expect behavior

    Should put value in the ordered set.

    Current behavior

    Assert fails with InvalidDataSize. If I print the size of the string view inside the engine I get following output [ERROR] time 40009 ms: SSET size collection: 1147430597 userkey: 1147430594 value: 1147430588.

    I think there is an issue with the string view but I don't understand why since the usage is very similar to the usage guide. I also tried to pass a static string. Any ideas?

    opened by tthebst 13
  • Why not use libpmemobj ? [NEW]

    Why not use libpmemobj ? [NEW]

    Bug Report

    Why not use libpmemobj ? I do not think kvdk can support atomic operation only using APIs in libpmem. batchWrite can write multiple KV pairs one by one, but could not roll back if failed. DLinked_list could not update prev/next pointer in one transaction. But it shows "Provide APIs to write multiple key-value pairs in an atomic batch. ", Is there something wrong ?

    opened by jinhao2 6
  • Why is the reading and writing speed of kvdk not as fast as leveldb on ssd?

    Why is the reading and writing speed of kvdk not as fast as leveldb on ssd?

    Why is the reading and writing speed of kvdk not as fast as leveldb? I think this is very strange. Is it the problem of my use or the problem of kvdb itself? If it is the problem of kvdb itself, do you have a good recommendation for kv database, it is best to use B+ tree as the engine, thank you.

    opened by yunxiao3 6
  • Reorg PMEM FreeList: global free entry pool + thread local entry cache

    Reorg PMEM FreeList: global free entry pool + thread local entry cache

    In this patch:

    1. Move pmem allocator related codes to directory engine/pmem_allocator
    2. Reorg free list:
    • Use a global free space entry pool to store free space
    • Each write thread has a cache structure, which consists of a active free space list and a backup free space list. Each threads store just freed space to and use free space from own active list. To balance free space entries among threads, if too many entries cached by a thread, newly freed entries will be stored in backup list, then move the whole backup list to entry pool which shared by all threads
    • If active list is empty, then swap backup list as active list, or fetch a free list from entry pool
    opened by JiayuZzz 5
  • Save space merge cpu overhead in spare time

    Save space merge cpu overhead in spare time

    What is changed and how it works?

    Merge space in background only if there are enough space entry freed since last merge

    Check List

    Tests

    • Unit test
    • Integration test
    new-feature 
    opened by JiayuZzz 3
  • Cannot compile kvrocks

    Cannot compile kvrocks

    Bug Report

    I cannot compile the kvrocks correctly by following the steps mentioned in the example/kvrocks/README I found that the README is updated 9 months ago. However, the patch is updated three days ago.

    Current behavior

    When I compile with running ‘make’, there's an error:

    incubator-kvrocks/external/kvdk/include/types.hpp:7:10: fatal error: libpmemobj++/string_view.hpp: No such file or directory
    

    This error could be fixed by copying libpmemobj++/ directory from kvdk to the kvrocks.

    But there is another error:

    In file included from redis_bitmap_string.cc:5:
    redis_string.h:9:10: fatal error: namespace.hpp: No such file or directory
    #include "namespace.hpp"
    

    I found that you are developing kvdk and there is no namespace.h file in the latest version. I would be grateful if you fixing the issue and update the README.

    opened by Chris-NaN 3
  • Allow empty string as key

    Allow empty string as key

    The empty string as key acts normally in KVDK. No need to check for zero-sized keys. Added a file debug.cpp under tests directory and is compiled to dbdebug. This file is almost a copy of tutorial.cpp but for debugging purpose, because it's very simple and easy to change.

    opened by ZiyanShi 3
  • bench tool:  Set error happened when using sorted-type and latency enable

    bench tool: Set error happened when using sorted-type and latency enable

    "Set error" happened when using sorted-type and latency enable, option type inclues "Fill, update, Insert",for example:

    Insert new sorted-type kv Write latency overflow: 12191390 us Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error sh: line 1: 71030 Segmentation fault (core dumped) numactl --cpunodebind=0 --membind=0 ./bench -latency=1 -populate=1 -value_size=128 -threads=96 -time=10 -path=/mnt/pmem0/kvdk -num=789516047 -space=536870912000 -max_write_threads=96 -fill=0 -type=sorted -read_ratio =0 -existing_keys_ratio=0 > ./results/sorted_vs128_insert_thread96

    opened by Sean58238 3
  • Add submodule jemalloc and numactl

    Add submodule jemalloc and numactl

    Signed-off-by: Yu, Peng [email protected]

    What problem does this PR solve?

    Problem Summary: Add submodule jemalloc and numactl for volatile memory management

    What is changed and how it works?

    What's Changed:

    • Add submodule jemalloc and numactl under extern directory
    • Update cmake configurations to build above submodules
    • Small improvement to Java module

    Check List

    Tests

    • Unit test

    Side effects

    • None
    opened by iyupeng 2
  • engine/utils: refactor the function compare_string_view()

    engine/utils: refactor the function compare_string_view()

    What problem does this PR solve?

    Problem Summary: compare_string_view() function works inefficiently by comparing characters one by one .

    What is changed and how it works?

    When the Key is more than 8B, compare two uint64_t instead of 8 bytes.

    What's Changed: static inline int compare_string_view(const StringView& src, const StringView& target) { auto size = std::min(src.size(), target.size()); auto batchNum = size & ~(sizeof(uint64_t)-1);
    uint32_t i = 0; for( i =0; i< batchNum ; ){ // if size is more than 8, compared by type of uint64_t. auto nEq = (uint64_t)(&src[i]) - (uint64_t)(&target[i]); if ( nEq ) return nEq; i += sizeof(uint64_t); } for (; i < size; i++) { if (src[i] != target[i]) { return src[i] - target[i];
    } } return src.size() - target.size(); }

    Tests

    • No code

    Side effects none

    opened by jinhao2 2
  • Add compile option WITH_PMEM and runtime config enable_pmem

    Add compile option WITH_PMEM and runtime config enable_pmem

    What problem does this PR solve?

    Problem Summary: Introducing optional volatile KV storage on DRAM

    What is changed and how it works?

    What's Changed:

    • add compile option WITH_PMEM, which is ON by default
    • add enable_pmem in KVDK::Configs, which is true by default
    • cmake with -DWITH_PMEM=OFF to remove dependency to libpmem
    • even built with -DWITH_PMEM=ON, setting Configs.enable_pmem = false can make KVDK store data on DRAM

    Check List

    Tests

    • Unit test
    • Integration test

    Side effects

    • None
    opened by iyupeng 2
  • Adapt benchmarks for volatile KV storage

    Adapt benchmarks for volatile KV storage

    Signed-off-by: Yu, Peng [email protected]

    What problem does this PR solve?

    Problem Summary: Adapt benchmarks for volatile KV storage

    What is changed and how it works?

    What's Changed: volatile/benchmark/bench.cpp

    Check List

    Tests

    • Unit test
    • Integration test

    Side effects

    • None
    opened by iyupeng 0
  • Vhash

    Vhash

    What is changed and how it works?

    Implement new hashmap with rehash function. Reduce memory usage. Add VHash(Volatile Hash) data type.

    TODO list

    Add automatic rehashing logic.

    Check List

    Tests

    • Unit test
    opened by ZiyanShi 0
  • [NEW] How can KVDK use multiple PM disks?

    [NEW] How can KVDK use multiple PM disks?

    The problem/use-case that the feature addresses

    If one PM disk space is not enough, how can multiple PM disks be mapped into one KVDK?

    Description of the feature

    KVDK can take use of multiple PM disk.

    Alternatives you've considered

    Additional information

    Any additional information that is relevant to the feature request.

    opened by jinhao2 1
Releases(v1.0)
  • v1.0(Sep 30, 2022)

    KVDK version 1.0.0 is a high performance persistent key-value storage solution based on Intel Persistent Memory.

    Features

    KVDK supports multiple data types along with several advanced features.

    • Multiple data types. Raw string KV, sorted KV collection, hash KV collection and list.
    • Basic KV operation. Like Get/Put/Update/Delete key-value pairs, and snapshot based scan.
    • Expire data on time. Set a TTL (time-to-live) for a string KV or a collection in KVDK.
    • Atomic Read-Modify-Write of key-value pairs.
    • Atomic batch write across data types.
    • Read-committed transaction across data types.
    • Consistent dump & restore data to/from storage.
    • Consistent checkpoint. Make a checkpoint at run time and restore data to the checkpoint at next startup.
    • Multiple language APIs. C/C++/Java.

    Public API

    Please refer to doc/user_doc.md and examples/tutorial for public API and examples.

    Performance

    Please refer to doc/benchmark.md for benchmarking.

    Source code(tar.gz)
    Source code(zip)
Owner
Persistent Memory Programming
Libraries and Examples for Persistent Memory Programming
Persistent Memory Programming
Kvrocks is a key-value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks is a key-value NoSQL database based on RocksDB and compatible with Redis protocol.

Bit Leak 1.9k Jan 8, 2023
Kvrocks is a distributed key value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks is a distributed key value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks Labs 1.9k Jan 9, 2023
🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

?? ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

ArangoDB 12.8k Jan 9, 2023
Kreon is a key-value store library optimized for flash-based storage

Kreon is a key-value store library optimized for flash-based storage, where CPU overhead and I/O amplification are more significant bottlenecks compared to I/O randomness.

Computer Architecture and VLSI Systems (CARV) Laboratory 24 Jul 14, 2022
RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB

Facebook 24.3k Jan 5, 2023
BerylDB is a data structure data manager that can be used to store data as key-value entries.

BerylDB is a data structure data manager that can be used to store data as key-value entries. The server allows channel subscription and is optimized to be used as a cache repository. Supported structures include lists, sets, and keys.

BerylDB 203 Dec 16, 2022
FoundationDB - the open source, distributed, transactional key-value store

FoundationDB is a distributed database designed to handle large volumes of structured data across clusters of commodity servers. It organizes data as

Apple 12k Dec 31, 2022
A high performance, shared memory, lock free, cross platform, single file, no dependencies, C++11 key-value store

SimDB A high performance, shared memory, lock free, cross platform, single file, no dependencies, C++11 key-value store. SimDB is part of LAVA (Live A

null 454 Dec 29, 2022
Kit: a magical, high performance programming language, designed for game development

Kit: a magical, high performance programming language, designed for game development

Kit Programming Language 988 Dec 10, 2022
cdk is a minimal cross-platform c language development kit.

Overview cdk is a minimal cross-platform c language development kit. Requirement Based on c11 standard. Compile create a build directory under the cdk

Red 22 Dec 15, 2022
Data Plane Development Kit

DPDK is a set of libraries and drivers for fast packet processing. It supports many processor architectures and both FreeBSD and Linux. The DPDK uses

DPDK 2.2k Dec 29, 2022
Internal Software Development Kit for Battlefield 2042

battlefield-2042-internal-sdk Internal Software Development Kit for Battlefield 2042 SDK Includes the following: Entity Classes Player Classes Vehicle

Skengdo 11 Nov 29, 2022
🎮 Cross platform development kit for Z80 and SM83 based consoles.

cdk ?? Cross platform development kit for Z80 and SM83 based consoles. Platform We planned to support the following consoles: Nintendo Game Boy Ninten

Micro Console 4 Jan 10, 2022
bl_mcu_sdk is MCU software development kit provided by Bouffalo Lab Team for BL602/BL604, BL702/BL704/BL706 and other series of RISC-V based chips in the future.

bl mcu sdk is an MCU software development kit provided by the Bouffalo Lab Team for BL602/BL604, BL702/BL704/BL706 and other series of chips in the future

Bouffalo Lab 165 Dec 23, 2022
John Walker 24 Dec 15, 2022
Simple constant key/value storage library, for read-heavy systems with infrequent large bulk inserts.

Sparkey is a simple constant key/value storage library. It is mostly suited for read heavy systems with infrequent large bulk inserts. It includes bot

Spotify 989 Dec 14, 2022
LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values. Authors: Sanjay Ghem

Google 31.6k Jan 7, 2023
Modern transactional key-value/row storage library.

Sophia is advanced transactional MVCC key-value/row storage library. How does it differ from other storages? Sophia is RAM-Disk hybrid storage. It is

Dmitry Simonenko 1.8k Dec 15, 2022
This project implemented the Mean Value Coordinates in 3D algorithm in c++

Mean Value Coordinates in 3D [c++] | Paper link on Sciencedirect | Pdf version link | This project implemented the Mean Value Coordinates in 3D algori

null 3 Nov 22, 2022
Kvrocks is a key-value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks is a key-value NoSQL database based on RocksDB and compatible with Redis protocol.

Bit Leak 1.9k Jan 8, 2023