KVDK (Key-Value Development Kit) is a key-value store library implemented in C++ language

Related tags

kvdk
Overview

KVDK

KVDK (Key-Value Development Kit) is a key-value store library implemented in C++ language. It is designed for persistent memory and provides unified APIs for both volatile and persistent scenarios. It also demonstrates several optimization methods for high performance with persistent memory. Besides providing the basic APIs of key-value store, it offers several advanced features, like transaction, snapshot as well.

Features

  • The basic get/set/update/delete opertions on unsorted keys.
  • The basic get/set/update/delete/iterate operations on sorted keys.
  • Multiple changes on unsorted keys can be made in one atomic batch.
  • User can create multiple collections of sorted keys.
  • Support read-committed transaction. (TBD)
  • Support snapshot to get a consistent view of data. (TBD)

Limitations

  • Maximum supported key-value size are 64KB-64MB.
  • The maximum write thread number can't be dynamicly changed after start-up.
  • No support of key-value compression.
  • Persistent memory space can't be expanded on the fly.

Getting the Source

git clone --recurse-submodules https://github.com/pmem/kvdk.git

Building

mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release && make -j

Before each commit, please check its coding style with below instructions

cmake .. -DCHECK_CPP_STYLE=ON && make -j

Benchmarks

Here are the examples on how to benchmark the performance of KVDK on your own systems.

Documentations

User Guide

Please reference to User guide for API introductions of KVDK.

Architecture

Issues
  • Reorg PMEM FreeList: global free entry pool + thread local entry cache

    Reorg PMEM FreeList: global free entry pool + thread local entry cache

    In this patch:

    1. Move pmem allocator related codes to directory engine/pmem_allocator
    2. Reorg free list:
    • Use a global free space entry pool to store free space
    • Each write thread has a cache structure, which consists of a active free space list and a backup free space list. Each threads store just freed space to and use free space from own active list. To balance free space entries among threads, if too many entries cached by a thread, newly freed entries will be stored in backup list, then move the whole backup list to entry pool which shared by all threads
    • If active list is empty, then swap backup list as active list, or fetch a free list from entry pool
    opened by JiayuZzz 5
  • what is the difference between kvdk and pmemkv?

    what is the difference between kvdk and pmemkv?

    what is the difference between kvdk and pmemkv?

    opened by GOGOYAO 5
  • bench tool:  Set error happened when using sorted-type and latency enable

    bench tool: Set error happened when using sorted-type and latency enable

    "Set error" happened when using sorted-type and latency enable, option type inclues "Fill, update, Insert",for example:

    Insert new sorted-type kv Write latency overflow: 12191390 us Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error Set error sh: line 1: 71030 Segmentation fault (core dumped) numactl --cpunodebind=0 --membind=0 ./bench -latency=1 -populate=1 -value_size=128 -threads=96 -time=10 -path=/mnt/pmem0/kvdk -num=789516047 -space=536870912000 -max_write_threads=96 -fill=0 -type=sorted -read_ratio =0 -existing_keys_ratio=0 > ./results/sorted_vs128_insert_thread96

    opened by Sean58238 3
  • Add tutorial and user doc.

    Add tutorial and user doc.

    null

    opened by ZiyanShi 3
  • Allow empty string as key

    Allow empty string as key

    The empty string as key acts normally in KVDK. No need to check for zero-sized keys. Added a file debug.cpp under tests directory and is compiled to dbdebug. This file is almost a copy of tutorial.cpp but for debugging purpose, because it's very simple and easy to change.

    opened by ZiyanShi 3
  • segment fault happen when benchmark thread set over 1 thread

    segment fault happen when benchmark thread set over 1 thread

    [[email protected] build]# ./bench -fill=1 -value_size=4096 -threads=2 -path=/mnt/pmem0/kvdk -space=2748779069 -num=4000000 -max_write_threads=1 -type=string -populate=1 to fill 4000000 uniform keys [LOG] time 0 ms: Initializing PMEM size 2748779069 in file /mnt/pmem0/kvdk/data [LOG] time 380 ms: Map pmem space done [LOG] time 1163 ms: In restoring: iterated 0 records [LOG] time 1163 ms: Populating PMEM space ... [LOG] time 2697 ms: Populating done init 2 write threads init 0 read threads ------- ops in seconds ----------- time (ms), read ops, not found, write ops, total read, total write Segmentation fault (core dumped)

    compile the debug version and use the gdb to see the offset of the thread cache entry is overflow: Thread 20 "bench" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffafd7f8700 (LWP 4346)] 0x00007ffff7e613a5 in memmove_mov_sse2_clwb () from /lib64/libpmem.so.1 Missing separate debuginfos, use: dnf debuginfo-install gflags-2.2.2-5.fc32.x86_64 libgcc-10.3.1-1.fc32.x86_64 libpmem-1.8-2.fc32.x86_64 libstdc++-10.3.1-1.fc32.x86_64 (gdb) bt #0 0x00007ffff7e613a5 in memmove_mov_sse2_clwb () from /lib64/libpmem.so.1 #1 0x00007ffff7e49da6 in memmove_nodrain_sse2_clwb () from /lib64/libpmem.so.1 #2 0x00007ffff7e496cd in pmem_memcpy_persist () from /lib64/libpmem.so.1 #3 0x00007ffff7f8560a in kvdk::PMEMAllocator::Allocate (this=0x43bca0, size=4128) at /home/dennis/kvdk/engine/pmem_allocator.cpp:334 #4 0x00007ffff7f4e472 in kvdk::KVEngine::HashSetImpl (this=0x43ba90, key=..., value=..., dt=1, batch_hint=0x0) at /home/dennis/kvdk/engine/kv_engine.cpp:806 #5 0x00007ffff7f4e97c in kvdk::KVEngine::Set (this=0x43ba90, key="I\023\000\000\000\000\000", value="0yVu77DE49c8Rj9wD4D7pCWGv9DBdsi96M6QHZD3M64pnOX9FB7c8bLqj3si5DiyjMtX497gkFeLpiePIP8L5Xg5bw7cALT9gGYqjRSKz4SIsCYOjg5VLp94IdMZ3saQB1X475fOxuF3Vas5u04aIxKff25DzR05401cg2QClimC6FsL53o182S5plURj25TrkJBAZ0Y"...) at /home/dennis/kvdk/engine/kv_engine.cpp:871 #6 0x000000000041282d in DBWrite (id=1) at /home/dennis/kvdk/benchmark/bench.cpp:200 #7 0x000000000041bfda in std::__invoke_impl<void, void ()(int), int> ([email protected]: 0x4126ab <DBWrite(int)>) at /usr/include/c++/10/bits/invoke.h:60 #8 0x000000000041bf35 in std::__invoke<void ()(int), int> ([email protected]: 0x4126ab <DBWrite(int)>) at /usr/include/c++/10/bits/invoke.h:95 #9 0x000000000041bea5 in std::thread::_Invoker<std::tuple<void ()(int), int> >::_M_invoke<0ul, 1ul> (this=0x43bf68) at /usr/include/c++/10/thread:264 #10 0x000000000041be44 in std::thread::_Invoker<std::tuple<void ()(int), int> >::operator() (this=0x43bf68) at /usr/include/c++/10/thread:271 #11 0x000000000041bd80 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(int), int> > >::M_run (this=0x43bf60) at /usr/include/c++/10/thread:215 #12 0x00007ffff7cfd994 in execute_native_thread_routine () from /lib64/libstdc++.so.6 #13 0x00007ffff7e8e432 in start_thread () from /lib64/libpthread.so.0 #14 0x00007ffff79fa6d3 in clone () from /lib64/libc.so.6 (gdb) f 3 #3 0x00007ffff7f8560a in kvdk::PMEMAllocator::Allocate (this=0x43bca0, size=4128) at /home/dennis/kvdk/engine/pmem_allocator.cpp:334 334 pmem_memcpy_persist( (gdb) list 329 // Padding remaining space 330 auto extra_space = thread_cache.free_entry.size - b_size; 331 // TODO optimize, do not write PMEM 332 if (extra_space >= FREE_SPACE_PADDING_BLOCK) { 333 DataHeader header(0, extra_space); 334 pmem_memcpy_persist( 335 offset2addr(thread_cache.free_entry.space_entry.offset + b_size), 336 &header, sizeof(DataHeader)); 337 } else { 338 b_size = thread_cache.free_entry.size; (gdb) p header $1 = {checksum = 0, b_size = 4152360919} (gdb) p thread_cache $2 = (kvdk::PMEMAllocator::ThreadCache &) @0x43bc60: {segment_offset = 4294968816, segment_usable_blocks = 49, free_entry = {space_entry = { offset = 140737352581152, hash_entry_reference = 0x43bc40, hash_entry_mutex = 0x7ffff7800000}, size = 4152360984}, freelist = { offsets = std::vector of length -178956965, capacity 5863877454623 = {} (gdb) p thread_cache.free_entry.space_entry.offset $3 = 140737352581152 (gdb) p /x thread_cache.free_entry.space_entry.offset $4 = 0x7ffff7e84020

    The offset looks like a address, please take a look to see if there is something wrong.

    opened by guoanwu 2
  • SeekToFirst() on non-exsiting collection cause segmentation fault

    SeekToFirst() on non-exsiting collection cause segmentation fault

    In the code below I create iterator to the collection before adding any element to it by SSet() method.

    TEST_F(EngineBasicTest, TestSeekToFirst) {
    
      const std::string empty_collection = "Empty";
      ASSERT_EQ(Engine::Open(db_path.c_str(), &engine, configs, stdout),
                Status::Ok);
    
      auto iter = engine->NewSortedIterator(empty_collection);
      iter->SeekToFirst();
      ASSERT_FALSE(iter->Valid());
    
    }
    

    As Collection would be implicitly created on first SSet() it do not exists during SeekToFirst() call. In that case iter->Valid() should gracefully return false, as such iterator obviously is invalid. Instead of that it cause segmentation fault.

    Test output:

    localhost/kvdk:build# PMEM_IS_PMEM_FORCE=1 ./dbtest --gtest_filter="*SeekToFirst*"
    Note: Google Test filter = *SeekToFirst*
    [==========] Running 1 test from 1 test suite.
    [----------] Global test environment set-up.
    [----------] 1 test from EngineBasicTest
    [ RUN      ] EngineBasicTest.TestSeekToFirst
    [LOG] time 0 ms: Initializing PMEM size 17179869184 in file /mnt/pmem0/data/data
    [LOG] time 2087 ms: Map pmem space done
    [LOG] time 2091 ms: In restoring: iterated 0 records
    Segmentation fault (core dumped)
    
    opened by karczex 2
  • segment fault happen at insert and update of sort type

    segment fault happen at insert and update of sort type

    Segmentation fault (core dumped) numactl --cpunodebind=0 --membind=0 ./bench -latency=1 -populate=1 -value_size=4096 -threads=64 -time=10 -path=/mnt/pmem0/kvdk -num=26163299 -space=536870912000 -max_write_threads=64 -fill=0 -type=sorted -read_ratio=0 -existing_keys_ratio=0 > ./results/sorted_vs4096_insert_thread64

    Segmentation fault (core dumped) numactl --cpunodebind=0 --membind=0 ./bench -latency=1 -populate=1 -value_size=4096 -threads=64 -time=10 -path=/mnt/pmem0/kvdk -num=26163299 -space=536870912000 -max_write_threads=64 -fill=0 -type=sorted -read_ratio=0 > ./results/sorted_vs4096_update_thread64

    please take a look.

    opened by Sean58238 2
  • api: enable kvdk c api and add c style examples

    api: enable kvdk c api and add c style examples

    @JiayuZzz @KVKIT seems we need to change CI to adapt changed compilation: build/dbtutorial -> build/examples/tutorial/db_xxx.

    please review.

    opened by sherry-1001 2
  • Security/Normal Issue: Other threads modifying Hashes or SortedSets may pass the linkage check after acquiring the locks

    Security/Normal Issue: Other threads modifying Hashes or SortedSets may pass the linkage check after acquiring the locks

    Bug Report

    KVDK version

    ddeb70d1d65cb6918f1cb8c7e578ac7f3c5e835e

    System configuration

    2 * Intel(R) Xeon(R) Platinum 8269C CPU @ 2.50GHz CPU: 2 * Intel(R) Xeon(R) Platinum 8269C CPU @ 2.50GHz DDR: 12 * DDR4 16384 MB 2666 MT/s PMEM: 4 * 129408 MB 2666 MT/s

    Reproduce steps

    Inject this code snippet between line 250 and line 251 in skiplist.cpp to simulate a thread pausing

      {
        using namespace std::chrono_literals;
        std::this_thread::sleep_for(1000ms);
      }
    

    Put this code snippet in tests.cpp

    TEST_F(EngineBasicTest, SortedCollectionIssue2) {
      const std::string collection01 = "C1";
      const std::string collection02 = "C2";
      int num_threads = 2;
      configs.max_write_threads = num_threads;
      configs.pmem_segment_blocks = 16;
      ASSERT_EQ(Engine::Open(db_path.c_str(), &engine, configs, stdout),
                Status::Ok);
    
      auto Operations = [&](size_t tid)
      {
        if (tid == 0)
        {    
          for (size_t i = 0; i < 11; i++)
            engine->Set(std::to_string(i), "Dummy");
    
          engine->SSet(collection01, "a", "aa");
          engine->SSet(collection01, "b", "bb");
          engine->SSet(collection01, "c", "cc");
          engine->SSet(collection01, "d", "dd");
          // This SSet will sleep 1000ms right before acquiring locks
          engine->SSet(collection01, "b1", "b1b1");
        }
        else
        {
          for (size_t i = 0; i < 15; i++)
            engine->Set(std::to_string(i), "Dummy");
    
          using namespace std::chrono_literals;
          std::this_thread::sleep_for(500ms);
          engine->SDelete(collection01, "b");
          engine->SDelete(collection01, "c");
          engine->SSet(collection02, "c", "cc");
          engine->SSet(collection02, "b", "bb");
        }
      };
    
      auto IterateThrough = [&](std::string collection)
      {
        size_t cnt = 0;
        std::cout << "Contents in " << collection << std::endl;
        auto iter = engine->NewSortedIterator(collection);
        for (iter->SeekToFirst(); iter->Valid(); iter->Next())
          ++cnt;
    
        if (collection == collection01)
        {
          EXPECT_EQ(cnt, 3) 
            << R"(There should be three keys in C1, "a", "b1" and "d")";
        }
        else if (collection == collection02)
        {
          EXPECT_EQ(cnt, 2) 
            << R"(There should be two keys in C2, "b" and "c")";
        }
      };
    
      LaunchNThreads(num_threads, Operations);
      IterateThrough(collection01);
      IterateThrough(collection02);
    
      delete engine;
    }
    
    

    Expect behavior

    Test case success

    Current behavior

    Test case fail

    Note

    Malicious user may use this defect to intercept data from other users, or may even break data on PMem during recovery if the attack is carefully designed.

    opened by ZiyanShi 0
  • Support WriteBatch for collections

    Support WriteBatch for collections

    I'm not sure if this is currently possible but it seems to me that WriteBatch is only supported for the anonymous collection. Would it be possible to support WriteBatch for sorted collections?

    opened by tthebst 1
Owner
Persistent Memory Programming
Libraries and Examples for Persistent Memory Programming
Persistent Memory Programming
🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

?? ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

ArangoDB 11.7k Oct 21, 2021
RediSearch is a Redis module that provides querying, secondary indexing, and full-text search for Redis.

A query and indexing engine for Redis, providing secondary indexing, full-text search, and aggregations.

null 3.2k Oct 12, 2021
Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

ScyllaDB 7.2k Oct 18, 2021
Kvrocks is a key-value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks is a key-value NoSQL database based on RocksDB and compatible with Redis protocol.

Bit Leak 905 Oct 16, 2021
Kvrocks is a distributed key value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks is a distributed key value NoSQL database based on RocksDB and compatible with Redis protocol.

Kvrocks Labs 909 Oct 19, 2021
FEDB is a NewSQL database optimised for realtime inference and decisioning application

FEDB is a NewSQL database optimised for realtime inference and decisioning applications. These applications put real-time features extracted from multiple time windows through a pre-trained model to evaluate new data to support decision making. Existing in-memory databases cost hundreds or even thousands of milliseconds so they cannot meet the requirements of inference and decisioning applications.

4Paradigm 2.4k Oct 16, 2021
Nebula Graph is a distributed, fast open-source graph database featuring horizontal scalability and high availability

Nebula Graph is an open-source graph database capable of hosting super large-scale graphs with billions of vertices (nodes) and trillions of edges, with milliseconds of latency. It delivers enterprise-grade high performance to simplify the most complex data sets imaginable into meaningful and useful information.

vesoft inc. 6.6k Oct 16, 2021