Fast C++ IPC using shared memory

Overview

Shadesmar

C/C++ CI

An IPC library that uses the system's shared memory to pass messages. Supports publish-subscribe and RPC.

Requires: Linux and x86. Caution: Alpha software.

Features

  • Multiple subscribers and publishers.

  • Uses a circular buffer to pass messages between processes.

  • Faster than using the network stack. High throughput, low latency for large messages.

  • Decentralized, without resource starvation.

  • Minimize or optimize data movement using custom copiers.

Usage

There's a single header file generated from the source code which can be found here.

If you want to generate the single header file yourself, clone the repo and run:

$ cd shadesmar
$ python3 simul/simul.py

This will generate the file in include/.


Publish-Subscribe

Publisher:

#include <shadesmar/pubsub/publisher.h>

int main() {
    shm::pubsub::Publisher pub("topic_name");
    const uint32_t data_size = 1024;
    void *data = malloc(data_size);
    
    for (int i = 0; i < 1000; ++i) {
        p.publish(data, data_size);
    }
}

Subscriber:

#include <shadesmar/pubsub/subscriber.h>

void callback(shm::memory::Memblock *msg) {
  // `msg->ptr` to access `data`
  // `msg->size` to access `data_size`

  // The memory will be free'd at the end of this callback.
  // Copy to another memory location if you want to persist the data.
  // Alternatively, if you want to avoid the copy, you can call
  // `msg->no_delete()` which prevents the memory from being deleted
  // at the end of the callback.
}

int main() {
    shm::pubsub::Subscriber sub("topic_name", callback);

    // Using `spin_once` with a manual loop
    while(true) {
        sub.spin_once();
    }
    // OR
    // Using `spin`
    sub.spin();
}

RPC

Client:

#include <shadesmar/rpc/client.h>

int main() {
  Client client("channel_name");
  shm::memory::Memblock req, resp;
  // Populate req.
  client.call(req, &resp);
  // Use resp here.

  // resp needs to be explicitly free'd.
  client.free_resp(&resp);
}

Server:

#include <shadesmar/rpc/server.h>

bool callback(const shm::memory::Memblock &req,
              shm::memory::Memblock *resp) {
  // resp->ptr is a void ptr, resp->size is the size of the buffer.
  // You can allocate memory here, which can be free'd in the clean-up lambda.
  return true;
}

void clean_up(shm::memory::Memblock *resp) {
  // This function is called *after* the callback is finished. Any memory
  // allocated for the response can be free'd here. A different copy of the
  // buffer is sent to the client, this can be safely cleaned.
}

int main() {
  shm::rpc::Server server("channel_name", callback, clean_up);

  // Using `serve_once` with a manual loop
  while(true) {
    server.serve_once();
  }
  // OR
  // Using `serve`
  server.serve();
}
Issues
  • Any example publishing a char *?

    Any example publishing a char *?

    First, I'm new to c++, but I'm trying to port this library to nodejs. This is a part of my code:

         Napi::Buffer<char> buff = info[1].As<Napi::Buffer<char>>();          //nodejs buffer
         const uint32_t data_size = buff.Length();
         char * word = buff.Data();        
         shm::memory::DefaultCopier cpy;
         shm::pubsub::Publisher pub = shm::pubsub::Publisher("topic_example", &cpy);
         pub.publish(reinterpret_cast<void *>(word), data_size);
    

    But I'm getting this error on publish method: free(): invalid pointer

    bug 
    opened by amunhoz 9
  • Trying to compile

    Trying to compile

    I'm kind of new in c++ and trying to create a nodejs addon. But I'm failing to compile your project. Here are my steps:

    $ sudo apt-get install libboost-all-dev libmsgpack-dev
    $ git clone --recursive https://github.com/Squadrick/shadesmar.git
    $ cd ./vendors/shadesmar/
    $ ./install_deps.sh
    $ ./configure
    $ ninja
    

    But I'm getting:

    FAILED: CMakeFiles/dragons_test.dir/test/dragons_test.cpp.o 
    /usr/bin/c++  -DDEBUG_BUILD -Iinclude -O3 -DNDEBUG   -march=native -O2 -std=gnu++1z -MD -MT CMakeFiles/dragons_test.dir/test/dragons_test.cpp.o -MF CMakeFiles/dragons_test.dir/test/dragons_test.cpp.o.d -o CMakeFiles/dragons_test.dir/test/dragons_test.cpp.o -c test/dragons_test.cpp
    In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
                     from include/shadesmar/memory/dragons.h:31,
                     from test/dragons_test.cpp:29:
    /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h: In function ‘void shm::memory::dragons::_avx_async_cpy(void*, const void*, size_t)’:
    /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:920:1: error: inlining failed in call to always_inline ‘__m256i _mm256_stream_load_si256(const __m256i*)’: target specific option mismatch
     _mm256_stream_load_si256 (__m256i const *__X)
     ^~~~~~~~~~~~~~~~~~~~~~~~
    In file included from test/dragons_test.cpp:29:0:
    include/shadesmar/memory/dragons.h:111:55: note: called from here
         const __m256i temp = _mm256_stream_load_si256(sVec);
                                                           ^
    [6/8] Building CXX object CMakeFiles/dragons_bench.dir/benchmark/dragons.cpp.o
    FAILED: CMakeFiles/dragons_bench.dir/benchmark/dragons.cpp.o 
    /usr/bin/c++  -DDEBUG_BUILD -Iinclude -isystem /usr/local/include -O3 -DNDEBUG   -march=native -O2 -std=gnu++1z -MD -MT CMakeFiles/dragons_bench.dir/benchmark/dragons.cpp.o -MF CMakeFiles/dragons_bench.dir/benchmark/dragons.cpp.o.d -o CMakeFiles/dragons_bench.dir/benchmark/dragons.cpp.o -c benchmark/dragons.cpp
    In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
                     from include/shadesmar/memory/dragons.h:31,
                     from benchmark/dragons.cpp:23:
    /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h: In function ‘void shm::memory::dragons::_avx_async_cpy(void*, const void*, size_t)’:
    /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:920:1: error: inlining failed in call to always_inline ‘__m256i _mm256_stream_load_si256(const __m256i*)’: target specific option mismatch
     _mm256_stream_load_si256 (__m256i const *__X)
     ^~~~~~~~~~~~~~~~~~~~~~~~
    In file included from benchmark/dragons.cpp:23:0:
    include/shadesmar/memory/dragons.h:111:55: note: called from here
         const __m256i temp = _mm256_stream_load_si256(sVec);
                                                           ^
    ninja: build stopped: subcommand failed.
    

    Any help? Thanks in advance.

    bug 
    opened by amunhoz 8
  • Question to relation with ros2

    Question to relation with ros2

    Hi there, I came across this code through reddit.

    It is not really an issue, but I also couldnt find a forum. Please feel free to close.

    I am wondering what advantages this library offers over ros2 for example.

    Thanks and kind regards

    question 
    opened by mhubii 4
  • just some questions

    just some questions

    Every instance running shadesmar will share the same events base correct? There is no way to open a different channel like unix sockets? What about instances running inside containers (docker), will they share the same events base too?

    Thanks

    question 
    opened by amunhoz 3
  • [RPC] Replace poll with condvar

    [RPC] Replace poll with condvar

    We can add running stats of time spent waiting under condvar, so that each successive sleep time will be inline with the actual sleeping time. This will prevent sleeping for too longer or waking up too early and wasting CPU cycles doing polls.

    enhancement 
    opened by Squadrick 3
  • "Increase buffer_size" and not sending message

    I'm trying to create a server/client system with Shadesmar, using topics to communicate with specific clients. After the server receive the message, it responds and then i get this problem. Both, server and client are running on the same instance. Only two messages was exchanged, so, i dont think i could be a memory problem. Other specs:

    • i'm using multithread.
    • the copier and publisher are created at a global map
    • tried to create a new instance of copier and publisher to send a message back, same problem.

    any ideas?

    other questions:

    • Can i use the same 'DefaultCopier' with multiple publishers/topics?
    • Can i use the same 'DefaultCopier' to publish and subscribe?
    bug 
    opened by amunhoz 2
  • Fix shm namespace issue in macOS

    Fix shm namespace issue in macOS

    Fixes the failing macOS builds

    Also wondering why is shm::memory::dragons disable for macOS? I was able to run dragons benchmark without any issues after enabling it for macOS.

    opened by shrijitsingh99 2
  • std::filesystem namespace conflict

    std::filesystem namespace conflict

    When trying to build shadesmar tests this error pops up

    ../include/shadesmar/memory/tmp.h:41:48: error: ‘namespace std::filesystem = std::experimental::std::experimental::filesystem;’ conflicts with a previous declaration
    

    The previous declaration is in chrono header.

    opened by sudo-panda 1
  • [test] Rewrite benchmarks and tests

    [test] Rewrite benchmarks and tests

    Currently, the different tests and benchmarks do not use any framework, and resort to using asserts and manually timing functions. Ideally, we would want to use Catch for testing and Google benchmark for the different benchmarks.

    cleanup 
    opened by Squadrick 1
  • [macos] msgpack.hpp not found

    [macos] msgpack.hpp not found

    1. install_deps.sh successfully installs msgpack from brew.
    2. cmake can successfully find the package.

    During the compilation step it fails with (failed CI):

    [1/18] Building CXX object CMakeFiles/micro_benchmark.dir/test/micro_benchmark.cpp.o
    FAILED: CMakeFiles/micro_benchmark.dir/test/micro_benchmark.cpp.o 
    /Applications/Xcode_11.5.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++   -I../include -O2 -DNDEBUG -isysroot /Applications/Xcode_11.5.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk   -march=native -std=gnu++17 -MD -MT CMakeFiles/micro_benchmark.dir/test/micro_benchmark.cpp.o -MF CMakeFiles/micro_benchmark.dir/test/micro_benchmark.cpp.o.d -o CMakeFiles/micro_benchmark.dir/test/micro_benchmark.cpp.o -c ../test/micro_benchmark.cpp
    ../test/micro_benchmark.cpp:25:10: fatal error: 'msgpack.hpp' file not found
    #include <msgpack.hpp>
             ^~~~~~~~~~~~~
    1 error generated.
    [2/18] Building CXX object CMakeFiles/pubsub_bin_test.dir/test/pubsub_bin_test.cpp.o
    FAILED: CMakeFiles/pubsub_bin_test.dir/test/pubsub_bin_test.cpp.o 
    /Applications/Xcode_11.5.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++   -I../include -O2 -DNDEBUG -isysroot /Applications/Xcode_11.5.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk   -march=native -std=gnu++17 -MD -MT CMakeFiles/pubsub_bin_test.dir/test/pubsub_bin_test.cpp.o -MF CMakeFiles/pubsub_bin_test.dir/test/pubsub_bin_test.cpp.o.d -o CMakeFiles/pubsub_bin_test.dir/test/pubsub_bin_test.cpp.o -c ../test/pubsub_bin_test.cpp
    In file included from ../test/pubsub_bin_test.cpp:32:
    In file included from ../include/shadesmar/pubsub/publisher.h:35:
    ../include/shadesmar/message.h:32:10: fatal error: 'msgpack.hpp' file not found
    #include <msgpack.hpp>
             ^~~~~~~~~~~~~
    1 error generated.
    

    Maybe it's been renamed from msgpack.hpp to msgpack.h?

    bug 
    opened by Squadrick 1
  • Replace boost's managed shared memory with custom allocator

    Replace boost's managed shared memory with custom allocator

    In Memory, we currently use boost::interprocess::managed_shared_memory for allocating/deallocating of shared memory needed for each message during runtime. It uses a red-black tree for best-fit allocation. This is overkill for shadesmar since the number of allocations is fixed to the buffer size, and the size of each allocation (message) will be roughly equal. Profiling raw_benchmark shows that the most number of function calls are to boost's red-black tree implementation.

    enhancement 
    opened by Squadrick 1
  • Performance (at least compared to nanomsg)

    Performance (at least compared to nanomsg)

    I wonder how this performs compared to nanomsg (zero-copy, very minimal, fully scalable, ...).

    Could you do some measurements (at least preliminary with trade-offs to get a general sense) and publish them ideally right in the readme?

    Rationale

    I consider nanomsg as a baseline, so that's why I'm not asking about comparison to other messaging/IPC products (there are many good ones but nanomsg is easy to measure against due to its stability, easy setup, nice perf package, and leading performance).

    documentation 
    opened by dumblob 1
  • Clarification about RobustLock::lock

    Clarification about RobustLock::lock

    void RobustLock::lock() {
      while (!mutex_.try_lock()) {
        if (exclusive_owner.load() != 0) {
          auto ex_proc = exclusive_owner.load();
          if (proc_dead(ex_proc)) {
            // Here, I thought, ex_proc is always equal to exclusive_owner.
            // Why not remove the compare_exchange_strong check?
            if (exclusive_owner.compare_exchange_strong(ex_proc, 0)) {
              mutex_.unlock();
              continue;
            }
          }
        } else {
          prune_readers();
        }
    
        std::this_thread::sleep_for(std::chrono::microseconds(1));
      }
      exclusive_owner = getpid();
    }
    
    documentation 
    opened by liuxw7 3
  • Probable memory leak

    Probable memory leak

    Hallo, @Squadrick , thank for for your helpful ipc project, it really saved my life. I tested this project for half an hour, the memory it consumed grow from 0.2% to 0.5%, but all my program do is just received and publish, nothing else. I doubt there are some memory leak related bugs in the code. Thank you for replay:)

    bug 
    opened by jian-li 2
  • Support zero-copy communication

    Support zero-copy communication

    Here's one way to achieve this:

    Publisher p("topic");
    void *ptr = p.get_msg(size);
    

    ptr is allocated in the shared memory (using Allocator) and given to the user. We also assign an Element in the shared queue to ptr. We hold a writer lock on this element until ptr is published. We may need to update the base Element to add an extra field: is_zero_copied, so that the consumer can react accordingly.

    auto obj = new (ptr) SomeClass( /* params */);
    // update obj
    p.zero_copy_publish(ptr); // releases the shared queue element lock
    

    On the consumer side, we'll return a subclass of Memblock: ZeroCopyMemblock which will not have a no_delete() and will be deallocated at the end of the callback. We'll need to check the logic for locks as well.

    Code path for copied-communication:

    // element is the currently accessed shared queue position
    Memblock memblock;
    element.lock.acquire();
    memcpy(element.ptr, memblock.ptr, element.size);
    memblock.size = element.size;
    element.lock.release();
    
    callback(memblock);
    
    if (memblock.should_free) {
       delete memblock;
    }
    

    New code path for zero-copy communication:

    element.lock.acquire();
    callback(ZeroCopyMemblock{element.ptr, element.size});
    element.lock.release();
    
    allocator.dealloc(element);
    

    The above has been shown for pub-sub, but they can be extended to RPC too.


    Here's a problem, we can't free each message pointer independently. A message pointer can only be free after all preceding message allocations are released, which is due to the logic in which Allocator works. It is a strictly FIFO-based allocation strategy. For performance, we may want to consider moving to a more complex general-purpose allocator.

    NOTE: Writing a general-purpose allocator to work on a single chunk of shared memory is very error-prone.

    enhancement help wanted 
    opened by Squadrick 0
Owner
Dheeraj R Reddy
Whatever works.
Dheeraj R Reddy
Using shared memory to communicate between two executables or processes, for Windows, Linux and MacOS (posix). Can also be useful for remote visualization/debugging.

shared-memory-example Using shared memory to communicate between two executables or processes, for Windows, Linux and MacOS (posix). Can also be usefu

null 8 Mar 18, 2022
Implementation of System V shared memory (a type of inter process communication) in xv6 operating system.

NOTE: we have stopped maintaining the x86 version of xv6, and switched our efforts to the RISC-V version (https://github.com/mit-pdos/xv6-riscv.git)

Viraj Jadhav 5 Feb 21, 2022
The Hoard Memory Allocator: A Fast, Scalable, and Memory-efficient Malloc for Linux, Windows, and Mac.

The Hoard Memory Allocator Copyright (C) 1998-2020 by Emery Berger The Hoard memory allocator is a fast, scalable, and memory-efficient memory allocat

Emery Berger 882 May 13, 2022
Custom memory allocators in C++ to improve the performance of dynamic memory allocation

Table of Contents Introduction Build instructions What's wrong with Malloc? Custom allocators Linear Allocator Stack Allocator Pool Allocator Free lis

Mariano Trebino 1.2k May 9, 2022
MMCTX (Memory Management ConTeXualizer), is a tiny (< 300 lines), single header C99 library that allows for easier memory management by implementing contexts that remember allocations for you and provide freeall()-like functionality.

MMCTX (Memory Management ConTeXualizer), is a tiny (< 300 lines), single header C99 library that allows for easier memory management by implementing contexts that remember allocations for you and provide freeall()-like functionality.

A.P. Jo. 4 Oct 2, 2021
Memory-dumper - A tool for dumping files from processes memory

What is memory-dumper memory-dumper is a tool for dumping files from process's memory. The main purpose is to find patterns inside the process's memor

Alexander Nestorov 29 Feb 5, 2022
Mesh - A memory allocator that automatically reduces the memory footprint of C/C++ applications.

Mesh: Compacting Memory Management for C/C++ Mesh is a drop in replacement for malloc(3) that can transparently recover from memory fragmentation with

PLASMA @ UMass 1.4k May 7, 2022
STL compatible C++ memory allocator library using a new RawAllocator concept that is similar to an Allocator but easier to use and write.

memory The C++ STL allocator model has various flaws. For example, they are fixed to a certain type, because they are almost necessarily required to b

Jonathan Müller 1.1k May 9, 2022
Execute MachO binaries in memory using CGo

Execute Thin Mach-O Binaries in Memory This is a CGo implementation of the initial technique put forward by Stephanie Archibald in her blog, Running E

Dwight Hohnstein 55 Apr 15, 2022
STL compatible C++ memory allocator library using a new RawAllocator concept that is similar to an Allocator but easier to use and write.

STL compatible C++ memory allocator library using a new RawAllocator concept that is similar to an Allocator but easier to use and write.

Jonathan Müller 1k Dec 2, 2021
Malloc Lab: simple memory allocator using sorted segregated free list

LAB 6: Malloc Lab Main Files mm.{c,h} - Your solution malloc package. mdriver.c - The malloc driver that tests your mm.c file short{1,2}-bal.rep - T

null 1 Feb 28, 2022
Public domain cross platform lock free thread caching 16-byte aligned memory allocator implemented in C

rpmalloc - General Purpose Memory Allocator This library provides a public domain cross platform lock free thread caching 16-byte aligned memory alloc

Mattias Jansson 1.5k May 10, 2022
OpenXenium JTAG and Flash Memory programmer

OpenXenium JTAG and Flash Memory programmer * Read: "Home Brew" on ORIGINAL XBOX - a detailed article on why and how * The tools in this repo will all

Koos du Preez 25 Feb 14, 2022
manually map driver for a signed driver memory space

smap manually map driver for a signed driver memory space credits https://github.com/btbd/umap tested system Windows 10 Education 20H2 UEFI installati

ekknod 71 Apr 9, 2022
Memory instrumentation tool for android app&game developers.

Overview LoliProfiler is a C/C++ memory profiling tool for Android games and applications. LoliProfiler supports profiling debuggable applications out

Tencent 416 May 13, 2022
A single file drop-in memory leak tracking solution for C++ on Windows

MemLeakTracker A single file drop-in memory leak tracking solution for C++ on Windows This small piece of code allows for global memory leak tracking

null 22 Apr 23, 2022
Dump the memory of a PPL with a userland exploit

PPLdump This tool implements a userland exploit that was initially discussed by James Forshaw (a.k.a. @tiraniddo) - in this blog post - for dumping th

Clément Labro 546 May 14, 2022
An In-memory Embedding of CPython

An In-memory Embedding of CPython This repository contains all the build artifacts necessary to build an embedding of CPython 3.8.2 that can be run en

null 100 Apr 26, 2022
Initialize the 8-bit computer memory with a program to be executed automatically on powering.

Initialize the 8-bit computer memory with a program to be executed automatically on powering. This project is small extension of Ben Eater's computer

Dmytro Striletskyi 62 Dec 13, 2021