Libmill is a library that introduces Go-style concurrency to C.
Documentation
For the documentation check the project website:
License
Libmill is licensed under MIT/X11 license.
I have a prototype project at https://github.com/benjolitz/blaster
In the most recent commit, I've rewritten my approach to be more clean.
After ~100 hits from apache bench on OSX, I get a panic: multiple coroutines waiting for a single file descriptor
Here is a console session:
(cpython27) benjolitz-laptop:~/software/blaster [master]$ make clean debug ; lldb ./blaster -- 8051
Deleting blaster symlink
Deleting directories
Creating directories
Beginning debug build
Compiling: src//blaster.c -> build/debug//blaster.o
Compile time: 0
Compiling: src//contrib/http_parser.c -> build/debug//contrib/http_parser.o
Compile time: 0
Linking: bin/debug/blaster
Link time: 0
Making symlink: blaster -> bin/debug/blaster
Total build time: 0
(lldb) target create "./blaster"
Current executable set to './blaster' (x86_64).
(lldb) settings set -- target.run-args "8051"
(lldb) run
Process 81146 launched: './blaster' (x86_64)
Starting 1 process(es)
Listening on port 8051
panic: multiple coroutines waiting for a single file descriptor
Process 81146 stopped
* thread #1: tid = 0x99b7c4, 0x00007fff8bd0b002 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
frame #0: 0x00007fff8bd0b002 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
-> 0x7fff8bd0b002 <+10>: jae 0x7fff8bd0b00c ; <+20>
0x7fff8bd0b004 <+12>: movq %rax, %rdi
0x7fff8bd0b007 <+15>: jmp 0x7fff8bd05bdd ; cerror_nocancel
0x7fff8bd0b00c <+20>: retq
(lldb) bt
* thread #1: tid = 0x99b7c4, 0x00007fff8bd0b002 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00007fff8bd0b002 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff93eb35c5 libsystem_pthread.dylib`pthread_kill + 90
frame #2: 0x00007fff8a2316e7 libsystem_c.dylib`abort + 129
frame #3: 0x0000000100010737 libmill.15.dylib`mill_panic + 39
frame #4: 0x0000000100011cbc libmill.15.dylib`mill_fdwait + 137
frame #5: 0x0000000100012d20 libmill.15.dylib`tcpaccept + 144
frame #6: 0x000000010000181f blaster`main(arg_count=2, args=0x00007fff5fbff830) + 511 at blaster.c:182
frame #7: 0x00007fff91bc65ad libdyld.dylib`start + 1
(lldb)
After inspecting tcp.c
, I cannot find where calling tcpclose() deregisters the connection explicitly.
Also, I cannot figure out how to pass a tcpsock into a new coroutine for handling HTTP/1.1 connections cleanly.
Following error happens:
chan: chan.c:87: int main(): Assertion `val == 888' failed.
make: *** [test_chan] Aborted
The value is actually 999. Seems like there's some aggressive and possibly incorrect reordering going on.
Hi, love the library. I've been playing around with porting it to windows under msvc. The go function was fairly straight forward although I had to resort to assembly in the end. MSVC doesn't support VLAs, and things crashed if I tried alloca. I think it tries to write to the intervening memory. It ends up looking something like the following:
do {\
void *mill_sp = mill_go_prologue();\
if(mill_sp) {\
__asm {mov mill_rax, mill_rsp}\
__asm {mov mill_rsp, mill_sp}\
__asm {push mill_rax}\
fn;\
__asm {pop mill_rsp}\
mill_go_epilogue();\
}\
} while(0)
For the choose statement I've taken a different approach. The choose implementation at the moment uses goto labels which is very specific to GCC. The approach I've been playing with looks like the following:
switch (choose(IN, ch1, IN, ch2, END)) {
case 0:
printf("coroutine 'a(%d)' has finished first!\n", ch1->value);
break;
case 1:
printf("coroutine 'b(%d)' has finished first!\n", ch2->value);
break;
}
choose is then a var arg function which takes an enum (IN, OUT, END, DEFAULT) channel pairs. The value can be accessed as channel->value, which works because the channels pointers are typed (see next). This approach also seems to drastically simplify the choose logic as it doesn't have to track the labels (e.g. I've removed the clause structure and the channel just tracks the list of continuations).
I've also been playing with having the channels be typed. I've played with this concept before with a containers library. To use a channel you would first have a call to DECL_CHAN(name, type) somewhere in the header. You can then use chan(name) as the channel type. For example:
DECL_CHAN(int, int);
void worker(int count, const char *text, chan(int) ch) { int i; for(i = 0; i != count; ++i) { printf("%s\n", text); musleep(10000); } chs(ch, count); chclose(ch); }
int main() {
chan(int) ch1 = chmake(int, 0);
go(worker(4, "a", chdup(ch1)));
chan(int) ch2 = chmake(int, 0);
go(worker(2, "b", chdup(ch2)));
switch (choose(IN, ch1, IN, ch2, END)) {
case 0:
printf("coroutine 'a(%d)' has finished first!\n", ch1->value);
break;
case 1:
printf("coroutine 'b(%d)' has finished first!\n", ch2->value);
break;
}
chclose(ch2);
chclose(ch1);
return 0;
}
Thoughts?
I have made a preliminary test to see whether there's a significant improvement in an assembly implementation of the context switching code. On x86-64 there's a 17-18% improvement on the go perf test and 28% improvment in just context switching.
I'll setup a github branch when I get home or find a wifi access point without the ssh ports disabled...
With inline-assembly:
$ cat /proc/cpuinfo | grep CPU | head -n1
model name : Intel(R) Core(TM) i3-5010U CPU @ 2.10GHz
$ ./perf/go 10
executed 10M coroutines in 0.780000 seconds
duration of one coroutine creation+termination: 78 ns
coroutine creations+terminatios per second: 12.820512M
$ ./perf/ctxswitch 10
performed 10M context switches in 0.316000 seconds
duration of one context switch: 31 ns
context switches per second: 32.258064M
With old sigsetjmp/siglongjmp
$ cat /proc/cpuinfo | grep CPU | head -n1
model name : Intel(R) Core(TM) i3-5010U CPU @ 2.10GHz
$ ./perf/go 10
executed 10M coroutines in 0.948000 seconds
duration of one coroutine creation+termination: 94 ns
coroutine creations+terminatios per second: 10.638297M
$ ./ctxswitch 10
performed 10M context switches in 0.437000 seconds
duration of one context switch: 43 ns
context switches per second: 23.255812M
I would like to develop an http server using libmill. As a first step, I am planning to port an existing go based http server library using libmill. (https://github.com/valyala/fasthttp) Is it a good idea? Will I get more performance than Go? (I am expecting more performance since this web server will be in C)
I hope, the porting will be easy as there is 1:1 mapping of concurrency model. And I would like to port this (new webserver) code to libdill once it is matured.
questionI recommend never calling 'abort' or 'exit' from within library code, unless it is built in a Debug mode for testing.
https://github.com/sustrik/libmill/blob/master/utils.h#L74 is called from: https://github.com/sustrik/libmill/blob/master/valbuf.c#L47
You undermine the behavior of programs using your library. Some programs may be running in critical or sensitive contexts and need to do some level of cleanup before exiting.
You should return an error instead and let the calling code handle that.
Build step6 of the tutorial on Fedora 20:
gcc -o server step6.c -lmill
Run server
Connect two clients
Wait until server times out and close the connection
As soon as the first connection times out, server segfaults
$ gdb ./server
GNU gdb (GDB) Fedora 7.7.1-21.fc20
...
Reading symbols from ./server...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/nsoffer/src/libmill/tutorial/server
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Total number of connections: 1
Active connections: 1
Failed connections: 0
Total number of connections: 2
Active connections: 2
Failed connections: 0
Total number of connections: 2
Active connections: 1
Failed connections: 1
Program received signal SIGSEGV, Segmentation fault.
0x000000305fa10c78 in ?? () from /lib64/libc.so.6
(gdb) bt full
#0 0x000000305fa10c78 in ?? () from /lib64/libc.so.6
No symbol table info available.
#1 0x000000305f6092b9 in check_match (sym=0x7ffff7d84b04) at dl-lookup.c:149
stt = <optimized out>
verstab = <optimized out>
type_class = 32767
ref = 0x700000001
strtab = 0x0
undef_name = 0x0
map = 0xffffffff00000001
version = 0x600000000
symidx = 4158409904
flags = -1
num_versions = 1
versioned_sym = 0x4
#2 0x000000305f609acb in do_lookup_x (new_hash=new_hash@entry=294296705, old_hash=old_hash@entry=0x7ffff7d84c00, result=result@entry=0x7ffff7d84c10, scope=<optimized out>,
i=<optimized out>, i@entry=0, flags=1, flags@entry=-8096, skip=skip@entry=0x0, undef_map=undef_map@entry=0x305f821168) at dl-lookup.c:249
hasharr = 0x7ffff7dc92b4
bucket = <optimized out>
bitmask_word = <optimized out>
hashbit1 = 1
hashbit2 = <optimized out>
symtab = 0x7ffff7dc9350
sym = <optimized out>
bitmask = <optimized out>
n = 6
list = 0x0
num_versions = 0
versioned_sym = 0x0
map = 0x7ffff7ffa660
type_class = 1
undef_name = 0x40060f "mill_go_epilogue"
strtab = 0x7ffff7dc9b00 ""
symidx = 44
ref = 0x4004e8
version = 0x0
#3 0x000000305f609daf in _dl_lookup_symbol_x (undef_name=0x40060f "mill_go_epilogue", undef_map=0x305f821168, ref=0x7ffff7d84cc8, symbol_scope=0x305f8214c0, version=0x0,
type_class=1, flags=-8096, skip_map=0x0) at dl-lookup.c:737
res = <optimized out>
start = 0
new_hash = 294296705
---Type <return> to continue, or q <return> to quit---
old_hash = 4294967295
current_value = {s = 0x7ffff7dc9770, m = 0x7ffff7ffa660}
scope = 0x305f8214c0
i = 0
protected = <optimized out>
#4 0x0000000000000001 in ?? ()
No symbol table info available.
#5 0x00007fffffffe060 in ?? ()
No symbol table info available.
#6 0x0000000000000000 in ?? ()
No symbol table info available.
Hello, Martin.
As you might remember I maintain https://github.com/Zewo/Venice. A Swift library that wraps libmill.
Unfortunately we're having a bit of a leakage problem on macOS. Swift uses Foundation under the covers on Darwin which in turn uses ARC. The problem is that because of libmill context switches the calls to push/pop on the autorelease pool get unbalanced.
To deal with this we basically need a hook for right before any resume of a co-routine and a hook for right after it’s suspended and a guarantee that those hooks are always 1:1.
We tried inspecting the code ourselves, but I'm afraid we couldn't find all the places. Could you help us pinpoint the actual places the context switches occur, or the best place to put those hooks, given we need 1:1?
questionwhen binding to libmill from node.js (basically a C++ environment), i got:
error: typedef redefinition with different types ('struct tcpsock *' vs 'tcpsock')
the proposed change avoids the problem, allowing me to bind to libmill
we should also take a look at the use of assert throughout the code base, since configuring with cmake -DCMAKE_BUILD_TYPE=Release ..
for optimized builds will use the -NDEBUG
flag which will remove the expressions within assert statements entirely. That, or possibly force cmake to not build with that flag for the tests at least.
The problem with assert from assert.h is that they're meant to be removed with -NDEBUG
. It would be preferable to have an assert macro that either checked the return code or didn't based on symbol definition at compile time, not whether to execute code and test or not at all.
I'm integrating libpq (PostgreSQL client driver) with libmill and one of the tests I have is around concurrent SQL transactions on separate connections in the same process.
To help simulate concurrency I sleep a transaction in the middle to allow a second transaction to begin. While doing so I keep encountering an issue where msleep
never returns.
At first I figured this was an issue with libpq and threading or something but after debugging a ton @raedwulf suggested I try libdill just to see if it's a bug in libpq or libmill and sure enough using libdill msleep never causes an issue.
Attached is the libmill code and the libdill code. I can't really figure out where the failure scenario is, goredump() displays all the coroutines just fine. Maybe it's worthwhile looking into the timers list and see if it's gone missing?
When running the sample what happens is Transaction A starts and then half way through it sleeps where msleep(now() + 2000)
is. Transaction b starts then it sleeps and wakes where msleep(now() + 500)
is. Transaction B halts in it tracks waiting on data from the server (because on the server it's blocked by Transaction A) but Transaction A never wakes up so it can never continue.
@sustrik
To someone who has read a decent amount of your work, it probably feels obvious that libdill is basically the successor of libmill: libdill was written later, libdill looks more like C, libdill has a more recently active GitHub commit history, libdill seems to be favored by you, and libdill's website is still alive.
However, people who come here have a high chance of not being aware of the above points, and might even spend time adopting libmill when libdill is probably the better choice.
Proposal: add a note to the start of the libmill README on this repo which is says that libdill is recommended over libmill, and then ideally archive this repo entirely if you are not looking to actively maintain it anymore.
It seems libmill.org expired and got parked some time ago, and libmill's homepage is inaccessible even through the github.io domain. I think if you deleted the CNAME file in the gh-pages branch then https://sustrik.github.io/libmill would work?
/tmp/tmp.F4CPONhvAn/dns/dns.c:303:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
#if HAVE___ATOMIC_FETCH_ADD && __GCC_ATOMIC_LONG_LOCK_FREE == 2
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:295:34: note: expanded from macro 'HAVE___ATOMIC_FETCH_ADD'
#define HAVE___ATOMIC_FETCH_ADD (defined __ATOMIC_RELAXED)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:312:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
#if HAVE___ATOMIC_FETCH_SUB && __GCC_ATOMIC_LONG_LOCK_FREE == 2
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:299:33: note: expanded from macro 'HAVE___ATOMIC_FETCH_SUB'
#define HAVE___ATOMIC_FETCH_SUB HAVE___ATOMIC_FETCH_ADD
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:295:34: note: expanded from macro 'HAVE___ATOMIC_FETCH_ADD'
#define HAVE___ATOMIC_FETCH_ADD (defined __ATOMIC_RELAXED)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:668:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
#if DNS_HAVE_SOCKADDR_UN
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:662:31: note: expanded from macro 'DNS_HAVE_SOCKADDR_UN'
#define DNS_HAVE_SOCKADDR_UN (defined AF_UNIX && !defined _WIN32)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:668:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
/tmp/tmp.F4CPONhvAn/dns/dns.c:662:51: note: expanded from macro 'DNS_HAVE_SOCKADDR_UN'
#define DNS_HAVE_SOCKADDR_UN (defined AF_UNIX && !defined _WIN32)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:723:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
#if DNS_HAVE_SOCKADDR_UN
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:662:31: note: expanded from macro 'DNS_HAVE_SOCKADDR_UN'
#define DNS_HAVE_SOCKADDR_UN (defined AF_UNIX && !defined _WIN32)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:723:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
/tmp/tmp.F4CPONhvAn/dns/dns.c:662:51: note: expanded from macro 'DNS_HAVE_SOCKADDR_UN'
#define DNS_HAVE_SOCKADDR_UN (defined AF_UNIX && !defined _WIN32)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:729:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
#if DNS_HAVE_SOCKADDR_UN
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:662:31: note: expanded from macro 'DNS_HAVE_SOCKADDR_UN'
#define DNS_HAVE_SOCKADDR_UN (defined AF_UNIX && !defined _WIN32)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:729:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
/tmp/tmp.F4CPONhvAn/dns/dns.c:662:51: note: expanded from macro 'DNS_HAVE_SOCKADDR_UN'
#define DNS_HAVE_SOCKADDR_UN (defined AF_UNIX && !defined _WIN32)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:789:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
#if DNS_HAVE_SOCKADDR_UN
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:662:31: note: expanded from macro 'DNS_HAVE_SOCKADDR_UN'
#define DNS_HAVE_SOCKADDR_UN (defined AF_UNIX && !defined _WIN32)
^
/tmp/tmp.F4CPONhvAn/dns/dns.c:789:5: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
/tmp/tmp.F4CPONhvAn/dns/dns.c:662:51: note: expanded from macro 'DNS_HAVE_SOCKADDR_UN'
#define DNS_HAVE_SOCKADDR_UN (defined AF_UNIX && !defined _WIN32)
^
10 warnings generated.
Yesterday i've fix one compiling bug. But seems that there are lot more.
Here is example - ARM, 32 bit, static, GCC 8.3
./configure --enable-debug --enable-shared=false --enable-static
make check
============================================================================
Testsuite summary for libmill 1.18-3-g2dd13ae
============================================================================
# TOTAL: 18
# PASS: 18
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================
ARM, 32 bit, dynamic, GCC 8.3
./configure --enable-debug
make check
============================================================================
Testsuite summary for libmill 1.18-3-g2dd13ae
============================================================================
# TOTAL: 18
# PASS: 18
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================
Now without same without debug:
./configure
============================================================================
Testsuite summary for libmill 1.18-3-g2dd13ae
============================================================================
# TOTAL: 18
# PASS: 16
# SKIP: 0
# XFAIL: 0
# FAIL: 2
# XPASS: 0
# ERROR: 0
============================================================================
(failed was udp and tcp)
Same test on Ubuntu 20.04 LTS, 64 bit, GCC 9.3.0:
============================================================================
Testsuite summary for libmill 1.18-3-g2dd13ae
============================================================================
# TOTAL: 18
# PASS: 4
# SKIP: 0
# XFAIL: 0
# FAIL: 14
# XPASS: 0
# ERROR: 0
============================================================================
(only that passed was UDP, IP, FILE and SSL).
I've starting to digging why this happens and will publish patches as soon as i was able to fix something.
Encountered the following FTBFS on Debian "unstable" x86_64:
In file included from ip.c:44:
ip.c: In function 'mill_ipremote_':
dns/dns.h:1009:24: error: lvalue required as unary '&' operand
1009 | #define dns_opts(...) (&dns_quietinit((struct dns_options)DNS_OPTS_INIT(__VA_ARGS__)))
| ^
ip.c:268:31: note: in expansion of macro 'dns_opts'
268 | mill_dns_hints, NULL, dns_opts(), &rc);
| ^~~~~~~~
libgo libgo -- a coroutine library and a parallel Programming Library Libgo is a stackful coroutine library for collaborative scheduling written in C+
Async++ Async++ is a lightweight concurrency framework for C++11. The concept was inspired by the Microsoft PPL library and the N3428 C++ standard pro
Continuous Integration Drone Travis Cirrus Compilers tested in the past include gcc, clang, cygwin, icc, mingw32, mingw64 and suncc across all support
Documentation: latest, development (master) HPX HPX is a C++ Standard Library for Concurrency and Parallelism. It implements all of the corresponding
Grand Central Dispatch Grand Central Dispatch (GCD or libdispatch) provides comprehensive support for concurrent code execution on multicore hardware.
transwarp Doxygen documentation transwarp is a header-only C++ library for task concurrency. It allows you to easily create a graph of tasks where eve
concurrencpp, the C++ concurrency library concurrencpp is a tasking library for C++ allowing developers to write highly concurrent applications easily
HPX is a C++ Standard Library for Concurrency and Parallelism. It implements all of the corresponding facilities as defined by the C++ Standard. Additionally, in HPX we implement functionalities proposed as part of the ongoing C++ standardization process. We also extend the C++ Standard APIs to the distributed case.
Complementary Programs for course "Linux Kernel Internals" Project Listing tpool: A lightweight thread pool. tinync: A tiny nc implementation using co
YACLib YACLib (Yet Another Concurrency Library) is a C++ library for concurrent tasks execution. Documentation Install guide About dependencies Target
task_system task_system provides a task scheduler for modern C++. The scheduler manages an array of concurrent queues A task, when scheduled, is enque
Light Actor Framework Concurrency is a breeze. Also a nightmare, if you ever used synchronization techniques. Mostly a nightmare, though. This tiny li
The Deadlock Empire A game that teaches locking and concurrency. It runs on https://deadlockempire.github.io. Contributing We gladly welcome all contr
RaftLib is a C++ Library for enabling stream/data-flow parallel computation. Using simple right shift operators (just like the C++ streams that you wo
CO is an elegant and efficient C++ base library that supports Linux, Windows and Mac platforms. It pursues minimalism and efficiency, and does not rely on third-party library such as boost.
cocoyaxi English | 简体中文 A go-style coroutine library in C++11 and more. 0. Introduction cocoyaxi (co for short), is an elegant and efficient cross-pla
LIBMILL Libmill is a library that introduces Go-style concurrency to C. Documentation For the documentation check the project website: http://libmill.
libgo libgo -- a coroutine library and a parallel Programming Library Libgo is a stackful coroutine library for collaborative scheduling written in C+
Async++ Async++ is a lightweight concurrency framework for C++11. The concept was inspired by the Microsoft PPL library and the N3428 C++ standard pro
Continuous Integration Drone Travis Cirrus Compilers tested in the past include gcc, clang, cygwin, icc, mingw32, mingw64 and suncc across all support