An implementation of the MessagePack serialization format in C /[C]

Related tags

Serialization cmp


Build Status Coverage Status

CMP is a C implementation of the MessagePack serialization format. It currently implements version 5 of the MessagePack Spec.

CMP's goal is to be lightweight and straightforward, forcing nothing on the programmer.


While I'm a big believer in the GPL, I license CMP under the MIT license.

Example Usage

The following examples use a file as the backend, and are modeled after the examples included with the msgpack-c project.

#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

#include "cmp.h"

static bool read_bytes(void *data, size_t sz, FILE *fh) {
    return fread(data, sizeof(uint8_t), sz, fh) == (sz * sizeof(uint8_t));

static bool file_reader(cmp_ctx_t *ctx, void *data, size_t limit) {
    return read_bytes(data, limit, (FILE *)ctx->buf);

static bool file_skipper(cmp_ctx_t *ctx, size_t count) {
    return fseek((FILE *)ctx->buf, count, SEEK_CUR);

static size_t file_writer(cmp_ctx_t *ctx, const void *data, size_t count) {
    return fwrite(data, sizeof(uint8_t), count, (FILE *)ctx->buf);

static void error_and_exit(const char *msg) {
    fprintf(stderr, "%s\n\n", msg);

int main(void) {
    FILE *fh = NULL;
    cmp_ctx_t cmp = {0};
    uint32_t array_size = 0;
    uint32_t str_size = 0;
    char hello[6] = {0};
    char message_pack[12] = {0};

    fh = fopen("cmp_data.dat", "w+b");

    if (fh == NULL) {
        error_and_exit("Error opening data.dat");

    cmp_init(&cmp, fh, file_reader, file_skipper, file_writer);

    if (!cmp_write_array(&cmp, 2)) {

    if (!cmp_write_str(&cmp, "Hello", 5)) {

    if (!cmp_write_str(&cmp, "MessagePack", 11)) {


    if (!cmp_read_array(&cmp, &array_size)) {

    /* You can read the str byte size and then read str bytes... */

    if (!cmp_read_str_size(&cmp, &str_size)) {

    if (str_size > (sizeof(hello) - 1)) {
        error_and_exit("Packed 'hello' length too long\n");

    if (!read_bytes(hello, str_size, fh)) {

     * ...or you can set the maximum number of bytes to read and do it all in
     * one call

    str_size = sizeof(message_pack);
    if (!cmp_read_str(&cmp, message_pack, &str_size)) {

    printf("Array Length: %u.\n", array_size);
    printf("[\"%s\", \"%s\"]\n", hello, message_pack);


    return EXIT_SUCCESS;

Advanced Usage

See the examples folder.

Fast, Lightweight, Flexible, and Robust

CMP uses no internal buffers; conversions, encoding and decoding are done on the fly.

CMP's source and header file together are ~4k LOC.

CMP makes no heap allocations.

CMP uses standardized types rather than declaring its own, and it depends only on stdbool.h, stdint.h and string.h.

CMP is written using C89 (ANSI C), aside, of course, from its use of fixed-width integer types and bool.

On the other hand, CMP's test suite requires C99.

CMP only requires the programmer supply a read function, a write function, and an optional skip function. In this way, the programmer can use CMP on memory, files, sockets, etc.

CMP is portable. It uses fixed-width integer types, and checks the endianness of the machine at runtime before swapping bytes (MessagePack is big-endian).

CMP provides a fairly comprehensive error reporting mechanism modeled after errno and strerror.

CMP is thread aware; while contexts cannot be shared between threads, each thread may use its own context freely.

CMP is tested using the MessagePack test suite as well as a large set of custom test cases. Its small test program is compiled with clang using -Wall -Werror -Wextra ... along with several other flags, and generates no compilation errors in either clang or GCC.

CMP's source is written as readably as possible, using explicit, descriptive variable names and a consistent, clear style.

CMP's source is written to be as secure as possible. Its testing suite checks for invalid values, and data is always treated as suspect before it passes validation.

CMP's API is designed to be clear, convenient and unsurprising. Strings are null-terminated, binary data is not, error codes are clear, and so on.

CMP provides optional backwards compatibility for use with other MessagePack implementations that only implement version 4 of the spec.


There is no build system for CMP. The programmer can drop cmp.c and cmp.h in their source tree and modify as necessary. No special compiler settings are required to build it, and it generates no compilation errors in either clang or gcc.


CMP's versions are single integers. I don't use semantic versioning because I don't guarantee that any version is completely compatible with any other. In general, semantic versioning provides a false sense of security. You should be evaluating compatibility yourself, not relying on some stranger's versioning convention.


I only guarantee stability for versions released on the releases page. While rare, both master and develop branches may have errors or mismatched versions.

Backwards Compatibility

Version 4 of the MessagePack spec has no BIN type, and provides no STR8 marker. In order to remain backwards compatible with version 4 of MessagePack, do the following:

Avoid these functions:

  • cmp_write_bin
  • cmp_write_bin_marker
  • cmp_write_str8_marker
  • cmp_write_str8
  • cmp_write_bin8_marker
  • cmp_write_bin8
  • cmp_write_bin16_marker
  • cmp_write_bin16
  • cmp_write_bin32_marker
  • cmp_write_bin32

Use these functions in lieu of their v5 counterparts:

  • cmp_write_str_marker_v4 instead of cmp_write_str_marker
  • cmp_write_str_v4 instead of cmp_write_str
  • cmp_write_object_v4 instead of cmp_write_object

Disabling Floating Point Operations

Thanks to tdragon it's possible to disable floating point operations in CMP by defining CMP_NO_FLOAT. No floating point functionality will be included. Fair warning: this changes the ABI.

Setting Endianness at Compile Time

CMP will honor WORDS_BIGENDIAN. If defined to 0 it will convert data to/from little-endian format when writing/reading. If defined to 1 it won't. If not defined, CMP will check at runtime.

  • Enable to catch compiler issues (C89, future gcc warnings etc)

    Enable to catch compiler issues (C89, future gcc warnings etc)

    Hello again,

    thanks for all the recent patches and improvements.

    RE: c89 support

    How can we prevent this (and other compiler gotchas) in the future? I don't know your workflow but I can guess you might be a busy guy.

    I propose setting up for this project that that will compile camgunz/cmp in a few ways (-std=c89 -Wal -Wextra -Werr) and maybe with Clang and it's static analyzer and beep if warnings occur.

    I can set this up as a fork+PR but in the end you'll have to be comfortable with having access to your repo.



    opened by client9 15
  • Add Skipping

    Add Skipping

    So #3 proposed adding skipping to CMP, and there's some discussion there.

    After trying to implement a version, it looks like the only way to fully implement this is by creating a SAX-style state machine.

    "WHY!?", you might ask. Well I'll tell you, hopefully I'm wrong.

    It's not a backend-support problem. We can add an optional skip callback, and CMP can just set an error whenever cmp_skip_object is called on a context where that callback is NULL.

    The problem is nested arrays and maps. The naive approach is to just have cmp_skip_object recursively call itself, but that leaves CMP open to stack overflow attacks via specifically-crafted data. I absolutely will not do that.

    The alternative is to have a bunch of state in cmp_ctx_t itself and use the heap. There are a few downsides to this:

    • It makes heap allocations, and it's doing this for every nested array/map. This is, therefore, a vector for memory exhaustion given specifically-crafted data.
    • It's a complicated little state machine, in contrast with the rest of CMP, which is extremely clear.
    • It would cause CMP to depend on the C Standard Library, and thus would require a build system. Not that that's a big deal, but it's still a definite paradigm shift.

    I vote against adding skipping to CMP. To use the example in #3 of an RPC server, let's say you're getting MessagePack data as a stream and that's your CMP backend (error handling omitted):

    uint32_t map_size;
    cmp_read_map(&cmp, &map_size)
    for (uint32_t i = 0; i < map_size; i++) {
        char *arg_name;
        uint32_t str_size;
        cmp_read_str_size(&cmp, &str_size);
        read_netstream_string(&arg_name, str_size);
        if (!rpc_arg_is_valid(arg_name)) {
            cmp_read_str_size(&cmp, &str_size);

    This is pretty simple, and the only thing that would be different if CMP added cmp_skip_next_object is skip_netstream_bytes(str_size) is replaced with cmp_skip_next_object(&cmp).

    The problem is that cmp_skip_next_object might have to skip a map containing 5000 other arrays, each containing 5000 arrays, each containing 5000 arrays that each contain 5000 entries of the Gettysburg Address. Skipping an unwanted string is much simpler than skipping the next object, whatever it might be. Furthermore, skipping can most easily be handled using backend API's designed specifically for that; CMP can add no value there. Therefore, I think adding skipping to CMP is out of scope.

    That said, I'm always open to arguments! :) If this is a feature you're really needing and you've got a cool idea on how to do it, I'm absolutely happy to work on it (or, even better, merge a PR ;) ). I just think it's not feasible.

    opened by camgunz 14
  • The library is too perfect ;)

    The library is too perfect ;)

    Yes... thats the issue, ok, let me explain: The library is fully fulfilling the new spec, while most other implementations (Perl, JavaScript, to be specific) just support the old types. Which is something you dont notice AS LONG you never have a STR8 ... cause thats literally the only case where the stuff crash (And there are is no BIN support in the old spec, but that is not those implementations use it at all).

    There is actually a very easy simple trick here to fix that, the library just needs to ignore STR8 for the packing and directly make STR16 instead. (2 comments, a define?) Out of the fact that I think many people will run into this problem if they mix the implementations (its REALLY not an obvious problem, took me 1-2 days to find out), i think it would be awesome if by default that would be part of the library (I can make the pull request, i mean its trivial), the question would only be how in my eyes (with define or different), but i think its crucial that the default behaviour just fits into "both ways".... the people still can enforce STR8 and they can still unpack STR8, there is no damage just the killing of problems of the bad state of the msgpack implementations.

    What you think?

    (P.S.: I am working on fixing the situation more general, but its hard to find responsible people ;), this is a first approach that would at least "fix" the situation)

    opened by Getty 13
  • Static Analysis warnings

    Static Analysis warnings

    First thank you for the excellent library :)

    In cmp.c there are a lot of warnings reported by our analysis tool. See screenshot. The line numbers may not match, but the function names are mentioned. cmp_issues

    Can you please evaluate if there is any problem with these implicit conversions? It would also be nice if you can cast accordingly to show your intention.

    opened by DaelDe 10
  • Use cmp with char buffer instead of file

    Use cmp with char buffer instead of file


    I'm new to MsgPack and C, but I'm loving your implementation so far. I already have some great results. I'm using it to unpack / pack data from a network socket, which stores it into an unsigned char buffer. I couldn't get cmp to read / write directly on the char buffer instead of a file (your example). Is there a simple way to do that and if yes, could you provide me with a simple example how to do it? Thanks a lot in advance!

    opened by timlehr 9
  • Add peeking

    Add peeking

    In my own code it would be very convenient if I could "peek" at an object before committing to reading it. What I am imagining is that each cmp_read__() function from the main api would have a corresponding peek method that returns true/false if the corresponding cmp_read__() function would, in principle, be successful.

    if (cmp_peek_str(cmp)) { // It's a string, read it with cmp_read_str } else if (cmp_peek_int(cmp)) { // It's an int, read it with cmp_read_int } else if (cmp_peek_double(cmp)) { // etc.. } else { // error - not supported }

    While you can accomplish something similar with cmp_read_object() by looking at the type of the cmp_object, you can't then take advantage of the logic already embedded in the cmp_read__() functions to know how the CMP_TYPE__ enumerations relate to each other and should be handled.

    The duty of error checking would still fall mostly on the read functions but you could also add a cmp_peek_error() method that returns true/false if nothing can be peeked because of an error.

    opened by jdpalmer 9
  • Support compiling without floating point operations

    Support compiling without floating point operations

    We would like to use the CMP library for kernel->user mode communication. But it is not allowed to use floating point instructions in the kernel. Therefore we have to exclude everything FPU related from code during compilation.

    opened by tdragon 7
  • Added skipping support

    Added skipping support

    This is another attempt to add skipping support (see also #5). This implementation is based on code from CWPack and does not use recursion.

    Before this is merged, two things should be considered:

    1. obj_skip_count is currently a uint32_t and could overflow (e.g. while reading malicious data). Extending the type to uint64_t would help with the problem (at least a bit), but I didn't really want to change it, since I use cmp on an embedded device. Maybe instead, we could check for overflow and exit with an error.
    2. Skipping of strings/bin/ext currently reads one byte at a time, since there is no buffer available. This could be replaced with a skip/seek function pointer. Instead a buffer of configurable size could be added (see #3).
    opened by m-wichmann 6
  • Implicitly convert between float and doubles

    Implicitly convert between float and doubles

    When developing an application communicating between Python and C I ran into a problem with floating point numbers: python doesn't distinguish between simple and double precision. I figured it made sense to allow cmp to cast them transparently on read.

    opened by antoinealb 6
  • Ext type and data size are inverted

    Ext type and data size are inverted

    Hello, cmp_write_ext8_marker (and possibly others, as it is the only one I checked) have a slight error. They write the extension type before the data size, where the MessagePack 5 specification says :

    ext 8 stores an integer and a byte array whose length is upto (2^8)-1 bytes:
    |  0xc7  |XXXXXXXX|  type  |  data  |
    * XXXXXXXX is a 8-bit unsigned integer which represents N

    Can you fix it, or do you prefer if I do it myself ? Also, I think using some kind of unit testing for CMP would be awesome.

    thank you for this very useful library by the way.

    opened by antoinealb 6
  • cmp_skip_object_limit() doesn't work as documented for nested arrays

    cmp_skip_object_limit() doesn't work as documented for nested arrays

    The function cmp_skip_object_limit() is documented to have its limit be a limit of depth, but it doesn't work properly when you have nested arrays or maps.

    For example, if one had an array of size 10 that contained 10 zero-length arrays within, that should be fine with a limit of depth 2 according to the description. However, because of the way the code is written, each time an array is encountered, the depth is increased, regardless of whether it was nested or not. As a concrete example, a diff for test/test.c is attached. If you apply that, build, and run ./cmptest, the second bit of tests added fails, displaying this behavior.

    Given that cmp.c is written without any heap allocations, I'm not sure what the solution is for tracking the depth here. One could add an arbitrarily-sized array to keep track of some data, but perhaps you can think of a more appropriate solution. Let me know if you need any more info on the specific issue!

    opened by dfego 5
  • Feature request -- nested object skipping

    Feature request -- nested object skipping

    After reading through the code (and using it in a pet project -- great work! Thanks!!), I have come across a situation where I need some way to skip over nested objects to find specific keys in a map. I have been unpacking into a nested map and then walking the map, but that adds a frustrating amount of complexity to my case, when I'd rather just use the msgpack binary blob as the memory backing the data structure.

    You have in your comments that there is no way to do nested skips without allocations, but couldn't you just add a counter into cmp_ctx_t that tracks some limited amount of nesting (configurable at compile-time)?


    Then each recursion can use CMP_RECURSION_DEPTH_BYTES to count up to the number of elements per recursion, and CMP_RECURSION_DEPTH can count up to the number of recursions supported. Obviously, this would be easier with allocations, but I completely understand your reluctance to add that.

    Alternatively, you could handle it at runtime by adding

      uint8_t recursion_max; // set by cmp_ctx_init_with_recursion (new function)
      uint8_t recursion_stack[0]; // allocated along with the cmp_ctx_t struct

    to the cmp_ctx_t struct. I think I'm over-complicating your very nice library, though.

    If this is something you think is worthwhile, I would be happy to work on it and put in a PR for review. If not, I will likely still work on it and just keep the code in my own project. Thoughts?

    opened by jrmolin 2
  • integer API cleanup

    integer API cleanup

    The read/write functions are not symmetric and confusing:

    bool cmp_write_integer(cmp_ctx_t *ctx, int64_t d);
    bool cmp_read_int(cmp_ctx_t *ctx, int32_t *i);
    bool cmp_read_long(cmp_ctx_t *ctx, int64_t *u);
    bool cmp_read_integer(cmp_ctx_t *ctx, int64_t *u);

    I would suggest the following:

    • remove int/integer duplicate functions
    • uses distinct names for 32-bit and 64-bit types: int/long or maybe int32/int64
    • add dedicated 32-bit write function (read function is already there) -> this is especially advantageous for 32-bit systems, e.g. microcontrollers, where 64-bit ints are emulated in software
    bool cmp_write_int(cmp_ctx_t *ctx, int32_t d);
    bool cmp_write_long(cmp_ctx_t *ctx, int64_t d);
    bool cmp_read_int(cmp_ctx_t *ctx, int32_t *i);
    bool cmp_read_long(cmp_ctx_t *ctx, int64_t *d);
    opened by jrahlf 1
  • Add support for zero-copy reading

    Add support for zero-copy reading

    As I see, currently to read a huge binary data, they need to be written to the buffer by reading function. What I would like to have is to get a pointer to memory-mapped binary instead of providing a buffer that will be eventually filled with binary data by reader function.

    opened by mateuszj 4
  • v20(Mar 29, 2022)

    What's Changed

    • Fix issue#55 bigendian host decode float and double by @dcurrie in
    • Avoid overflow for saturated uint32_t by @horta in
    • cmp.h self sufficient by @fperrad in
    • some cleanup in test/ by @fperrad in
    • Eliminate signed/unsigned conversion warnings. by @horta in
    • Various minor cleanups + adding GitHub Actions CI (no coverage yet) by @iphydf in

    New Contributors

    • @dcurrie made their first contribution in
    • @horta made their first contribution in
    • @iphydf made their first contribution in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • v19(Sep 8, 2020)

    • Deprecate cmp_skip_object_limit and add cmp_skip_object_flat
    • Support compiling without floating point operations (thanks @tdragon)
    • Constify many read-only parameters and other sundry linting (thanks @fperrad)
    • Support CMake (thanks @sc07sap)
    • Add some initial support for profiling
    Source code(tar.gz)
    Source code(zip)
  • v18(Nov 29, 2017)

  • v17(Jun 19, 2017)

  • v16(Jan 31, 2017)

    Several updates:

    • Quiets various type warnings
    • Adds some helpful object API calls
    • Fixes cmp_read_ufix expecting negative, not positive fixints
    • Adds MessagePack v4 compatibility functions
    • - Puts cmp_read/_write_float/_double functions beneath cmp_write_decimal
    • Adds a code of conduct
    • Updates copyrights
    • Fixes issue where floats were occasionally corrupted on little-endian machines
    Source code(tar.gz)
    Source code(zip)
  • v10(Dec 3, 2014)

  • v9(Dec 3, 2014)

  • v8(Nov 19, 2014)

  • v7(Nov 11, 2014)

  • v6(Nov 4, 2014)

  • v5(Oct 27, 2014)

    Adds new object API (cmp_object_is_*, cmp_object_as_*) which, among other things, can be used for peeking, and fixes a bug where EXT type and size were reversed while reading and writing binary output.

    Source code(tar.gz)
    Source code(zip)
  • v4(Sep 5, 2014)

  • v3(May 6, 2014)

  • v2(May 5, 2014)

  • v1(May 5, 2014)

Charlie Gunyon
Charlie Gunyon
MessagePack implementation for C and C++ /[C/C++]

msgpack for C/C++ It's like JSON but smaller and faster. Overview MessagePack is an efficient binary serialization format, which lets you exchange dat

MessagePack 2.6k Dec 31, 2022
Msgpack11 - A tiny MessagePack library for C++11 ([C++11])

What is msgpack11 ? msgpack11 is a tiny MsgPack library for C++11, providing MsgPack parsing and serialization. This library is inspired by json11. Th

Masahiro Wada 100 Dec 1, 2022
Your binary serialization library

Bitsery Header only C++ binary serialization library. It is designed around the networking requirements for real-time data delivery, especially for ga

Mindaugas Vinkelis 771 Jan 2, 2023
Microsoft 2.5k Dec 31, 2022
Cap'n Proto serialization/RPC system - core tools and C++ library

Cap'n Proto is an insanely fast data interchange format and capability-based RPC system. Think JSON, except binary. Or think Protocol Buffers, except

Cap'n Proto 9.5k Jan 1, 2023
A C++11 library for serialization

cereal - A C++11 library for serialization cereal is a header-only C++11 serialization library. cereal takes arbitrary data types and reversibly turns

iLab @ USC 3.4k Jan 3, 2023
Fast Binary Encoding is ultra fast and universal serialization solution for C++, C#, Go, Java, JavaScript, Kotlin, Python, Ruby, Swift

Fast Binary Encoding (FBE) Fast Binary Encoding allows to describe any domain models, business objects, complex data structures, client/server request

Ivan Shynkarenka 654 Jan 2, 2023
FlatBuffers: Memory Efficient Serialization Library

FlatBuffers FlatBuffers is a cross platform serialization library architected for maximum memory efficiency. It allows you to directly access serializ

Google 19.7k Jan 9, 2023
Yet Another Serialization

YAS Yet Another Serialization - YAS is created as a replacement of boost.serialization because of its insufficient speed of serialization (benchmark 1

niXman 596 Jan 7, 2023
Binary Serialization

Binn Binn is a binary data serialization format designed to be compact, fast and easy to use. Performance The elements are stored with their sizes to

null 383 Dec 23, 2022
Simple C++ 20 Serialization Library that works out of the box with aggregate types!

BinaryLove3 Simple C++ 20 Serialization Library that works out of the box with aggregate types! Requirements BinaryLove3 is a c++20 only library.

RedSkittleFox 14 Sep 2, 2022
Zmeya is a header-only C++11 binary serialization library designed for games and performance-critical applications

Zmeya Zmeya is a header-only C++11 binary serialization library designed for games and performance-critical applications. Zmeya is not even a serializ

Sergey Makeev 99 Dec 24, 2022
CppSerdes is a serialization/deserialization library designed with embedded systems in mind

A C++ serialization/deserialization library designed with embedded systems in mind

Darren V Levine 79 Nov 5, 2022
Serialization framework for Unreal Engine Property System that just works!

DataConfig Serialization framework for Unreal Engine Property System that just works! Unreal Engine features a powerful Property System which implemen

null 81 Dec 19, 2022
Header-only library for automatic (de)serialization of C++ types to/from JSON.

fuser 1-file header-only library for automatic (de)serialization of C++ types to/from JSON. how it works The library has a predefined set of (de)seria

null 51 Dec 17, 2022
Yet Another Serialization

YAS Yet Another Serialization - YAS is created as a replacement of boost.serialization because of its insufficient speed of serialization (benchmark 1

niXman 455 Sep 7, 2021
C++17 library for all your binary de-/serialization needs

blobify blobify is a header-only C++17 library to handle binary de-/serialization in your project. Given a user-defined C++ struct, blobify can encode

Tony Wasserka 247 Dec 8, 2022
universal serialization engine

A Universal Serialization Engine Based on compile-time Reflection iguana is a modern, universal and easy-to-use serialization engine developed in c++1

qicosmos 711 Jan 7, 2023
Yet another JSON/YAML/BSON serialization library for C++.

ThorsSerializer Support for Json Yaml Bson NEW Benchmark Results Conformance mac linux Performance max linux For details see: JsonBenchmark Yet anothe

Loki Astari 281 Dec 10, 2022