🗄️ single header json parser for C and C++

Overview

🗄️ json.h

Actions Status Build status Sponsor

A simple single header solution to parsing JSON in C and C++.

JSON is parsed into a read-only, single allocation buffer.

The current supported compilers are gcc, clang and msvc.

The current supported platforms are Windows, mac OS and Linux.

Usage

Just #include "json.h" in your code!

json_parse

Parse a json string into a DOM.

struct json_value_s *json_parse(
    const void *src,
    size_t src_size);
  • src - a utf-8 json string to parse.
  • src_size - the size of src in bytes.

Returns a struct json_value_s* pointing the root of the json DOM.

struct json_value_s

The main struct for interacting with a parsed JSON Document Object Model (DOM) is the struct json_value_s.

struct json_value_s {
  void *payload;
  size_t type;
};
  • payload - a pointer to the contents of the value.
  • type - the type of struct payload points to, one of json_type_e. Note: if type is json_type_true, json_type_false, or json_type_null, payload will be NULL.

json_parse_ex

Extended parse a json string into a DOM.

struct json_value_s *json_parse_ex(
    const void *src,
    size_t src_size,
    size_t flags_bitset,
    void*(*alloc_func_ptr)(void *, size_t),
    void *user_data,
    struct json_parse_result_s *result);
  • src - a utf-8 json string to parse.
  • src_size - the size of src in bytes.
  • flags_bitset - extra parsing flags, a bitset of flags specified in enum json_parse_flags_e.
  • alloc_func_ptr - a callback function to use for doing the single allocation. If NULL, malloc() is used.
  • user_data - user data to be passed as the first argument to alloc_func_ptr.
  • result - the result of the parsing. If a parsing error occurred this will contain what type of error, and where in the source it occurred. Can be NULL.

Returns a struct json_value_s* pointing the root of the json DOM.

enum json_parse_flags_e

The extra parsing flags that can be specified to json_parse_ex() are as follows:

enum json_parse_flags_e {
  json_parse_flags_default = 0,
  json_parse_flags_allow_trailing_comma = 0x1,
  json_parse_flags_allow_unquoted_keys = 0x2,
  json_parse_flags_allow_global_object = 0x4,
  json_parse_flags_allow_equals_in_object = 0x8,
  json_parse_flags_allow_no_commas = 0x10,
  json_parse_flags_allow_c_style_comments = 0x20,
  json_parse_flags_deprecated = 0x40,
  json_parse_flags_allow_location_information = 0x80,
  json_parse_flags_allow_single_quoted_strings = 0x100,
  json_parse_flags_allow_hexadecimal_numbers = 0x200,
  json_parse_flags_allow_leading_plus_sign = 0x400,
  json_parse_flags_allow_leading_or_trailing_decimal_point = 0x800,
  json_parse_flags_allow_inf_and_nan = 0x1000,
  json_parse_flags_allow_multi_line_strings = 0x2000,
  json_parse_flags_allow_simplified_json =
      (json_parse_flags_allow_trailing_comma |
       json_parse_flags_allow_unquoted_keys |
       json_parse_flags_allow_global_object |
       json_parse_flags_allow_equals_in_object |
       json_parse_flags_allow_no_commas),
  json_parse_flags_allow_json5 =
      (json_parse_flags_allow_trailing_comma |
       json_parse_flags_allow_unquoted_keys |
       json_parse_flags_allow_c_style_comments |
       json_parse_flags_allow_single_quoted_strings |
       json_parse_flags_allow_hexadecimal_numbers |
       json_parse_flags_allow_leading_plus_sign |
       json_parse_flags_allow_leading_or_trailing_decimal_point |
       json_parse_flags_allow_inf_and_nan |
       json_parse_flags_allow_multi_line_strings)
};
  • json_parse_flags_default - the default, no special behaviour is enabled.
  • json_parse_flags_allow_trailing_comma - allow trailing commas in objects and arrays. For example, both [true,] and {"a" : null,} would be allowed with this option on.
  • json_parse_flags_allow_unquoted_keys - allow unquoted keys for objects. For example, {a : null} would be allowed with this option on.
  • json_parse_flags_allow_global_object - allow a global unbracketed object. For example, a : null, b : true, c : {} would be allowed with this option on.
  • json_parse_flags_allow_equals_in_object - allow objects to use '=' as well as ':' between key/value pairs. For example, {"a" = null, "b" : true} would be allowed with this option on.
  • json_parse_flags_allow_no_commas - allow that objects don't have to have comma separators between key/value pairs. For example, {"a" : null "b" : true} would be allowed with this option on.
  • json_parse_flags_allow_c_style_comments - allow c-style comments (// or /* */) to be ignored in the input JSON file.
  • json_parse_flags_deprecated - a deprecated option.
  • json_parse_flags_allow_location_information - allow location information to be tracked for where values are in the input JSON. Useful for alerting users to errors with precise location information pertaining to the original source. When this option is enabled, all json_value_s*'s can be casted to json_value_ex_s*, and the json_string_s* of json_object_element_s*'s name member can be casted to json_string_ex_s* to retrieve specific locations on all the values and keys. Note this option will increase the memory budget required for the DOM used to record the JSON.
  • json_parse_flags_allow_single_quoted_strings - allows strings to be in 'single quotes'.
  • json_parse_flags_allow_hexadecimal_numbers - allows hexadecimal numbers to be used 0x42.
  • json_parse_flags_allow_leading_plus_sign - allows a leading '+' sign on numbers +42.
  • json_parse_flags_allow_leading_or_trailing_decimal_point - allows decimal points to be lead or trailed by 0 digits .42 or 42..
  • json_parse_flags_allow_inf_and_nan - allows using infinity and NaN identifiers Infinity or NaN.
  • json_parse_flags_allow_multi_line_strings - allows strings to span multiple lines.
  • json_parse_flags_allow_simplified_json - allow simplified JSON to be parsed. Simplified JSON is an enabling of a set of other parsing options. See the Bitsquid blog introducing this here.
  • json_parse_flags_allow_json5 - allow JSON5 to be parsed. JSON5 is an enabling of a set of other parsing options. See the website defining this extension here.

Examples

Parsing with json_parse

Lets say we had the JSON string '{"a" : true, "b" : [false, null, "foo"]}'. To get to each part of the parsed JSON we'd do:

const char json[] = "{\"a\" : true, \"b\" : [false, null, \"foo\"]}";
struct json_value_s* root = json_parse(json, strlen(json));
assert(root->type == json_type_object);

struct json_object_s* object = (struct json_object_s*)root->payload;
assert(object->length == 2);

struct json_object_element_s* a = object->start;

struct json_string_s* a_name = a->name;
assert(0 == strcmp(a_name->string, "a"));
assert(a_name->string_size == strlen("a"));

struct json_value_s* a_value = a->value;
assert(a_value->type == json_type_true);
assert(a_value->payload == NULL);

struct json_object_element_s* b = a->next;
assert(b->next == NULL);

struct json_string_s* b_name = b->name;
assert(0 == strcmp(b_name->string, "b"));
assert(b_name->string_size == strlen("b"));

struct json_value_s* b_value = b->value;
assert(b_value->type == json_type_array);

struct json_array_s* array = (struct json_array_s*)b_value->payload;
assert(array->length == 3);

struct json_array_element_s* b_1st = array->start;

struct json_value_s* b_1st_value = b_1st->value;
assert(b_1st_value->type == json_type_false);
assert(b_1st_value->payload == NULL);

struct json_array_element_s* b_2nd = b_1st->next;

struct json_value_s* b_2nd_value = b_2nd->value;
assert(b_2nd_value->type == json_type_null);
assert(b_2nd_value->payload == NULL);

struct json_array_element_s* b_3rd = b_2nd->next;
assert(b_3rd->next == NULL);

struct json_value_s* b_3rd_value = b_3rd->value;
assert(b_3rd_value->type == json_type_string);

struct json_string_s* string = (struct json_string_s*)b_3rd_value->payload;
assert(0 == strcmp(string->string, "foo"));
assert(string->string_size == strlen("foo"));

/* Don't forget to free the one allocation! */
free(root);

Iterator Helpers

There are some functions that serve no purpose other than to make it nicer to iterate through the produced JSON DOM:

  • json_value_as_string - returns a value as a string, or null if it wasn't a string.
  • json_value_as_number - returns a value as a number, or null if it wasn't a number.
  • json_value_as_object - returns a value as an object, or null if it wasn't an object.
  • json_value_as_array - returns a value as an array, or null if it wasn't an array.
  • json_value_is_true - returns non-zero is a value was true, zero otherwise.
  • json_value_is_false - returns non-zero is a value was false, zero otherwise.
  • json_value_is_null - returns non-zero is a value was null, zero otherwise.

Lets look at the same example from above but using these helper iterators instead:

const char json[] = "{\"a\" : true, \"b\" : [false, null, \"foo\"]}";
struct json_value_s* root = json_parse(json, strlen(json));

struct json_object_s* object = json_value_as_object(root);
assert(object != NULL);
assert(object->length == 2);

struct json_object_element_s* a = object->start;

struct json_string_s* a_name = a->name;
assert(0 == strcmp(a_name->string, "a"));
assert(a_name->string_size == strlen("a"));

struct json_value_s* a_value = a->value;
assert(json_value_is_true(a_value));

struct json_object_element_s* b = a->next;
assert(b->next == NULL);

struct json_string_s* b_name = b->name;
assert(0 == strcmp(b_name->string, "b"));
assert(b_name->string_size == strlen("b"));

struct json_array_s* array = json_value_as_array(b->value);
assert(array->length == 3);

struct json_array_element_s* b_1st = array->start;

struct json_value_s* b_1st_value = b_1st->value;
assert(json_value_is_false(b_1st_value));

struct json_array_element_s* b_2nd = b_1st->next;

struct json_value_s* b_2nd_value = b_2nd->value;
assert(json_value_is_null(b_2nd_value));

struct json_array_element_s* b_3rd = b_2nd->next;
assert(b_3rd->next == NULL);

struct json_string_s* string = json_value_as_string(b_3rd->value);
assert(string != NULL);
assert(0 == strcmp(string->string, "foo"));
assert(string->string_size == strlen("foo"));

/* Don't forget to free the one allocation! */
free(root);

As you can see it makes iterating through the DOM a little more pleasant.

Extracting a Value from a DOM

If you want to extract a value from a DOM into a new allocation then json_extract_value and json_extract_value_ex are you friends. These functions let you take any value and its subtree from a DOM and clone it into a new allocation - either a single malloc or a user-provided allocation region.

const char json[] = "{\"foo\" : { \"bar\" : [123, false, null, true], \"haz\" : \"haha\" }}";
struct json_value_s* root = json_parse(json, strlen(json));
assert(root);

struct json_value_s* foo = json_value_as_object(root)->start->value;
assert(foo);

struct json_value_s* extracted = json_extract_value(foo);

/* We can free root now because we've got a new allocation for extracted! */
free(root);

assert(json_value_as_object(extracted));

/* Don't forget to free the one allocation! */
free(extracted);

Design

The json_parse function calls malloc once, and then slices up this single allocation to support all the weird and wonderful JSON structures you can imagine!

The structure of the data is always the JSON structs first (which encode the structure of the original JSON), followed by the data.

Todo

License

This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

For more information, please refer to http://unlicense.org/

Issues
  • Added input buffer offsets to DOM objects and a line offset buffer.

    Added input buffer offsets to DOM objects and a line offset buffer.

    As discussed in #38 I'm adding line information to DOM objects.

    • json_value_s stores a byte offset of the value
    • json_object_element_s stores a byte offset of the element name
    • line offset table is optionally exposed via json_parse_result_s
    • offsets can be translated to locations using binary search client-side, e.g. by using std::lower_bound
    opened by alexshafranov 12
  • JSON standard wants to escape /

    JSON standard wants to escape /

    The JSON standard allows for escaping the / character. The software "Tiled" does this when you output JSON data, so using json.h to read those causes errors, because strings (like paths) look like "../../some/file". Ignoring the '' in this case works fine

    opened by Smilex 12
  • Partial support for simplified notation + fix unlikely bug

    Partial support for simplified notation + fix unlikely bug

    Added support for "Quotes around object keys are optional if the keys are valid identifiers" Which ihmo is the important part of simplified json. I wonder if it could be under a #define and then the writer can assume that as well? Right now I haven't changed the writer.

    Also that loop on line 435 would fail on a \\" pattern so I fixed that

      while (state->offset < state->size &&
        ('"' != state->src[state->offset] ||
        ('\\' == state->src[state->offset - 1] &&
        '"' == state->src[state->offset]))) {
        state->data[size++] = state->src[state->offset++];
      }
    

    Became

      while (state->offset < state->size && '"' != state->src[state->offset]) {
        if ('\\' == state->src[state->offset]) {
          // copy reverse solidus character
          state->data[size++] = state->src[state->offset++];
        }
        state->data[size++] = state->src[state->offset++];
      }
    

    Don't need to test for size again as we are on second pass. (Actually it became something else after I added non-quoted keys)

    opened by ocornut 10
  • Error reporting

    Error reporting

    I started using this and I'll be wanting some more error reporting (being able to report line number/row and possibly type of errors). I think I can add them to the code but I was wondering what your priorities were in term of speed/functionalities trade-off and if they were good candidate for a merge.

    For now I have just added line_no and line_start_offset to the json_parse_state_s structure which are only altered when encountering a \n. Byte count from beginning of line can be calculated using (offset-line_start_offset), it isn't a character count per se but already useful and the program can possibly output clang-style error reporting in context from that.

    Since this is C my idea was to add an extra entry point, json_parse_ex() taking something like a json_parse_result_s* to output information such as the type of error and line number. Would you want such patch?

    opened by ocornut 10
  • error parsing valid JSON with json_parse_flags_allow_simplified_json

    error parsing valid JSON with json_parse_flags_allow_simplified_json

    This give me a parser error when the json_parse_flags_allow_simplified_json flag is give, but work with the json_parse_flags_default flag. {"frames":[]}

    The error type is error: json_parse_error_expected_colon

    opened by TimothyWrightSoftware 6
  • json.h under TrustInSoft CI

    json.h under TrustInSoft CI

    Hi,

    I initiated the configuration of json.h on the new tool TrustInSoft CI. It's a source code analyzer, which analyzes execution paths (usually unit tests in your repo) to detect Undefined Behaviors along the way. Coverage includes all the defects that sanitizers ASAN, UBSan and MSan find, plus a large number of other (usually subtle) defects.

    I've set up TrustInSoft CI on your test suite located in the test directory (using test/main.cpp as the entry point) and I've added tests from the JSON Parsing Test Suite (318 parsing tests + 22 transform tests = 340 tests total). Each test was run 4 times to emulate 4 different architectures: x86 32-bit, x86 64-bit, PPC 32-bit, and PPC 64-bit. You can check the results here.

    The tool has proved the absence of Undefined Behaviors on all the analyzed execution paths. Since it relies on formal methods, it is able to detect the most subtle Undefined Behaviors, so these results are particularly impressive!

    On the other hand I have noticed that 14 test cases from the JSON Parsing Test Suite give unexpected results: invalid JSON is parsed successfully.

    Would you mind taking a look?

    As I've set up TrustInSoft CI on my fork, one next step would be that you try it out in a continuous integration mode.

    May I ask if this is something that you'd be interested in?

    The setup consists in writing a configuration file. I can put my initial setup in a PR if you wish me to.

    Thanks!

    opened by jakub-zwolakowski 5
  • Fix Unicode escapes and empty strings parsing

    Fix Unicode escapes and empty strings parsing

    The logic was reversed in the code that validates that the Unicode escape sequence ("\uXXXX") is long enough. Also, the wrong offset was pushed forward.

    opened by kwikwag 4
  • Ideas for some large changes and some inline macro sugar.

    Ideas for some large changes and some inline macro sugar.

    Hi, I'm interested to hear your option on those changes. Let's start with the large ones:

    • Right now arrays are O(n), this is unnecessary. How about changing them to O(1) with a block of array elements after array data to make parsing easier. So [ARRAY OBJECT] ............ [ARRAY_ELEMENT_1][ARRAY_ELEMENT_2] etc.
    • Having separate json_value_s and type structures in the DOM increases its size and makes us jump around with pointers and uglify our code with casts. Why not have a union of those structures inside of json_value_s? Useful bonus here is that we can have an immediate value bool instead of one hacked from types.
    • Object and array access functions and some useful inlines for type checking and quick access to the value via relevant c library functions like atoll or atof.

    Let me know what you think

    opened by fireice-uk 4
  • Test Case: json_write_pretty() with json string \u012b writes garbage

    Test Case: json_write_pretty() with json string \u012b writes garbage

    I've been using your library for a while - very nice. Thank you for the effort to build it! I recently ran into an odd JSON string with a unicode value that looks like valid json to me (based on description of strings at www.json.org). 'json.c' parses it without error, but prints garbage due to the escape character (see program below for test case).

    I ran this string through my vim json parser: PASS and pretty_prints correctly I ran it through the parser at jsonlint.org: PASS and pretty_prints correctly I run it through json.c: PASS... but it won't print correctly

    Compiling on Ubuntu 12.04, g++ version 4.6.4

    int main(void)
    {
       const char payload[] = "{\"key1\":\"value1\", \"key2\":\"\\u012b\"}"; 
    
       printf("\nPrinting Payload:\n  %s\n\n", payload); 
       printf("\nProcessing Payload in JSON parser...\n");
       struct json_value_s *value = json_parse(payload, strlen(payload));
       void* tmp = json_write_pretty(value, "  ", "\n", 0); 
    
       if(tmp) {
           printf("\nPretty Payload:\n%s\n\n", (char*) tmp);
       }
       return 0; 
    }
    

    image

    I think the solution is how to handle the escape character in 'json_parse_string()'...

    I'm stumped!

    opened by guidotex 3
  • Single file distribution

    Single file distribution

    Would it be possible to append the C file into the header file and wrap it in #if !defined(SHEREDOM_JSON_HEADER) to streamline unity builds? (or alternatively #if defined(SHEREDOM_JSON_IMPLEMENTATION); I prefer the first style in contrast to stb-style for unity builds, but the second may be less confusing for stb users)

    opened by ghost 3
  • Fix for errors in streams '{\**\\**\

    Fix for errors in streams '{\**\\**\"a":"b"}' and '{ ' with comments

    Hi,

    Those are fixes for bugs mostly caused by the design issue I told you about. I added a test case that tests for multiple comment - whitespace sequences.

    You should consider adding some tests that are done at the end of mmap'ed or VirtualAlloc'ed page - this way your code will segfault if it reads past the input buffer like it would on '{ '.

    Fireice

    opened by fireice-uk 3
  • [Feature Request] Lookup value via given key

    [Feature Request] Lookup value via given key

    Great work so far.

    However one thing that would help tremendously is to have a (family of) utility function to look up values by a given key - right now I'd have to iterate through a linked list of the members of the json object, which makes the implemenation curve annoying. This has it's own performance implications ofc.

    Parson has something like these:

    JSON_Value* json_object_get_value(const JSON_Object* object, const char* name);
    const char* json_object_get_string(const JSON_Object* object, const char* name);
    JSON_Array* json_object_get_array(const JSON_Object* object, const char* name);
    double        json_object_get_number(const JSON_Object* object, const char* name);
    
    opened by ddengster 2
Owner
Neil Henning
Compiler Warlock on Burst @ Unity
Neil Henning
🗄️ single header json parser for C and C++

??️ json.h A simple single header solution to parsing JSON in C and C++. JSON is parsed into a read-only, single allocation buffer. The current suppor

Neil Henning 506 Aug 2, 2022
single-header json parser for c99 and c++

ghh_json.h a single-header ISO-C99 (and C++ compatible) json loader. why? obviously this isn't the first json library written for C, so why would I wr

garrison hinson-hasty 15 May 26, 2022
a header-file-only, JSON parser serializer in C++

PicoJSON - a C++ JSON parser / serializer Copyright © 2009-2010 Cybozu Labs, Inc. Copyright © 2011-2015 Kazuho Oku Licensed under 2-clause BSD license

Kazuho Oku 1k Aug 5, 2022
A generator of JSON parser & serializer C++ code from structure header files

JSON-CPP-gen This is a program that parses C++ structures from a header file and automatically generates C++ code capable of serializing said structur

Viktor Chlumský 8 May 2, 2022
json_struct is a single header only C++ library for parsing JSON directly to C++ structs and vice versa

Structurize your JSON json_struct is a single header only library that parses JSON to C++ structs/classes and serializing structs/classes to JSON. It

Jørgen Lind 238 Aug 6, 2022
This is a JSON C++ library. It can write and read JSON files with ease and speed.

Json Box JSON (JavaScript Object Notation) is a lightweight data-interchange format. Json Box is a C++ library used to read and write JSON with ease a

Anhero inc. 108 Jul 7, 2022
JSON parser and generator for C/C++ with scanf/printf like interface. Targeting embedded systems.

JSON parser and emitter for C/C++ Features ISO C and ISO C++ compliant portable code Very small footprint No dependencies json_scanf() scans a string

Cesanta Software 611 Aug 5, 2022
a JSON parser and printer library in C. easy to integrate with any model.

libjson - simple and efficient json parser and printer in C Introduction libjson is a simple library without any dependancies to parse and pretty prin

Vincent Hanquez 262 Aug 6, 2022
RapidJSON is a JSON parser and generator for C++.

A fast JSON parser/generator for C++ with both SAX/DOM style API

Tencent 12.3k Aug 7, 2022
Ultralightweight JSON parser in ANSI C

cJSON Ultralightweight JSON parser in ANSI C. Table of contents License Usage Welcome to cJSON Building Copying the source CMake Makefile Vcpkg Includ

Dave Gamble 7.8k Aug 15, 2022
JSON & BSON parser/writer

jbson is a library for building & iterating BSON data, and JSON documents in C++14. \tableofcontents Features # {#features} Header only. Boost license

Chris Manning 39 May 12, 2022
Jsmn is a world fastest JSON parser/tokenizer. This is the official repo replacing the old one at Bitbucket

JSMN jsmn (pronounced like 'jasmine') is a minimalistic JSON parser in C. It can be easily integrated into resource-limited or embedded projects. You

Serge Zaitsev 3k Aug 8, 2022
A JSON parser in C++

JSON++ Introduction JSON++ is a light-weight JSON parser, writer and reader written in C++. JSON++ can also convert JSON documents into lossless XML d

Hong Jiang 485 Jul 23, 2022
Very low footprint JSON parser written in portable ANSI C

Very low footprint JSON parser written in portable ANSI C. BSD licensed with no dependencies (i.e. just drop the C file into your project) Never recur

James McLaughlin 1.2k Aug 12, 2022
Very simple C++ JSON Parser

Very simple JSON parser for c++ data.json: { "examples": [ { "tag_name": "a", "attr": [ { "key":

Amir Saboury 65 Jul 25, 2022
A fast JSON parser/generator for C++ with both SAX/DOM style API

A fast JSON parser/generator for C++ with both SAX/DOM style API Tencent is pleased to support the open source community by making RapidJSON available

Tencent 12.3k Aug 16, 2022
Lightweight, extremely high-performance JSON parser for C++11

sajson sajson is an extremely high-performance, in-place, DOM-style JSON parser written in C++. Originally, sajson meant Single Allocation JSON, but i

Chad Austin 540 Aug 13, 2022
🔋 In-place lightweight JSON parser

?? JSON parser for C This is very simple and very powerful JSON parser. It creates DOM-like data structure and allows to iterate and process JSON obje

Recep Aslantas 21 Jun 7, 2022
Buggy JSON parser

Fuzzgoat: A minimal libFuzzer integration This repository contains a basic C project that includes an (intentionally insecure) JSON parser. It is an e

Fuzzbuzz 1 Apr 11, 2022