A modern C++ library for reading, writing, and analyzing CSV (and similar) files.

Overview

Vince's CSV Parser

Build Status

Motivation

There's plenty of other CSV parsers in the wild, but I had a hard time finding what I wanted. Inspired by Python's csv module, I wanted a library with simple, intuitive syntax. Furthermore, I wanted support for special use cases such as calculating statistics on very large files. Thus, this library was created with these following goals in mind.

Performance and Memory Requirements

With the deluge of large datasets available, a performant CSV parser is a necessity. By using overlapped threads, memory mapped IO, and efficient data structures, this parser can quickly tackle large CSV files. Furthermore, this parser has a minimal memory footprint and can handle larger-than-RAM files.

Show me the numbers

On my computer (Intel Core i7-8550U @ 1.80GHz/Toshiba XG5 SSD), this parser can read

Robust Yet Flexible

RFC 4180 and Beyond

This CSV parser is much more than a fancy string splitter, and parses all files following RFC 4180.

However, in reality we know that RFC 4180 is just a suggestion, and there's many "flavors" of CSV such as tab-delimited files. Thus, this library has:

  • Automatic delimiter guessing
  • Ability to ignore comments in leading rows and elsewhere
  • Ability to handle rows of different lengths

By default, rows of variable length are silently ignored, although you may elect to keep them or throw an error.

Encoding

This CSV parser is encoding-agnostic and will handle ANSI and UTF-8 encoded files. It does not try to decode UTF-8, except for detecting and stripping UTF-8 byte order marks.

Well Tested

This CSV parser has an extensive test suite and is checked for memory safety with Valgrind. If you still manage to find a bug, do not hesitate to report it.

Documentation

In addition to the Features & Examples below, a fully-fledged online documentation contains more examples, details, interesting features, and instructions for less common use cases.

Integration

This library was developed with Microsoft Visual Studio and is compatible with >g++ 6.0 and clang. All of the code required to build this library, aside from the C++ standard library, is contained under include/.

C++ Version

While C++17 is recommended, C++11 is the minimum version required. This library makes extensive use of string views, and uses Martin Moene's string view library if std::string_view is not available.

Single Header

This library is available as a single .hpp file under single_include/csv.hpp.

CMake Instructions

If you're including this in another CMake project, you can simply clone this repo into your project directory, and add the following to your CMakeLists.txt:

# Optional: Defaults to C++ 17
# set(CSV_CXX_STANDARD 11)
add_subdirectory(csv-parser)

# ...

add_executable(<your program> ...)
target_link_libraries(<your program> csv)

Features & Examples

Reading an Arbitrarily Large File (with Iterators)

With this library, you can easily stream over a large file without reading its entirety into memory.

C++ Style

# include "csv.hpp"

using namespace csv;

...

CSVReader reader("very_big_file.csv");

for (CSVRow& row: reader) { // Input iterator
    for (CSVField& field: row) {
        // By default, get<>() produces a std::string.
        // A more efficient get<string_view>() is also available, where the resulting
        // string_view is valid as long as the parent CSVRow is alive
        std::cout << field.get<>() << ...
    }
}

...

Old-Fashioned C Style Loop

...

CSVReader reader("very_big_file.csv");
CSVRow row;
 
while (reader.read_row(row)) {
    // Do stuff with row here
}

...

Memory-Mapped Files vs. Streams

By default, passing in a file path string to the constructor of CSVReader causes memory-mapped IO to be used. In general, this option is the most performant.

However, std::ifstream may also be used as well as in-memory sources via std::stringstream.

Note: Currently CSV guessing only works for memory-mapped files. The CSV dialect must be manually defined for other sources.

CSVFormat format;
// custom formatting options go here

CSVReader mmap("some_file.csv", format);

std::ifstream infile("some_file.csv", std::ios::binary);
CSVReader ifstream_reader(infile, format);

std::stringstream my_csv;
CSVReader sstream_reader(my_csv, format);

Indexing by Column Names

Retrieving values using a column name string is a cheap, constant time operation.

# include "csv.hpp"

using namespace csv;

...

CSVReader reader("very_big_file.csv");
double sum = 0;

for (auto& row: reader) {
    // Note: Can also use index of column with [] operator
    sum += row["Total Salary"].get<double>();
}

...

Numeric Conversions

If your CSV has lots of numeric values, you can also have this parser (lazily) convert them to the proper data type.

  • Type checking is performed on conversions to prevent undefined behavior and integer overflow
    • Negative numbers cannot be blindly converted to unsigned integer types
  • get<float>(), get<double>(), and get<long double>() are capable of parsing numbers written in scientific notation.
  • Note: Conversions to floating point types are not currently checked for loss of precision.
# include "csv.hpp"

using namespace csv;

...

CSVReader reader("very_big_file.csv");

for (auto& row: reader) {
    if (row["timestamp"].is_int()) {
        // Can use get<>() with any integer type, but negative
        // numbers cannot be converted to unsigned types
        row["timestamp"].get<int>();
        
        // ..
    }
}

Converting to JSON

You can serialize individual rows as JSON objects, where the keys are column names, or as JSON arrays (which don't contain column names). The outputted JSON contains properly escaped strings with minimal whitespace and no quoting for numeric values. How these JSON fragments are assembled into a larger JSON document is an exercise left for the user.

# include <sstream>
# include "csv.hpp"

using namespace csv;

...

CSVReader reader("very_big_file.csv");
std::stringstream my_json;

for (auto& row: reader) {
    my_json << row.to_json() << std::endl;
    my_json << row.to_json_array() << std::endl;

    // You can pass in a vector of column names to
    // slice or rearrange the outputted JSON
    my_json << row.to_json({ "A", "B", "C" }) << std::endl;
    my_json << row.to_json_array({ "C", "B", "A" }) << std::endl;
}

Specifying the CSV Format

Although the CSV parser has a decent guessing mechanism, in some cases it is preferrable to specify the exact parameters of a file.

# include "csv.hpp"
# include ...

using namespace csv;

CSVFormat format;
format.delimiter('\t')
      .quote('~')
      .header_row(2);   // Header is on 3rd row (zero-indexed)
      // .no_header();  // Parse CSVs without a header row
      // .quote(false); // Turn off quoting 

// Alternatively, we can use format.delimiter({ '\t', ',', ... })
// to tell the CSV guesser which delimiters to try out

CSVReader reader("wierd_csv_dialect.csv", format);

for (auto& row: reader) {
    // Do stuff with rows here
}

Trimming Whitespace

This parser can efficiently trim off leading and trailing whitespace. Of course, make sure you don't include your intended delimiter or newlines in the list of characters to trim.

CSVFormat format;
format.trim({ ' ', '\t'  });

Handling Variable Numbers of Columns

Sometimes, the rows in a CSV are not all of the same length. Whether this was intentional or not, this library is built to handle all use cases.

CSVFormat format;

// Default: Silently ignoring rows with missing or extraneous columns
format.variable_columns(false); // Short-hand
format.variable_columns(VariableColumnPolicy::IGNORE);

// Case 2: Keeping variable-length rows
format.variable_columns(true); // Short-hand
format.variable_columns(VariableColumnPolicy::KEEP);

// Case 3: Throwing an error if variable-length rows are encountered
format.variable_columns(VariableColumnPolicy::THROW);

Setting Column Names

If a CSV file does not have column names, you can specify your own:

std::vector<std::string> col_names = { ... };
CSVFormat format;
format.column_names(col_names);

Parsing an In-Memory String

# include "csv.hpp"

using namespace csv;

...

// Method 1: Using parse()
std::string csv_string = "Actor,Character\r\n"
    "Will Ferrell,Ricky Bobby\r\n"
    "John C. Reilly,Cal Naughton Jr.\r\n"
    "Sacha Baron Cohen,Jean Giard\r\n";

auto rows = parse(csv_string);
for (auto& r: rows) {
    // Do stuff with row here
}
    
// Method 2: Using _csv operator
auto rows = "Actor,Character\r\n"
    "Will Ferrell,Ricky Bobby\r\n"
    "John C. Reilly,Cal Naughton Jr.\r\n"
    "Sacha Baron Cohen,Jean Giard\r\n"_csv;

for (auto& r: rows) {
    // Do stuff with row here
}

Writing CSV Files

# include "csv.hpp"
# include ...

using namespace csv;
using namespace std;

...

stringstream ss; // Can also use ofstream, etc.

auto writer = make_csv_writer(ss);
// auto writer = make_tsv_writer(ss);               // For tab-separated files
// DelimWriter<stringstream, '|', '"'> writer(ss);  // Your own custom format

writer << vector<string>({ "A", "B", "C" })
    << deque<string>({ "I'm", "too", "tired" })
    << list<string>({ "to", "write", "documentation." });

writer << array<string, 2>({ "The quick brown", "fox", "jumps over the lazy dog" });
writer << make_tuple(1, 2.0, "Three");
...

You can pass in arbitrary types into DelimWriter by defining a conversion function for that type to std::string.

Contributing

Bug reports, feature requests, and so on are always welcome. Feel free to leave a note in the Issues section.

Comments
  • Crash on assertion failure ( Assertion failed! Line: 883 )

    Crash on assertion failure ( Assertion failed! Line: 883 )

    on Windows Visual Studio 2017 Library version 1.3.0

    Trying to parse this file https://www.kaggle.com/austinreese/craigslist-carstrucks-data

    Assertion failed!
    
    Program: ...x64\Debug\Fileparse.exe
    .. \csv.hpp
    Line: 883
    
    Expression: pos < size()
    
    bug 
    opened by bangusi 17
  • Misparsing csv

    Misparsing csv

    Hi, Firstly i want to thank you for developing this library. Its usage is really elegant and easy.

    Unfortunately, i faced with an issue which csv-parser parses the csv file wrongly. It parses fields wrongly. Am i missing something or is it a bug ? ( I generated .csv file from GNU Octave, i think its format is correct)

    Scenario to reproduce the issue: I have a csv file which contains 16384 columns and two rows. First row is an header, second row contains floating point values. .csv file is delimited by ','. (.csv file attaced as .zip)

    #include <csv.hpp>
    
    int main(int argc, char *argv[])
    {
        using namespace csv;
        CSVReader reader("time-result.csv");
    
        for (CSVRow& row: reader) { // Input iterator
            auto i = 0;
            for (CSVField& field: row) {
                if ( i == 7003 ) // 7003th is one of the wrongly parsed fields, there are more
                    std::cout << field.get<>() << std::endl;
                ++i;
            }
        }
    }
    

    Output : e-069.99

    But it should be a valid floating point numerical value string.

    Library Version : 1.3.0 Compiler : gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)

    time-result.zip

    opened by OzanCansel 12
  • This csv file crashes the parser

    This csv file crashes the parser

    This file crashes the parser. The file is parsed correctly in Excel and OpenOffice/LibreOffice My code:

    void f0()
    {
    	csv::CSVFormat format;
    	format.delimiter(',').quote('"').header_row(0);
    	csv::CSVReader reader("problem.txt",format);
    	auto column_names = reader.get_col_names();
    	std::cout << column_names.size() << std::endl;
    	for (auto& cv : column_names)
    	{
    		std::cout << cv << "\t";
    	}
    	std::cout << "\n";
    
    	for (csv::CSVRow& row : reader)
    	{
    		std::cout << row.size() << std::endl;
    		for (auto& rv : row)
    		{
    			std::cout << rv << "\t";
    		}
    		std::cout << "\n";
    	}
    }
    The data:
    

    ACCOUNT_TYPE,ACCOUNT_NUMBER,TRANSACTION_DATE,CHEQUE_NUMBER,DESCRIPTION1,DESCRIPTION2,CAD,USD Chequing,07451-1007186,1/2/1987,,"Bill Payment","Purchase Order",-4.00,, Saving,07451-1007186,1/29/1987,,"Account Payable Pmt","Mac 6000 INCO",210424.25,, Chequing,07451-1007186,2/1/1987,,"Misc Payment","Purchase Order",-200.00,, Chequing,07451-1007186,2/5/1987,,"Membership fees","VAT-Y 4007633",-917.33,, Chequing,07451-1007186,2/5/1987,,"Membership fees","TXINS 4007659",-950.69,, Saving,07451-1007186,2/26/1987,,"Account Payable Pmt","Mac 6000 INCO",79034.35,, Chequing,07451-1007186,2/28/1987,,"Membership fees","VAT-Y 7453902",-7905.02,, Chequing,07451-1007186,2/28/1987,,"Membership fees","TXINS 7454013",-823.93,, Chequing,07451-1007186,3/1/1987,,"Bill Payment","Purchase Order",-8.00,, Saving,07451-1007186,3/4/1987,,"Online transfer sent - 1872","Great Outdoors",-17000.00,,

    opened by bangusi 10
  • Cmake Build Error form v1.1.2 to v1.3.0

    Cmake Build Error form v1.1.2 to v1.3.0

    I'm coming from version 1.1.2 and want to use 1.3.0. When I run cmake to build my project I get this errors.

    /CLionProjects/student_research/dev/hmmenc_client/main.cpp: In function ‘int main(int, char**)’:
    /CLionProjects/student_research/dev/hmmenc_client/main.cpp:536:136: error: ‘type_name’ is not a member of ‘csv::internals’; did you mean ‘type_num’?
      536 |                                         cout << colNames[i] << "  has:  " << item.second << "  entries of type:  "  << csv::internals::type_name(item.first) << endl;
          |                                                                                                                                        ^~~~~~~~~
          |                                                                                                                                        type_num
    /CLionProjects/student_research/dev/hmmenc_client/main.cpp:820:80: error: ‘class csv::CSVStat’ has no member named ‘correct_rows’
      820 |                                             auto idManualLim = (uint64_t)stats.correct_rows; // All Rows
          |                                                                                ^~~~~~~~~~~~
    /CLionProjects/student_research/dev/hmmenc_client/main.cpp:821:79: error: ‘class csv::CSVStat’ has no member named ‘correct_rows’
      821 |                                             if (idManualLim > (uint64_t)stats.correct_rows)
          |                                                                               ^~~~~~~~~~~~
    /CLionProjects/student_research/dev/hmmenc_client/main.cpp:823:79: error: ‘class csv::CSVStat’ has no member named ‘correct_rows’
      823 |                                                 idManualLim = (uint64_t)stats.correct_rows;
          |                                                                               ^~~~~~~~~~~~
    /CLionProjects/student_research/dev/hmmenc_client/main.cpp:848:82: error: ‘type_name’ is not a member of ‘csv::internals’; did you mean ‘type_num’?
      848 |                                                 columnNameType = csv::internals::type_name(item.first);
          |                                                                                  ^~~~~~~~~
          |                                                                                  type_num
    

    I guess that stats.correct_rows form csv::CSVStat stats() was changed to stats.num_rows, correct?

    Next thing I use is:

    // Get the type of the values in the column
    auto columnNameIndex = (uint64_t)readerInfo.index_of(columnName);
    string columnNameType;
    for (auto item : colDataTypes[columnNameIndex])
    {
        columnNameType = csv::internals::type_name(item.first);
        cout << columnName << " has " << item.second << " elements of type: " <<  columnNameType << endl;
    }
    

    Could you please tell me what should I use instead of type_name in csv::internals::type_name(item.first); to get the same effect? The v1.3.0 doesn't have that named member in csv::internals.

    opened by CleanHit 9
  • Add Clang builds

    Add Clang builds

    Please see https://app.travis-ci.com/github/xgdgsc/csv-parser/builds/233738949 for various build errors on gcc/clang. Why isn' t travis working at your master branch?

    opened by xgdgsc 8
  • Floating point output completely broken?

    Floating point output completely broken?

    The following:

    #include <iostream>
    #include <csv.hpp>
    
    int main(int argc, char* argv[]) {
        auto writer = csv::make_csv_writer(std::cout);
        writer << std::vector<std::string>{"a", "b", "c"};
        writer << std::make_tuple(1, 0.004654, 45);
    }
    

    Outputs:

    a,b,c
    1,0.465,45
    

    Is there a way to fix this quickly? Otherwise, this makes this library completely unusable...

    Is there a reason not to use the standard to_string functions in the library?

    bug 
    opened by Holt59 8
  • single_include compilation error

    single_include compilation error

    First of all, thanks for doing this library.

    When compiling files in single_include_test directory, the following compilation errors occurred:

    $ g++ -pthread --std=c++14 -o file1 file1.cpp
    In file included from my_header.hpp:2:0,
                     from file1.cpp:1:
    csv.hpp:3975:28: error: enclosing class of constexpr non-static member function ‘bool csv::CSVRow::iterator::operator==(const csv::CSVRow::iterator&) const’ is not a literal type
                 constexpr bool operator==(const iterator& other) const {
                                ^~~~~~~~
    csv.hpp:3945:15: note: ‘csv::CSVRow::iterator’ is not literal because:
             class iterator {
                   ^~~~~~~~
    csv.hpp:3945:15: note:   ‘csv::CSVRow::iterator’ has a non-trivial destructor
    csv.hpp:3979:28: error: enclosing class of constexpr non-static member function ‘bool csv::CSVRow::iterator::operator!=(const csv::CSVRow::iterator&) const’ is not a literal type
                 constexpr bool operator!=(const iterator& other) const { return !operator==(other); }
                                ^~~~~~~~
    

    To fix the errors above I've just modified the following lines:

    (the lines commented are the original ones)

    csv.hpp

            class iterator {
       ......
                /** Two iterators are equal if they point to the same field */
    //|            constexpr bool operator==(const iterator& other) const {
                inline bool operator==(const iterator& other) const {
                    return this->i == other.i;
                };
    
    //|            constexpr bool operator!=(const iterator& other) const { return !operator==(other); }
                inline bool operator!=(const iterator& other) const { return !operator==(other); }
    

    file1.hpp

    //|int foobar(int argc, char** argv) {
    int main(int argc, char** argv) {
        using namespace csv;
    

    The file2.hpp is ok.

    opened by viniciusjl 8
  • Why is the first column ignored?

    Why is the first column ignored?

    Windows 10, Visual Studio 2017 My data

    EMPLOYEEKEY	FIRSTNAME	HIREDATE	LASTNAME	TITLE	
    2	Kevin	2006-08-26	Brown	Marketing Assistant	
    3	Roberto	2007-06-11	Tamburello	Engineering Manager	
    4	Rob	2007-07-05	Walters	Senior Tool Designer	
    5	Rob	2007-07-05	Walters	Senior Tool Designer	
    6	Thierry	2007-07-11	D'Hers	Tool Designer	
    7	David	2007-07-20	Bradley	Marketing Manager	
    8	David	2007-07-20	Bradley	Marketing Manager	
    9	JoLynn	2007-07-26	Dobney	Production Supervisor - WC60	
    10	Ruth	2007-08-06	Ellerbrock	Production Technician - WC10
    

    My Code

    void f0()
    {
    	csv::Reader foo;
    
    	foo.configure_dialect("my_dialect")
    		.delimiter("\t")
    		.quote_character('"')
    		.double_quote(true)
    		.skip_initial_space(false)
    		.trim_characters(' ', '\t')
    		//	.ignore_columns("foo", "bar")
    		.header(true)
    		.skip_empty_rows(true);
    
    	foo.read("sample.csv");
    	auto rows = foo.rows();
    	for (auto& row : rows)
    	{
    		auto key = row["EMPLOYEEKEY"];
    		auto fname = row["FIRSTNAME"];
    		auto hdate = row["HIREDATE"];
    		auto lname = row["LASTNAME"];
    		auto title = row["TITLE"];
    		std::cout << key << " " << fname << " " << hdate << " " << lname << " " << title << "\n";
    	}
    
    ```}
    
    **Outpu**t
    

    Kevin 2006-08-26 Brown Marketing Assistant Roberto 2007-06-11 Tamburello Engineering Manager Rob 2007-07-05 Walters Senior Tool Designer Rob 2007-07-05 Walters Senior Tool Designer Thierry 2007-07-11 D'Hers Tool Designer David 2007-07-20 Bradley Marketing Manager David 2007-07-20 Bradley Marketing Manager JoLynn 2007-07-26 Dobney Production Supervisor - WC60 Ruth 2007-08-06 Ellerbrock Production Technician - WC10

    opened by bangusi 8
  • Data corruption and crash with very wide columns?

    Data corruption and crash with very wide columns?

    large file with 2000 x ~20char wide double columns, generated like this:

      const int cols_n = 2000;
      std::string big_filename = "MedDataset.txt";
      std::ofstream ofstream(big_filename);
      if (!ofstream.is_open()) {
        std::cerr << "failed to open " << big_filename << '\n';
        exit(1);
      }
    
      std::random_device rd;
      std::mt19937 gen{rd()};
      std::uniform_real_distribution<double> dist{0, 1};
    
      ofstream << std::setprecision(16);
      for (int r = 0; r < 1000; r++) {
        for (int c = 0; c < cols_n; c++) {
          double num = dist(gen);
          ofstream << num;
          if (c != cols_n -1) ofstream << ',';
        }
        ofstream << "\n";
    
      }
      ofstream.close();
    
    

    parsing like this:

      CSVReader reader("MedDataset.txt");
    
      {
        std::vector<double> r;
        for (CSVRow& row: reader) { // Input iterator
          for (CSVField& field: row) {
            r.push_back(field.get<double>());
          }
          // use vector...
          r.clear();
        }
      }
    

    getting this error during parsing..

    terminate called after throwing an instance of 'std::runtime_error' 
      what():  Not a number.
    

    If I reduce the cols_n = 2000 to 1800 it runs just fine.

    I have visually inspected the file and not seeing any weird characters. All programmatically produced.

    It feels like there some sort of "buffer overflow" due to the very large row --- roughly 32kb....?? 100% percent reproducible for me eventhough the values of the fields are random.

    clang++ -O2  -std=c++17   ...    -lpthread
    clang++ --version
    clang version 8.0.0-3 (tags/RELEASE_800/final)
    Target: x86_64-pc-linux-gnu
    
    bug enhancement 
    opened by oschonrock 6
  • Getting Column Data Types with get_dtypes()

    Getting Column Data Types with get_dtypes()

    I tried your example code for some file statistics here csv::CSVStat Class Reference. Here is a minimal example:

    csv::CSVStat stats(csvFilePath);
    auto colDataTypes = stats.get_dtypes();
    auto colNames = stats.get_col_names();
    
    // Doesn't work for colDataTypes but works for other defined statistics like max, min and so `on...`
    for (int64_t it = 0; it < colNames.size(); it++){
        std::cout << colDataTypes[it] << std::endl;
    }
    
    // Doesn't work either
    for (auto &type : colDataTypes){
        std::cout << type << std::endl;
    }
    

    How can I get the colDataTypes of each column printed out? If I understood it correctly that is what the get_dtypes()function suppose to do.

    opened by CleanHit 6
  • Add option to ignore quote character

    Add option to ignore quote character

    It appears the quote character defaults to '"'which works in many cases but I have ran into situations where the file has no quote character specified. In such cases when '"' is encountered the parser produces incorrect results. It can even lead to program crash.

    opened by bangusi 5
  • Bug: Segmentation Fault on `CSVStat` when No Rows

    Bug: Segmentation Fault on `CSVStat` when No Rows

    Input:

    myfile.csv

    a
    
    

    test.cpp:

    #include "csv.hpp"
    
    using namespace csv;
    
    int main(int argc, char *argv[]) {
        
      CSVStat stats("myfile.csv");
      auto min = stats.get_mins();
    
      return 0;
    }
    

    Output:

    $ ./fuzz_csv 
    Segmentation fault
    
    CMake Details

    CMakeLists.txt

    cmake_minimum_required(VERSION 3.16)
    project(fuzz-csv-parser)
    
    set(CMAKE_CXX_STANDARD 17)
    set(CMAKE_CXX_STANDARD_REQUIRED ON)
    
    include(FetchContent)
    
    ##################################################
    # From: https://github.com/vincentlaucsb/csv-parser/issues/135#issuecomment-977528169
    ##################################################
    
    # Install a header-only library from github as dependency.
    FetchContent_Declare(
      csv
      GIT_REPOSITORY https://github.com/vincentlaucsb/csv-parser.git
      GIT_TAG        master
      )
    
    FetchContent_MakeAvailable(csv)
    
    include_directories(
      ${CMAKE_SOURCE_DIR}/${FolderSource} # Find headers from source folder
      ${csv_SOURCE_DIR}/single_include # Issue #139 for this repo
      )
    
    ##################################################
    # End from
    ##################################################
    
    add_executable(fuzz_csv)
    target_sources(fuzz_csv
      PRIVATE
        test.cpp
    )
    target_link_libraries(fuzz_csv csv)
    

    Full Command:

    ~/e4/build-testing$ cmake ../fuzz-csv-parser/
    Building CSV library using C++17
    -- Configuring done
    -- Generating done
    -- Build files have been written to: ~/e4/build-testing
    ~/e4/build-testing$ make clean all
    [  5%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/basic_csv_parser.cpp.o
    [ 11%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/col_names.cpp.o
    [ 16%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/csv_format.cpp.o
    [ 22%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/csv_reader.cpp.o
    [ 27%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/csv_reader_iterator.cpp.o
    [ 33%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/csv_row.cpp.o
    [ 38%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/csv_row_json.cpp.o
    [ 44%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/csv_stat.cpp.o
    [ 50%] Building CXX object _deps/csv-build/include/internal/CMakeFiles/csv.dir/csv_utility.cpp.o
    [ 55%] Linking CXX static library libcsv.a
    [ 55%] Built target csv
    [ 61%] Building CXX object CMakeFiles/fuzz_csv.dir/test.cpp.o
    [ 66%] Linking CXX executable fuzz_csv
    [ 66%] Built target fuzz_csv
    [ 72%] Building CXX object CMakeFiles/fuzz_csv2.dir/test2.cpp.o
    [ 77%] Linking CXX executable fuzz_csv2
    [ 77%] Built target fuzz_csv2
    [ 83%] Building CXX object _deps/csv-build/programs/CMakeFiles/csv_info.dir/csv_info.cpp.o
    [ 88%] Linking CXX executable csv_info
    [ 88%] Built target csv_info
    [ 94%] Building CXX object _deps/csv-build/programs/CMakeFiles/csv_stats.dir/csv_stats.cpp.o
    [100%] Linking CXX executable csv_stats
    [100%] Built target csv_stats
    ~/e4/build-testing$ ./fuzz_csv 
    Segmentation fault
    
    opened by chrishappy 0
  • mingw cross-compilation fails when looking for Windows.h

    mingw cross-compilation fails when looking for Windows.h

    https://github.com/vincentlaucsb/csv-parser/blob/9d5f796a32c6cdecd83a2f778ca6db0500948d27/include/internal/common.hpp#L16

    Due to the capitalization of the includes directive, MinGW errors out when cross-compiling on Linux targeting Windows. Fortunately this is the only instance this problem, as all other instances of #include <windows.h> in csv-parser are lowercase.

    opened by nbarrios1337 0
Releases(2.1.3)
  • 2.1.3(Jul 29, 2021)

  • 2.1.2(Jul 27, 2021)

    • Fixed compilation issues with C++11 and 14.
      • CSV Parser should now be should C++11 compatible once again with g++ 7.5 or up
    • Allowed users to customize decimal place precision when writing CSVs
    • Fixed floating point output
      • Arbitrarily large integers stored in doubles can now be output w/o limits
    • Fixed newlines not being escaped by CSVWriter
    Source code(tar.gz)
    Source code(zip)
  • 2.1.1(Apr 15, 2021)

    • Fixed CSVStats only processing first 5000 rows thanks to @TobyEalden
    • Fixed parsing """fields like this""" thanks to @rpadrela
    • Fixed CSVReader move semantics thanks to @artpaul
    Source code(tar.gz)
    Source code(zip)
  • 2.1.0.1(Dec 20, 2020)

  • 2.1.0(Oct 18, 2020)

    New Features

    • CSVReader can now parse from memory mapped files, std::stringstream, and std::ifstream
    • DelimWriter now supports writing rows encoded as std::tuple
    • DelimWriter automatically converts numbers and other data types stored in vectors, arrays, and tuples

    Improvements

    • CSVReader is now a no-copy parser when memory-mapped IO is used
      • CSVRow and CSVField now refer to the original memory map
    • Significant performance improvements for some files

    Bug Fixes

    • Fixed potential thread safety issues with internals::CSVFieldList

    API Changes

    • CSVReader::feed() and CSVReader::end_feed() have been removed. In-memory parsing should be performed via the interface for std::stringsteam.
    Source code(tar.gz)
    Source code(zip)
  • 2.0.1(Oct 1, 2020)

  • 2.0.0(Sep 27, 2020)

    • Parser now uses memory-mapped IO for reading from disk thanks to mio
      • CSV files are read in smaller chunks to reduce memory footprint (but parsing is significantly faster)
    • CSVReader::read_row() (and CSVReader::iterator) no longer blocks CSVReader::read_csv(), i.e. we can now simultaneously work on CSV data while reading more rows
    • Parser internals completely rewritten to use more efficient and easier to maintain/debug data structures
    • Fixed bug where single column files could not be parsed
    • Fixed errors with parsing empty files
    • CSVWriter::write_row() now works with std::array
    Source code(tar.gz)
    Source code(zip)
  • 2.0.0-beta(Sep 22, 2020)

    • Parser now uses memory-mapped IO for reading from disk
      • On Windows, parser may map entire file into memory or mmap chunks of file iteratively based on available RAM (will extend to all OSes)
    • Parser internals completely rewritten to use more efficient and easier to maintain/debug data structures
      • New algorithm involves minimal copying
    • Fixed bug where single column files could not be parsed
    • Fixed errors with parsing empty files
    Source code(tar.gz)
    Source code(zip)
  • 1.3.3(May 16, 2020)

    • Fixed issue with incorrect usage of string_view that led to memory errors when parsing large files such as the 1.4GB Craigslist vehicles dataset #90
    • Added ability to have no quote character #83
    • Changed VariableColumnPolicy::IGNORE to IGNORE_ROW to avoid clashing with IGNORE macro as defined by WinBase.h #96
    Source code(tar.gz)
    Source code(zip)
  • 1.3.2(May 9, 2020)

    • Fixed bug with parsing very long rows (as reported in #92) when the length of the row was greater than 2^16 (the limit of unsigned short)
      • All instances of unsigned short have been replaced by internals::StrBufferPos (size_t) thus giving this parser the theoretical capability of parsing rows that are 2^64 characters long
    • Fixed bug recognizing numbers in e-notation when the base did not have a decimal, e.g. 1E-06
    Source code(tar.gz)
    Source code(zip)
  • 1.3.1(May 4, 2020)

    Fixes incorrect CSV parsing when whitespace trimming is enabled and a field is composed entirely of whitespace characters as reported in #85

    Source code(tar.gz)
    Source code(zip)
  • 1.3.0(Mar 12, 2020)

    • The behavior for parsing variable-column CSV files can now be simply defined using CSVFormat::variable_columns()
      • Variable-column rows can be kept or silently dropped (default), or result in an error being thrown
      • CSVReader::bad_row_handler() has been removed
    • Many annoying clang/gcc warning messages fixed (thanks rpavlik!)
    • CSV guessing implementation has been simplified (CSVGuesser is also gone now)
    Source code(tar.gz)
    Source code(zip)
  • 1.2.5(Mar 10, 2020)

  • 1.2.4(Jan 20, 2020)

  • 1.2.2.1(Sep 7, 2019)

    • Fixed clang++ compilation warnings and UTF-8 BOM detection (with the help of @tamaskenez)
    • Fixed compilation errors that occurred when including single header csv.hpp in multiple files
    Source code(tar.gz)
    Source code(zip)
  • 1.2.2(Sep 3, 2019)

    CSVRow objects now have to_json() and to_json_array() methods with proper string escaping and column slicing/rearranging. The CSVFormat interface is now also more robust.

    Source code(tar.gz)
    Source code(zip)
  • 1.2.1.1(Aug 18, 2019)

  • 1.2.1(Jun 9, 2019)

  • 1.2.0(May 26, 2019)

    • Integrated Hedley library
      • Possible performance increase on older compilers due to use of restrict, pure, etc.
    • Better handling of integer types with get()
      • get<>() is now supported for all signed integer types
      • Removed complications regarding the (mostly) useless long int type
    • CSVWriter now accepts deque<string> and list<string> as inputs
    • Updated Catch to latest version
      • Refactored unit tests
    Source code(tar.gz)
    Source code(zip)
  • 1.1.4.1(May 20, 2019)

  • 1.1.4(May 19, 2019)

    • Improved performance by storing all CSVRow data in contiguous memory regions
      • Numbers based on my computer
        • Disk parsing speed: 220 MB/s
        • In-memory parsing speed: 380 MB/s
    • Improved CSVFormat interface
    Source code(tar.gz)
    Source code(zip)
  • 1.1.3(Mar 31, 2019)

  • 1.1.2(Mar 31, 2019)

    CSV library should now be compatible with C++11 by using a third-party string_view implementation. If C++17 is detected, then std::string_view will be used.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Jul 25, 2018)

Owner
Vincent La
Proud Gaucho (UCSB '18). Occasionally codes.
Vincent La
fast-cpp-csv-parser

Fast C++ CSV Parser This is a small, easy-to-use and fast header-only library for reading comma separated value (CSV) files. Features Automatically re

null 1.8k Dec 29, 2022
a cpp lib for csv reading and writing

CSV Reader and Writer Author : csl E-Mail : [email protected] OverView Comma separated values (CSV, sometimes called character separated values, becau

null 0 Apr 3, 2022
libspng is a C library for reading and writing PNG format files with a focus on security and ease of use.

libspng (simple png) is a C library for reading and writing Portable Network Graphics (PNG) format files with a focus on security and ease of use.

Randy 570 Dec 29, 2022
A simple C++ library for reading and writing audio files.

AudioFile A simple header-only C++ library for reading and writing audio files. Current supported formats: WAV AIFF Author AudioFile is written and ma

Adam Stark 683 Jan 4, 2023
A C library for reading and writing sound files containing sampled audio data.

libsndfile libsndfile is a C library for reading and writing files containing sampled audio data. Authors The libsndfile project was originally develo

null 1.1k Jan 2, 2023
LibMEI is a C++ library for reading and writing MEI files

C++ library and Python bindings for the Music Encoding Initiative format

Distributed Digital Music Archives and Libraries Lab 58 Nov 17, 2022
PDFio is a simple C library for reading and writing PDF files

pdfio - PDF Read/Write Library PDFio is a simple C library for reading and writing PDF files. The primary goals of PDFio are: Read and write any versi

Michael R Sweet 64 Dec 17, 2022
libnpy is a simple C++ library for reading and writing of numpy's .npy files.

C++ library for reading and writing of numpy's .npy files

Leon Merten Lohse 203 Dec 30, 2022
A high performance C++14 library for effortlessly reading and writing UBJSON

UbjsonCpp A high performance C++14 library for effortlessly reading and writing UBJSON This library implements UBJSON Draft 12 and Value semmantics Ab

Ibrahim Timothy Onogu 21 Aug 2, 2022
C++ (with python bindings) library for easily reading/writing/manipulating common animation particle formats such as PDB, BGEO, PTC. See the discussion group @ http://groups.google.com/group/partio-discuss

Partio - A library for particle IO and manipulation This is the initial source code release of partio a tool we used for particle reading/writing. It

Walt Disney Animation Studios 412 Dec 29, 2022
Reading, writing, and processing images in a wide variety of file formats, using a format-agnostic API, aimed at VFX applications.

README for OpenImageIO Introduction The primary target audience for OIIO is VFX studios and developers of tools such as renderers, compositors, viewer

OpenImageIO 1.6k Jan 2, 2023
An area to test reading in ATLAS xAOD format and writing out to Parquet

xaod_to_parquet An area to test reading in ATLAS xAOD format and writing out to Parquet Getting the Code Clone the repository with the --recursive fla

Daniel Antrim 2 Nov 19, 2021
flashrom is a utility for detecting, reading, writing, verifying and erasing flash chips

flashrom is a utility for detecting, reading, writing, verifying and erasing flash chips

null 614 Dec 26, 2022
Parses existing Chia plotter log files and builds a .csv file containing all the important details

Chia Log Analysis Parses through Chia plotter log files and plops all the juicy details into a CSV file! Build See below for instructions if you prefe

Drew M Johnson 45 May 10, 2022
Fast CSV parser and writer for Modern C++

Table of Contents CSV Reader Performance Benchmark Reader API CSV Writer Writer API Compiling Tests Generating Single Header Contributing License CSV

Pranav 354 Jan 9, 2023
Lister (Total Commander) plugin to view CSV files

csvtab-wlx is a Total Commander plugin to view CSV files. Download the latest version Features Auto-detect codepage and delimiter Column filters Sort

null 13 Dec 7, 2022
SubLink is a C++ library used for constructing and analyzing merger trees in numerical simulations of galaxy formation

README SubLink is a C++ library used for constructing and analyzing merger trees in numerical simulations of galaxy formation. Brief description SubLi

nelson-group 1 Jan 20, 2022
Tools for analyzing and browsing Tarmac instruction traces.

Tarmac Trace Utilities Arm Tarmac Trace Utilities is a suite of tools to read, analyze and browse traces of running programs in the 'Tarmac' textual f

Arm Software 37 Jan 3, 2023
Helper plugin for analyzing UEFI firmware

bn-uefi-helper Description Helper plugin for analyzing UEFI firmware. This plugin contains the following features: Apply the correct prototype to the

Brandon Miller 73 Aug 1, 2022
BlowBeef is a tool for analyzing WMI data.

Blowbeef BlowBeef is a tool for analyzing WMI data. Usage BlowBeef is a tool for analyzing WMI data.

倾旋 18 Sep 2, 2022