A Compile time PCRE (almost) compatible regular expression matcher.

Overview

Compile time regular expressions v3

Build Status

Fast compile-time regular expressions with support for matching/searching/capturing during compile-time or runtime.

You can use the single header version from directory single-header. This header can be regenerated with make single-header. If you are using cmake, you can add this directory as subdirectory and link to target ctre.

More info at compile-time.re

What this library can do

ctre::match<"REGEX">(subject); // C++20
"REGEX"_ctre.match(subject); // C++17 + N3599 extension
  • Matching
  • Searching (search or starts_with)
  • Capturing content (named captures are supported too)
  • Back-Reference (\g{N} syntax, and \1...\9 syntax too)
  • Multiline support (with multi_) functions
  • Unicode properties and UTF-8 support

The library is implementing most of the PCRE syntax with a few exceptions:

  • atomic groups
  • callouts
  • comments
  • conditional patterns
  • control characters (\cX)
  • match point reset (\K)
  • named characters
  • octal numbers
  • options / modes
  • subroutines
  • unicode grapheme cluster (\X)

More documentation on pcre.org.

Possible subjects (inputs)

  • std::string-like objects (std::string_view or your own string if it's providing begin/end functions with forward iterators)
  • pairs of forward iterators

Unicode support

To enable you need to include:

  • <ctre-unicode.hpp>
  • or <ctre.hpp> and <unicode-db.hpp>

Otherwise you will get missing symbols if you try to use the unicode support without enabling it.

Supported compilers

  • clang 6.0+ (template UDL, C++17 syntax)
  • xcode clang 10.0+ (template UDL, C++17 syntax)
  • gcc 8.0+ (template UDL, C++17 syntax)
  • gcc 9.0+ (C++17 & C++20 cNTTP syntax)
  • MSVC 15.8.8+ (C++17 syntax only) (semi-supported, I don't have windows machine)

Template UDL syntax

The compiler must support extension N3599, for example as GNU extension in gcc (not in GCC 9.1+) and clang.

constexpr auto match(std::string_view sv) noexcept {
    using namespace ctre::literals;
    return "h.*"_ctre.match(sv);
}

If you need extension N3599 in GCC 9.1+, you can't use -pedantic. Also, you need to define macro CTRE_ENABLE_LITERALS.

C++17 syntax

You can provide a pattern as a constexpr ctll::fixed_string variable.

static constexpr auto pattern = ctll::fixed_string{ "h.*" };

constexpr auto match(std::string_view sv) noexcept {
    return ctre::match<pattern>(sv);
}

(this is tested in MSVC 15.8.8)

C++20 syntax

Currently, the only compiler which supports cNTTP syntax ctre::match<PATTERN>(subject) is GCC 9+.

constexpr auto match(std::string_view sv) noexcept {
    return ctre::match<"h.*">(sv);
}

Examples

Extracting number from input

std::optional<std::string_view> extract_number(std::string_view s) noexcept {
    if (auto m = ctre::match<"[a-z]+([0-9]+)">(s)) {
        return m.get<1>().to_view();
    } else {
        return std::nullopt;
    }
}

link to compiler explorer

Extracting values from date

struct date { std::string_view year; std::string_view month; std::string_view day; };

std::optional<date> extract_date(std::string_view s) noexcept {
    using namespace ctre::literals;
    if (auto [whole, year, month, day] = ctre::match<"(\\d{4})/(\\d{1,2})/(\\d{1,2})">(s); whole) {
        return date{year, month, day};
    } else {
        return std::nullopt;
    }
}

//static_assert(extract_date("2018/08/27"sv).has_value());
//static_assert((*extract_date("2018/08/27"sv)).year == "2018"sv);
//static_assert((*extract_date("2018/08/27"sv)).month == "08"sv);
//static_assert((*extract_date("2018/08/27"sv)).day == "27"sv);

link to compiler explorer

Using captures

auto result = ctre::match<"(?<year>\\d{4})/(?<month>\\d{1,2})/(?<day>\\d{1,2})">(s);
return date{result.get<"year">(), result.get<"month">, result.get<"day">};

// or in C++ emulation, but the object must have a linkage
static constexpr ctll::fixed_string year = "year";
static constexpr ctll::fixed_string month = "month";
static constexpr ctll::fixed_string day = "day";
return date{result.get<year>(), result.get<month>, result.get<day>};

// or use numbered access
// capture 0 is the whole match
return date{result.get<1>(), result.get<2>, result.get<3>};

Lexer

enum class type {
    unknown, identifier, number
};

struct lex_item {
    type t;
    std::string_view c;
};

std::optional<lex_item> lexer(std::string_view v) noexcept {
    if (auto [m,id,num] = ctre::match<"([a-z]+)|([0-9]+)">(v); m) {
        if (id) {
            return lex_item{type::identifier, id};
        } else if (num) {
            return lex_item{type::number, num};
        }
    }
    return std::nullopt;
}

link to compiler explorer

Range over input

This support is preliminary, probably the API will be changed.

auto input = "123,456,768"sv;

for (auto match: ctre::range<"([0-9]+),?">(input)) {
    std::cout << std::string_view{match.get<0>()} << "\n";
}

Running tests (for developers)

Just run make in root of this project.

Comments
  • Ctre causes stack overflow when using certain text with certain regex ??

    Ctre causes stack overflow when using certain text with certain regex ??

    This Code Fails miserably :

    static constexpr auto line_q = ctll::fixed_string{ "^(\\w+)=\"(.*)\"\\s*$" };
        std::string data = "EXAMPLE=\"TEXT TEXT TEXT TEXT TEXT-TEXT-THAT TEXT.TEXT TEXT TEXT + I I II * I II + I NVM_>U=SOMEMORETEXT#&PhK&2 \"\"TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT\"\"\"";
        std::string_view data_view = data;
        auto matches = ctre::search<line_q>(data_view);  // Line causes stack overflow
    
    opened by beemibrahim 17
  • Concatenating repeats

    Concatenating repeats

    Playing around with repeats, a bit like optionals we can append them together, if found back to back. for example: a{1,2}a{2,3} -> a{3,5} Possessive form's a bit tricky.

    // append repeats
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, size_t A1, size_t B1, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<repeat<A0, B0, Content...>, repeat<A1, B1, Content...>, Tail...>) {
    	return evaluate(begin, current, end, captures, ctll::list<repeat<A0+A1, (B0==0ULL || B1==0ULL) ? (0ULL) : (B0+B1), Content...>, Tail...>());
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<repeat<A0, B0, Content...>, optional<Content...>, Tail...>) {
    	constexpr size_t A1 = 0; constexpr size_t B1 = 1;
    	return evaluate(begin, current, end, captures, ctll::list<repeat<A0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<repeat<A0, B0, Content...>, star<Content...>, Tail...>) {
    	constexpr size_t A1 = 0; constexpr size_t B1 = 0;
    	return evaluate(begin, current, end, captures, ctll::list<repeat<A0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<repeat<A0, B0, Content...>, plus<Content...>, Tail...>) {
    	constexpr size_t A1 = 1; constexpr size_t B1 = 0;
    	return evaluate(begin, current, end, captures, ctll::list<repeat<A0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    }
    
    // append lazy repeats (they act like repeats)
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, size_t A1, size_t B1, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<lazy_repeat<A0,B0, Content...>, lazy_repeat<A1,B1, Content...>, Tail...>) {
    	return evaluate(begin, current, end, captures, ctll::list<lazy_repeat<A0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<lazy_repeat<A0, B0, Content...>, lazy_optional<Content...>, Tail...>) {
    	constexpr size_t A1 = 0; constexpr size_t B1 = 1;
    	return evaluate(begin, current, end, captures, ctll::list<lazy_repeat<A0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<lazy_repeat<A0, B0, Content...>, lazy_star<Content...>, Tail...>) {
    	constexpr size_t A1 = 1; constexpr size_t B1 = 0;
    	return evaluate(begin, current, end, captures, ctll::list<lazy_repeat<A0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<lazy_repeat<A0, B0, Content...>, lazy_plus<Content...>, Tail...>) {
    	constexpr size_t A1 = 1; constexpr size_t B1 = 0;
    	return evaluate(begin, current, end, captures, ctll::list<lazy_repeat<A0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    }
    
    // append possessive repeats (slightly different, since the possessive behavior of the leading repeat changes things)
    // EG: a{0,3}+a{2,4}+ -> a{5,7}+ (here a{0,3}+ will consume up to aaa, then try to process a{2,4}+)
    //     a{2,4}+a{0,3}+ -> a{2,7}+ (here a{2,4}+ may consume aa, then try to process a{0,3}+)
    //     a{2,4}+a{1,3}+ -> a{5,7}+ (here a{2,4}+ will consume up to aaaa, then try to process a{1,3}+)
    // simple cases:
    //     a{0,4}+a{0,3}+ -> a{0,7}+
    //     a{0,3}+a{0,4}+ -> a{0,7}+
    //     a{0,}+a{1,}+ -> reject (here a{0,}+ will process any a's it sees, and since a{0,} doesn't backtrack there can be no a to satisfy a{1,}+)
    //If I've not messed this up this* this is correct
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, size_t A1, size_t B1, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<possessive_repeat<A0, B0, Content...>, lazy_repeat<A1, B1, Content...>, Tail...>) {
    	if constexpr (B0 == 0ULL && A0 > 0ULL) {
    		//consumes all of Content..., then rejects b/c can't backtrack for an additional Content...
    		return not_matched;
    	} else if (B0 == 0ULL && A0 == 0ULL) {
    		return evaluate(begin, current, end, captures, ctll::list<possessive_repeat<A0, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    	} else {
    		return evaluate(begin, current, end, captures, ctll::list<possessive_repeat<B0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    	}
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<possessive_repeat<A0, B0, Content...>, lazy_optional<Content...>, Tail...>) {
    	constexpr size_t A1 = 0; constexpr size_t B1 = 1;
    	if constexpr (B0 == 0ULL && A0 > 0ULL) {
    		//consumes all of Content..., then rejects b/c can't backtrack for an additional Content...
    		return not_matched;
    	} else if (B0 == 0ULL && A0 == 0ULL) {
    		return evaluate(begin, current, end, captures, ctll::list<possessive_repeat<A0, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    	} else {
    		return evaluate(begin, current, end, captures, ctll::list<possessive_repeat<B0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    	}
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<possessive_repeat<A0, B0, Content...>, lazy_star<Content...>, Tail...>) {
    	constexpr size_t A1 = 1; constexpr size_t B1 = 0;
    	if constexpr (B0 == 0ULL && A0 > 0ULL) {
    		//consumes all of Content..., then rejects b/c can't backtrack for an additional Content...
    		return not_matched;
    	} else if (B0 == 0ULL && A0 == 0ULL) {
    		return evaluate(begin, current, end, captures, ctll::list<possessive_repeat<A0, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    	} else {
    		return evaluate(begin, current, end, captures, ctll::list<possessive_repeat<B0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    	}
    }
    template <typename R, typename Iterator, typename EndIterator, size_t A0, size_t B0, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<possessive_repeat<A0, B0, Content...>, lazy_plus<Content...>, Tail...>) {
    	constexpr size_t A1 = 1; constexpr size_t B1 = 0;
    	if constexpr (B0 == 0ULL && A0 > 0ULL) {
    		//consumes all of Content..., then rejects b/c can't backtrack for an additional Content...
    		return not_matched;
    	} else if (B0 == 0ULL && A0 == 0ULL) {
    		return evaluate(begin, current, end, captures, ctll::list<possessive_repeat<A0, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    	} else {
    		return evaluate(begin, current, end, captures, ctll::list<possessive_repeat<B0 + A1, (B0 == 0ULL || B1 == 0ULL) ? (0ULL) : (B0 + B1), Content...>, Tail...>());
    	}
    }
    
    opened by Andersama 14
  • Including ctre.hpp produces errors in MSVC

    Including ctre.hpp produces errors in MSVC

    Hey,

    I installed CTRE via vcpkg and I can include the file just fine, but when I do I get the following:

    image

    Do I need to include something else? Create a special #define? To note that these errors appear as soon as I include the file ctre.hpp, there is no ctre regex code just the file.

    Here is MSVC info:

    Microsoft Visual Studio Community 2019
    Version 16.3.6
    VisualStudio.16.Release/16.3.6+29418.71
    Microsoft .NET Framework
    Version 4.8.03752
    
    Installed Version: Community
    
    Visual C++ 2019   00435-60000-00000-AA806
    Microsoft Visual C++ 2019
    
    opened by AntonioCS 10
  • I didn't find any instructions on how to build this Library on MSVC.

    I didn't find any instructions on how to build this Library on MSVC.

    Hi hanickadot, thanks for this library at first.

    We want to build this lib on MSVC, but I didn't find any instructions. Could you tell me where I can get it? Or can you provide the build steps for me? If you need any other question please let me know. Thank you very much.

    Our environment: VS 2017 + Windows Server 2016

    opened by QuellaZhang 10
  • Wrapped optionals should probably collapse.

    Wrapped optionals should probably collapse.

    In playing around with some regexs that have optionals (to work out maybe some fixes for the long compile times). I found that (?:a?)? generates ctre::optional<ctre::optional<ctre::character<97>>> when it should probably generate or evaluate as ctre::optional<ctre::character<97>>.

    EG: probably should have

    // wrapped optional
    template <typename R, typename Iterator, typename EndIterator, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<optional<optional<Content...>>, Tail...>) noexcept {
    	if (auto r1 = evaluate(begin, current, end, captures, ctll::list<sequence<Content...>, Tail...>())) {
    		return r1;
    	} else if (auto r2 = evaluate(begin, current, end, captures, ctll::list<Tail...>())) {
    		return r2;
    	} else {
    		return not_matched;
    	}
    }
    
    // wrapped lazy optional
    template <typename R, typename Iterator, typename EndIterator, typename... Content, typename... Tail> 
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<lazy_optional<lazy_optional<Content...>>, Tail...>) noexcept {
    	if (auto r1 = evaluate(begin, current, end, captures, ctll::list<Tail...>())) {
    		return r1;
    	} else if (auto r2 = evaluate(begin, current, end, captures, ctll::list<sequence<Content...>, Tail...>())) {
    		return r2;
    	} else {
    		return not_matched;
    	}
    }
    
    // wrapped optional in lazy optional (optional processes first, supersedes lazy inside)
    template <typename R, typename Iterator, typename EndIterator, typename... Content, typename... Tail>
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<lazy_optional<optional<Content...>>, Tail...>) noexcept {
    	if (auto r1 = evaluate(begin, current, end, captures, ctll::list<sequence<Content...>, Tail...>())) {
    		return r1;
    	} else if (auto r2 = evaluate(begin, current, end, captures, ctll::list<Tail...>())) {
    		return r2;
    	} else {
    		return not_matched;
    	}
    }
    
    // wrapped lazy optional in optional (lazy optional processes first, supersedes optional inside)
    template <typename R, typename Iterator, typename EndIterator, typename... Content, typename... Tail> 
    constexpr CTRE_FORCE_INLINE R evaluate(const Iterator begin, Iterator current, const EndIterator end, R captures, ctll::list<optional<lazy_optional<Content...>>, Tail...>) noexcept {
    	if (auto r1 = evaluate(begin, current, end, captures, ctll::list<Tail...>())) {
    		return r1;
    	} else if (auto r2 = evaluate(begin, current, end, captures, ctll::list<sequence<Content...>, Tail...>())) {
    		return r2;
    	} else {
    		return not_matched;
    	}
    }
    
    opened by Andersama 8
  • infinite loop if cycle within cycle

    infinite loop if cycle within cycle

    The following expression will enter in an infinite loop! [Compiler: clang 8.0 with -std=c++2a]

    R"(^(([^\\{}]*(\\.)?[^\\{}]*)*\{\}([^\\{}]*(\\.)?[^\\{}]*)*){2}$)"_ctre
            .match(R"(sds{}dsds{}ds)"sv);
    

    error message:

    ./utl/ctre.hpp:3671:63: note: constexpr evaluation hit maximum step limit; possible infinite loop?
            constexpr CTRE_FORCE_INLINE void matched() noexcept { _matched = true; }
                                                                  ^
    ./utl/ctre.hpp:3943:40: note: in call to '&captures._captures.head->matched()'
            _captures.template select<0>().matched();
                                           ^
    ./utl/ctre.hpp:4947:43: note: in call to '&captures->matched()'
        return captures.set_end_mark(current).matched();
    
    bug 
    opened by yaoxinliu 8
  • MSVC: first.hpp: error C2664: cannot convert argument 2 from 'ctll::list' to 'ctll::list<>

    MSVC: first.hpp: error C2664: cannot convert argument 2 from 'ctll::list' to 'ctll::list<>

    One of the latest commits caused this regression with MSVC 16.1 Preview 2:

    29>C:\Projects\Libraries\ctre\compile-time-regular-expressions\include\ctre\first.hpp(244,9): error C2664: 'auto ctre::first<>(ctll::list<>,ctll::list<>) noexcept': cannot convert argument 2 from 'ctll::list' to 'ctll::list<>' 29>C:\Projects\Libraries\ctre\compile-time-regular-expressions\include\ctre\first.hpp(244,9): error C2664: with 29>C:\Projects\Libraries\ctre\compile-time-regular-expressions\include\ctre\first.hpp(244,9): error C2664: [ 29>C:\Projects\Libraries\ctre\compile-time-regular-expressions\include\ctre\first.hpp(244,9): error C2664: A=ctre::negative_setctre::space_chars 29>C:\Projects\Libraries\ctre\compile-time-regular-expressions\include\ctre\first.hpp(244,9): error C2664: ]

    opened by matbech 8
  • Ctre doesn't work when there is a

    Ctre doesn't work when there is a "\r" at the end of string

    When using the slow "std::regex" it works :

    std::string data = "WORD\r";
    std::smatch matches;
    	std::regex_search(data, matches, std::regex("^(\\w+)"));
    
    	std::string match1 = matches[0];       // returns the match
    
    

    When using fast ctre regex the variable " auto matches" is NULL :

    std::string data = "WORD\r";
    std::string_view data_view = data;
        static constexpr auto elem = ctll::fixed_string{ "^(\\w+)" };
    	auto matches = ctre::match<elem>(data_view);              // matches is null here
    
    	std::string match1 = std::string(matches.get<1>());            // generates exception since matches is NULL
    

    Pls fix this

    Im using Visual Studio with C++ 20

    opened by beemibrahim 7
  • [MSVC] When building CTRE emits errors C2440 and C2912 on MSVC for amd_64 and x_86

    [MSVC] When building CTRE emits errors C2440 and C2912 on MSVC for amd_64 and x_86

    System information Windows 10

    Issue description Hi, all CTRE fails to build on MSVC build due to error C2440 and C2912. I have reproduced these errors in the latest fe build, Could you help look?

    Steps to reproduce

    1. open VS2019 x86 (or VS2019 x64)tools command
    2. git clone https://github.com/hanickadot/compile-time-regular-expressions F:\gitP\hanickadot\compile-time-regular-expressions
    3. cd F:\gitP\hanickadot\compile-time-regular-expressions\build_x86(or mkdir build_amd64
    4. cd F:\gitP\hanickadot\compile-time-regular-expressions\build_amd64 )
    5. set CL=/std:c++latest /utf-8 /permissive-
    6. \vcfs\Builds\msvc\fe\20220316.02\binaries.x86chk\bin\i386\CL.exe /std:c++latest /TP _unicode.i(or \vcfs\Builds\msvc\fe\20220316.02\binaries.amd64chk\bin\amd64\CL.exe /std:c++latest /TP _unicode.i)

    Build.log:

    buildfor64.log buildfor86.log

    Error info: F:\gitP\hanickadot\compile-time-regular-expressions\include\unicode-db/unicode-db.hpp(344,31): error C2440: 'initializing': cannot convert from 'std::_String_view_iterator<_Traits>' to 'const char *' [F:\gitP\hanickadot\compile-time-regular-expressions\build_x86\tests\ctre-test-_unicode.vcxproj] with [ _Traits=std::char_traits ] F:\gitP\hanickadot\compile-time-regular-expressions\include\unicode-db/unicode-db.hpp(344,19): message : No user-defined-conversion operator available that can perform this conversion, or the operator cannot be called [F:\gitP\hanickadot\compile-time-regular-expressions\build_x86\tests\ctre-test-_unicode.vcxproj]

    F:\gitP\hanickadot\compile-time-regular-expressions\include\unicode-db/unicode-db.hpp(2994,1): error C2912: explicit specialization 'bool uni::cp_isuni::category::co(char32_t)' is not a specialization of a function template [F:\gitP\hanickadot\compile-time-regular-expressions\build_x86\tests\ctre-test-_unicode.vcxproj]

    other attachments _unicodefor64.i.txt _unicodefor86.i.txt

    opened by Apriltanq 6
  • fix several compiler errors when using msvc with unicode headers

    fix several compiler errors when using msvc with unicode headers

    Of particular note is an MSVC bug where it does not include the enum type in a mangled type's name when you use only the value. I managed to work around this by including the enum type in the type name, separately from the value. See this godbolt link for a simple demonstration of the problem.

    opened by Ryan-rsm-McKenzie 6
  • Warning C4702: unreachable code

    Warning C4702: unreachable code

    Using:

    • ctre/3.3.4
    • MSVC/16.7.1

    Pretty much any regex using a quantifier (e.g. .{2}) results in a handful of unreachable code warnings. Matching with that regex works as expected however. When working with /Wx this is still problematic.

    The warning comes from right here ctre\evaluation.hpp(343): warning C4702: unreachable code while the subsequent warnings are of the same root cause.

    #ifndef CTRE_DISABLE_GREEDY_OPT
    	if constexpr (!collides(calculate_first(Content{}...), calculate_first(Tail{}...))) {
    		return evaluate(begin, current, end, f, captures, ctll::list<possessive_repeat<A,B,Content...>, Tail...>());
    	}
    #endif
    

    I believe right here is the problem, that if constexpr has no else clause.

    wontfix 
    opened by Malacath-92 6
Releases(v3.7.1)
  • v3.7.1(Oct 1, 2022)

    • switching modes inside pattern (pattern: (?i)AB(?c)AB matches string: "abAB")
    • better detection of char8_t support
    • fix build failure in libc++ debug mode
    • remove some shadowing warnings
    Source code(tar.gz)
    Source code(zip)
  • v3.7(Jun 1, 2022)

    Support for:

    • lookbehind positive (<=abc) and negative (<!abc)
    • case insensitive matching for ASCII only ctre::OPERATION<"regex", ctre::case_insensitive>(subject_or_range) (example: ctre::match<"[a-z]+", ctre::case_insensitive>(input)

    Notes: don't use mixes case range when doing case_insensitive:

    • [a-z] will match a-zA-Z
    • [a-Z] won't match anything (as ascii(a) > ascii(Z))
    • [A-z] will match more than you expect A-Z[\]^_a-z + backtick (as I don't know how to type in markdown correctly)
    Source code(tar.gz)
    Source code(zip)
  • v3.6(Mar 20, 2022)

    • [[:punct:]] behaves as ispunct in C
    • fix for several MSVC warnings
    • cmake library is using C++20 by default (you can change it back with -DCTRE_CXX_STANDARD=17)
    • CI/CD for all supported compilers (+ some fixes along the way)
    Source code(tar.gz)
    Source code(zip)
  • v3.5(Jan 23, 2022)

  • v3.4(Apr 17, 2021)

  • v3.3.4(Nov 8, 2020)

  • v3.3.2(Nov 7, 2020)

  • v3.3.1(Nov 7, 2020)

    Fixes regression #131.

    • prepare infrastructure for flags (case insensitiveness, multi-line support)
    • new string comparison without recursion
    Source code(tar.gz)
    Source code(zip)
  • v3.3(Nov 7, 2020)

  • v3.2(Oct 18, 2020)

    • update unicode database for missing scripts
    • fix in ctre::utf8_iterator::operator* for multibyte characters for characters which have highest bit in first UTF8 unit used
    • remove LIKELY/UNLIKELY for older GCC to remove warning
    • fix missing rotated comparison of utf8_iterator and sentinel
    Source code(tar.gz)
    Source code(zip)
  • v3.1(Oct 17, 2020)

    Now you can use std::u8string and std::u8string_view with CTRE! 🎉

    std::optional<std::u8string> match(std::u8string_view subject) {
    	if (auto res = ctre::match<"(\\p{emoji}+)">(subject)) {
    		return res.get<1>().to_string();
    	} else {
    		return std::nullopt;
    	}
    }
    

    Also you can use ctre::utf8_range to convert between unicode encodings:

    std::u32string convert(std::u8string_view input) {
    	auto r = ctre::utf8_range(input);
    	return std::u32string(r.begin(), r.end());
    }
    
    Source code(tar.gz)
    Source code(zip)
  • v3.0.1(Oct 16, 2020)

    If you include <unicode-db.hpp> CTRE will support unicode properties. If you don't want to include two headers, you can just include <ctre-unicode.hpp>.

    Have fun! 🤯

    Thanks to @cor3ntin for making it possible. ❤️

    Source code(tar.gz)
    Source code(zip)
  • v2.10(Sep 9, 2020)

    • support for starts_with which is equivalent to search<"^...">
    • enable optimising search<"^..."> to generate better assembly (https://compiler-explorer.com/z/nvjG4E)
    • added vertical and horizontal white space (\v, \V, \h, \H)
    Source code(tar.gz)
    Source code(zip)
  • v2.8.3(May 22, 2020)

    • new code for extracting capture content with result.get<...>() functions, which should instantiate O(1) on C++20 compatible compilers with CNTTP

    simple benchmark on GCC shows 50% better compile time in capture heavy code (tests/gets.cpp)

    Source code(tar.gz)
    Source code(zip)
  • v2.6.4(Apr 30, 2019)

  • v2.6.3(Apr 30, 2019)

    If you regex contains non colliding greedy cycle eg. [a-z]+[^a-z]+ the cycle(s) will be optimized into possessive one(s).

    You can disable the optimization with macro CTRE_DISABLE_GREEDY_OPT.

    Source code(tar.gz)
    Source code(zip)
  • v2.2(Oct 22, 2018)

    Support for range and iterators.

    auto input = "123,456,768"sv;
    
    using namespace ctre::literals;
    for (auto match: ctre::range(input,"[0-9]++"_ctre)) {
    	std::cout << std::string_view{match} << "\n";
    }
    
    Source code(tar.gz)
    Source code(zip)
  • cppcon2018(Oct 2, 2018)

Owner
Hana Dusíková
Scientist in Avast. Author of Compile Time Regular Expressions. Chair of SG7 "Compile-Time Programming Study Group" in WG21 "ISO C++ Committee". Czech NB in WG2
Hana Dusíková
High-performance regular expression matching library

Hyperscan Hyperscan is a high-performance multiple regex matching library. It follows the regular expression syntax of the commonly-used libpcre libra

Intel Corporation 4k Jan 1, 2023
regular expression library

Oniguruma https://github.com/kkos/oniguruma Oniguruma is a modern and flexible regular expressions library. It encompasses features from different reg

K.Kosako 1.9k Jan 3, 2023
A small implementation of regular expression matching engine in C

cregex cregex is a compact implementation of regular expression (regex) matching engine in C. Its design was inspired by Rob Pike's regex-code for the

Jim Huang 72 Dec 6, 2022
Easier CPP interface to PCRE regex engine with global match and replace

RegExp Easier CPP interface to PCRE regex engine with global match and replace. I was looking around for better regex engine than regcomp for my C/C++

Yasser Asmi 5 May 21, 2022
C++ regular expressions made easy

CppVerbalExpressions C++ Regular Expressions made easy VerbalExpressions is a C++11 Header library that helps to construct difficult regular expressio

null 362 Nov 29, 2022
Perl Incompatible Regular Expressions library

This is PIRE, Perl Incompatible Regular Expressions library. This library is aimed at checking a huge amount of text against relatively many regular

Yandex 320 Oct 9, 2022
Onigmo is a regular expressions library forked from Oniguruma.

Onigmo (Oniguruma-mod) https://github.com/k-takata/Onigmo Onigmo is a regular expressions library forked from Oniguruma. It focuses to support new exp

K.Takata 585 Dec 6, 2022
A non-backtracking NFA/DFA-based Perl-compatible regex engine matching on large data streams

Name libsregex - A non-backtracking NFA/DFA-based Perl-compatible regex engine library for matching on large data streams Table of Contents Name Statu

OpenResty 600 Dec 22, 2022
RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.

This is the source code repository for RE2, a regular expression library. For documentation about how to install and use RE2, visit https://github.co

Google 7.5k Jan 4, 2023
Header-only ECMAScript (JavaScript) compatible regular expression engine

SRELL (std::regex-like library) is a regular expression template library for C++ and has native support for UTF-8, UTF-16, and UTF-32. This is up-to-d

Dmitry Atamanov 4 Mar 11, 2022
A compile-time enabled Modern C++ library that provides compile-time dimensional analysis and unit/quantity manipulation.

mp-units - A Units Library for C++ The mp-units library is the subject of ISO standardization for C++23/26. More on this can be found in ISO C++ paper

Mateusz Pusz 679 Dec 29, 2022
High-performance regular expression matching library

Hyperscan Hyperscan is a high-performance multiple regex matching library. It follows the regular expression syntax of the commonly-used libpcre libra

Intel Corporation 4k Jan 1, 2023
regular expression library

Oniguruma https://github.com/kkos/oniguruma Oniguruma is a modern and flexible regular expressions library. It encompasses features from different reg

K.Kosako 1.9k Jan 3, 2023
A small implementation of regular expression matching engine in C

cregex cregex is a compact implementation of regular expression (regex) matching engine in C. Its design was inspired by Rob Pike's regex-code for the

Jim Huang 72 Dec 6, 2022
Love 6's Regular Expression Engine. Support Concat/Select/Closure Basic function. Hope u can enjoy this tiny engine :)

Regex_Engine Love 6's Blog Website: https://love6.blog.csdn.net/ Love 6's Regular Expression Engine Hope u can love my tiny regex engine :) maybe a fe

Love6 2 May 24, 2022
Fast regular expression grep for source code with incremental index updates

Fast regular expression grep for source code with incremental index updates

Arseny Kapoulkine 261 Dec 28, 2022
A portable fork of the high-performance regular expression matching library

Vectorscan? A fork of Intel's Hyperscan, modified to run on more platforms. Currently ARM NEON/ASIMD is 100% functional, and Power VSX are in developm

VectorCamp 275 Dec 26, 2022
Jinja2 C++ (and for C++) almost full-conformance template engine implementation

Jinja2С++ C++ implementation of the Jinja2 Python template engine. This library brings support of powerful Jinja2 template features into the C++ world

Jinja2C++ project 385 Dec 17, 2022
Metin2 Resource Dumper/Extractor Tool. Dump 100% of the resources from almost any Metin2 Client

PackDumper Metin2 Resource Dumper/Extractor Tool. Dump 100% of the resources from almost any Metin2 Client How to Compile ✔️ Clone the project and com

null 12 Aug 11, 2022
Easier CPP interface to PCRE regex engine with global match and replace

RegExp Easier CPP interface to PCRE regex engine with global match and replace. I was looking around for better regex engine than regcomp for my C/C++

Yasser Asmi 5 May 21, 2022