This is the source code repository for RE2, a regular expression library. For documentation about how to install and use RE2, visit https://github.com/google/re2/. The short version is: make make test make install make testinstall There is a fair amount of documentation (including code snippets) in the re2.h header file. More information can be found on the wiki: https://github.com/google/re2/wiki Issue tracker: https://github.com/google/re2/issues Mailing list: https://groups.google.com/group/re2-dev Unless otherwise noted, the RE2 source files are distributed under the BSD-style license found in the LICENSE file. RE2's native language is C++. The Python wrapper is at https://github.com/google/re2/tree/abseil/python and on PyPI (https://pypi.org/project/google-re2/). A C wrapper is at https://github.com/marcomaggi/cre2/. An Erlang wrapper is at https://github.com/dukesoferl/re2/ and on Hex (hex.pm). An Inferno wrapper is at https://github.com/powerman/inferno-re2/. A Node.js wrapper is at https://github.com/uhop/node-re2/ and on NPM (npmjs.com). An OCaml wrapper is at https://github.com/janestreet/re2/ and on OPAM (opam.ocaml.org). A Perl wrapper is at https://github.com/dgl/re-engine-RE2/ and on CPAN (cpan.org). An R wrapper is at https://github.com/qinwf/re2r/ and on CRAN (cran.r-project.org). A Ruby wrapper is at https://github.com/mudge/re2/ and on RubyGems (rubygems.org). A WebAssembly wrapper is at https://github.com/google/re2-wasm/ and on NPM (npmjs.com).
RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.
Overview
Comments
-
Tweaks to cmake-based build needed for packaging
I've been asked to use CMake to build re2 on Arch Linux so it can be consumed by other CMake-based projects like gRPC. There are a couple of issues with this:
- We still want to ship
re2.pc
as part of a cmake-based build (reverting https://github.com/google/re2/commit/5bd613749fd530b576b890283bfb6bc6ea6246cb would solve this I think) - The cmake-based build installs the
re2.so
library without a soname version. Something like the following seems to work:
set_target_properties(re2 PROPERTIES VERSION 8.0.0 SOVERSION 8)
Of course there would need to be some way to keep the soname consistent between
Makefile
andCMakeLists.txt
. A single file containing just the major soname version would be easy to read from both build systems. I'm not sure if there's another solution that would be better here. - We still want to ship
-
pip install fails on Macbook Pro M1
I apologize if I'm missing something but I can't seem to get re2 installed on my Macbook running MacOS 12.0.1 with an M1 Pro chip. I'm running python in a virtual environment (a pycharm VM) running Python 3.9 and I also successfully installed pybind11 v 2.8.1. The error seems to be: fatal error: too many errors emitted, stopping now [-ferror-limit=] 29 warnings and 20 errors generated. error: command '/usr/bin/gcc' failed with exit code 1
ERROR: Failed building wheel for google-re2 ERROR: Command errored out with exit status 1:
Although there is a very long list of output. Any help would be much appreciated.
-
Building with USEPCRE in MSCV 2013 gives link errors: entry point pcre_free multiply defined.
Compiling the "test" project with USEPCRE defined, in MSVC 2013, I get warning messages that entry point pcre_free is defined in different programs:
1>tester.obj : warning LNK4006: pcre_free already defined in pcre.obj; second definition ignored 1>exhaustive_tester.obj : warning LNK4006: pcre_free already defined in pcre.obj; second definition ignored 1> test.vcxproj -> E:\src\re2\bin\Debug\test.lib
If I try to build any program that uses test.lib with USEPCRE defined, I get similar messages but they are errors, not warnings, and the link fails:
1>------ Build started: Project: exhaustive_test, Configuration: Debug x64 ------ 1>test.lib(tester.obj) : error LNK2005: pcre_free already defined in test.lib(exhaustive_tester.obj) 1>test.lib(pcre.obj) : error LNK2005: pcre_free already defined in test.lib(exhaustive_tester.obj) 1>pcre.lib(pcre_globals.obj) : error LNK2005: pcre_free already defined in test.lib(exhaustive_tester.obj) 1>LINK : warning LNK4098: defaultlib 'MSVCRT' conflicts with use of other libs; use /NODEFAULTLIB:library 1>E:\src\re2\bin\Debug\exhaustive_test.exe : fatal error LNK1169: one or more multiply defined symbols found
(Since PCRE is not native to Windows, I downloaded PCRE version 8.37, built it, and put pcre.lib and pcre.h in the default lib and include directories).
I have no idea how this could happen. It looks as if tester.obj and exhaustive_tester.obj contain definitions of pcre_free, but looking at them, I don't see how they could. I am quite mystified.
Does this even matter? Do we need to be able the run the re2 tests with PCRE when building with MSVC?
-
It fails to compile using g++ 6.1.x
g++ -c -o obj/dbg/re2/dfa.o -std=c++11 -pthread -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -I. -O3 -g re2/dfa.cc re2/dfa.cc: In Konstruktor »re2::DFA::State::State()«: re2/dfa.cc:95:10: Fehler: unbekannte Feldgröße in »delete« struct State { ^~~~~ re2/dfa.cc: In Elementfunktion »re2::DFA::State* re2::DFA::CachedState(int*, int, re2::uint)«: re2/dfa.cc:703:9: Anmerkung: erzeugte Methode »re2::DFA::State::State()« zuerst hier erfordert State state; ^~~~~ make: *** [Makefile:190: obj/dbg/re2/dfa.o] Fehler 1
When I forced it to use clang++ with "make CXX=clang++ test" it worked.
Versions: clang version 3.8.0 (tags/RELEASE_380/final) g++ (GCC) 6.1.1 20160707
Translation of the error message to english (if that helps you): re2/dfa.cc:95:10: Error: unknown array size in »delete«
-
testinstall hangs when compiled with --static
On Ubuntu 14.10 64 or 32-bit, which has gcc version 4.9.1, compiling testinstall.cc with --static results in a program that hangs. I don't see this problem on other platforms.
git clone https://github.com/google/re2.git cd re2 make clean make g++ --static testinstall.cc -L obj -lre2 -I . -pthread ./a.out
the a.out program hangs, but if you take away --static, it does not hang.
-
continuous fuzzing for re2
Let's fuzz re2 to make it even more reliable! I've set up a public fuzzing bot for re2. Right now it uses libFuzzer and AddressSanitizer, and tests a very simple target function.
This bug will be tracking the process of improving and extending this fuzzer. Bugs found by this process (if any) will be reported separately.
-
How to build Google RE2 for Mac OS(High Sierra), did anybody try?
When I use
make test
, I got this error:Running dynamic binary tests. obj/so/test/charclass_test PASS obj/so/test/compile_test PASS obj/so/test/filtered_re2_test FAIL; output in obj/so/test/filtered_re2_test.log obj/so/test/mimics_pcre_test PASS obj/so/test/parse_test PASS obj/so/test/possible_match_test FAIL; output in obj/so/test/possible_match_test.log obj/so/test/re2_test FAIL; output in obj/so/test/re2_test.log obj/so/test/re2_arg_test FAIL; output in obj/so/test/re2_arg_test.log obj/so/test/regexp_test PASS obj/so/test/required_prefix_test PASS obj/so/test/search_test FAIL; output in obj/so/test/search_test.log obj/so/test/set_test FAIL; output in obj/so/test/set_test.log obj/so/test/simplify_test PASS obj/so/test/string_generator_test PASS TESTS FAILED. make: *** [shared-test] Error 1
-
Eliminate compiler warnings for MSVC
I have changes, mostly using static_cast, to eliminate all compiler warnings for MSVC. What do I do to upload them? I am only somewhat familiar with GIT.
-
regexp_benchmark error: Check failed: (prog->SearchDFA
Following error occurs compiled with MSVC 2013, debug or release, compiled with PCRE:
Check failed: (prog->SearchDFA Check failed: (prog->SearchDFA
Search_BigFixed_CachedDFA/4K 5000 335033 ns/op 12.23 MB/s Search_BigFixed_CachedDFA/8K 1000 1331514 ns/op 6.15 MB/s Search_BigFixed_CachedDFA/16K 200 5568419 ns/op 2.94 MB/s Search_BigFixed_CachedDFA/32K 1 1945298081 ns/op 0.02 MB/s Search_BigFixed_CachedDFA/64K 1 6533983296 ns/op 0.01 MB/s e:\src\re2\re2\testing\regexp_benchmark.cc:899: Check failed: (prog->SearchDFA(text, 0, anchor, Prog::kFirstMatch, 0, &failed, 0)) == (expect_match)
I will try compiling without PCRE and see if I get the same result.
-
iOS got rejected due to legal issue (Guideline 5.0 - Legal) Syria
We use gRPC in our app, and the gPRC-Core use re2. And our iOS app got rejected due to legal issue.
Guideline 5.0 - Legal In distributing apps on the App Store, Apple must comply with U.S. laws. Under U.S. sanctions regulations, the App Store cannot host, distribute or do business with certain apps or developers connected to U.S. embargoed countries or regions. This also extends to Syria, which is currently subject to comprehensive restrictions. We have identified that your app contains functionality that is connected to a U.S. embargoed territory. Specifically, your app includes the following link/s in the binary: Syria
And I checked our app, and found there's a code https://github.com/google/re2/blob/885eb38accf49e2ccdd2fa6786f3590cf40a3e23/re2/unicode_groups.cc#L6086 https://github.com/google/re2/blob/885eb38accf49e2ccdd2fa6786f3590cf40a3e23/re2/unicode_groups.cc#L6402
Is there any chance I need to delete the code mentioned above to get our app approved by Apple Store?
-
Provide proper shared library versioning and cmake support
re2
is in the critical path of several large projects that rely on cmake builds of grpc, apache-arrow, and others, including several tensorflow-related and other Google projects like magenta. See:- https://bugs.archlinux.org/task/67739
- https://gist.github.com/mtorromeo/1d48de16534dc8ee29cd872f94b070e5
- https://github.com/macports/macports-ports/pull/9836
Yet the support of a basic build chain for
re2
in common environments (macOS, Linux) isn't yet supported, and done so "unenthusiastically", forcing hacky backflips like this, this, and this just to provide shared library versioning.Would you please provide support of shared library versioning and if not enthusiastic, then basic cmake support that gets the versioning correct?
Other builds rely on all these, and these issues must be hacked in or ignored now.
Releases(2022-12-01)
-
2022-12-01(Nov 30, 2022)
-
2022-06-01(May 31, 2022)
-
2022-04-01(Mar 31, 2022)
-
2022-02-01(Jan 31, 2022)
-
2021-11-01(Nov 1, 2021)
-
2021-09-01(Sep 1, 2021)
-
2021-08-01(Aug 1, 2021)
-
2021-06-01(Jun 1, 2021)
-
2021-04-01(Apr 1, 2021)
-
2021-02-02(Feb 2, 2021)
-
2021-02-01(Jan 31, 2021)
-
2020-11-01(Nov 1, 2020)
-
2020-10-01(Oct 1, 2020)
-
2020-08-01(Aug 1, 2020)
-
2020-07-06(Jul 6, 2020)
-
2020-07-01(Jul 1, 2020)
-
2020-06-01(Jun 1, 2020)
-
2020-05-01(May 1, 2020)
-
2020-04-01(Apr 1, 2020)
-
2020-03-03(Mar 3, 2020)
-
2020-03-02(Mar 2, 2020)
-
2020-03-01(Mar 1, 2020)
-
2020-01-01(Dec 31, 2019)
-
2019-12-01(Dec 1, 2019)
-
2019-11-01(Nov 1, 2019)
-
2019-09-01(Sep 1, 2019)
-
2019-08-01(Aug 1, 2019)
-
2019-07-01(Jul 1, 2019)
-
2019-06-01(Jun 1, 2019)
-
2019-04-01(Apr 2, 2019)
A non-backtracking NFA/DFA-based Perl-compatible regex engine matching on large data streams
Name libsregex - A non-backtracking NFA/DFA-based Perl-compatible regex engine library for matching on large data streams Table of Contents Name Statu
Perl Incompatible Regular Expressions library
This is PIRE, Perl Incompatible Regular Expressions library. This library is aimed at checking a huge amount of text against relatively many regular
High-performance regular expression matching library
Hyperscan Hyperscan is a high-performance multiple regex matching library. It follows the regular expression syntax of the commonly-used libpcre libra
regular expression library
Oniguruma https://github.com/kkos/oniguruma Oniguruma is a modern and flexible regular expressions library. It encompasses features from different reg
A small implementation of regular expression matching engine in C
cregex cregex is a compact implementation of regular expression (regex) matching engine in C. Its design was inspired by Rob Pike's regex-code for the
Easier CPP interface to PCRE regex engine with global match and replace
RegExp Easier CPP interface to PCRE regex engine with global match and replace. I was looking around for better regex engine than regcomp for my C/C++
Onigmo is a regular expressions library forked from Oniguruma.
Onigmo (Oniguruma-mod) https://github.com/k-takata/Onigmo Onigmo is a regular expressions library forked from Oniguruma. It focuses to support new exp
C++ regular expressions made easy
CppVerbalExpressions C++ Regular Expressions made easy VerbalExpressions is a C++11 Header library that helps to construct difficult regular expressio
The approximate regex matching library and agrep command line tool.
Introduction TRE is a lightweight, robust, and efficient POSIX compliant regexp matching library with some exciting features such as approximate (fuzz
SRL-CPP is a Simple Regex Language builder library written in C++11 that provides an easy to use interface for constructing both simple and complex regex expressions.
SRL-CPP SRL-CPP is a Simple Regex Language builder library written in C++11 that provides an easy to use interface for constructing both simple and co
Glob pattern to regex translator in C++11. Optionally, directory traversal with glob pattern in C++17. Header-only library.
Glob pattern to regex translator in C++11. Optionally, directory traversal with glob pattern in C++17. Header-only library.
A Compile time PCRE (almost) compatible regular expression matcher.
Compile time regular expressions v3 Fast compile-time regular expressions with support for matching/searching/capturing during compile-time or runtime
A non-backtracking NFA/DFA-based Perl-compatible regex engine matching on large data streams
Name libsregex - A non-backtracking NFA/DFA-based Perl-compatible regex engine library for matching on large data streams Table of Contents Name Statu
Perl Incompatible Regular Expressions library
This is PIRE, Perl Incompatible Regular Expressions library. This library is aimed at checking a huge amount of text against relatively many regular
Fast regular expression grep for source code with incremental index updates
Fast regular expression grep for source code with incremental index updates
A library of type safe sets over fixed size collections of types or values, including methods for accessing, modifying, visiting and iterating over those.
cpp_enum_set A library of type safe sets over fixed size collections of types or values, including methods for accessing, modifying, visiting and iter
High-performance regular expression matching library
Hyperscan Hyperscan is a high-performance multiple regex matching library. It follows the regular expression syntax of the commonly-used libpcre libra
regular expression library
Oniguruma https://github.com/kkos/oniguruma Oniguruma is a modern and flexible regular expressions library. It encompasses features from different reg
A portable fork of the high-performance regular expression matching library
Vectorscan? A fork of Intel's Hyperscan, modified to run on more platforms. Currently ARM NEON/ASIMD is 100% functional, and Power VSX are in developm
A small implementation of regular expression matching engine in C
cregex cregex is a compact implementation of regular expression (regex) matching engine in C. Its design was inspired by Rob Pike's regex-code for the