Glob for C++17

Overview

Unix-style pathname pattern expansion

Table of Contents

Quick Start

  • This library is available in two flavors:
    1. Two file version: glob.h and glob.cpp
    2. Single header file version in single_include/
  • No external dependencies - just the standard library
  • Requires C++17 std::filesystem
  • MIT License

Build Library and Standalone Sample

cmake -Hall -Bbuild
cmake --build build

# run standalone `glob` sample
./build/standalone/glob --help

Usage

// Match on a single pattern
for (auto& p : glob::glob("~/.b*")) {                // e.g., .bash_history, .bashrc
  // do something with `p`
}

// Match on multiple patterns
for (auto& p : glob::glob({"*.png", "*.jpg"})) {     // e.g., foo.png, bar.jpg
  // do something with `p`
}

// Match recursively with `rglob`
for (auto& p : glob::rglob("**/*.hpp")) {            // e.g., include/foo.hpp, include/foo/bar.hpp
  // do something with `p`
}

API

/// e.g., glob("*.hpp")
/// e.g., glob("**/*.cpp")
/// e.g., glob("test_files_02/[0-9].txt")
/// e.g., glob("/usr/local/include/nc*.h")
/// e.g., glob("test_files_02/?.txt")
vector<filesystem::path> glob(string pathname);

/// Globs recursively
/// e.g., rglob("Documents/Projects/Foo/**/*.hpp")
/// e.g., rglob("test_files_02/*[0-9].txt")
vector<filesystem::path> rglob(string pathname);

There are also two convenience functions to glob on a list of patterns:

/// e.g., glob({"*.png", "*.jpg"})
vector<filesystem::path> glob(vector<string> pathnames);

/// Globs recursively
/// e.g., rglob({"**/*.h", "**/*.hpp", "**/*.cpp"})
vector<filesystem::path> rglob(vector<string> pathnames);

Wildcards

Wildcard Matches Example
* any characters *.txt matches all files with the txt extension
? any one character ??? matches files with 3 characters long
[] any character listed in the brackets [ABC]* matches files starting with A,B or C
[-] any character in the range listed in brackets [A-Z]* matches files starting with capital letters
[!] any character listed in the brackets [!ABC]* matches files that do not start with A,B or C

Examples

The following examples use the standalone sample that is part of this repository to illustrate the library functionality.

foo@bar:~$ ./build/standalone/glob -h
Run glob to find all the pathnames matching a specified pattern
Usage:
  ./build/standalone/glob [OPTION...]

  -h, --help       Show help
  -v, --version    Print the current version number
  -r, --recursive  Run glob recursively
  -i, --input arg  Patterns to match

Match file extensions

foo@bar:~$ tree
.
├── include
│   └── foo
│       ├── bar.hpp
│       ├── baz.hpp
│       └── foo.hpp
└── test
    ├── bar.cpp
    ├── doctest.hpp
    ├── foo.cpp
    └── main.cpp

3 directories, 7 files

foo@bar:~$ ./glob -i "**/*.hpp"
"test/doctest.hpp"

foo@bar:~$ ./glob -i "**/**/*.hpp"
"include/foo/baz.hpp"
"include/foo/foo.hpp"
"include/foo/bar.hpp"

NOTE If you run glob recursively, i.e., using rglob:

foo@bar:~$ ./glob -r -i "**/*.hpp"
"test/doctest.hpp"
"include/foo/baz.hpp"
"include/foo/foo.hpp"
"include/foo/bar.hpp"

Match files in absolute pathnames

foo@bar:~$ ./glob -i '/usr/local/include/nc*.h'
"/usr/local/include/ncCheck.h"
"/usr/local/include/ncGroupAtt.h"
"/usr/local/include/ncUshort.h"
"/usr/local/include/ncByte.h"
"/usr/local/include/ncString.h"
"/usr/local/include/ncUint64.h"
"/usr/local/include/ncGroup.h"
"/usr/local/include/ncUbyte.h"
"/usr/local/include/ncvalues.h"
"/usr/local/include/ncInt.h"
"/usr/local/include/ncAtt.h"
"/usr/local/include/ncVar.h"
"/usr/local/include/ncUint.h"

Wildcards: Match a range of characters listed in brackets ('[]')

foo@bar:~$ ls test_files_02
1.txt 2.txt 3.txt 4.txt

foo@bar:~$ ./glob -i 'test_files_02/[0-9].txt'
"test_files_02/4.txt"
"test_files_02/3.txt"
"test_files_02/2.txt"
"test_files_02/1.txt"

foo@bar:~$ ./glob -i 'test_files_02/[1-2]*'
"test_files_02/2.txt"
"test_files_02/1.txt"
foo@bar:~$ ls test_files_03
file1.txt file2.txt file3.txt file4.txt

foo@bar:~$ ./glob -i 'test_files_03/file[0-9].*'
"test_files_03/file2.txt"
"test_files_03/file3.txt"
"test_files_03/file1.txt"
"test_files_03/file4.txt"

Exclude files from the matching

foo@bar:~$ ls test_files_01
__init__.py     bar.py      foo.py

foo@bar:~$ ./glob -i 'test_files_01/*[!__init__].py'
"test_files_01/bar.py"
"test_files_01/foo.py"

foo@bar:~$ ./glob -i 'test_files_01/*[!__init__][!bar].py'
"test_files_01/foo.py"

foo@bar:~$ ./glob -i 'test_files_01/[!_]*.py'
"test_files_01/bar.py"
"test_files_01/foo.py"

Wildcards: Match any one character with question mark ('?')

foo@bar:~$ ls test_files_02
1.txt 2.txt 3.txt 4.txt

foo@bar:~$ ./glob -i 'test_files_02/?.txt'
"test_files_02/4.txt"
"test_files_02/3.txt"
"test_files_02/2.txt"
"test_files_02/1.txt"
foo@bar:~$ ls test_files_03
file1.txt file2.txt file3.txt file4.txt

foo@bar:~$ ./glob -i 'test_files_03/????[3-4].txt'
"test_files_03/file3.txt"
"test_files_03/file4.txt"

Case sensitivity

glob matching is case-sensitive:

foo@bar:~$ ls test_files_05
file1.png file2.png file3.PNG file4.PNG

foo@bar:~$ ./glob -i 'test_files_05/*.png'
"test_files_05/file2.png"
"test_files_05/file1.png"

foo@bar:~$ ./glob -i 'test_files_05/*.PNG'
"test_files_05/file3.PNG"
"test_files_05/file4.PNG"

foo@bar:~$ ./glob -i "test_files_05/*.png","test_files_05/*.PNG"
"test_files_05/file2.png"
"test_files_05/file1.png"
"test_files_05/file3.PNG"
"test_files_05/file4.PNG"

Tilde expansion

foo@bar:~$ ./glob -i "~/.b*"
"/Users/pranav/.bashrc"
"/Users/pranav/.bash_sessions"
"/Users/pranav/.bash_profile"
"/Users/pranav/.bash_history"

foo@bar:~$ ./glob -i "~/Documents/Projects/glob/**/glob/*.h"
"/Users/pranav/Documents/Projects/glob/include/glob/glob.h"

Contributing

Contributions are welcome, have a look at the CONTRIBUTING.md document for more information.

License

The project is available under the MIT license.

Comments
  • How to match [ and ] as individual characters ?

    How to match [ and ] as individual characters ?

    Hi, I'm evaluating your library for use in a project of mine. I have looked quickly at the wildcards parsing code and it is not clear to me how the user can match the [ and ] characters, the documentation is also lacking on this aspect. From my understanding, the ] character by itself should work (also a [ character if there's no following ] anywhere in the pattern string), [[] should work at matching the [ character, but there's no way of matching [ OR ] together. Am I missing something ?

    opened by pgallo725 4
  • Fix some warnings in MSVC W4

    Fix some warnings in MSVC W4

    While using this library with MSVC we encountered some minor warnings. This should get rid of them.

    We are just using the single include variant; these fixes may be relevant to the other variant as well.

    opened by W4RH4WK 2
  • Add support for path beginning with parent directory (..)

    Add support for path beginning with parent directory (..)

    This PR adds support for paths starting with the parent directory (..). It modifies the is_hidden() function so that paths starting with parent directory ( ../ ) or current directory ( ./ ) should not be considered hidden.

    opened by fdinel 2
  • Use platform environment variable instead of platform function

    Use platform environment variable instead of platform function

    Using a platform dependent function is not a clean solution to getting the username environment variable value. This PR checks the platform and uses different username environment variable names accordingly, and uses the std::getenv function regardless the target platform.

    opened by camielverdult 1
  • Export a method to check whether a path matches a glob

    Export a method to check whether a path matches a glob

    Can you add/export a method where you can pass a path and a glob and it returns (bool) whether the glob matches?

    Looking through the code, I think that would be fnmatch_case() and/or filter().

    opened by Gei0r 1
  • Add CMake option to compile with `gulrak/filesystem`

    Add CMake option to compile with `gulrak/filesystem`

    The README says:

    If you can't use C++17, you can integrate gulrak/filesystem with minimal effort.

    This adds a CMake option to support that without making any code changes.

    opened by rsmmr 0
  • not working on iOS

    not working on iOS

    I'm using this on several platforms and all work great, but now I need it on iOS as well and something like glob::glob("*.png") fails with The complexity of an attempted match against a regular expression exceeded a pre-set level

    Any idea how to solve this?

    opened by cdcseacave 0
  • A more distinct name?

    A more distinct name?

    Nice work!

    Suggest to give this project and the package a more distinct name.

    There is a possibly that the include directory "glob" and namespace "glob" conflict with something else.

    Just like the so many json c++ libraries, although they are all json parser, they have distinct names: rapid json, nlohmann json ...

    No one takes the cxxglob or cppglob name yet. Perhaps use one of these?

    opened by zzhang97 0
  • want the parameter to support

    want the parameter to support "filesystem::path"

    hope the parameter itself can also be filesystem::path

    vector<filesystem::path> glob(filesystem::path pathname);
    vector<filesystem::path> rglob(filesystem::path pathname);
    
    vector<filesystem::path> glob(vector<filesystem::path> pathnames);
    vector<filesystem::path> rglob(vector<filesystem::path> pathnames);
    
    opened by nblog 0
  • Add a simple license header that mentions the license and your name and email to the single_header file

    Add a simple license header that mentions the license and your name and email to the single_header file

    I'm imagining that most people who use the single file header include would just like to copy it in their project (at least on a temporary basis). I'd like to make sure credit is still given in that case.

    Something like this

    /*
     * Copyright (c) YYYY Author <email@address>
     * 
     *  Licensed under the MIT license: https://opensource.org/licenses/MIT
     *  Permission is granted to use, copy, modify, and redistribute the work.
     *  This file is part of: https://github.com/p-ranav
     */
    

    Adjust as you wish.

    opened by jrobeson 0
Releases(v0.0.1)
Owner
Pranav
Pranav