Glob for C++17

Overview

Unix-style pathname pattern expansion

Table of Contents

Quick Start

  • This library is available in two flavors:
    1. Two file version: glob.h and glob.cpp
    2. Single header file version in single_include/
  • No external dependencies - just the standard library
  • Requires C++17 std::filesystem
  • MIT License

Build Library and Standalone Sample

cmake -Hall -Bbuild
cmake --build build

# run standalone `glob` sample
./build/standalone/glob --help

Usage

// Match on a single pattern
for (auto& p : glob::glob("~/.b*")) {                // e.g., .bash_history, .bashrc
  // do something with `p`
}

// Match on multiple patterns
for (auto& p : glob::glob({"*.png", "*.jpg"})) {     // e.g., foo.png, bar.jpg
  // do something with `p`
}

// Match recursively with `rglob`
for (auto& p : glob::rglob("**/*.hpp")) {            // e.g., include/foo.hpp, include/foo/bar.hpp
  // do something with `p`
}

API

/// e.g., glob("*.hpp")
/// e.g., glob("**/*.cpp")
/// e.g., glob("test_files_02/[0-9].txt")
/// e.g., glob("/usr/local/include/nc*.h")
/// e.g., glob("test_files_02/?.txt")
vector<filesystem::path> glob(string pathname);

/// Globs recursively
/// e.g., rglob("Documents/Projects/Foo/**/*.hpp")
/// e.g., rglob("test_files_02/*[0-9].txt")
vector<filesystem::path> rglob(string pathname);

There are also two convenience functions to glob on a list of patterns:

/// e.g., glob({"*.png", "*.jpg"})
vector<filesystem::path> glob(vector<string> pathnames);

/// Globs recursively
/// e.g., rglob({"**/*.h", "**/*.hpp", "**/*.cpp"})
vector<filesystem::path> rglob(vector<string> pathnames);

Wildcards

Wildcard Matches Example
* any characters *.txt matches all files with the txt extension
? any one character ??? matches files with 3 characters long
[] any character listed in the brackets [ABC]* matches files starting with A,B or C
[-] any character in the range listed in brackets [A-Z]* matches files starting with capital letters
[!] any character listed in the brackets [!ABC]* matches files that do not start with A,B or C

Examples

The following examples use the standalone sample that is part of this repository to illustrate the library functionality.

[email protected]:~$ ./build/standalone/glob -h
Run glob to find all the pathnames matching a specified pattern
Usage:
  ./build/standalone/glob [OPTION...]

  -h, --help       Show help
  -v, --version    Print the current version number
  -r, --recursive  Run glob recursively
  -i, --input arg  Patterns to match

Match file extensions

[email protected]:~$ tree
.
├── include
│   └── foo
│       ├── bar.hpp
│       ├── baz.hpp
│       └── foo.hpp
└── test
    ├── bar.cpp
    ├── doctest.hpp
    ├── foo.cpp
    └── main.cpp

3 directories, 7 files

[email protected]:~$ ./glob -i "**/*.hpp"
"test/doctest.hpp"

[email protected]:~$ ./glob -i "**/**/*.hpp"
"include/foo/baz.hpp"
"include/foo/foo.hpp"
"include/foo/bar.hpp"

NOTE If you run glob recursively, i.e., using rglob:

[email protected]:~$ ./glob -r -i "**/*.hpp"
"test/doctest.hpp"
"include/foo/baz.hpp"
"include/foo/foo.hpp"
"include/foo/bar.hpp"

Match files in absolute pathnames

[email protected]:~$ ./glob -i '/usr/local/include/nc*.h'
"/usr/local/include/ncCheck.h"
"/usr/local/include/ncGroupAtt.h"
"/usr/local/include/ncUshort.h"
"/usr/local/include/ncByte.h"
"/usr/local/include/ncString.h"
"/usr/local/include/ncUint64.h"
"/usr/local/include/ncGroup.h"
"/usr/local/include/ncUbyte.h"
"/usr/local/include/ncvalues.h"
"/usr/local/include/ncInt.h"
"/usr/local/include/ncAtt.h"
"/usr/local/include/ncVar.h"
"/usr/local/include/ncUint.h"

Wildcards: Match a range of characters listed in brackets ('[]')

[email protected]:~$ ls test_files_02
1.txt 2.txt 3.txt 4.txt

[email protected]:~$ ./glob -i 'test_files_02/[0-9].txt'
"test_files_02/4.txt"
"test_files_02/3.txt"
"test_files_02/2.txt"
"test_files_02/1.txt"

[email protected]:~$ ./glob -i 'test_files_02/[1-2]*'
"test_files_02/2.txt"
"test_files_02/1.txt"
[email protected]:~$ ls test_files_03
file1.txt file2.txt file3.txt file4.txt

[email protected]:~$ ./glob -i 'test_files_03/file[0-9].*'
"test_files_03/file2.txt"
"test_files_03/file3.txt"
"test_files_03/file1.txt"
"test_files_03/file4.txt"

Exclude files from the matching

[email protected]:~$ ls test_files_01
__init__.py     bar.py      foo.py

[email protected]:~$ ./glob -i 'test_files_01/*[!__init__].py'
"test_files_01/bar.py"
"test_files_01/foo.py"

[email protected]:~$ ./glob -i 'test_files_01/*[!__init__][!bar].py'
"test_files_01/foo.py"

[email protected]:~$ ./glob -i 'test_files_01/[!_]*.py'
"test_files_01/bar.py"
"test_files_01/foo.py"

Wildcards: Match any one character with question mark ('?')

[email protected]:~$ ls test_files_02
1.txt 2.txt 3.txt 4.txt

[email protected]:~$ ./glob -i 'test_files_02/?.txt'
"test_files_02/4.txt"
"test_files_02/3.txt"
"test_files_02/2.txt"
"test_files_02/1.txt"
[email protected]:~$ ls test_files_03
file1.txt file2.txt file3.txt file4.txt

[email protected]:~$ ./glob -i 'test_files_03/????[3-4].txt'
"test_files_03/file3.txt"
"test_files_03/file4.txt"

Case sensitivity

glob matching is case-sensitive:

[email protected]:~$ ls test_files_05
file1.png file2.png file3.PNG file4.PNG

[email protected]:~$ ./glob -i 'test_files_05/*.png'
"test_files_05/file2.png"
"test_files_05/file1.png"

[email protected]:~$ ./glob -i 'test_files_05/*.PNG'
"test_files_05/file3.PNG"
"test_files_05/file4.PNG"

[email protected]:~$ ./glob -i "test_files_05/*.png","test_files_05/*.PNG"
"test_files_05/file2.png"
"test_files_05/file1.png"
"test_files_05/file3.PNG"
"test_files_05/file4.PNG"

Tilde expansion

[email protected]:~$ ./glob -i "~/.b*"
"/Users/pranav/.bashrc"
"/Users/pranav/.bash_sessions"
"/Users/pranav/.bash_profile"
"/Users/pranav/.bash_history"

[email protected]:~$ ./glob -i "~/Documents/Projects/glob/**/glob/*.h"
"/Users/pranav/Documents/Projects/glob/include/glob/glob.h"

Contributing

Contributions are welcome, have a look at the CONTRIBUTING.md document for more information.

License

The project is available under the MIT license.

Issues
  • Fix some warnings in MSVC W4

    Fix some warnings in MSVC W4

    While using this library with MSVC we encountered some minor warnings. This should get rid of them.

    We are just using the single include variant; these fixes may be relevant to the other variant as well.

    opened by W4RH4WK 2
  • Add support for path beginning with parent directory (..)

    Add support for path beginning with parent directory (..)

    This PR adds support for paths starting with the parent directory (..). It modifies the is_hidden() function so that paths starting with parent directory ( ../ ) or current directory ( ./ ) should not be considered hidden.

    opened by fdinel 2
  • A more distinct name?

    A more distinct name?

    Nice work!

    Suggest to give this project and the package a more distinct name.

    There is a possibly that the include directory "glob" and namespace "glob" conflict with something else.

    Just like the so many json c++ libraries, although they are all json parser, they have distinct names: rapid json, nlohmann json ...

    No one takes the cxxglob or cppglob name yet. Perhaps use one of these?

    opened by zzhang97 0
  • want the parameter to support

    want the parameter to support "filesystem::path"

    hope the parameter itself can also be filesystem::path

    vector<filesystem::path> glob(filesystem::path pathname);
    vector<filesystem::path> rglob(filesystem::path pathname);
    
    vector<filesystem::path> glob(vector<filesystem::path> pathnames);
    vector<filesystem::path> rglob(vector<filesystem::path> pathnames);
    
    opened by nblog 0
  • Add a simple license header that mentions the license and your name and email to the single_header file

    Add a simple license header that mentions the license and your name and email to the single_header file

    I'm imagining that most people who use the single file header include would just like to copy it in their project (at least on a temporary basis). I'd like to make sure credit is still given in that case.

    Something like this

    /*
     * Copyright (c) YYYY Author <[email protected]>
     * 
     *  Licensed under the MIT license: https://opensource.org/licenses/MIT
     *  Permission is granted to use, copy, modify, and redistribute the work.
     *  This file is part of: https://github.com/p-ranav
     */
    

    Adjust as you wish.

    opened by jrobeson 0
  • ** does not match zero directories

    ** does not match zero directories

    Hi, thank you for this nice library!

    I think glob::rglob("dir/**/*.ext") should match:

    • dir/file.ext
    • dir/sub/file.ext
    • dir/sub/sub/file.ext

    Currently it only matches:

    • dir/sub/file.ext
    • dir/sub/sub/file.ext

    Or how else one can recursively get all .ext files within dir?

    bug 
    opened by houmain 1
  • Using with gulrak/filesystem

    Using with gulrak/filesystem

    In single_include/glob/glob.hpp I had to change a few cases of std::filesystem to fs in order to get glob to work with gulrak/filesystem.

    Since there are other places in glob.hpp where you used fs, I thought I'd mention it.

    opened by asalamon-work 0
Releases(v0.0.1)
Owner
Pranav
Pranav