glob is a cross-platform, pure Nim module for matching files against Unix style patterns. It supports creating patterns, testing file paths, and walking through directories to find matching files or directories. For example, the pattern src/**/*.nim will be expanded to return all files with a .nim extension in the src directory and any of its subdirectories.
It's similar to Python's glob module but supports extended glob syntax like {} groups.
Note that while glob works on all platforms, the patterns it generates can be platform specific due to differing path separator characters.
Syntax
token | example | description |
---|---|---|
? | ?.nim | acts as a wildcard, matching any single character |
* | *.nim | matches any string of any length until a path separator is found |
** | **/license | same as * but crosses path boundaries to any depth |
[] | [ch] | character class, matches any of the characters or ranges inside |
{} | {nim,js} | string class (group), matches any of the strings inside |
/ | foo/*.js | literal path separator (even on Windows) |
\ | foo\*.js | escape character (not path separator, even on Windows) |
Any other characters are matched literally. Make special note of the difference between / and \. Even when on Windows platforms you should not use \ as a path separator, since it is actually the escape character in glob syntax. Instead, always use / as the path separator. This module will then use the correct separator when the glob is created.
Character classes
Matching special characters
If you need to match some special characters like ] or - inside a bracket expression, you'll need to use them in specific ways to match them literally.
character | special | literal | description |
---|---|---|---|
] | [)}]] | []_.] | must come first or is treated as closing bracket |
- | [_-=] | [-_] | must come first or last or is treated as a range |
! | [!<>] | [<!>] | must not come first or is treated as negation character |
POSIX classes
Within bracket expressions ([]) you can use POSIX character classes, which are basically named groups of characters. These are the available classes and their roughly equivalent regex values:
POSIX class | similar to | meaning |
---|---|---|
[:upper:] | [A-Z] | uppercase letters |
[:lower:] | [a-z] | lowercase letters |
[:alpha:] | [A-Za-z] | upper- and lowercase letters |
[:digit:] | [0-9] | digits |
[:xdigit:] | [0-9A-Fa-f] | hexadecimal digits |
[:alnum:] | [A-Za-z0-9] | digits, upper- and lowercase letters |
[:word:] | [A-Za-z0-9_] | alphanumeric and underscore |
[:blank:] | [ \t] | space and TAB characters only |
[:space:] | [ \t\n\r\f\v] | blank (whitespace) characters |
[:cntrl:] | [\x00-\x1F\x7F] | control characters |
[:ascii:] | [\x00-\x7F] | ASCII characters |
[:graph:] | [^ [:cntrl:]] | graphic characters (all characters which have graphic representation) |
[:punct:] | [!"\#$%&'()*+,-./:;<=>?@\[\]^_`{|}~] | punctuation (all graphic characters except letters and digits) |
[:print:] | [[:graph] ] | graphic characters and space |
Extended pattern matching
glob supports most of the extended pattern matching syntax found under bash's extglob flag:
?(...patterns) | match zero or one occurrences of the given patterns |
*(...patterns) | match zero or more occurrences of the given patterns |
+(...patterns) | match one or more occurrences of the given patterns |
@(...patterns) | match one of the given patterns |
Note that the !(...patterns) form that allows for matching anything except the given patterns is not currently supported. This is a limitation in the regex backend.
Examples
For these examples let's imagine we have this file structure:
├─ assets/ │ └─ img/ │ ├─ favicon.ico │ └─ logo.svg ├─ src/ │ ├─ glob/ │ │ ├─ other.nim │ │ ├─ regexer.nim │ │ └─ private/ │ │ └─ util.nim │ └─ glob.nim └─ glob.nimble
glob pattern | files returned |
---|---|
* | @["glob.nimble"] |
src/*.nim | @["src/glob.nim"] |
src/**/*.nim | @["src/glob.nim", "src/glob/other.nim", "src/glob/regexer.nim", "src/glob/private/util.nim"] |
**/*.{ico,svg} | @["assets/img/favicon.ico", "assets/img/logo.svg"] |
**/????.??? | @["src/glob.nim", "src/glob/private/util.nim", "assets/img/logo.svg"] |
For more info on glob syntax see this link for a good reference, although it references a few more extended features which aren't yet supported. As a cheatsheet, this wiki might also be useful.
Roadmap
There may be some features and other capabilities which aren't supported yet but will potentially be added in the future, for example:
- unicode character support
- multiple patterns (something like glob(["*.nim", "!foo.nim"]))
Types
Glob = object pattern*: string regexStr*: string regex*: Regex base*: string magic*: string
- Represents a compiled glob pattern and its backing regex. Also stores the glob's base & magic components as given by the splitPattern proc.
GlobEntry = tuple[path: string, kind: PathComponent]
- Represents a filesystem entity matched by a glob pattern, containing the item's path and its kind as an os.PathComponent.
PatternStems = tuple[base: string, magic: string]
- The type returned by splitPattern where base contains the leading non-magic path components and magic contains any path segments containing or following special glob characters.
GlobOption {...}{.pure.} = enum Absolute, IgnoreCase, NoExpandDirs, FollowLinks, ## iterator behavior Hidden, Files, Directories, FileLinks, DirLinks ## to yield or not to yield
-
Flags that control the behavior or results of the file system iterators. See defaultGlobOptions for some usage & examples.
flag meaning GlobOption.Absolute yield paths as absolute rather than relative to root GlobOption.IgnoreCase matching will ignore case differences GlobOption.NoExpandDirs if pattern is a directory don't treat it as <dir>/**/* GlobOption.Hidden yield hidden files or directories GlobOption.Directories yield directories GlobOption.Files yield files GlobOption.DirLinks yield links to directories GlobOption.FileLinks yield links to files GlobOption.FollowLinks recurse into directories through links GlobOptions = set[GlobOption]
-
The set type containing flags for controlling glob behavior.
var options: GlobOptions = {} if someCondition: options += GlobOption.Absolute
FilterDescend = (path: string) -> bool
-
A predicate controlling whether or not to recurse into a directory when iterating with a recursive glob pattern. Returning true will allow recursion, while returning false will prevent it.
path can either be relative or absolute, which depends on GlobOption.Absolute being present in the iterator's options.
FilterYield = (path: string, kind: PathComponent) -> bool
-
A predicate controlling whether or not to yield a filesystem item. Paths for which this predicate returns false will not be yielded.
path can either be relative or absolute, which depends on GlobOption.Absolute being present in the iterator's options. kind is an os.PathComponent.
Consts
defaultGlobOptions = {GlobOption.Files, GlobOption.FileLinks, GlobOption.DirLinks}
-
The default options used when none are provided. If a new set is provided it overrides the defaults entirely, so in order to partially modify the default options you can use Nim's set union and intersection operators:
const optsNoFiles = defaultGlobOptions - {Files} const optsHiddenNoLinks = defaultGlobOptions + {Hidden} - {FileLinks, DirLinks}
On case-insensitive filesystems (like Windows), this also includes GlobOption.IgnoreCase.
Funcs
func hasMagic(str: string): bool {...}{.raises: [], tags: [].}
-
Returns true if the given string is glob-like, ie. if it contains any of the special characters *, ?, [, { or an extglob which is one of the characters ?, !, @, +, or * followed by (.
Examples:
doAssert("*.nim".hasMagic) doAssert("profile_picture.{png,jpg}".hasMagic) doAssert(not "literal_match.html".hasMagic)
func globToRegex(pattern: string; isDos = isDosDefault; ignoreCase = isDosDefault): Regex {...}{. raises: [RegexError, GlobSyntaxError], tags: [].}
- Converts a string glob pattern to a regex pattern.
func splitPattern(pattern: string): PatternStems {...}{.raises: [], tags: [].}
-
Splits the given pattern into two parts: the base which is the part containing no special glob characters and the magic which includes any path segments containing or following special glob characters.
When pattern is not glob-like, ie. pattern.hasMagic == false, it will be considered a literal matcher and the entire pattern will be returned as magic, while base will be the empty string "".
Examples:
doAssert "root_dir/inner/**/*.{jpg,gif}".splitPattern == ("root_dir/inner", "**/*.{jpg,gif}") doAssert "this/is-a/literal-match.txt".splitPattern == ("", "this/is-a/literal-match.txt")
func glob(pattern: string; isDos = isDosDefault; ignoreCase = isDosDefault): Glob {...}{. raises: [GlobSyntaxError, RegexError], tags: [].}
- Constructs a new Glob object from the given pattern.
func matches(input: string; glob: Glob): bool {...}{.raises: [], tags: [].}
-
Returns true if input is a match for the given glob object.
Examples:
when defined posix: const matcher = glob("src/**/*.nim") doAssert("src/dir/foo.nim".matches(matcher)) doAssert(not r"src\dir\foo.nim".matches(matcher)) elif defined windows: const matcher = glob("src/**/*.nim") doAssert(r"src\dir\foo.nim".matches(matcher)) doAssert(not "src/dir/foo.nim".matches(matcher))
func matches(input, pattern: string; isDos = isDosDefault; ignoreCase = isDosDefault): bool {...}{. raises: [RegexError, GlobSyntaxError], tags: [].}
-
Check that input matches the given pattern and return true if it does. Shortcut for matches(input, glob(pattern, isDos, ignoreCase)).
Examples:
when defined posix: doAssert "src/dir/foo.nim".matches("src/**/*.nim") elif defined windows: doAssert r"src\dir\foo.nim".matches("src/**/*.nim")
Iterators
iterator walkGlobKinds(pattern: string | Glob; root = ""; options = defaultGlobOptions; filterDescend: FilterDescend = nil; filterYield: FilterYield = nil): GlobEntry
-
Equivalent to walkGlob but yields a GlobEntry which contains the path as well as the kind of the item.
Examples:
for path, kind in walkGlobKinds("src/*.nim"): doAssert(path is string and kind is PathComponent) ## include hidden items, exclude links const optsHiddenNoLinks = defaultGlobOptions + {Hidden} - {FileLinks, DirLinks} for path, kind in walkGlobKinds("src/**/*", options = optsHiddenNoLinks): doAssert(kind notin {pcLinkToFile, pcLinkToDir})
iterator walkGlob(pattern: string | Glob; root = ""; options = defaultGlobOptions; filterDescend: FilterDescend = nil; filterYield: FilterYield = nil): string
-
Iterates over all the paths within the scope of the given glob pattern, yielding all those that match. root defaults to the current working directory (by using os.getCurrentDir).
See GlobOption for the flags available to alter iteration behavior and output.
Examples:
for path in walkGlob("src/*.nim"): ## `path` is a file only in the `src` directory (not any of its ## subdirectories) with the `.nim` file extension discard for path in walkGlob("docs/**/*.{png, svg}"): ## `path` is a file in the `docs` directory or any of its ## subdirectories with either a `png` or `svg` file extension discard