Next: , Previous: Sequence Comparison, Up: Packages

5.10 Regular Expression Pattern Matching

These functions are defined in rgx.c using a POSIX or GNU regex library. If your computer does not support regex, a package is available via ftp from For a description of regular expressions, See syntax.

— Function: regcomp pattern [flags]

Compile a regular expression. Return a compiled regular expression, or an integer error code suitable as an argument to regerror.

flags in regcomp is a string of option letters used to control the compilation of the regular expression. The letters may consist of:

newlines won't be matched by . or hat lists; ( [^...] )
ignore case.
only when compiled with _GNU_SOURCE:

allows dot to match a null character.
enable GNU fastmaps.

— Function: regerror errno

Returns a string describing the integer errno returned when regcomp fails.

— Function: regexec re string

Returns #f or a vector of integers. These integers are in doublets. The first of each doublet is the index of string of the start of the matching expression or sub-expression (delimited by parentheses in the pattern). The last of each doublet is index of string of the end of that expression. #f is returned if the string does not match.

— Function: regmatch? re string

Returns #t if the pattern such that regexp = (regcomp pattern) matches string as a POSIX extended regular expressions. Returns #f otherwise.

— Function: regsearch re string [start [len]]
— Function: regsearchv re string [start [len]]
— Function: regmatch re string [start [len]]
— Function: regmatchv re string [start [len]]

Regsearch searches for the pattern within the string.

Regmatch anchors the pattern and begins matching it against string.

Regsearch returns the character position where re starts, or #f if not found.

Regmatch returns the number of characters matched, #f if not matched.

Regsearchv and regmatchv return the match vector is returned if re is found, #f otherwise.

may be either:
  1. a compiled regular expression returned by regcomp;
  2. a string representing a regular expression;
  3. a list of a string and a set of option letters.

The string to be operated upon.
The character position at which to begin the search or match. If absent, the default is zero.

Compiled _GNU_SOURCE and using GNU libregex only

When searching, if start is negative, the absolute value of start will be used as the start location and reverse searching will be performed.

The search is allowed to examine only the first len characters of string. If absent, the entire string may be examined.

— Function: string-split re string
— Function: string-splitv re string

String-split splits a string into substrings that are separated by re, returning a vector of substrings.

String-splitv returns a vector of string positions that indicate where the substrings are located.

— Function: string-edit re edit-spec string [count]

Returns the edited string.

Is a string used to replace occurances of re. Backquoted integers in the range of 1-9 may be used to insert subexpressions in re, as in sed.
The number of substitutions for string-edit to perform. If #t, all occurances of re will be replaced. The default is to perform one substitution.