diff --git a/docs/schema.md b/docs/schema.md
index df5ae19a..cd455f6f 100644
--- a/docs/schema.md
+++ b/docs/schema.md
@@ -1,94 +1,87 @@
# Schema specification
-A schema file specifies the delimiters and variables patterns (regular
-expressions) necessary for `log-surgeon` to parse log events. `log-surgeon` uses
-the delimiters to find tokens† (as in, strings of non-delimiters) in
-the input, and categorizes any token that matches a variable pattern as a
-variable. Any tokens that are not categorized as variables are treated as static
-text. In essence, this allows the user to parse variables out of their
-unstructured log events.
-
-`log-surgeon` also assigns types to each variable based on the variable
-pattern's name in the schema file.
-
-† Internally, `log-surgeon`'s lexer also treats a string of
-delimiters as a token, just not one that matches a variable pattern.
+A schema file defines the **delimiters** and **variable patterns** (regular expressions) that
+`log-surgeon` uses to parse log events. Delimiters conceptually divide the input into *tokens*,
+where each token is either a variable (matched by a pattern) or **static text**. Variable tokens may
+include delimiters and are treated as a single token. Static-text always begins and ends with a
+delimiter. This structure enables `log-surgeon` to extract variables from otherwise unstructured log
+events.
## Schema syntax
-A schema file essentially contains a list of *rules*, each of which has a *name*
-and a *pattern* (regular expression).
+A schema file consists of a list of *rules*, each defined by a *name* and *pattern* (regular
+expression). These rules dictate how `log-surgeon` identifies and categorizes parts of a log event.
There are three types of rules in a schema file:
-* [Variables](#variables)
-* [Delimiters](#delimiters)
-* [Timestamps](#timestamps)
+* [Variables](#variables): Defines patterns for capturing specific pieces of the log.
+* [Delimiters](#delimiters): Specifies the characters that separate tokens in the log.
+* [Timestamps](#timestamps): Identifies the boundary between log events. Timestamps are also treated
+ as variables.
+
+For documentation, the schema allows for user comments by ignoring any text preceded by `//`.
### Variables
**Syntax:**
-```
+```txt
:
```
-* `variable-name` may contain any alphanumeric characters, but may not be
- the reserved names `delimiters` or `timestamp`.
+
+* `variable-name` may contain any alphanumeric characters, but may not be the reserved names
+ `delimiters` or `timestamp`.
* `variable-pattern` is a regular expression using the supported
- [syntax](#regular-expression-syntax), but it **cannot** contain characters
- defined as [delimiters](#delimiters).
+ [syntax](#regular-expression-syntax).
Note that:
+
* A schema file may contain zero or more variable rules.
-* Repeating the same variable name in another rule will `OR` the regular
- expressions (preform an alternation).
-* If a token matches multiple patterns from multiple rules, the token will be
- assigned the name of each rule, in the same order that they appear in the
- schema file.
+* Repeating the same variable name in another rule will `OR` the regular expressions (perform an
+ alternation).
+* If a token matches multiple patterns from multiple rules, the token will be assigned the name of
+ each rule, in the same order that they appear in the schema file.
### Delimiters
**Syntax:**
-```
+```txt
delimiters:
```
-* `delimiters` is a reserved name for this rule
-* `characters` is a set of characters that should be treated as delimiters
+
+* `delimiters` is a reserved name for this rule.
+* `characters` is a set of characters that should be treated as delimiters. These characters define
+ the boundaries between tokens in the log.
Note that:
-* A schema file must contain a single `delimiters` rule. If multiple
- `delimiters` rules are specified, only the last one will be used.
+
+* A schema file must contain at least one `delimiters` rule. If multiple `delimiters` rules are
+ specified, only the last one will be used.
### Timestamps
**Syntax:**
-```
+```txt
timestamp:
```
-* `timestamp` is a reserved name for this rule
+
+* `timestamp` is a reserved name for this rule.
* `timestamp-pattern` is a regular expression using the supported
- [syntax](#regular-expression-syntax)
+ [syntax](#regular-expression-syntax).
Note that:
-* Unlike [variable](#variables) patterns, timestamp patterns can contain
- delimiters.
-* The parser uses a timestamp to denote the start of a new log event if:
- * ... it appears as the first token in the input, or
- * ... it appears after a newline character.
-* Until a timestamp is found, the parser will use a newline character to denote
- the start of a new log event.
-* The timestamp pattern is not used to match text inside a log event; since the
- pattern can contain delimiters, no token can match it.
-### Comments
-
-**Syntax:** Comments are any text preceded by `//`.
+* The parser uses a timestamp to denote the start of a new log event if:
+ * It appears as the first token in the input, or
+ * It appears after a newline character.
+* Until a timestamp is found, the parser will use a newline character to denote the start of a new
+ log event.
## Example schema file
-```
+```txt
// Delimiters
delimiters: \t\r\n:,!;%
@@ -101,35 +94,50 @@ float:\-{0,1}[0-9]+\.[0-9]+
// Custom variables
hex:[a-fA-F]+
hasNumber:.*\d.*
-equals:.*=.*[a-zA-Z0-9].*
+equalsCapture:.*=(?.*[a-zA-Z0-9].*)
```
-* `delimiters: \t\r\n:,!;%` indicates that ` `, `\t`, `\r`, `\n`, `:`, `,`,
- `!`, `;`, `%`, and `'` are delimiters. In a log file, consecutive delimiters,
- e.g., N consecutive spaces, will be tokenized as static text.
+
+* `delimiters: \t\r\n:,!;%` indicates that ` `, `\t`, `\r`, `\n`, `:`, `,`, `!`, `;`, and `%` are
+ delimiters.
* `timestamp` matches two different patterns:
- * 2023-04-19 12:32:08.064
- * [20230419-12:32:08]
-* `int`, `float`, `hex`, `hasNumber`, and `equals` all match different user
- defined variables.
+ * `2023-04-19 12:32:08.064`
+ * `[20230419-12:32:08]`
+* `int`, `float`, `hex`, `hasNumber`, and `equalsCapture` all match different user defined
+ variables.
+* `equalsCapture` also contains the named capture group `equals`. This allows the user to extract
+ the substring following the equals sign.
## Regular Expression Syntax
-The following regular expression rules are supported by the schema. When
-building a regular expression, the rules are applied as they appear in this
-list, from top to bottom.
-```
-REGEX RULE DEFINITION
-ab Match 'a' followed by 'b'
-a|b Match a OR b
-[a-z] Match any character in the brackets (e.g., any lowercase letter)
- - special characters must be escaped, even in brackets (e.g., [\.\(\\])
-[^a-zA-Z] Match any character NOT in the brackets (e.g., non-alphabet character)
-a* Match 'a' 0 or more times
-a+ Match 'a' 1 or more times
-a{N} Match 'a' exactly N times
-a{N,M} Match 'a' between N and M times
-(abc) Subexpression (concatenates abc)
-\d Match any digit 0-9
-\s Match any whitespace character (' ', '\r', '\t', '\v', or '\f')
-. Match any *non-delimiter* character
+The following regular expression rules are supported by the schema. When building a regular
+expression, the rules are applied as they appear in this list, from top to bottom.
+
+```txt
+REGEX RULE EXAMPLE DEFINITION
+Concatenation ab Match two expressions in sequence (e.g., 'a'
+ followed by 'b').
+Alternation a|b Match one of two expressions (e.g., 'a' or 'b').
+Range [a-z] Match any character within a specified range
+ (e.g., any lowercase letter).
+Negated Range [^a-zA-Z] Match any character not within the specified
+ range (e.g., any non-alphabet character).
+Kleene Star a* Match an expression zero or more times.
+Kleene Plus a+ Match an expression one or more times.
+Repetition a{N} Match an expression exactly N times.
+Repetition Range a{N,M} Match an expression between N and M times.
+Digit \d Match any digit (i.e., 0-9).
+Whitespace \s Match any whitespace character (i.e., ' ', \r,
+ \t, \v, or \f).
+Wildcard . Match any non-delimiter character.
+Subexpression (ab) Match the expression in parentheses (e.g., ab).
+Named Capture (?[01]+) Match an expression and assign it a name (e.g.,
+ capture binary as "var").
+
+* Special characters include: ( ) * + - . [ \ ] ^ { | } < > ?
+ - Escape these with '\' when used literally (e.g., \., \(, \\).
+ - Special characters must be escaped even in ranges.
+
+* For each regex rule, the expression(s) it contains can be formed by applying
+ any sequence of valid regex rules, including the rule itself, thus allowing
+ for recursive composition.
```