Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased

- Renamed `CustomLexerContext` to `CustomLexerCursor` and separated it from the
- Renamed `CustomLexerContext` to `LexCursor` and separated it from the
`context.Context` type.
- Renamed `ParserContext` to `ParseCursor` and separated it from the
`context.Context` type.
Comment thread
ianlewis marked this conversation as resolved.

## [0.3.0] - 2026-01-25
Expand Down
49 changes: 25 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,9 +130,9 @@ in addition to the underlying reader's position. When the token has been fully
processed it can be emitted to a channel for further processing by the `Parser`.

Developers implement the token processing portion of the lexer by implementing
`LexState` interface for each relevant lexer state. A `CustomLexerCursor` is
passed to each `LexState` during processing and includes a number of methods
that can be used to advance through the input text.
`LexState` interface for each relevant lexer state. A `LexCursor` is passed to
each `LexState` during processing and includes a number of methods that can be
used to advance through the input text.

For example, consider the following simple template language.

Expand Down Expand Up @@ -211,7 +211,7 @@ type LexState interface {
// Run returns the next state to transition to or an error. If the returned
// next state is nil or the returned error is io.EOF then the Lexer
// finishes processing normally.
Run(context.Context, *CustomLexerCursor) (LexState, error)
Run(context.Context, *LexCursor) (LexState, error)
}
```

Expand All @@ -238,23 +238,23 @@ advancing over the text.

```go
// lexText tokenizes normal text.
func lexText(ctx context.Context, c *lexparse.CustomLexerCursor) (lexparse.LexState, error) {
func lexText(ctx context.Context, cur *lexparse.LexCursor) (lexparse.LexState, error) {
for {
p := c.PeekN(2)
p := cur.PeekN(2)
switch string(p) {
case tokenBlockStart, tokenVarStart:
if c.Width() > 0 {
c.Emit(lexTypeText)
if cur.Width() > 0 {
cur.Emit(lexTypeText)
}
return lexparse.LexStateFn(lexCode), nil
default:
}

// Advance the input.
if !c.Advance() {
if !cur.Advance() {
// End of input. Emit the text up to this point.
if c.Width() > 0 {
c.Emit(lexTypeText)
if cur.Width() > 0 {
cur.Emit(lexTypeText)
}
return nil, nil
}
Expand Down Expand Up @@ -326,7 +326,7 @@ flowchart-elk TD

Similar to the lexer API, each parser state is represented by an object
implementing the `ParseState` interface. It contains only a single `Run` method
which handles processing input tokens while in that state. A `ParserContext` is
which handles processing input tokens while in that state. A `ParseCursor` is
passed to each `ParseState` during processing and includes a number of methods
that can be used to examine the current token, advance to the next token, and
manipulate the AST.
Expand All @@ -335,10 +335,11 @@ manipulate the AST.
// ParseState is the state of the current parsing state machine. It defines the
// logic to process the current state and returns the next state.
type ParseState[V comparable] interface {
// Run returns the next state to transition to or an error. If the returned
// next state is nil or the returned error is io.EOF then the Lexer
// finishes processing normally.
Run(*ParserContext[V]) (ParseState[V], error)
// Run executes the logic at the current state, returning an error if one is
// encountered. Implementations are expected to add new [Node] objects to
// the AST using [Parser.Push] or [Parser.Node). As necessary, new parser
// state should be pushed onto the stack as needed using [Parser.PushState].
Comment thread
ianlewis marked this conversation as resolved.
Outdated
Run(ctx context.Context, cur *ParseCursor[V]) error
}
```

Expand Down Expand Up @@ -377,16 +378,16 @@ Here we push the later relevant expected state onto the parser's stack.

```go
// parseSeq delegates to another parse function based on token type.
func parseSeq(ctx *lexparse.ParserContext[*tmplNode]) error {
token := ctx.Peek()
func parseSeq(ctx context.Context, cur *lexparse.ParseCursor[*tmplNode]) error {
token := cur.Peek(ctx)

switch token.Type {
case lexTypeText:
ctx.PushState(lexparse.ParseStateFn(parseText))
cur.PushState(lexparse.ParseStateFn(parseText))
case lexTypeVarStart:
ctx.PushState(lexparse.ParseStateFn(parseVarStart))
cur.PushState(lexparse.ParseStateFn(parseVarStart))
case lexTypeBlockStart:
ctx.PushState(lexparse.ParseStateFn(parseBlockStart))
cur.PushState(lexparse.ParseStateFn(parseBlockStart))
}

return nil
Expand All @@ -399,11 +400,11 @@ are pushed in reverse order so that they are handled in the order listed.

```go
// parseVarStart handles var start (e.g. '{{').
func parseVarStart(ctx *lexparse.ParserContext[*tmplNode]) error {
func parseVarStart(ctx context.Context, cur *lexparse.ParseCursor[*tmplNode]) error {
// Consume the var start token.
_ = ctx.Next()
_ = cur.Next(ctx)

ctx.PushState(
cur.PushState(
lexparse.ParseStateFn(parseVar),
lexparse.ParseStateFn(parseVarEnd),
)
Expand Down
54 changes: 27 additions & 27 deletions custom.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,111 +35,111 @@ const EOF rune = -1
type LexState interface {
// Run returns the next state to transition to or an error. If the returned
// error is io.EOF then the Lexer finishes processing normally.
Run(ctx context.Context, cursor *CustomLexerCursor) (LexState, error)
Run(ctx context.Context, cur *LexCursor) (LexState, error)
}

type lexFnState struct {
f func(context.Context, *CustomLexerCursor) (LexState, error)
f func(context.Context, *LexCursor) (LexState, error)
}

// Run implements [LexState.Run].
//
//nolint:ireturn // Returning interface required to satisfy [LexState.Run]
func (s *lexFnState) Run(ctx context.Context, cursor *CustomLexerCursor) (LexState, error) {
return s.f(ctx, cursor)
func (s *lexFnState) Run(ctx context.Context, cur *LexCursor) (LexState, error) {
return s.f(ctx, cur)
}

// LexStateFn creates a State from the given Run function.
//
//nolint:ireturn // Returning interface required to satisfy [LexState.Run]
func LexStateFn(f func(context.Context, *CustomLexerCursor) (LexState, error)) LexState {
func LexStateFn(f func(context.Context, *LexCursor) (LexState, error)) LexState {
return &lexFnState{f}
}

// CustomLexerCursor is a type that allows for processing the input for the
// LexCursor is a type that allows for processing the input for the
// CustomLexer. It provides methods to advance the reader, emit tokens, and
// manage the current token being processed. It is designed to be used within
// the [LexState.Run] method to allow the state implementation to interact with
// the lexer without exposing the full CustomLexer implementation.
type CustomLexerCursor struct {
type LexCursor struct {
l *CustomLexer
}

// NewCustomLexerCursor creates a new CustomLexerCursor.
func NewCustomLexerCursor(l *CustomLexer) *CustomLexerCursor {
return &CustomLexerCursor{
// NewLexCursor creates a new CustomLexerCursor.
Comment thread
ianlewis marked this conversation as resolved.
Outdated
func NewLexCursor(l *CustomLexer) *LexCursor {
return &LexCursor{
l: l,
}
}

// Advance attempts to advance the underlying reader a single rune and returns
// true if actually advanced. The current token cursor position is not updated.
func (c *CustomLexerCursor) Advance() bool {
func (c *LexCursor) Advance() bool {
return c.l.advance(1, false) == 1
}

// AdvanceN attempts to advance the underlying reader n runes and returns the
// number actually advanced. The current token cursor position is not updated.
func (c *CustomLexerCursor) AdvanceN(n int) int {
func (c *LexCursor) AdvanceN(n int) int {
return c.l.advance(n, false)
}

// Cursor returns the current position of the underlying cursor marking the
// beginning of the current token being processed.
func (c *CustomLexerCursor) Cursor() Position {
func (c *LexCursor) Cursor() Position {
return c.l.cursor
}

// Discard attempts to discard the next rune, advancing the current token
// cursor, and returns true if actually discarded.
func (c *CustomLexerCursor) Discard() bool {
func (c *LexCursor) Discard() bool {
return c.l.advance(1, true) == 1
}

// DiscardN attempts to discard n runes, advancing the current token cursor
// position, and returns the number actually discarded.
func (c *CustomLexerCursor) DiscardN(n int) int {
func (c *LexCursor) DiscardN(n int) int {
return c.l.advance(n, true)
}

// DiscardTo searches the input for one of the given search strings, advancing
// the reader, and stopping when one of the strings is found. The token cursor
// is advanced and data prior to the search string is discarded. The string
// found is returned. If no match is found an empty string is returned.
func (c *CustomLexerCursor) DiscardTo(query []string) string {
func (c *LexCursor) DiscardTo(query []string) string {
return c.l.discardTo(query)
}

// Emit emits the token between the current cursor position and reader
// position and returns the token. If the lexer is not currently active, this
// is a no-op. This advances the current token cursor.
func (c *CustomLexerCursor) Emit(typ TokenType) *Token {
func (c *LexCursor) Emit(typ TokenType) *Token {
return c.l.emit(typ)
}

// Find searches the input for one of the given search strings, advancing the
// reader, and stopping when one of the strings is found. The token cursor is
// not advanced. The string found is returned. If no match is found an empty
// string is returned.
func (c *CustomLexerCursor) Find(query []string) string {
func (c *LexCursor) Find(query []string) string {
return c.l.find(query)
}

// Ignore ignores the previous input and resets the token start position to
// the current reader position.
func (c *CustomLexerCursor) Ignore() {
func (c *LexCursor) Ignore() {
c.l.ignore()
}

// NextRune returns the next rune of input, advancing the reader while not
// advancing the token cursor.
func (c *CustomLexerCursor) NextRune() rune {
func (c *LexCursor) NextRune() rune {
return c.l.nextRune()
}

// Peek returns the next rune from the buffer without advancing the reader or
// current token cursor.
func (c *CustomLexerCursor) Peek() rune {
func (c *LexCursor) Peek() rune {
p := c.PeekN(1)
if len(p) < 1 {
return EOF
Expand All @@ -151,23 +151,23 @@ func (c *CustomLexerCursor) Peek() rune {
// PeekN returns the next n runes from the buffer without advancing the reader
// or current token cursor. PeekN may return fewer runes than requested if an
// error occurs or at end of input.
func (c *CustomLexerCursor) PeekN(n int) []rune {
func (c *LexCursor) PeekN(n int) []rune {
return c.l.peekN(n)
}

// Pos returns the current position of the underlying reader.
func (c *CustomLexerCursor) Pos() Position {
func (c *LexCursor) Pos() Position {
return c.l.pos
}

// Token returns the current token value.
func (c *CustomLexerCursor) Token() string {
func (c *LexCursor) Token() string {
return c.l.b.String()
}

// Width returns the current width of the token being processed. It is
// equivalent to l.Pos().Offset - l.Cursor().Offset.
func (c *CustomLexerCursor) Width() int {
func (c *LexCursor) Width() int {
return c.l.pos.Offset - c.l.cursor.Offset
}

Expand Down Expand Up @@ -246,7 +246,7 @@ func (l *CustomLexer) NextToken(ctx context.Context) *Token {
return l.newToken(TokenTypeEOF)
}

cursor := NewCustomLexerCursor(l)
cur := NewLexCursor(l)

// If we have no tokens to return, we need to run the current state.
for len(l.buf) == 0 && l.state != nil {
Expand All @@ -261,7 +261,7 @@ func (l *CustomLexer) NextToken(ctx context.Context) *Token {

var err error

l.state, err = l.state.Run(ctx, cursor)
l.state, err = l.state.Run(ctx, cur)
l.setErr(err)

if l.err != nil {
Expand Down
Loading
Loading