Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased

- Renamed `CustomLexerContext` to `CustomLexerCursor` and separated it from the
- Renamed `CustomLexerContext` to `LexCursor` and separated it from the
`context.Context` type.
- Renamed `ParserContext` to `ParseCursor` and separated it from the
`context.Context` type.
Comment thread
ianlewis marked this conversation as resolved.

## [0.3.0] - 2026-01-25
Expand Down
50 changes: 26 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,9 +130,9 @@ in addition to the underlying reader's position. When the token has been fully
processed it can be emitted to a channel for further processing by the `Parser`.

Developers implement the token processing portion of the lexer by implementing
`LexState` interface for each relevant lexer state. A `CustomLexerCursor` is
passed to each `LexState` during processing and includes a number of methods
that can be used to advance through the input text.
`LexState` interface for each relevant lexer state. A `LexCursor` is passed to
each `LexState` during processing and includes a number of methods that can be
used to advance through the input text.

For example, consider the following simple template language.

Expand Down Expand Up @@ -211,7 +211,7 @@ type LexState interface {
// Run returns the next state to transition to or an error. If the returned
// next state is nil or the returned error is io.EOF then the Lexer
// finishes processing normally.
Run(context.Context, *CustomLexerCursor) (LexState, error)
Run(context.Context, *LexCursor) (LexState, error)
}
```

Expand All @@ -238,23 +238,23 @@ advancing over the text.

```go
// lexText tokenizes normal text.
func lexText(ctx context.Context, c *lexparse.CustomLexerCursor) (lexparse.LexState, error) {
func lexText(ctx context.Context, cur *lexparse.LexCursor) (lexparse.LexState, error) {
for {
p := c.PeekN(2)
p := cur.PeekN(2)
switch string(p) {
case tokenBlockStart, tokenVarStart:
if c.Width() > 0 {
c.Emit(lexTypeText)
if cur.Width() > 0 {
cur.Emit(lexTypeText)
}
return lexparse.LexStateFn(lexCode), nil
default:
}

// Advance the input.
if !c.Advance() {
if !cur.Advance() {
// End of input. Emit the text up to this point.
if c.Width() > 0 {
c.Emit(lexTypeText)
if cur.Width() > 0 {
cur.Emit(lexTypeText)
}
return nil, nil
}
Expand Down Expand Up @@ -326,7 +326,7 @@ flowchart-elk TD

Similar to the lexer API, each parser state is represented by an object
implementing the `ParseState` interface. It contains only a single `Run` method
which handles processing input tokens while in that state. A `ParserContext` is
which handles processing input tokens while in that state. A `ParseCursor` is
passed to each `ParseState` during processing and includes a number of methods
that can be used to examine the current token, advance to the next token, and
manipulate the AST.
Expand All @@ -335,10 +335,12 @@ manipulate the AST.
// ParseState is the state of the current parsing state machine. It defines the
// logic to process the current state and returns the next state.
type ParseState[V comparable] interface {
// Run returns the next state to transition to or an error. If the returned
// next state is nil or the returned error is io.EOF then the Lexer
// finishes processing normally.
Run(*ParserContext[V]) (ParseState[V], error)
// Run executes the logic at the current state, returning an error if one is
// encountered. Implementations are expected to add new Node objects to
// the AST using ParseCursor.Push or ParseCursor.Node. As necessary, new
// parser state should be pushed onto the stack as needed using
// Parser.PushState.
Run(ctx context.Context, cur *ParseCursor[V]) error
}
```

Expand Down Expand Up @@ -377,16 +379,16 @@ Here we push the later relevant expected state onto the parser's stack.

```go
// parseSeq delegates to another parse function based on token type.
func parseSeq(ctx *lexparse.ParserContext[*tmplNode]) error {
token := ctx.Peek()
func parseSeq(ctx context.Context, cur *lexparse.ParseCursor[*tmplNode]) error {
token := cur.Peek(ctx)

switch token.Type {
case lexTypeText:
ctx.PushState(lexparse.ParseStateFn(parseText))
cur.PushState(lexparse.ParseStateFn(parseText))
case lexTypeVarStart:
ctx.PushState(lexparse.ParseStateFn(parseVarStart))
cur.PushState(lexparse.ParseStateFn(parseVarStart))
case lexTypeBlockStart:
ctx.PushState(lexparse.ParseStateFn(parseBlockStart))
cur.PushState(lexparse.ParseStateFn(parseBlockStart))
}

return nil
Expand All @@ -399,11 +401,11 @@ are pushed in reverse order so that they are handled in the order listed.

```go
// parseVarStart handles var start (e.g. '{{').
func parseVarStart(ctx *lexparse.ParserContext[*tmplNode]) error {
func parseVarStart(ctx context.Context, cur *lexparse.ParseCursor[*tmplNode]) error {
// Consume the var start token.
_ = ctx.Next()
_ = cur.Next(ctx)

ctx.PushState(
cur.PushState(
lexparse.ParseStateFn(parseVar),
lexparse.ParseStateFn(parseVarEnd),
)
Expand Down
56 changes: 28 additions & 28 deletions custom.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,111 +35,111 @@ const EOF rune = -1
type LexState interface {
// Run returns the next state to transition to or an error. If the returned
// error is io.EOF then the Lexer finishes processing normally.
Run(ctx context.Context, cursor *CustomLexerCursor) (LexState, error)
Run(ctx context.Context, cur *LexCursor) (LexState, error)
}

type lexFnState struct {
f func(context.Context, *CustomLexerCursor) (LexState, error)
f func(context.Context, *LexCursor) (LexState, error)
}

// Run implements [LexState.Run].
//
//nolint:ireturn // Returning interface required to satisfy [LexState.Run]
func (s *lexFnState) Run(ctx context.Context, cursor *CustomLexerCursor) (LexState, error) {
return s.f(ctx, cursor)
func (s *lexFnState) Run(ctx context.Context, cur *LexCursor) (LexState, error) {
return s.f(ctx, cur)
}

// LexStateFn creates a State from the given Run function.
//
//nolint:ireturn // Returning interface required to satisfy [LexState.Run]
func LexStateFn(f func(context.Context, *CustomLexerCursor) (LexState, error)) LexState {
func LexStateFn(f func(context.Context, *LexCursor) (LexState, error)) LexState {
return &lexFnState{f}
}

// CustomLexerCursor is a type that allows for processing the input for the
// CustomLexer. It provides methods to advance the reader, emit tokens, and
// LexCursor is a type that allows for processing the input for a
// [CustomLexer]. It provides methods to advance the reader, emit tokens, and
// manage the current token being processed. It is designed to be used within
// the [LexState.Run] method to allow the state implementation to interact with
// the lexer without exposing the full CustomLexer implementation.
type CustomLexerCursor struct {
type LexCursor struct {
l *CustomLexer
}

// NewCustomLexerCursor creates a new CustomLexerCursor.
func NewCustomLexerCursor(l *CustomLexer) *CustomLexerCursor {
return &CustomLexerCursor{
// NewLexCursor creates a new [LexCursor].
func NewLexCursor(l *CustomLexer) *LexCursor {
return &LexCursor{
l: l,
}
}

// Advance attempts to advance the underlying reader a single rune and returns
// true if actually advanced. The current token cursor position is not updated.
func (c *CustomLexerCursor) Advance() bool {
func (c *LexCursor) Advance() bool {
return c.l.advance(1, false) == 1
}

// AdvanceN attempts to advance the underlying reader n runes and returns the
// number actually advanced. The current token cursor position is not updated.
func (c *CustomLexerCursor) AdvanceN(n int) int {
func (c *LexCursor) AdvanceN(n int) int {
return c.l.advance(n, false)
}

// Cursor returns the current position of the underlying cursor marking the
// beginning of the current token being processed.
func (c *CustomLexerCursor) Cursor() Position {
func (c *LexCursor) Cursor() Position {
return c.l.cursor
}

// Discard attempts to discard the next rune, advancing the current token
// cursor, and returns true if actually discarded.
func (c *CustomLexerCursor) Discard() bool {
func (c *LexCursor) Discard() bool {
return c.l.advance(1, true) == 1
}

// DiscardN attempts to discard n runes, advancing the current token cursor
// position, and returns the number actually discarded.
func (c *CustomLexerCursor) DiscardN(n int) int {
func (c *LexCursor) DiscardN(n int) int {
return c.l.advance(n, true)
}

// DiscardTo searches the input for one of the given search strings, advancing
// the reader, and stopping when one of the strings is found. The token cursor
// is advanced and data prior to the search string is discarded. The string
// found is returned. If no match is found an empty string is returned.
func (c *CustomLexerCursor) DiscardTo(query []string) string {
func (c *LexCursor) DiscardTo(query []string) string {
return c.l.discardTo(query)
}

// Emit emits the token between the current cursor position and reader
// position and returns the token. If the lexer is not currently active, this
// is a no-op. This advances the current token cursor.
func (c *CustomLexerCursor) Emit(typ TokenType) *Token {
func (c *LexCursor) Emit(typ TokenType) *Token {
return c.l.emit(typ)
}

// Find searches the input for one of the given search strings, advancing the
// reader, and stopping when one of the strings is found. The token cursor is
// not advanced. The string found is returned. If no match is found an empty
// string is returned.
func (c *CustomLexerCursor) Find(query []string) string {
func (c *LexCursor) Find(query []string) string {
return c.l.find(query)
}

// Ignore ignores the previous input and resets the token start position to
// the current reader position.
func (c *CustomLexerCursor) Ignore() {
func (c *LexCursor) Ignore() {
c.l.ignore()
}

// NextRune returns the next rune of input, advancing the reader while not
// advancing the token cursor.
func (c *CustomLexerCursor) NextRune() rune {
func (c *LexCursor) NextRune() rune {
return c.l.nextRune()
}

// Peek returns the next rune from the buffer without advancing the reader or
// current token cursor.
func (c *CustomLexerCursor) Peek() rune {
func (c *LexCursor) Peek() rune {
p := c.PeekN(1)
if len(p) < 1 {
return EOF
Expand All @@ -151,23 +151,23 @@ func (c *CustomLexerCursor) Peek() rune {
// PeekN returns the next n runes from the buffer without advancing the reader
// or current token cursor. PeekN may return fewer runes than requested if an
// error occurs or at end of input.
func (c *CustomLexerCursor) PeekN(n int) []rune {
func (c *LexCursor) PeekN(n int) []rune {
return c.l.peekN(n)
}

// Pos returns the current position of the underlying reader.
func (c *CustomLexerCursor) Pos() Position {
func (c *LexCursor) Pos() Position {
return c.l.pos
}

// Token returns the current token value.
func (c *CustomLexerCursor) Token() string {
func (c *LexCursor) Token() string {
return c.l.b.String()
}

// Width returns the current width of the token being processed. It is
// equivalent to l.Pos().Offset - l.Cursor().Offset.
func (c *CustomLexerCursor) Width() int {
func (c *LexCursor) Width() int {
return c.l.pos.Offset - c.l.cursor.Offset
}

Expand Down Expand Up @@ -246,7 +246,7 @@ func (l *CustomLexer) NextToken(ctx context.Context) *Token {
return l.newToken(TokenTypeEOF)
}

cursor := NewCustomLexerCursor(l)
cur := NewLexCursor(l)

// If we have no tokens to return, we need to run the current state.
for len(l.buf) == 0 && l.state != nil {
Expand All @@ -261,7 +261,7 @@ func (l *CustomLexer) NextToken(ctx context.Context) *Token {

var err error

l.state, err = l.state.Run(ctx, cursor)
l.state, err = l.state.Run(ctx, cur)
l.setErr(err)

if l.err != nil {
Expand Down
Loading
Loading