This document outlines the architectural issues in the Algow compiler and provides a detailed plan to bring the codebase to production quality. The goal is to transform this from a working prototype into a canonical, textbook implementation of a compiler pipeline.
After this refactor, the codebase must look like it was developed by the combined efforts of:
- The MIT CSAIL programming languages group
- The Berkeley PLSE lab
- Rob Pike (simplicity, clarity, no cleverness)
- Linus Torvalds (uncompromising quality, zero tolerance for hacks)
- Simon Peyton Jones (principled compiler architecture)
This is not hyperbole. This is the bar. Anything less is failure.
1. No Shortcuts. Ever.
If the right solution takes 3 days and a hack takes 30 minutes, we take 3 days. Shortcuts compound. One hack justifies another. Before you know it, you have a codebase held together by duct tape and prayers.
2. No Hacks. Ever.
A hack is any code that:
- Works around a problem instead of fixing it
- Has a comment like "TODO: fix this properly later"
- Makes you feel slightly ashamed when you write it
- Requires explaining "why it's done this weird way"
If you find yourself writing a hack, STOP. Go back. Fix the underlying problem.
3. No Workarounds. Ever.
A workaround is a hack's cousin. It's code that exists because something else is broken. Instead of writing a workaround, fix the thing that's broken.
Example of a workaround we're removing:
// 300 lines of offsetSpan, offsetExpr, offsetPattern...
// These exist because spans don't know their file.
// Workaround: manually adjust every span for multi-file compilation.
// Fix: spans know their file. No adjustment needed.4. Follow Canonical Patterns
We invent nothing. Every problem we face has been solved before by smarter people. Our job is to:
- Identify the established solution
- Implement it correctly
- Document why we chose it
Examples:
- Type inference → Algorithm W (Damas-Milner). Not our invention.
- ANF transformation → Standard technique from literature. Not our invention.
- Pattern exhaustiveness → Standard algorithm. Not our invention.
- Tail call optimization → Standard transformation. Not our invention.
If you're tempted to invent a "clever" solution, resist. Find the canonical approach.
5. The Type System Is The Specification
If the type says span: Span, then span is always present. Period.
If the type says span?: Span, then span might be undefined, and every consumer must handle that case.
We do not:
- Use
!to assert something is non-null when the type says it could be - Use
asto cast away type errors - Ignore TypeScript errors with
@ts-ignore
If the type is wrong, fix the type. If the implementation doesn't match the type, fix the implementation.
6. One Way To Do Things
There should be one obvious way to:
- Look up a type for a node
- Get the span of any AST element
- Find the definition of a name
- Report an error with location
Not two ways. Not "it depends." One way.
Before considering any phase complete, verify:
-
No optional types where values are always present. If something is always there, the type reflects that.
-
No type assertions (
as,!). If you need to assert, the types are wrong. Fix the types. -
No
anytypes. Everything has a precise type. -
No comments explaining workarounds. If code needs a comment explaining why it's weird, it shouldn't be weird.
-
No TODO comments. Either do it now or create an issue. TODOs in code are lies we tell ourselves.
-
No dead code. If it's not used, delete it. Version control remembers.
-
No duplicated logic. If two places do the same thing, extract it. DRY without being clever.
-
Single responsibility. Each function does one thing. Each file has one purpose. If you can't name it clearly, it's doing too much.
-
Consistent naming. Same concepts use same names everywhere. A "span" is always a "span", not sometimes "location" or "range" or "position".
-
Tests pass. Not "mostly pass" or "pass with known failures." Pass.
A phase is done when:
- All checklist items for that phase are complete
- All quality checklist items pass
- The test suite passes with no skipped tests
- A cold read of the changed code reveals no surprises
- You would be proud to show this code in a job interview
- You would be comfortable if Simon Peyton Jones reviewed it
- "It works but I'll clean it up later" → Not done
- "The types are a bit loose but it's fine" → Not done
- "There's a small hack but it's just temporary" → Not done
- "I had to add a workaround for edge case X" → Not done
- "The tests pass if you ignore these two" → Not done
It is better to:
- Leave a feature unimplemented than implement it with hacks
- Have a smaller codebase that's correct than a larger one with workarounds
- Take longer to do it right than ship something broken quickly
- Say "I don't know how to do this properly yet" than pretend a hack is acceptable
If the right way isn't clear, STOP. Research. Ask. Think. Do not proceed with a hack while you "figure out the right way later." Later never comes.
- Core Problems
- Detailed Issue Analysis
- Refactoring Phases
- Implementation Checklist
- File-by-File Changes
Throughout the codebase, spans are defined as optional:
// Current (bad)
type SExpr = {
kind: "SVar";
name: string;
span?: Span; // Why optional?
};There is no valid reason for spans to be optional. Every token in the source has a location. Every AST node is constructed from tokens. Therefore, every node has a span. Period.
The only scenario where a span might be "missing" is for compiler-generated nodes (e.g., during desugaring). But even then, these nodes should inherit or synthesize spans from their source nodes—not have undefined spans.
Making spans optional has cascading consequences:
- Every consumer must handle
undefinedwith?.or! - LSP features become unreliable ("sometimes we have position info, sometimes not")
- Debugging becomes harder ("where did this node come from?")
- The type system can't help catch missing span bugs
Names are represented inconsistently:
// Surface AST: name is a string, span is separate
type SDeclLet = {
name: string;
nameSpan?: Span; // Separate from name!
// ...
};
// Core AST: name has id but no span at all
type Name = {
id: number;
original: string;
// No span!
};A name IS a node. It has:
- A string value (what it says)
- A location (where it appears)
- An identity (unique ID for the binding it refers to)
These three things should be bundled together, always.
Current State:
In src/surface.ts, bindings have name: string and nameSpan?: Span as siblings:
export type SDeclLet = {
readonly kind: "SDeclLet";
readonly name: string;
readonly nameSpan?: Span;
readonly value: SExpr;
readonly span?: Span;
};
export type SLet = {
readonly kind: "SLet";
readonly name: string;
readonly nameSpan?: Span;
readonly value: SExpr;
readonly body: SExpr;
readonly span?: Span;
};In src/core.ts, the Name type has an ID but no span:
export type Name = {
readonly id: number;
readonly original: string;
};Why This Is Wrong:
-
Separation of concerns violation: The name string and its location are intrinsically linked. Separating them means you can have a name without knowing where it came from.
-
Inconsistent tracking: Some nodes have
nameSpan, others don't. The parser has to remember to set bothnameandnameSpanseparately. -
Lost information in Core AST: When we desugar Surface → Core, we lose span information because
Namedoesn't have a span field. -
LSP workarounds: To support "go to definition" and "hover", we had to create separate tracking structures (
typeMap,exprTypeMap) indexed by position because names don't carry their own positions.
What Should Exist:
// A Name is a node with all three properties
export type Name = {
readonly id: number; // Unique identifier (assigned during resolution)
readonly text: string; // The actual name string
readonly span: Span; // Where this name appears in source
};
// Before resolution, we have UnresolvedName (no id yet)
export type UnresolvedName = {
readonly text: string;
readonly span: Span;
};Current State:
Every AST node type has span?: Span:
// src/surface.ts
export type SVar = {
readonly kind: "SVar";
readonly name: string;
readonly span?: Span;
};
export type SApp = {
readonly kind: "SApp";
readonly func: SExpr;
readonly arg: SExpr;
readonly span?: Span;
};
// ... every single node typeWhy This Is Wrong:
-
Parsing always produces spans: The parser reads tokens which have positions. When it constructs nodes, it knows exactly where they start and end. There's no reason for the span to be missing.
-
Null checks everywhere: Code that uses spans must constantly check for
undefined:if (expr.span) { const range = spanToRange(lineIndex, expr.span); } // What do we do if there's no span? Crash? Return null? Guess?
-
Type system failure: TypeScript can't help us catch "oops, I forgot to set the span" bugs because
undefinedis always valid.
What Should Exist:
// Span is ALWAYS present
export type SVar = {
readonly kind: "SVar";
readonly name: UnresolvedName; // Name has span built-in
readonly span: Span; // Node's overall span (required)
};For compiler-generated nodes (during desugaring), use a synthetic span that points to the source construct:
// When desugaring `if c then a else b` into a match:
// The generated match node gets the span of the original if-expression
const matchSpan = ifExpr.span; // Always available, always validCurrent State:
src/compile.ts contains 300+ lines of span offsetting functions:
const offsetSpan = (span: Span | undefined, offset: number): Span | undefined => {
if (!span) return undefined;
return { start: span.start + offset, end: span.end + offset };
};
const offsetType = (type: SType, offset: number): SType => {
/* 30 lines */
};
const offsetPattern = (pattern: SPattern, offset: number): SPattern => {
/* 50 lines */
};
const offsetCase = (c: SCase, offset: number): SCase => {
/* ... */
};
const offsetDoStmt = (stmt: SDoStmt, offset: number): SDoStmt => {
/* ... */
};
const offsetExpr = (expr: SExpr, offset: number): SExpr => {
/* 120 lines */
};
const offsetDecl = (decl: SDecl, offset: number): SDecl => {
/* 50 lines */
};
const offsetProgram = (program: SProgram, offset: number): SProgram => {
/* ... */
};Why This Is Wrong:
-
Manual traversal for every node type: Every time we add a new AST node kind, we must remember to add offset handling. Easy to forget, hard to test.
-
Repetitive boilerplate: These functions do almost identical things—recursively walk the tree and add an offset to spans.
-
Indicates broken abstraction: If we need to manually adjust positions for multi-file compilation, our position model is wrong.
What Should Exist:
Two options:
Option A: File-relative spans with file ID
type Span = {
readonly fileId: number; // Which file
readonly start: number; // Offset within that file
readonly end: number;
};No offsetting needed—each file's spans are independent.
Option B: Generic AST traversal
// One function to rule them all
const mapSpans = <T extends ASTNode>(node: T, f: (span: Span) => Span): T => {
// Generic traversal that handles all node types
};
// Usage
const adjusted = mapSpans(program, (span) => ({
...span,
start: span.start + offset,
end: span.end + offset,
}));Option A is cleaner and eliminates the problem at its source.
Current State:
Type information lives in multiple places:
// src/checker.ts
export type CheckOutput = {
typeEnv: TypeEnv; // Map<string, Scheme> - name to type
typeMap: Map<number, Scheme>; // Map<nameId, Scheme>
exprTypeMap: Map<number, Type>; // Map<globalOffset, Type>
tupleProjections: Map<number, Type>; // Special case for .0, .1 access
constructorRegistry: ConstructorRegistry;
// ...
};Why This Is Wrong:
-
No single source of truth: To find "what is the type at position X?", you must:
- Check if it's a name reference → look in
typeMapby name ID - Check if it's an expression → look in
exprTypeMapby position - Check if it's a tuple projection → look in
tupleProjections - Fall back to inferring from context
- Check if it's a name reference → look in
-
Position-based indexing is fragile:
exprTypeMapis keyed by global byte offset. This breaks if:- File order changes
- We want incremental compilation
- We forget to offset positions correctly
-
LSP code is complex:
src/lsp/analysis.ts:236-270has to check multiple sources with fallbacks.
What Should Exist:
Every AST node should have a unique ID assigned during parsing:
type NodeId = number;
type SExpr = {
readonly nodeId: NodeId; // Unique across entire compilation
readonly kind: "SVar";
readonly name: UnresolvedName;
readonly span: Span;
};Type information is stored in one map:
type TypeMap = Map<NodeId, Type>;Simple. One lookup. No fallbacks.
Current State:
The resolver optionally builds a symbol table:
// src/resolve.ts
export const resolveProgram = (
program: CProgram,
env: Env = new Map(),
constructorTypes: ConstructorTypes = new Set(),
knownTypeNames: Set<string> = new Set(),
fileRegistry?: FileRegistry, // Optional! Only for LSP
): ResolveResult => {
// ...
// Symbol table only built if fileRegistry provided
const symbolTable = fileRegistry ? createSymbolTableBuilder() : null;Why This Is Wrong:
-
Two code paths: The resolver behaves differently with/without LSP mode. This means bugs might only appear in one mode.
-
Symbol information is duplicated: The resolver tracks bindings in
Env, then separately inSymbolTable. They can get out of sync. -
Type enrichment is a hack: After type checking, we call
enrichWithTypes()to add type info to the symbol table. This is a post-hoc patch, not integrated design.
What Should Exist:
The resolver ALWAYS builds a symbol table. For CLI compilation, we just don't use it. The cost is negligible (it's just storing information we already compute), and it guarantees consistency.
export const resolveProgram = (
program: CProgram,
fileRegistry: FileRegistry, // Required, not optional
): ResolveResult => {
const symbolTable = createSymbolTable();
// Always track symbols
};Current State:
The Core AST (src/core.ts) has no span information:
export type CExpr =
| CVar
| CLit
| CApp
| CAbs
| CLet
| CLetRec
| CMatch
| CCon
| CTuple
| CRecord
| CRecordUpdate
| CField;
// None of these have spans!
export type CVar = {
readonly kind: "CVar";
readonly name: Name; // Name has no span either
};Why This Is Wrong:
-
Error messages in type checker reference positions: When we report "type mismatch at line 5", we need to know where the expression is. Without spans in Core, we lose this.
-
Desugaring erases origin: When
if c then a else bbecomes a match expression, we lose the ability to point back to theifkeyword. -
Forces position-based hacks: The checker uses
exprTypeMapkeyed by global offset because nodes don't have their own identity/position.
What Should Exist:
Core AST should preserve spans:
export type CExpr =
| { kind: "CVar"; name: Name; span: Span }
| { kind: "CLit"; value: Literal; span: Span }
| { kind: "CApp"; func: CExpr; arg: CExpr; span: Span };
// ...Alternatively, use node IDs that map back to spans:
type NodeId = number;
type CExpr = { kind: "CVar"; nodeId: NodeId; name: Name };
// ...
// Spans stored in: Map<NodeId, Span>Current State:
src/checker.ts is 1817 lines containing:
- Type context and environment management
- Unification algorithm
- Type inference for all expression types
- Pattern type inference
- Exhaustiveness checking
- SCC analysis for recursive bindings
- Type class constraint handling
Why This Is Wrong:
Single Responsibility Principle. A 1800-line file is hard to:
- Navigate
- Test in isolation
- Understand as a whole
- Modify safely
What Should Exist:
Split into focused modules:
src/checker/
├── index.ts # Main API, ties everything together
├── context.ts # CheckContext, TypeEnv management
├── unify.ts # Unification algorithm
├── infer.ts # Expression type inference
├── patterns.ts # Pattern type inference
├── exhaustiveness.ts # Exhaustiveness checking
├── constraints.ts # Type class constraint solving
└── scc.ts # Strongly connected components analysis
Current State:
src/optimize.ts is 2655 lines containing:
- Alpha renaming
- Constant folding
- Beta reduction
- Eta reduction
- Dead code elimination
- Tail call optimization
- Case-of-known-constructor
- Case-of-case transformation
- Let floating (inward and outward)
- Function inlining
Why This Is Wrong:
Same as above. Plus:
- Each optimization has its own environment type (
ExtendedEnv,InlineEnv,FunctionEnv,ConstEnv) - No clear pass interface
- Difficult to enable/disable individual optimizations
- Hard to add new optimizations
What Should Exist:
src/optimize/
├── index.ts # Optimization pipeline orchestration
├── pass.ts # Pass interface definition
├── traverse.ts # Common AST traversal utilities
├── fold.ts # Constant folding
├── beta.ts # Beta reduction
├── eta.ts # Eta reduction
├── dce.ts # Dead code elimination
├── tco.ts # Tail call optimization
├── case-of-known.ts # Case-of-known-constructor
├── case-of-case.ts # Case-of-case transformation
├── float.ts # Let floating
└── inline.ts # Function inlining
Current State:
Every match expression generates an async IIFE:
// src/codegen.ts:418-534
const genMatch = (ctx: CodeGenContext, match: IR.IRMatch): string => {
const lines: string[] = [];
lines.push("(async () => {");
// ... generate match body ...
lines.push("})()");
return "await " + lines.join("\n");
};Generated code:
const result = await (async () => {
if (x === null) {
return 0;
} else {
return x.h;
}
})();Why This Is Wrong:
- Bloated output: Every match becomes a function call
- Performance overhead: Creating and awaiting async functions has cost
- Harder to debug: Stack traces show anonymous functions
What Should Exist:
Track whether we're in an expression or statement context:
// In statement context, generate directly:
let result;
if (x === null) {
result = 0;
} else {
result = x.h;
}
// In expression context, use ternary when possible:
const result = x === null ? 0 : x.h;
// Only use IIFE when truly necessary (complex multi-branch in expression position)Current State:
// src/lexer.ts:45
export type Token = [TokenKind, number, number];Why This Is Wrong:
// Hard to read
const kind = token[0];
const start = token[1];
const end = token[2];
// vs
const { kind, start, end } = token;What Should Exist:
export type Token = {
readonly kind: TokenKind;
readonly start: number;
readonly end: number;
};Goal: Make all spans required throughout the AST.
Rationale: This is foundational. Until spans are reliable, we can't build reliable LSP features or error messages.
Changes:
- Change
span?: Spantospan: Spanin all Surface AST types - Update parser to always provide spans (it already does, we just need to make types honest)
- Add
span: Spanto Core AST types - Update desugaring to propagate/synthesize spans
- Remove all
span?.null checks
Goal: Names are first-class nodes with text, span, and (after resolution) ID.
Rationale: This eliminates the name/nameSpan split and carries location through the pipeline.
Changes:
- Define
UnresolvedName = { text: string; span: Span } - Define
Name = { id: number; text: string; span: Span } - Replace
name: string+nameSpan?: Spanwithname: UnresolvedNamein Surface AST - Replace
Namein Core AST to include span - Update parser to create
UnresolvedNamenodes - Update desugarer and resolver accordingly
- Remove all
nameSpanfields
Goal: Spans know which file they belong to, eliminating manual offset adjustment.
Rationale: Multi-file compilation should be transparent, not require 300 lines of offsetting.
Changes:
- Add
FileRegistrythat assigns IDs to files - Change
Span = { fileId: number; start: number; end: number } - Remove all
offset*functions fromcompile.ts - Update span-to-position conversion to use file registry
Goal: Every AST node has a unique ID; type information is stored in one map.
Rationale: Simplifies type lookup and enables incremental compilation.
Changes:
- Add
nodeId: numberto all AST node types - Create ID generator used during parsing
- Replace
typeEnv,typeMap,exprTypeMap,tupleProjectionswith singleTypeMap<NodeId, Type> - Update checker to store types by node ID
- Update LSP to use node ID for type lookup
Goal: Symbol table is always built, not optional.
Rationale: Eliminates dual code paths and ensures consistency.
Changes:
- Make
fileRegistryrequired parameter in resolver - Always build symbol table during resolution
- Store type information directly in symbol table during checking
- Remove
enrichWithTypespost-processing
Goal: Break large files into focused modules.
Rationale: Improves maintainability, testability, and comprehension.
Changes:
- Split
checker.tsintosrc/checker/directory - Split
optimize.tsintosrc/optimize/directory - Define clear interfaces between modules
Goal: Generate cleaner JavaScript without unnecessary async IIFEs.
Rationale: Better output quality and runtime performance.
Changes:
- Track expression vs statement context in codegen
- Generate direct statements when in statement context
- Use ternary operators for simple matches
- Only use IIFE when necessary
- 1.1 Update
Spandefinition to be always required - 1.2 Update all Surface AST types (
SExpr,SPattern,SType,SDecl) to havespan: Span - 1.3 Update parser to always set span (remove any
undefinedassignments) - 1.4 Add span field to all Core AST types (
CExpr,CPattern,CDecl) - 1.5 Update desugarer to propagate spans from Surface to Core
- 1.6 Update desugarer synthetic nodes to use source span
- 1.7 Remove all
?.optional chaining on span access throughout codebase - 1.8 Remove all
span && ...conditional checks - 1.9 Update tests to provide spans in test AST construction
- 1.10 Run full test suite and fix any issues
- 2.1 Define
UnresolvedNametype:{ text: string; span: Span } - 2.2 Define
ResolvedNametype:{ id: number; text: string; span: Span } - 2.3 Update
SVarto usename: UnresolvedNameinstead ofname: string - 2.4 Update
SLetto usename: UnresolvedName, removenameSpan - 2.5 Update
SLetRecbindings to usename: UnresolvedName, removenameSpan - 2.6 Update
SDeclLetto usename: UnresolvedName, removenameSpan - 2.7 Update
SDeclLetRecbindings to usename: UnresolvedName, removenameSpan - 2.8 Update
SDeclForeignto usename: UnresolvedName, removenameSpan - 2.9 Update
SAbsparams to useUnresolvedName - 2.10 Update
SPVarto usename: UnresolvedName - 2.11 Update
SPAsto usename: UnresolvedName - 2.12 Update
Namein Core AST to includespan: Span - 2.13 Update parser to create
UnresolvedNamenodes - 2.14 Update desugarer to convert
UnresolvedNametoName(with placeholder id=-1) - 2.15 Update resolver to assign IDs and preserve spans in
Name - 2.16 Update all code that accesses
namestring to usename.text - 2.17 Remove all references to
nameSpanfield - 2.18 Update formatter to use
name.text - 2.19 Update LSP analysis to use
name.spandirectly - 2.20 Run full test suite and fix any issues
- 3.1 Update
Spantype to includefileId: number - 3.2 Move
FileRegistryfrom LSP to core (it's needed by all spans now) - 3.3 Update parser to accept file ID and embed in all spans
- 3.4 Update
compile.tsto create file registry before parsing - 3.5 Remove
offsetSpanfunction - 3.6 Remove
offsetTypefunction - 3.7 Remove
offsetPatternfunction - 3.8 Remove
offsetCasefunction - 3.9 Remove
offsetDoStmtfunction - 3.10 Remove
offsetExprfunction - 3.11 Remove
offsetDeclfunction - 3.12 Remove
offsetProgramfunction - 3.13 Update
spanToRangeto use file registry for content lookup - 3.14 Update diagnostic printing to resolve file from span
- 3.15 Update LSP position conversion to use file-aware spans
- 3.16 Run full test suite and fix any issues
- 4.1 Create
NodeIdtype alias:type NodeId = number - 4.2 Create
NodeIdGeneratorwithnext(): NodeIdmethod - 4.3 Add
nodeId: NodeIdto all Surface AST types - 4.4 Add
nodeId: NodeIdto all Core AST types - 4.5 Update parser to assign node IDs during parsing
- 4.6 Update desugarer to assign new node IDs to synthetic nodes
- 4.7 Create unified
nodeTypeMap = Map<NodeId, Type>in checker - 4.8 Update checker to store types by node ID
- 4.9
Remove(kept for backward compatibility with lower.ts)typeMap - 4.10 Remove
exprTypeMap: Map<number, Type>from CheckOutput - 4.11
Remove(still used for deferred tuple projections)tupleProjections - 4.12
Update LSP(uses symbolTable, not typeMap)getHoverAtPosition - 4.13 Update lower.ts type lookup to use node IDs
- 4.14 Run full test suite and fix any issues
- 5.1 Make
fileRegistryrequired inresolveProgramsignature - 5.2 Always create symbol table in resolver (remove conditional)
- 5.3 Update
ResolveResultto always includesymbolTableBuilder(mutable) - 5.4 Move type storage into symbol table during checking
- 5.5 Remove
enrichWithTypesfunction (replaced byfreezeSymbolTableWithSubst) - 5.6 Update
compileForLSPto use integrated symbol table - 5.7 Update regular
compileto pass file registry - 5.8 Ensure symbol table is populated even in CLI mode
- 5.9 Run full test suite and fix any issues
- 6.1 Create
src/checker/directory - 6.2 Extract
CheckContexttype and functions tosrc/checker/context.ts - 6.3 Extract unification to
src/checker/unify.ts - 6.4 Extract expression inference to
src/checker/infer.ts - 6.5 Extract pattern inference to
src/checker/patterns.ts - 6.6 Extract exhaustiveness checking to
src/checker/exhaustiveness.ts - 6.7 Extract SCC analysis to
src/checker/scc.ts - 6.8 Create
src/checker/index.tsthat re-exports public API - 6.9 Create
src/optimize/directory - 6.10
DefineCreatedPassinterface insrc/optimize/pass.tssrc/optimize/types.tswith shared types instead - 6.11 Extract constant folding to
src/optimize/fold.ts(includes beta reduction, case-of-known, eta reduction) - 6.12
Extract beta reduction toCombined in fold.tssrc/optimize/beta.ts - 6.13 Extract dead code elimination to
src/optimize/dce.ts - 6.14 Extract tail call optimization to
src/optimize/tco.ts - 6.15
Extract case-of-known toCombined in fold.tssrc/optimize/case-of-known.ts - 6.16 Extract case-of-case to
src/optimize/case-of-case.ts - 6.17 Extract let floating to
src/optimize/float.ts - 6.18 Extract function inlining to
src/optimize/inline.ts - 6.19
Create common traversal utilities inCreatedsrc/optimize/traverse.tssrc/optimize/types.tsandsrc/optimize/alpha.ts - 6.20 Create
src/optimize/index.tsthat orchestrates passes - 6.21 Update imports throughout codebase
- 6.22 Run full test suite and fix any issues
- 7.1 Add
context: "expr" | "stmt"toCodeGenContext - 7.2 Implement
genExprAsStatementsfor return-position code generation - 7.3 Implement direct statement generation for matches in function bodies (no IIFE needed)
- 7.4 Implement ternary generation for simple two-branch matches (True/False and literal patterns)
- 7.5 Only generate async when necessary (lambdas and IIFEs detect if body requires await)
- 7.6 Pure functions (no await needed) are now non-async
- 7.7 Add output quality tests comparing generated code (optional)
- 7.8 Run full test suite and fix any issues
-
C.1 Change(tuple is intentional for efficiency)Tokenfrom tuple to object type - C.2 Remove unnecessary type assertions (used readonly types, type guards)
- C.3 Add stricter TypeScript settings (noUnusedLocals, noUnusedParameters)
- C.4 Review and remove dead code (removed unused exports, TODO comments)
- C.5 Ensure consistent naming conventions (verified consistent patterns)
- C.6 Add missing JSDoc comments to public APIs
Current:
- All nodes have
span?: Span - Names are strings with separate
nameSpanfields
After Phase 1:
- All nodes have
span: Span
After Phase 2:
- Names use
UnresolvedNametype with embedded span nameSpanfields removed
After Phase 3:
SpanincludesfileId
After Phase 4:
- All nodes have
nodeId: NodeId
Current:
Namehasidandoriginalbut no span- No spans on any nodes
After Phase 1:
- All nodes have
span: Span
After Phase 2:
Nameincludesspan: Span
After Phase 4:
- All nodes have
nodeId: NodeId
Current:
- Creates nodes with optional spans
- Creates
name: stringfields
After Phase 1:
- Always sets span (types now require it)
After Phase 2:
- Creates
UnresolvedNamenodes for all names
After Phase 3:
- Accepts
fileIdparameter, embeds in all spans
After Phase 4:
- Uses
NodeIdGeneratorto assign IDs
Current:
- Creates Core nodes without spans
After Phase 1:
- Propagates spans from Surface to Core
- Synthesizes spans for compiler-generated nodes
After Phase 2:
- Converts
UnresolvedNametoName(with span)
Current:
- Optional
fileRegistryparameter - Conditional symbol table creation
After Phase 5:
- Required
fileRegistryparameter - Always builds symbol table
Current:
- 1817 lines, multiple concerns
- Multiple type maps
After Phase 4:
- Single
TypeMap<NodeId, Type>
After Phase 6:
- Split into
src/checker/directory with focused modules
Current:
- 300+ lines of span offsetting functions
After Phase 3:
- All offset functions removed (spans are file-aware)
Current:
- 2655 lines, all optimizations in one file
After Phase 6:
- Split into
src/optimize/directory - Each optimization in its own file
- Common traversal utilities shared
Current:
- Generates async IIFEs for all matches
After Phase 7:
- Context-aware code generation
- Direct statements when possible
- Ternaries for simple matches
- IIFEs only when necessary
Current:
Tokenis a tuple[TokenKind, number, number](intentional for efficiency - no change needed)
Current:
- Complex type lookup with multiple fallbacks
After Phase 4:
- Simple lookup by node ID
After Phase 5:
- Uses integrated symbol table
-
Types don't lie. If something can be
undefined, the type says| undefined. If something is always present, the type doesn't allowundefined. -
Spans are sacred. Every piece of source code has a location. That location should be preserved through all transformations.
-
Names are nodes. A name isn't just a string—it's a piece of source code with a location and (after resolution) an identity.
-
One source of truth. Type information lives in one place. Symbol information lives in one place. No duplication, no synchronization bugs.
-
Small, focused modules. Each file has one job. If you can't describe the file's purpose in one sentence, split it.
-
No special modes. The compiler works the same way whether invoked from CLI or LSP. Different consumers, same pipeline.