Skip to content

feat: Implement Tier 1 Core Language Features (Enums, Options, Maps)#19

Merged
mjm918 merged 15 commits intomainfrom
fix/codegen-compilation-errors
Jan 21, 2026
Merged

feat: Implement Tier 1 Core Language Features (Enums, Options, Maps)#19
mjm918 merged 15 commits intomainfrom
fix/codegen-compilation-errors

Conversation

@mjm918
Copy link
Copy Markdown
Contributor

@mjm918 mjm918 commented Jan 21, 2026

Summary

This PR implements the Tier 1 core language features from #15, enabling naml to support enums, options, maps, and foundational generics support.

Features Implemented

  • Enums with Variants: Support for enum definitions with unit and data-carrying variants, including pattern matching in switch statements
  • Option Type: Built-in option<T> with some(value), none, .is_some(), .is_none(), .or_default(), and .unwrap() methods
  • Map Type: Hash map implementation with map<K, V> supporting literals {}, indexing map[key], and assignment map[key] = value
  • Memory Management: Scope-based reference counting with proper cleanup for nested heap types (strings in arrays, arrays in maps, etc.)
  • Pattern Matching: Enum variant destructuring in switch/case statements

Key Changes

  • Added Cranelift JIT backend with full runtime system
  • Implemented hash map runtime with FNV-1a hashing and linear probing
  • Added reference counting for all heap-allocated types (strings, arrays, maps, structs)
  • Generated per-struct decref functions for proper nested cleanup
  • Added comprehensive test suite for Tier 1 features

New Runtime Functions

Module Functions
Array new, from, push, pop, get, set, len, incref, decref
Map new, set, get, contains, len, incref, decref
Option Built into enum system with some/none variants
Struct new, get_field, set_field, incref, decref, free

Files Changed

  • namlc/src/codegen/cranelift/ - New JIT backend (3200+ lines)
  • namlc/src/runtime/ - New runtime system (array, map, channel, scheduler, value)
  • namlc/src/parser/patterns.rs - Pattern parsing for switch cases
  • namlc/src/ast/patterns.rs - Pattern AST definitions
  • examples/ - Multiple test files for new features

Test plan

  • Run cargo test - all 116 tests pass
  • Run cargo run -- run examples/tier1_test.naml - comprehensive integration test
  • Run cargo run -- run examples/test_option.naml - option type tests
  • Run cargo run -- run examples/test_enum.naml - enum pattern matching
  • Run cargo run -- run examples/test_map.naml - map operations
  • Run cargo run -- run examples/memory_test.naml - memory cleanup verification
  • Run cargo run -- run examples/struct_cleanup_test.naml - struct nested cleanup

Closes #15

🤖 Generated with Claude Code

mjm918 and others added 13 commits January 20, 2026 23:24
This commit addresses multiple categories of codegen errors in the Rust backend:

**Fixes:**
- Fix await expression throws detection for method calls
- Fix move/clone semantics by cloning function args and array elements
- Add compare() method transformation to use partial_cmp
- Fix integer literals to use unsuffixed format for type inference
- Fix printf/Display by using {:?} for non-Display types
- Fix E0507 move out of mutable reference with borrow+clone
- Fix var...else Box deref for recursive struct fields

**Changes:**
- expressions.rs: Add emit_function_arg helper for cloning
- expressions.rs: Add resolve_receiver_type_name for better type detection
- expressions.rs: Transform compare() to partial_cmp
- statements.rs: Add is_recursive_field_access for Box handling
- statements.rs: Improve var...else pattern for self fields
- mod.rs: Track recursive structs for proper Box wrapping

Reduces errors from 37 to 12. Remaining errors are mostly closure vs
fn pointer issues which require significant design changes to address.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace Rust codegen with Cranelift JIT for faster development iteration.
This enables direct execution via `naml run` without Rust compilation.

Changes:
- Add Cranelift JIT codegen (cranelift/mod.rs, types.rs)
- Remove Rust transpilation codegen (no longer needed)
- Add runtime system with value representation (runtime/value.rs)
- Add array runtime with heap allocation (runtime/array.rs)
- Add channel runtime for concurrency (runtime/channel.rs)
- Add cooperative scheduler for spawn tasks (runtime/scheduler.rs)
- Add example files demonstrating language features
- Update type inference for better generic support

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add new Pattern AST nodes to support pattern matching in switch/case
statements. This enables enum destructuring like `case Suspended(reason):`

New types:
- Pattern enum (Literal, Identifier, Variant, Wildcard)
- LiteralPattern for matching literal values
- IdentPattern for binding/comparing identifiers
- VariantPattern for enum variant matching with bindings
- WildcardPattern for the `_` catch-all pattern

Updates SwitchCase to use Pattern instead of Expression.

Note: Parser, typechecker, and codegen updates pending in follow-up tasks.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix misleading documentation claiming "Copy-free" allocation when
  VariantPattern uses Vec which allocates on the heap
- Add lifetime parameter to Pattern<'ast> for consistency with
  Expression<'ast> and Statement<'ast>
- Update SwitchCase to use Pattern<'ast> with proper lifetime
- Fix import style in statements.rs to use relative imports

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a dedicated pattern parser that supports:
- Literal patterns (int, float, string, bool, none)
- Identifier patterns (bindings or constant references)
- Variant patterns with optional bindings (e.g., Some(x), Status::Active)
- Wildcard pattern (_)

Update parse_switch_stmt to use parse_pattern instead of parse_expression.

Note: This introduces expected compilation errors in visitor, typechecker,
and codegen modules that will be addressed in subsequent tasks.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add EnumDef and EnumVariantDef structs to the Cranelift JIT compiler
to track enum definitions during compilation. This infrastructure will
be used for generating code for enum construction and pattern matching.

The enum representation uses stack-allocated fixed-size tagged unions:
- Layout: | tag: u32 | padding: u32 | data: [u8; max_size] |
- Size = 8 + max variant data size (aligned to 8 bytes)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for constructing enum values in Cranelift JIT:
- Add enum_defs to CompileContext for enum type information
- Handle Expression::Path for unit variants (e.g., Color::Red)
- Handle Expression::Call for variants with data (e.g., Status::Suspended("reason"))

Enum layout: | tag: u32 | padding: u32 | data fields... |
Values are stack-allocated with pointers returned.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add visit_pattern method to visitor trait for proper pattern traversal
- Add walk_pattern function to traverse pattern nodes
- Fix switch case handling in visitor to use visit_pattern instead of visit_expr
- Add infer_pattern method to type inferrer for pattern type inference
- Fix switch case handling in typechecker to use infer_pattern
- Fix convert_ast_type to resolve named types (enum/struct) from symbol table
- Add compile_pattern_match function to compile pattern comparisons
- Add bind_pattern_vars function to bind variables in destructuring patterns
- Update switch statement codegen to use pattern matching functions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add built-in option enum as a polymorphic type with some/none variants.
Implements some() expression, none literal, and is_some/is_none/or_default
method calls using stack-allocated 16-byte option representation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds NamlMap type with FNV-1a hashing and linear probing for naml's
map<K, V> type with string keys.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Wire up the map runtime functions to the Cranelift codegen:
- Register naml_map_* symbols in JIT compiler
- Handle Expression::Map for map literal compilation
- Support map indexing with string keys (m["key"])
- Support map index assignment (m["key"] = value)
- Add map methods: contains, set
- Add naml_string_from_cstr to convert C strings to NamlStrings
- Add test file examples/test_map.naml

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive integration test that exercises all Tier 1 features:
- Enums with unit variants and pattern matching via switch
- Option type (some, none, is_some, is_none, or_default)
- Maps (creation, string key indexing, assignment)
- Structs (definition, instantiation, field access)
- Arrays (creation, len(), for loop iteration)
- Control flow (if/else, while, for, switch)
- Combined feature tests (enums in arrays, options with arithmetic)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…eanup

- Add memory management test files to validate heap allocation, reference counting, and cleanup for strings, arrays, maps, and structs.
- Introduce field heap type tracking in `StructDef` for nested cleanup during decref.
- Register enhanced runtime symbols for specialized decref methods for strings, arrays, maps, and structs.
- Implement per-struct decref function generation for structs with heap-allocated fields.
- Update codegen to manage heap variable lifecycle using incref and decref during assignments and returns.
@claude
Copy link
Copy Markdown

claude bot commented Jan 21, 2026

Code Review - 3 Issues Found

I've identified 3 significant bugs in this PR that should be addressed:

1. Memory Ordering Bug in HeapHeader::decref

File: namlc/src/runtime/value.rs:52-53

The decref() method uses Ordering::Release for fetch_sub but lacks an acquire fence before freeing memory. This can cause use-after-free bugs in multi-threaded scenarios.

Current code:

pub fn decref(&self) -> bool {
    self.refcount.fetch_sub(1, Ordering::Release) == 1
}

Issue: Without an acquire fence, there's no guarantee that the thread freeing the memory has observed all previous accesses to the object made by other threads that previously held references.

Fix: Add an acquire fence when refcount reaches zero:

pub fn decref(&self) -> bool {
    if self.refcount.fetch_sub(1, Ordering::Release) == 1 {
        std::sync::atomic::fence(Ordering::Acquire);
        true
    } else {
        false
    }
}

References:

Code location:

pub fn decref(&self) -> bool {
self.refcount.fetch_sub(1, Ordering::Release) == 1
}


2. Non-Atomic Refcount Operations in Generated Code

File: namlc/src/codegen/cranelift/mod.rs:401-404

The generate_struct_decref function generates non-atomic load/store operations for refcount manipulation, which creates race conditions in multi-threaded code.

Current code:

// Lines 399-404
// For simplicity, just load/decrement/store (single-threaded assumption for now)
let refcount = builder.ins().load(cranelift::prelude::types::I64, MemFlags::new(), struct_ptr, 0);
let one = builder.ins().iconst(cranelift::prelude::types::I64, 1);
let new_refcount = builder.ins().isub(refcount, one);
builder.ins().store(MemFlags::new(), new_refcount, struct_ptr, 0);

Issue: The comment acknowledges "single-threaded assumption for now", but the language has a full M:N task scheduler that runs tasks across multiple OS threads. The non-atomic read-modify-write sequence can cause:

  • Memory leaks: Two threads load the same refcount value and both store decremented values, leaving refcount > 0 when it should be 0
  • Double-free: Two threads both see refcount=1, decrement to 0, and both attempt to free

Fix: Generate atomic fetch_sub instruction instead of load/sub/store. Cranelift supports atomic operations via AtomicRmw instructions.

Code location:

// Call atomic decref on refcount (at offset 0 in HeapHeader)
// We use an atomic fetch_sub and check if result was 1 (meaning we need to free)
// For simplicity, just load/decrement/store (single-threaded assumption for now)
// HeapHeader layout: refcount (8 bytes), tag (1 byte), pad (7 bytes)
let refcount = builder.ins().load(cranelift::prelude::types::I64, MemFlags::new(), struct_ptr, 0);
let one = builder.ins().iconst(cranelift::prelude::types::I64, 1);
let new_refcount = builder.ins().isub(refcount, one);
builder.ins().store(MemFlags::new(), new_refcount, struct_ptr, 0);


3. Memory Leak in naml_map_set When Updating Values

File: namlc/src/runtime/map.rs:101-103

When an existing key's value is updated, the old value is overwritten without decrementing its refcount, causing a memory leak.

Current code:

if string_eq((*entry).key as *const NamlString, key as *const NamlString) {
    (*entry).value = value;  // Old value overwritten without decref
    return;
}

Issue: When a value is replaced:

  1. The map previously held a reference to the old value (refcount was incremented on insertion)
  2. The old value pointer is overwritten with the new value
  3. The old value's refcount is never decremented
  4. The old value will never be freed even when no other references exist

Fix: Decrement the old value's refcount before overwriting. The implementation is complex because the map supports different value types (strings, arrays, maps, structs). You'll need to either:

  • Track value types in map entries
  • Use a generic decref function that inspects the heap header tag
  • Store type information alongside values

Code location:

}
if string_eq((*entry).key as *const NamlString, key as *const NamlString) {
(*entry).value = value;
return;
}


Checked for bugs and CLAUDE.md compliance.

…ments

- Replace `///` comments with `//!` for module-level documentation across all `ast` and related files.
- Ensure consistency with Rust's convention for crate/module documentation.
- No logic or functionality changes.
…port

- Introduced specialized `naml_map_set_*` functions (`string`, `array`, `map`, `struct`) in the runtime for type-specific map value setting.
- Updated map entry assignment to properly decref old values prior to updates.
- Enhanced Cranelift codegen to select type-specific map setter functions based on value type.
- Improved atomic refcounting: implemented `atomic_rmw` for thread-safe decref.
@mjm918 mjm918 merged commit 0707ee6 into main Jan 21, 2026
1 check passed
@mjm918 mjm918 deleted the fix/codegen-compilation-errors branch January 21, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tier 1: Core Language Features - Enums, Options, Maps, Generics

1 participant