Add sourcemap support to the Aiken compiler by Quantumplation · Pull Request #1250 · aiken-lang/aiken

Quantumplation · 2026-01-07T18:17:34Z

Motivation

As Aiken matures, developers need better tooling for understanding what their contracts are actually doing at runtime. Three use cases drove this work:

Time-travel debuggers - Tools like https://github.com/SundaeSwap-finance/gastronomy can step through UPLC execution, but without source maps, users see raw UPLC with DeBruijn indices instead of their original Aiken code.
Code coverage - Understanding which parts of your contract are actually exercised by tests is table-stakes for most languages. This requires mapping executed UPLC nodes back to source locations.
Fuzzers - Coverage-guided fuzzing needs the same source mapping to know when new code paths are discovered.

All three require the same fundamental capability: given a UPLC node during execution, answer "where did this come from in the source?"

Approach

The core insight is that UPLC's Term type can carry metadata through compilation and execution. We:

Made Term generic over a context parameter: Term<T, C = ()> where C defaults to unit for backwards compatibility, but can hold source locations when needed.
Threaded source locations through code generation: As the compiler builds UPLC terms, it attaches SourceLocation (module name + byte span) to each node.
Made the CEK machine generic over context: The stepping interface preserves context through execution, so debuggers can query "what source location is this term from?" at each step.
Made optimization passes context-aware: Source locations survive inlining, lambda reduction, and other transformations.
Added source map generation: Post-compilation, we walk the term tree and build a mapping from node indices to source locations, which can be embedded in the blueprint or exported separately.

Alternative considered

We initially experimented with a trace-based approach: inject special Trace calls at key points during codegen, then strip them out before final output. The traces would carry source location strings that a debugger could intercept.

This had several problems, mainly that it didn't play well with optimizations and was very brittle.

The generic context approach is cleaner: the metadata is truly out-of-band and doesn't affect the compiled output when not needed.

What's included

Term<T, C> with generic context parameter
SourceLocation type for cross-module location tracking
CEK machine stepping interface (get_initial_machine_state, step)
Source map generation in aiken export --source-map
aiken export --list to discover exportable functions/tests
aiken export --no-optimize for debugging (larger output, but maps directly to source)
aiken coverage command with LCOV output
Variable name tracking in source maps (so debuggers can show x instead of i_42)

⚠️ Review note

This PR was heavily AI-assisted (Claude Code) because I was doing it over Christmas break; Without this, given my busy schedule, it likely wouldn't have happened at all, so I'm hoping the output is coherent enough to offset the influence of AI.

The high level design and approach was decided by @SupernaviX and @MicroProofs ahead of implementation, and all tests pass; Still, the volume of mechanical changes (particularly threading context through pattern matches) means a careful review is warranted. Key areas to scrutinize:

crates/uplc/src/machine.rs - CEK machine changes
crates/uplc/src/optimize/shrinker.rs - Optimization pass changes
crates/aiken-project/src/blueprint/source_map.rs - Source map format (new public interface)
crates/aiken-lang/src/gen_uplc.rs - Location threading through codegen

The test suite passes and the feature works end-to-end with Gastronomy, but fresh eyes on the implementation details would be valuable.

Riley-Kilgore · 2026-02-05T04:45:18Z

Really glad you had time to make this feature-set!

The only issue I can find relating to the sourcemap support itself is that the validator.title is constructed as {module_name}.{validator_name} where module_name derives from the module path. For a validator file at validators/governance/voting.ak, the title becomes validators/governance/voting.my_validator. When joined with dir, the resulting path is:

dir/validators/governance/voting.my_validator.sourcemap.json

Line 774 creates only dir with fs::create_dir_all(dir), but not dir/validators/governance/. The subsequent fs::write at line 788 will fail with "No such file or directory".

Example Test:

/// Nested module paths contain slashes which causes PathBuf::join
/// to create subdirectories instead of flat filenames.
///
/// This test SHOULD PASS but currently FAILS because the source map code
/// at lib.rs:782 uses the raw validator title (which contains slashes) as
/// a filename, causing PathBuf::join to create nested directories.
///
/// BUG: The code does:
///   let filename = format!("{}.sourcemap.json", validator.title);
///   let file_path = dir.join(&filename);
///
/// This test verifies the CORRECT behavior: source map files should be
/// written to a flat directory structure. It will FAIL until the bug is fixed.
#[test]
fn source_map_path_should_be_flat() {
    use std::path::PathBuf;

    // Simulate the current (buggy) behavior in lib.rs for external source maps
    let dir = PathBuf::from("/tmp/sourcemaps");

    // For a validator at validators/governance/voting.ak with name "my_vote",
    // the title becomes "governance/voting.my_vote.spend"
    let validator_title = "governance/voting.my_vote.spend";

    // Current buggy code: uses title directly as filename
    let filename = format!("{}.sourcemap.json", validator_title);
    let file_path = dir.join(&filename);
    let parent = file_path.parent().unwrap();

    // This assertion FAILS because the current code creates a nested path
    // Expected: /tmp/sourcemaps/governance-voting.my_vote.spend.sourcemap.json (flat)
    // Actual: /tmp/sourcemaps/governance/voting.my_vote.spend.sourcemap.json (nested)
    assert_eq!(
        parent.to_str().unwrap(),
        "/tmp/sourcemaps",
        "Source map path should be flat, but got nested directory: {}",
        parent.display()
    );
}

Separately, is it intended behavior for coverage reporting to only run property tests a single time? It seems we should have coverage reporting run property based tests a multitude of times as normal to properly report coverage.

Quantumplation · 2026-02-21T18:18:34Z

@Riley-Kilgore sorry, I didn't get notified when you responded! Or, more likely, it got buried. I think both points are fair, so I'll address them soon.

How did the AI generated code stand up?

Riley-Kilgore

Thanks for the updates, and for working through the earlier review.

I've worked through a second pass and found some more issues which seem like they'd be worth addressing.

Span/context loss in optimizer cleanup paths

Where: crates/uplc/src/optimize/shrinker.rs (afterwards, run_once_pass)
Problem: Name <-> NamedDeBruijn cleanup strips context and restores C::default().
Impact: source maps / coverage lose span fidelity.

Full regression tests:

#[test]
fn afterwards_preserves_term_context() {
    #[derive(Clone, Debug, PartialEq, Eq)]
    struct Ctx(u8);

    impl Default for Ctx {
        fn default() -> Self {
            Ctx(0)
        }
    }

    let program: Program<Name, Ctx> = Program {
        version: (1, 0, 0),
        term: Term::<Name, Ctx>::integer(1.into()).map_context(|_| Ctx(1)),
    };

    let optimized = program.afterwards();

    assert_eq!(
        optimized.term.context(),
        Ctx(1),
        "expected optimizer to preserve term context through afterwards()"
    );
}

#[test]
fn run_once_pass_preserves_term_context() {
    #[derive(Clone, Debug, PartialEq, Eq)]
    struct Ctx(u8);

    impl Default for Ctx {
        fn default() -> Self {
            Ctx(0)
        }
    }

    let program: Program<Name, Ctx> = Program {
        version: (1, 0, 0),
        term: Term::<Name, Ctx>::integer(1.into()).map_context(|_| Ctx(1)),
    };

    let optimized = program.run_once_pass();

    assert_eq!(
        optimized.term.context(),
        Ctx(1),
        "expected optimizer to preserve term context through run_once_pass()"
    );
}

Source-map index stability after parameter application

Where: crates/aiken-project/src/blueprint/source_map.rs (visit_post_order)
Problem: Apply traversal is argument-first, shifting indices for existing function subtrees when wrapping with Apply.
Impact: can invalidate existing sourceMap / sourceMapFile assumptions.

Full regression test:

#[test]
fn from_term_indices_do_not_shift_when_wrapping_in_apply() {
    use aiken_lang::{ast::Span, line_numbers::LineNumbers};

    let src = "x";
    let module_name = "test";
    let data = (src.to_string(), LineNumbers::new(src));
    let mut module_sources: IndexMap<&str, &(String, LineNumbers)> = IndexMap::new();
    module_sources.insert(module_name, &data);

    let var_x: Term<Name, AstSourceLocation> = Term::Var {
        context: AstSourceLocation::new(module_name, Span { start: 0, end: 1 }),
        name: Rc::new(Name::text("x_id_0")),
    };

    let sm_before = SourceMap::from_term(&var_x, module_name, &module_sources);
    let idx_before = *sm_before
        .locations
        .keys()
        .next()
        .expect("expected a single location entry for Var(x)");

    let wrapped: Term<Name, AstSourceLocation> = Term::Apply {
        function: Rc::new(var_x.clone()),
        argument: Rc::new(Term::Constant {
            value: Rc::new(uplc::ast::Constant::Unit),
            context: AstSourceLocation::empty(),
        }),
        context: AstSourceLocation::empty(),
    };

    let sm_after = SourceMap::from_term(&wrapped, module_name, &module_sources);
    let idx_after = *sm_after
        .locations
        .keys()
        .next()
        .expect("expected Var(x) to still have a location entry after wrapping");

    assert_eq!(
        idx_after, idx_before,
        "wrapping a term in Apply should not shift indices for the original function subtree"
    );
}

aiken coverage can silently pass when property fuzzer crashes

Where:
- crates/aiken-project/src/lib.rs in collect_tests_for_coverage (Err(_) => break)
- crates/aiken-project/src/lib.rs in coverage (empty test list => success path)
Problem: fuzzer sampling errors can be swallowed, producing success with empty coverage output.
Impact: aiken check can fail while aiken coverage appears successful.

Full regression test:

#[cfg(test)]
mod coverage_regressions {
    use super::*;
    use crate::watch::with_project;
    use aiken_lang::ast::{TraceLevel, Tracing};
    use std::{
        fs,
        path::{Path, PathBuf},
        time::{SystemTime, UNIX_EPOCH},
    };

    fn mk_temp_project(name: &str, source: &str) -> PathBuf {
        let nonce = SystemTime::now()
            .duration_since(UNIX_EPOCH)
            .expect("clock went backwards")
            .as_nanos();

        let root = std::env::temp_dir().join(format!("aiken_cov_{name}_{nonce}"));
        fs::create_dir_all(root.join("lib")).expect("failed to create lib dir");

        fs::write(
            root.join("aiken.toml"),
            r#"name = "demo/coverage-regression"
version = "0.0.0"
license = "Apache-2.0"
description = "tmp"

[repository]
user = "demo"
project = "coverage-regression"
platform = "github"
"#,
        )
        .expect("failed to write aiken.toml");

        fs::write(root.join("lib/tests.ak"), source).expect("failed to write tests.ak");

        root
    }

    #[test]
    fn coverage_fails_when_property_fuzzer_errors() {
        let root = mk_temp_project(
            "fuzzer_error",
            r#"
fn bad_fuzzer() -> Fuzzer<Int> {
  todo
}

test fuzzer_boom(n via bad_fuzzer()) {
  True
}
"#,
        );

        let output = root.join("coverage.lcov");
        let result = with_project(Some(Path::new(&root)), false, true, true, |p| {
            p.coverage(
                None,
                false,
                1,
                5,
                Tracing::All(TraceLevel::Silent),
                None,
                output.clone(),
            )
        });

        assert!(
            result.is_err(),
            "coverage should fail when a property fuzzer crashes; current behavior can return success with an empty report"
        );

        let _ = fs::remove_dir_all(root);
    }
}

Coverage fail once semantics evaluated per iteration instead of per property

Where: crates/aiken-project/src/lib.rs (coverage result loop)
Problem: coverage expands property tests into name[0], name[1], ... and checks each independently.
Impact: test ... fail once can incorrectly fail coverage even when one iteration fails (which should satisfy fail once).

Full regression test:

#[test]
fn coverage_fail_once_should_aggregate_over_property_iterations() {
    use crate::coverage::CoverageData;
    use aiken_lang::ast::OnTestFailure;
    use uplc::machine::cost_model::ExBudget;

    let results = vec![
        (
            "mod.prop[0]".to_string(),
            OnTestFailure::SucceedImmediately,
            crate::coverage::CoverageResult {
                success: true,
                errored: false,
                returned_false: false,
                machine_error: false,
                coverage: CoverageData::new(),
                remaining_budget: ExBudget::max(),
                logs: vec![],
            },
        ),
        (
            "mod.prop[1]".to_string(),
            OnTestFailure::SucceedImmediately,
            crate::coverage::CoverageResult {
                success: false,
                errored: false,
                returned_false: true,
                machine_error: false,
                coverage: CoverageData::new(),
                remaining_budget: ExBudget::max(),
                logs: vec![],
            },
        ),
    ];

    let mut failed_tests = Vec::new();
    for (name, on_test_failure, cov_result) in results {
        let test_passed = match on_test_failure {
            OnTestFailure::SucceedEventually | OnTestFailure::SucceedImmediately => {
                cov_result.errored || cov_result.returned_false
            }
            OnTestFailure::FailImmediately => cov_result.success,
        };
        if !test_passed {
            failed_tests.push(name);
        }
    }

    assert!(
        failed_tests.is_empty(),
        "for `fail once`, coverage should aggregate iterations by base property and pass when any iteration fails"
    );
}

External sourceMapFile entries are not resolvable from blueprint location

Where: crates/aiken-project/src/lib.rs (external source-map write path and source_map_file assignment)
Problem: blueprint stores filename only (e.g. foo.foo.spend.sourcemap.json) and drops configured --source-map-dir prefix.
Impact: consumers resolving sourceMapFile relative to plutus.json cannot find the files.

Full regression test:

#[cfg(test)]
mod source_map_reference_regressions {
    use crate::{options::SourceMapMode, watch::with_project};
    use aiken_lang::ast::{TraceLevel, Tracing};
    use std::{
        fs,
        path::{Path, PathBuf},
        time::{SystemTime, UNIX_EPOCH},
    };

    fn mk_temp_project(name: &str, source: &str) -> PathBuf {
        let nonce = SystemTime::now()
            .duration_since(UNIX_EPOCH)
            .expect("clock went backwards")
            .as_nanos();

        let root = std::env::temp_dir().join(format!("aiken_sm_{name}_{nonce}"));
        fs::create_dir_all(root.join("validators")).expect("failed to create validators dir");

        fs::write(
            root.join("aiken.toml"),
            r#"name = "demo/source-map-regression"
version = "0.0.0"
license = "Apache-2.0"
description = "tmp"

[repository]
user = "demo"
project = "source-map-regression"
platform = "github"
"#,
        )
        .expect("failed to write aiken.toml");

        fs::write(root.join("validators/foo.ak"), source).expect("failed to write validator");
        root
    }

    #[test]
    fn external_source_map_file_reference_should_include_configured_directory() {
        let root = mk_temp_project(
            "source_map_file_reference",
            r#"
validator foo {
  spend(_d, _r, _o, _c) {
    True
  }
}
"#,
        );

        let map_dir_name = "maps_reference_case";
        let cwd_map_dir = std::env::current_dir()
            .expect("cwd")
            .join(map_dir_name);
        let project_map_dir = root.join(map_dir_name);
        let _ = fs::remove_dir_all(&cwd_map_dir);
        let _ = fs::remove_dir_all(&project_map_dir);

        let result = with_project(Some(Path::new(&root)), false, true, true, |p| {
            p.build(
                false,
                Tracing::All(TraceLevel::Silent),
                p.blueprint_path(None),
                None,
                SourceMapMode::External(PathBuf::from(map_dir_name)),
            )
        });
        assert!(result.is_ok(), "control check: build should succeed");

        let blueprint_path = root.join("plutus.json");
        let blueprint_src = fs::read_to_string(&blueprint_path).expect("failed to read blueprint");
        let blueprint_json: serde_json::Value =
            serde_json::from_str(&blueprint_src).expect("failed to parse blueprint json");

        let validators = blueprint_json
            .get("validators")
            .and_then(|v| v.as_array())
            .expect("expected validators array");

        for validator in validators {
            let source_map_file = validator
                .get("sourceMapFile")
                .and_then(|v| v.as_str())
                .expect("expected sourceMapFile entry");

            assert!(
                source_map_file.starts_with(&format!("{map_dir_name}/")),
                "sourceMapFile should include configured source map directory; got: {source_map_file}"
            );

            let resolved = blueprint_path
                .parent()
                .expect("blueprint should have parent dir")
                .join(source_map_file);

            assert!(
                resolved.exists(),
                "sourceMapFile should resolve relative to blueprint location; missing: {}",
                resolved.display()
            );
        }

        let _ = fs::remove_dir_all(&cwd_map_dir);
        let _ = fs::remove_dir_all(root);
    }
}

Adds a generic "Context" parameter that can be populated with data; any methods which work with terms will preserve the context, meaning it survives execution, optimization, etc. For now, we default it to (), but this is foundational work for source map support, because we can assign sourcemap locations as the context during codegen.

Adds a new location field to Air and AirTree node variants, intended to track the closest span in the source code that generated that node. Initially set to empty spans, but then we can add in more and more source coverage in future commits.

This now passes the location information we have down into the Term! It's not *super* useful yet; the things generated by the compiler that *should* be able to derive their span info from the things around them aren't yet being provided; but it's a good start, all the tests pass! For example, here's a little fibonacci test I put together: === Source code === fn fib(n: Int) -> Int { if n < 2 { n } else { fib(n - 1) + fib(n - 2) } } test fib_test() { fib(10) == 55 } === Term tree with source locations === Apply (no span) Apply (no span) Builtin(EqualsInteger) (no span) Apply (no span) Apply (no span) Lambda(test_module_fib) (no span) Apply (no span) Lambda(test_module_fib) (no span) Var(test_module_fib) @ 164..167 = "fib" Apply (no span) Var(test_module_fib) (no span) Var(test_module_fib) (no span) Lambda(__no_inline__) (no span) Lambda(test_module_fib) (no span) Lambda(n_id_0) (no span) Force (no span) Apply (no span) Apply (no span) Apply (no span) Force (no span) Builtin(IfThenElse) (no span) Apply (no span) Apply (no span) Builtin(LessThanInteger) @ 42..47 = "n < 2" Var(n_id_0) @ 42..43 = "n" Constant(Integer(2)) (no span) Delay (no span) Var(n_id_0) @ 60..61 = "n" Delay (no span) Apply (no span) Apply (no span) Builtin(AddInteger) (no span) Apply (no span) Apply (no span) Var(test_module_fib) @ 89..92 = "fib" Var(test_module_fib) @ 89..92 = "fib" Apply (no span) Apply (no span) Builtin(SubtractInteger) @ 93..98 = "n - 1" Var(n_id_0) @ 93..94 = "n" Constant(Integer(1)) (no span) Apply (no span) Apply (no span) Var(test_module_fib) @ 102..105 = "fib" Var(test_module_fib) @ 102..105 = "fib" Apply (no span) Apply (no span) Builtin(SubtractInteger) @ 106..111 = "n - 2" Var(n_id_0) @ 106..107 = "n" Constant(Integer(2)) (no span) Constant(Integer(10)) (no span) Constant(Integer(55)) (no span)

- Add SourceMap type for mapping UPLC node indices to source locations - Add the ability to generate a sourcemap when building, either externally or in the blueprint json - Supports exporting tests (useful as a source of complex examples) - Add's a --list flag to make identifying what exactly to export easier

Exposes an interface to run the CEK machine one step at a time (essentially just exporing a few functions); More importantly, Now that we've added Context to terms, and propagated that through compilation, when we *run* the machine, we have to erase the context down to unit (). This makes things like debugger support awkward, because as the machine executes, the term being executed gets manipulated. That means to map it back to the generated source maps, we'd need some kind of pattern matching system. So, instead if we update the Machine to be able to run generically over context (i.e. preserve the context as we juggle the CEK machine), then we can use that context to attach a post-order numbering to each node, and use that to index into the source maps. To that end, we add a context parameter to Value, BuiltinRuntime, and Env. Of note, we don't make Error generic over Context. While that might be useful for better error messages on failure, it's a much bigger refactor, and isn't critical for debugging steps; so for now, we just erase the context when constructing errors, and provide a utility for lifting Value<()> into the default context.

It doesn't do us much good to carry the source spans through the whole compilation process, if optimizations just screw with the UPLC tree at the end. This ensures that the interning, shrinking, and other optimizations carry through the relevant context. This also fixes the order of operations to apply used functions before optimization.

- Add more source spans during code generation - Reorder pipeline to ensure source maps survive optimization - Unify the codegen paths to always use generic context

Spans aren't sufficient to actually track source location; we were trying to match based on module, but inlining messes with that. So, we'll pass the source file name in all the way through, and we introduce a SourceLocation type for this.

Thread SourceLocation through the code generation to all UPLC term construction sites, including: - Assignments and let bindings - Boolean operators (and/or chains) - Function application

This should avoid bugs with diverging implementations, the recurring theme of this PR.

Allows us to do compilation and codegen, but skip any optimizations that might screw with the execution codepath, making it easier to debug a contract

Adds a field to sourcemaps that map var's and lambdas to the variable names that introduce them; this lets debuggers show original source variable names in the environment!

Operates identically to the test command, but prints out a coverage report, using the new source map capabilities!

- Add is_empty() method to Env (len_without_is_empty) - Allow type_complexity in builtin_curry_reducer - Remove redundant iter cloning in script_context.rs

- Add #[allow(clippy::too_many_arguments)] to a few functions - Rename to_string() to render() - Add #[allow(clippy;type_complexity)] to collect_tests_for_coverage - Fix a few unused variables - Remove unused imports - Fix needless_borrow in export command - Use arrays instead of vec![] in tests - Use !is_empty() instead of len() > 0

I had a different rust version locally so therese weren't showing up. - Used derive(Default) on SourceLocation - Allow some unused assignments, because they're used by miette - Add result_large_err to existing allow attribute

Fix 1: Source map path bug (lib.rs:798) Problem: validator.title can contain slashes from nested module paths (e.g. governance/voting.my_vote.spend). When used directly as a filename in dir.join(), the slash creates nested subdirectories. Since only dir was created with create_dir_all, the intermediate directories don't exist and fs::write fails. Fix: Replace / with - in the validator title before constructing the filename: let safe_title = validator.title.replace('/', "-"); let filename = format!("{}.sourcemap.json", safe_title); This produces flat filenames like governance-voting.my_vote.spend.sourcemap.json. Fix 2: Coverage property test iterations (lib.rs:615-670) Problem: property_max_success was ignored (prefixed with _) and property tests were only sampled once during coverage, missing code paths that only get exercised with different fuzzer inputs. Fix: - Renamed _property_max_success back to property_max_success and threaded it through to collect_tests_for_coverage - Replaced the single-sample logic with a loop that chains the PRNG state across property_max_success iterations, generating a separate test program for each sampled value - Each iteration's coverage gets merged into the aggregate, giving proper coverage reporting across all fuzzer inputs

1. Preserve term context through DeBruijn round-trip in optimizer (afterwards, run_once_pass) by collecting and restoring contexts instead of replacing with C::default(). 2. Fix source-map index stability by visiting function before argument in Apply traversal, so wrapping a term doesn't shift existing indices. 3. Propagate fuzzer sampling errors in coverage instead of silently breaking, which could produce false success with empty reports. 4. Aggregate property test iterations by base name for fail-once semantics — the property passes if ANY iteration fails, not each independently. 5. Include configured directory prefix in sourceMapFile references so they resolve relative to the blueprint location.

The code is sufficiently complexe already, and the addition of the source map context makes it a bit more. That's okay. But duplicating entire chunks of logic with no changes when we have generics, is not! There's a similar cleanup to be done around the 'coverage' command, which is, in most part, a duplicate of the `check` command. Note that this changes also drop the 'finalize_minimal' and make the `no_optimize` flag now unused. It is not a mistake. That flag should not exist and shall be removed in upcoming commits. Signed-off-by: KtorZ <matthias.benkort@gmail.com>

KtorZ · 2026-03-15T17:13:46Z

I've looked into it for a good part of the afternoon and I am afraid this will require more work. Not so much regarding the outcome itself (I didn't really get to the point of even trying it out), but the code:

Quite a lot of logic has been duplicated across commands and internal functions. I've done a first round of cleanup for the internals; leaning more towards generics. The commands are still TODO. In particular, from the quick glance I had on the coverage command; it seems that it would better be done via adding flags to check and would likely allow to avoid the duplication altogether.
I am a little skeptical about context now introduces a sheer amount of clones. In particular, removing context from a program doesn't look like a cheap operation given the size of the ASTs. Maybe it's not a big deal, but maybe it is. Especially around the test framework. I'd like to at least see high-level execution benchmarks (using e.g. hyperfine) to analyze the impact on the compiler.
I have made the no_optimize flag redundant; because I don't quite agree with the idea of producing a non-optimized build. Unless you can clearly argue for it, I would prefer not to expose users to this.

Quantumplation · 2026-03-15T17:28:27Z

I'll take another pass at this, no worries :)

I agree with your first two suggestions, and partially agree with your third; I can definitely see not exposing this to users, but what about making it a secret flag that doesn't show up on man pages etc? it's quite useful when mucking around in the guts, such as I did here, or when adding new optimizations, as you can compare the semantics of the "pure" implementation of the UPLC (i.e. the "most obviously correct" version generated by the compiler just from the raw semantics of the code) with the "optimized" version;

…nchmarks Merge the standalone `coverage` command into `check --lcov <path>`, eliminating duplicated CLI args, compilation pipeline, and test-filter parsing logic. Extract shared `parse_match_filters` / `test_matches_filters` helpers used by both the normal test path and coverage path. Remove the dead `no_optimize` parameter from `export` (accepted but never read after the finalize_minimal removal in e3de042). Add `examples/acceptance_tests/bench-compiler`, a hyperfine wrapper for comparing compiler wall-clock performance across branches. Benchmark results (main vs source-maps, 4 projects): hello_world: 1.00x (within noise) 071: 1.07x slower 089: 1.06x slower 104: 1.03x slower The ~3-6% overhead comes from carrying SourceLocation context through the AST during code generation. Will investigate optimizations (e.g. strip_context cost, context cloning in test framework) as a follow-up. Fix pre-existing test breakage from e3de042: update doc tests in `uplc/flat.rs` for the new generic context parameter, fix `Term::Constant` named-field syntax in parser test, adapt gen_uplc tests to the renamed `generate_raw` API and add `strip_context()` before DeBruijn conversion.

SourceLocation is attached to every AST node and cloned frequently during map_context traversals, DeBruijn round-trips, and optimizer passes. The module field was a String, making every clone a heap allocation — despite all terms within a compilation unit sharing the same module name. Replace `module: String` with `module: Rc<str>` so clones are O(1) reference count bumps. The Rc is created once per generate() / generate_raw() call and shared via SourceLocation::with_rc(). Benchmark (104, largest acceptance test): 1.03x slower → 1.00x vs main.

Quantumplation · 2026-03-28T17:32:37Z

@KtorZ I removed the duplication, added a benchmark script (which showed this branch was only 3-7% slower), switched from strings to RC<&str> (got a few percentage points of speedup so we're within noise of main), and found a 24-35% speedup elsewhere in the compiler to make the whole thing a net win :)

I'll resolve merge conflicts after lunch.

Please don't spend a ton of time reviewing/refactoring, as I fear that will sour you on the idea overall 😅 Feel free to just skim and let me know what you're generally unhappy with (if anything), and I can iterate on that (potentially with @SupernaviX as a second set of eyes) before you do another deep dive review.

Draft PR description for submission to aiken-lang/aiken. Covers motivation, design rationale, relationship to PR aiken-lang#1250, testing results (828 tests, 126 acceptance, benchmarks), and concrete consumer references (CodeTracer, Gastronomy). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Quantumplation requested a review from a team as a code owner January 7, 2026 18:17

Quantumplation force-pushed the pi/source-maps branch 2 times, most recently from a45c83e to d82d733 Compare January 8, 2026 15:00

Riley-Kilgore requested changes Mar 2, 2026

View reviewed changes

Quantumplation and others added 23 commits March 15, 2026 16:16

Add location tracking to Air(Tree) nodes

20a47eb

Adds a new location field to Air and AirTree node variants, intended to track the closest span in the source code that generated that node. Initially set to empty spans, but then we can add in more and more source coverage in future commits.

Improve codegen pipeline for source maps

d1a5643

- Add more source spans during code generation - Reorder pipeline to ensure source maps survive optimization - Unify the codegen paths to always use generic context

Add SourceLocation type

ce51732

Spans aren't sufficient to actually track source location; we were trying to match based on module, but inlining messes with that. So, we'll pass the source file name in all the way through, and we introduce a SourceLocation type for this.

Propagate the SourceLocation through code gen

f787e3d

Thread SourceLocation through the code generation to all UPLC term construction sites, including: - Assignments and let bindings - Boolean operators (and/or chains) - Function application

Ensure DeBruijn conversion preserves context

b396a42

This should avoid bugs with diverging implementations, the recurring theme of this PR.

Add a no-optimize flag

72251cb

Allows us to do compilation and codegen, but skip any optimizations that might screw with the execution codepath, making it easier to debug a contract

Add variables to source maps

5fee11a

Adds a field to sourcemaps that map var's and lambdas to the variable names that introduce them; this lets debuggers show original source variable names in the environment!

Add code coverage command

84e08e4

Operates identically to the test command, but prints out a coverage report, using the new source map capabilities!

Final cargo fmt

4f59912

Fix clippy warnings

a425b70

- Add is_empty() method to Env (len_without_is_empty) - Allow type_complexity in builtin_curry_reducer - Remove redundant iter cloning in script_context.rs

Even more clippy fixes!

67b4445

I had a different rust version locally so therese weren't showing up. - Used derive(Default) on SourceLocation - Allow some unused assignments, because they're used by miette - Add result_large_err to existing allow attribute

Cargo fmt

814a9c8

Clippy

f1d6af6

Cargo fmt

79fc43c

KtorZ force-pushed the pi/source-maps branch from afbfba5 to e3de042 Compare March 15, 2026 17:05

Quantumplation added 3 commits March 28, 2026 11:22

Replace HashSet with Vec in parser error expected patterns

d5eabd1

KtorZ closed this May 14, 2026

KtorZ reopened this May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add sourcemap support to the Aiken compiler#1250

Add sourcemap support to the Aiken compiler#1250
Quantumplation wants to merge 26 commits into
mainfrom
pi/source-maps

Quantumplation commented Jan 7, 2026

Uh oh!

Riley-Kilgore commented Feb 5, 2026

Uh oh!

Quantumplation commented Feb 21, 2026

Uh oh!

Riley-Kilgore left a comment

Uh oh!

KtorZ commented Mar 15, 2026

Uh oh!

Quantumplation commented Mar 15, 2026

Uh oh!

Quantumplation commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Quantumplation commented Jan 7, 2026

Motivation

Approach

Alternative considered

What's included

⚠️ Review note

Uh oh!

Riley-Kilgore commented Feb 5, 2026

Uh oh!

Quantumplation commented Feb 21, 2026

Uh oh!

Riley-Kilgore left a comment

Choose a reason for hiding this comment

Uh oh!

KtorZ commented Mar 15, 2026

Uh oh!

Quantumplation commented Mar 15, 2026

Uh oh!

Quantumplation commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants