Skip to content

Add decoder-only autoregressive transformer example (#57)#448

Closed
lukstafi wants to merge 35 commits intoahrefs:masterfrom
lukstafi:ludics/gh-ocannl-57-s1/root
Closed

Add decoder-only autoregressive transformer example (#57)#448
lukstafi wants to merge 35 commits intoahrefs:masterfrom
lukstafi:ludics/gh-ocannl-57-s1/root

Conversation

@lukstafi
Copy link
Copy Markdown
Collaborator

Summary

  • Add decoder_only_block and decoder_only to lib/nn_blocks.ml — reusable building blocks for decoder-only transformers (masked self-attention + FFN, no cross-attention)
  • Add test/training/transformer_names.ml — a complete training + autoregressive generation example on the Names dataset using character-level encoding and causal masking
  • Add test/operations/decoder_only_test.ml — regression test exercising the new decoder_only API with a 2-layer stack forward pass

Test plan

  • dune build @check passes
  • dune runtest test/training/transformer_names.ml passes (loss decreases, generates name-like output)
  • dune runtest test/operations/decoder_only_test.ml passes (output shape validated)

🤖 Generated with Claude Code

@lukstafi lukstafi force-pushed the ludics/gh-ocannl-57-s1/root branch from 72c5cba to 6eb00e5 Compare April 15, 2026 08:20
lukstafi and others added 29 commits April 15, 2026 10:21
Train a small decoder-only transformer on sequences from an 8-state
binary-input finite state machine. The model learns the FSM transition
function, demonstrated by >90% valid-transition accuracy on held-out
sequences.

Key implementation details:
- Single attention block without layer_norm (recentered init workaround)
- Separate forward-only inference routine sharing trained weights
- Valid-transition accuracy metric for hidden-input FSM evaluation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…key crash

Add proposal for task-bb30d0be covering two independent test failures:
update test_ndarray_binary_io.expected and fix Map.of_alist_exn crash in
derive_projections for 1x1 output convolution/pooling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add per-test output directory isolation via build_files_prefix config option
to prevent flaky test failures from parallel test execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…assignment (ahrefs#420)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…refs#427)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hrefs#412)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers aligned memory allocation, compiler flag validation, auto-vectorization-
friendly code generation (restrict, pragma hints, aligned locals), and platform
detection macros. Scoped as foundation for tiling task's explicit SIMD micro-kernels.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…#253)

Gap analysis identifying which llm.c techniques apply to OCANNL,
which are already covered by existing tasks (tiling ahrefs#412, AVX ahrefs#164,
megakernels ahrefs#318), and which unique lessons remain (warp shuffles,
fused reductions, GELU, AdamW, vectorized memory access).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hrefs#277)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
lukstafi and others added 6 commits April 15, 2026 10:21
…nd (ahrefs#170)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Proposes Equality_with_index fetch op and simplify_llc pattern detection
to replace one-hot * matrix multiply with direct row lookup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ahrefs#151)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hrefs#57)

Add decoder_only_block and decoder_only to nn_blocks.ml as reusable
building blocks for autoregressive language models (masked self-attention
+ FFN, no cross-attention).

Add test/training/transformer_names.ml: a complete training + generation
example using character-level encoding on the Names dataset with causal
masking, SGD training, and autoregressive token-by-token generation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Exercises the new Nn_blocks.decoder_only helper with a 2-layer stack,
causal mask, and forward pass, validating output shape. This ensures
the new public API added in the previous commit has CI coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lukstafi lukstafi force-pushed the ludics/gh-ocannl-57-s1/root branch from 6eb00e5 to ee621a9 Compare April 15, 2026 08:22
@lukstafi
Copy link
Copy Markdown
Collaborator Author

@codex review Focus on bugs, correctness issues, and edge cases. Do not check adherence to a spec or plan.

@lukstafi
Copy link
Copy Markdown
Collaborator Author

Closing: this PR was filed against the wrong repo due to a gh-resolved bug (see lukstafi/ludics#302). Ported to lukstafi#5.

@lukstafi lukstafi closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant