Refactor data generation return type by ealt · Pull Request #177 · Astera-org/simplexity

ealt · 2026-03-11T00:16:21Z

see docs/plans/refactor-gen-process-returns.md

Note

Medium Risk
This is a breaking API change: generate() and the generator helpers no longer return tuples, so any external call sites must migrate to dict key access. Moderate risk of subtle shape/empty-sentinel regressions in downstream code that assumed previous tuple semantics.

Overview
Refactors generative-process generation APIs to return structured TypedDict payloads instead of tuples. GenerativeProcess.generate() (and IndependentFactoredGenerativeProcess.generate()) now returns GenerateResult with states, observations, and all_states; when return_all_states=False, all_states becomes a pytree-preserving empty sentinel.

Unifies generate_data_batch() and generate_data_batch_with_full_history() to both return a single DataBatch with always-present gen_states, inputs, labels, belief_states, and prefix_probabilities (empty (batch, 0) fields when history isn’t requested), and mirrors this in the torch wrapper via TorchDataBatch while converting only token tensors to torch. Updates end-to-end training and test suites to use dict access and adds assertions covering empty sentinel shapes and that states reflects the final scan carry.

^{Written by Cursor Bugbot for commit b6a67e8. This will update automatically on new commits. Configure here.}

claude

This PR is a well-structured refactor that improves type safety by moving from tuple returns to TypedDicts. The approach is sound and the test coverage is thorough. A few issues need addressing before merge.

simplexity/generative_processes/torch_generator.py

simplexity/generative_processes/generative_process.py

simplexity/generative_processes/torch_generator.py

casperlchristensen · 2026-03-11T22:00:15Z

@claude The suggested changes are good -- please add them.

claude · 2026-03-11T22:00:33Z

Claude finished @casperlchristensen's task in 1m 17s —— View job

Applied Review Suggestions

Read relevant files
Fix imports in torch_generator.py — merged 3 separate import blocks from simplexity.generative_processes.generator into one
Fix generate_data_batch to use TorchDataBatch(...) constructor instead of plain dict literal
Fix generate_data_batch_with_full_history to use TorchDataBatch(...) constructor instead of plain dict literal
Update outdated Returns docstring in generate_data_batch_with_full_history to document all 5 fields including gen_states
Fix stale belief_states → all_states references in generative_process.py docstring
Committed and pushed to refactor/data-gen

All 4 review suggestions have been applied in commit 85c4ee1.
|

- Merge three separate import blocks from simplexity.generative_processes.generator into one - Use TorchDataBatch(...) constructor instead of plain dict literals in both generate functions - Update outdated Returns docstring in generate_data_batch_with_full_history to list all 5 fields - Fix stale belief_states references in generative_process.py docstring to use all_states Co-authored-by: Casper Lützhøft Christensen <casperlchristensen@users.noreply.github.com>

ealt added 2 commits March 10, 2026 17:14

Create plan doc

dcb6f67

Implement refactor

b6a67e8

ealt marked this pull request as draft March 11, 2026 00:16

ealt marked this pull request as ready for review March 11, 2026 00:22

ealt requested a review from casperlchristensen March 11, 2026 00:22

claude bot requested changes Mar 11, 2026

View reviewed changes

simplexity/generative_processes/torch_generator.py Outdated Show resolved Hide resolved

claude bot reviewed Mar 11, 2026

View reviewed changes

simplexity/generative_processes/torch_generator.py Outdated Show resolved Hide resolved

claude bot reviewed Mar 11, 2026

View reviewed changes

simplexity/generative_processes/torch_generator.py Outdated Show resolved Hide resolved

simplexity/generative_processes/generative_process.py Outdated Show resolved Hide resolved

claude bot reviewed Mar 11, 2026

View reviewed changes

simplexity/generative_processes/torch_generator.py Show resolved Hide resolved

casperlchristensen approved these changes Mar 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor data generation return type#177

Refactor data generation return type#177
ealt wants to merge 3 commits intomainfrom
refactor/data-gen

ealt commented Mar 11, 2026 •

edited by cursor bot

Loading

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

casperlchristensen commented Mar 11, 2026

Uh oh!

claude bot commented Mar 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ealt commented Mar 11, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

casperlchristensen commented Mar 11, 2026

Uh oh!

claude bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Applied Review Suggestions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ealt commented Mar 11, 2026 •

edited by cursor bot

Loading

claude bot commented Mar 11, 2026 •

edited

Loading