RULER remove answer prefix #953

hsiehjackson · 2025-10-15T20:39:22Z

Summary by CodeRabbit

New Features
- None
Refactor
- Simplified preparation by removing generation-related options from settings and output.
- Question text now appends the answer prefix; the generation field is no longer emitted.
- Max sequence length now accounts for template tokens earlier for more accurate limits.
Chores
- Reduced samples per prepare task from 500 to 100 to speed up data preparation.
- Stopped injecting token count defaults into per-task settings.

Signed-off-by: Cheng-Ping Hsieh <[email protected]>

coderabbitai · 2025-10-15T20:39:47Z

Walkthrough

The RULER data preparation script updates remove generation-specific configs, change how question text is composed, drop generation fields from outputs, adjust max sequence length calculation, reduce per-task sample counts, and stop writing tokens_to_generate into generated per-task init.py files.

Changes

Cohort / File(s)	Summary of changes
RULER data prep adjustments `nemo_skills/dataset/ruler/prepare.py`	- Removed generation args: `tokens_to_generate`, `start_assistant_response_key`, `endpoint_type` usage - Compose question as `original_input + answer_prefix`; removed `generation` field from outputs - Omit `tokens_to_generate` in emitted per-task `__init__.py` DEFAULT_SETTINGS - Reduce samples per prepare task: 500 → 100 - Subtract `template_tokens` earlier when setting `max_seq_length`

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as CLI/Caller
  participant P as prepare.py
  participant T as Task Loader
  participant W as Writer

  U->>P: run RULER prepare
  P->>T: load tasks and templates
  Note over P,T: max_seq_length computed earlier<br/>(subtract template_tokens)
  loop per task (limited to 100 samples)
    P->>T: read original_entry
    Note right of P: question = input + answer_prefix<br/>(no generation config)
    P->>W: write prepared sample (no "generation" field)
  end
  P->>W: write per-task __init__.py<br/>(no tokens_to_generate in DEFAULT_SETTINGS)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I twirl my ears through fields of code,
Snipping tokens from the traveler’s load.
Questions stitched with prefix thread,
Lighter packs, fewer crumbs to spread.
Shorter hops, same moonlit track—
A tidy warren, bouncing back. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title Check	❓ Inconclusive	The title "RULER remove answer prefix" references a real aspect of the changeset—the handling of answer_prefix is indeed modified in this PR. However, the term "remove" is potentially misleading. Based on the summary, the answer_prefix is not being deleted entirely; rather, it's being incorporated into the question field by appending it (original_input + answer_prefix), and generation-related configurations are being refactored. The title does not clearly convey the primary architectural changes, such as the removal of the generation field and generation configurations, or the incorporation of the prefix into the question structure. The title is also somewhat vague about what exactly is being "removed" from the user's perspective.	Consider revising the title to be more precise about what is actually being changed. A more descriptive title might be something like "RULER incorporate answer prefix into question field" or "RULER refactor answer prefix handling" to better reflect that the prefix is being reorganized rather than removed. This would clarify the actual behavior change and make the intent clearer to reviewers scanning the PR history.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chsieh/ruler-remove-prefix

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a3eba8 and d7d272f.

📒 Files selected for processing (1)

nemo_skills/dataset/ruler/prepare.py (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: unit-tests

🔇 Additional comments (4)

nemo_skills/dataset/ruler/prepare.py (4)

25-32: LGTM! Generation configs properly removed.

The simplified GENERATION_ARGS aligns with the PR objective to remove generation-specific configurations. The removal of tokens_to_generate and related fields is consistent with moving the answer prefix into the question field itself.

104-104: Verify template_tokens adjustment doesn't affect task-specific length requirements.

The template_tokens subtraction is now applied globally before task preparation. While this simplifies the logic, ensure that all tasks can properly handle this uniform adjustment without requiring task-specific template token calculations.

107-115: Verify reduced sample count meets evaluation requirements.

The sample count has been reduced from 500 to 100 (80% reduction). While this significantly speeds up data preparation, ensure that 100 samples per task provide sufficient coverage for robust RULER evaluation and that this aligns with the benchmark's requirements.

44-50: Ensure answer_prefix includes a separator when building the question
Concatenating original_entry["input"] + original_entry["answer_prefix"] can merge words if answer_prefix doesn’t start with a space or newline—confirm every answer_prefix begins with an appropriate separator or explicitly insert one (e.g. f"{input} {answer_prefix}").

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

fayejf

let's pause this for two weeks.

remove answer prefix and reduce num of samples

d7d272f

Signed-off-by: Cheng-Ping Hsieh <[email protected]>

hsiehjackson requested a review from fayejf October 15, 2025 20:39

fayejf requested changes Oct 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RULER remove answer prefix #953

RULER remove answer prefix #953

Uh oh!

hsiehjackson commented Oct 15, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 15, 2025 •

edited

Loading

Uh oh!

fayejf left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RULER remove answer prefix #953

Are you sure you want to change the base?

RULER remove answer prefix #953

Uh oh!

Conversation

hsiehjackson commented Oct 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

fayejf left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hsiehjackson commented Oct 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 15, 2025 •

edited

Loading