Skip to content

Conversation

@hsiehjackson
Copy link
Collaborator

@hsiehjackson hsiehjackson commented Oct 15, 2025

Summary by CodeRabbit

  • New Features

    • None
  • Refactor

    • Simplified preparation by removing generation-related options from settings and output.
    • Question text now appends the answer prefix; the generation field is no longer emitted.
    • Max sequence length now accounts for template tokens earlier for more accurate limits.
  • Chores

    • Reduced samples per prepare task from 500 to 100 to speed up data preparation.
    • Stopped injecting token count defaults into per-task settings.

@hsiehjackson hsiehjackson requested a review from fayejf October 15, 2025 20:39
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 15, 2025

Walkthrough

The RULER data preparation script updates remove generation-specific configs, change how question text is composed, drop generation fields from outputs, adjust max sequence length calculation, reduce per-task sample counts, and stop writing tokens_to_generate into generated per-task init.py files.

Changes

Cohort / File(s) Summary of changes
RULER data prep adjustments
nemo_skills/dataset/ruler/prepare.py
- Removed generation args: tokens_to_generate, start_assistant_response_key, endpoint_type usage
- Compose question as original_input + answer_prefix; removed generation field from outputs
- Omit tokens_to_generate in emitted per-task __init__.py DEFAULT_SETTINGS
- Reduce samples per prepare task: 500 → 100
- Subtract template_tokens earlier when setting max_seq_length

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as CLI/Caller
  participant P as prepare.py
  participant T as Task Loader
  participant W as Writer

  U->>P: run RULER prepare
  P->>T: load tasks and templates
  Note over P,T: max_seq_length computed earlier<br/>(subtract template_tokens)
  loop per task (limited to 100 samples)
    P->>T: read original_entry
    Note right of P: question = input + answer_prefix<br/>(no generation config)
    P->>W: write prepared sample (no "generation" field)
  end
  P->>W: write per-task __init__.py<br/>(no tokens_to_generate in DEFAULT_SETTINGS)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I twirl my ears through fields of code,
Snipping tokens from the traveler’s load.
Questions stitched with prefix thread,
Lighter packs, fewer crumbs to spread.
Shorter hops, same moonlit track—
A tidy warren, bouncing back. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title Check ❓ Inconclusive The title "RULER remove answer prefix" references a real aspect of the changeset—the handling of answer_prefix is indeed modified in this PR. However, the term "remove" is potentially misleading. Based on the summary, the answer_prefix is not being deleted entirely; rather, it's being incorporated into the question field by appending it (original_input + answer_prefix), and generation-related configurations are being refactored. The title does not clearly convey the primary architectural changes, such as the removal of the generation field and generation configurations, or the incorporation of the prefix into the question structure. The title is also somewhat vague about what exactly is being "removed" from the user's perspective. Consider revising the title to be more precise about what is actually being changed. A more descriptive title might be something like "RULER incorporate answer prefix into question field" or "RULER refactor answer prefix handling" to better reflect that the prefix is being reorganized rather than removed. This would clarify the actual behavior change and make the intent clearer to reviewers scanning the PR history.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chsieh/ruler-remove-prefix

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a3eba8 and d7d272f.

📒 Files selected for processing (1)
  • nemo_skills/dataset/ruler/prepare.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: unit-tests
🔇 Additional comments (4)
nemo_skills/dataset/ruler/prepare.py (4)

25-32: LGTM! Generation configs properly removed.

The simplified GENERATION_ARGS aligns with the PR objective to remove generation-specific configurations. The removal of tokens_to_generate and related fields is consistent with moving the answer prefix into the question field itself.


104-104: Verify template_tokens adjustment doesn't affect task-specific length requirements.

The template_tokens subtraction is now applied globally before task preparation. While this simplifies the logic, ensure that all tasks can properly handle this uniform adjustment without requiring task-specific template token calculations.


107-115: Verify reduced sample count meets evaluation requirements.

The sample count has been reduced from 500 to 100 (80% reduction). While this significantly speeds up data preparation, ensure that 100 samples per task provide sufficient coverage for robust RULER evaluation and that this aligns with the benchmark's requirements.


44-50: Ensure answer_prefix includes a separator when building the question
Concatenating original_entry["input"] + original_entry["answer_prefix"] can merge words if answer_prefix doesn’t start with a space or newline—confirm every answer_prefix begins with an appropriate separator or explicitly insert one (e.g. f"{input} {answer_prefix}").


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator

@fayejf fayejf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's pause this for two weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants