Skip to content

Hardening: Decouple Environment Configuration from Hardcoded Tokenizers #461

@RUFFY-369

Description

@RUFFY-369

Describe the Issue

The environment scripts, specifically environments/gsm8k_server_teacher_distill.py, had hardcoded model and tokenizer strings (pointing to NousResearch/DeepHermes-3-Llama-3-3B-Preview).

This hardcoding prevents the framework from being easily used with other model architectures (e.g., Llama 3.1, Qwen, or custom checkpoints) without manual source code modifications. If a user attempts to use a different model while the tokenizer remains hardcoded to DeepHermes, it can result in TokenId out of range errors, incorrect prompt formatting, or silent performance degradation due to vocabulary mismatch.

Environment/API Details

  • Environment Class/Name: environments/gsm8k_server_teacher_distill.py
  • Environment Configuration: GSM8kTeacherDistillEnv
  • API Endpoint/Method Involved: config_init and teacher_config_init

Steps to Reproduce

  1. Attempt to run a training task with a non-DeepHermes model (e.g., Llama-3-8B).
  2. Observe that the environment still tries to load the DeepHermes tokenizer.
  3. Observe potential crashes during decoding or evaluation if the vocabularies differ.

Interaction Details (if applicable)

  • Expected Behavior:
    1. The environment should allow overriding the student/teacher models and tokenizers via environment variables (e.g., STUDENT_MODEL, TEACHER_MODEL).
    2. The configuration should support dynamic resolution to ensure the tokenizer always matches the model being served.

Setup Details

  • OS: Linux
  • Python Version: 3.10+
  • Atropos Version: commit c20c852

Additional Context & Logs

This refactoring makes the Atropos environment suite truly model-agnostic, allowing for rapid experimentation across different model families without brittle source-code patches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions