Hardening: Decouple Environment Configuration from Hardcoded Tokenizers

## Describe the Issue

The environment scripts, specifically `environments/gsm8k_server_teacher_distill.py`, had hardcoded model and tokenizer strings (pointing to `NousResearch/DeepHermes-3-Llama-3-3B-Preview`). 

This hardcoding prevents the framework from being easily used with other model architectures (e.g., Llama 3.1, Qwen, or custom checkpoints) without manual source code modifications. If a user attempts to use a different model while the tokenizer remains hardcoded to DeepHermes, it can result in `TokenId out of range` errors, incorrect prompt formatting, or silent performance degradation due to vocabulary mismatch.

## Environment/API Details

- **Environment Class/Name:** `environments/gsm8k_server_teacher_distill.py`
- **Environment Configuration:** `GSM8kTeacherDistillEnv`
- **API Endpoint/Method Involved:** `config_init` and `teacher_config_init`

## Steps to Reproduce

1. Attempt to run a training task with a non-DeepHermes model (e.g., Llama-3-8B).
2. Observe that the environment still tries to load the DeepHermes tokenizer.
3. Observe potential crashes during decoding or evaluation if the vocabularies differ.

## Interaction Details (if applicable)

- **Expected Behavior:** 
  1. The environment should allow overriding the student/teacher models and tokenizers via environment variables (e.g., `STUDENT_MODEL`, `TEACHER_MODEL`).
  2. The configuration should support dynamic resolution to ensure the tokenizer always matches the model being served.

## Setup Details

- **OS:** Linux
- **Python Version:** 3.10+
- **Atropos Version:** commit c20c852

## Additional Context & Logs

This refactoring makes the Atropos environment suite truly model-agnostic, allowing for rapid experimentation across different model families without brittle source-code patches.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hardening: Decouple Environment Configuration from Hardcoded Tokenizers #461

Describe the Issue

Environment/API Details

Steps to Reproduce

Interaction Details (if applicable)

Setup Details

Additional Context & Logs

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Hardening: Decouple Environment Configuration from Hardcoded Tokenizers #461

Description

Describe the Issue

Environment/API Details

Steps to Reproduce

Interaction Details (if applicable)

Setup Details

Additional Context & Logs

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions