[Feature]: Text Capability Recovery

### Problem Statement

Adding visual capabilities to LLMs consistently causes degradation in base text tasks like translation and mathematical reasoning, which is especially critical given Tiny Aya's smaller parameter redundancy.

### Proposed Solution

Build a merging script to interpolate the multimodal fine-tuned weights with the original text-only Tiny Aya weights. The script must accept a tunable merge ratio parameter ($\alpha=0.3$ to 0.7) to sweep for the optimal configuration.

### Use Case

This Phase 2/3 task is required to finalize the model weights, aiming to recover text performance on benchmarks like m-ArenaHard and GlobalMGSM while maintaining the newly acquired visual grounding.

### Alternatives Considered

- RMAdapter: Training with a dual-branch adapter that uses separate discrimination and reconstruction paths to enforce consistency and prevent forgetting natively.

### Additional Context

Validating cross-modal merging at the 3.35B scale is a core novelty gap for the project, as previous literature has only proven its efficacy at 8B and 32B scales

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Text Capability Recovery #11

Problem Statement

Proposed Solution

Use Case

Alternatives Considered

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: Text Capability Recovery #11

Description

Problem Statement

Proposed Solution

Use Case

Alternatives Considered

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions