Reorganize Transformers Module by Model Family

## Summary

The `candle-transformers/src/models/` directory has grown to contain 70+ flat module entries, mixing full and quantized implementations of the same model families. This makes the codebase harder to navigate and maintain.

**Proposal**: Group related models into family subdirectories, similar to the pattern demonstrated in SmolLM3 (#3180).

## Current State

The `models/mod.rs` currently has a flat structure:

```rust
pub mod llama;
pub mod llama2_c;
pub mod llama2_c_weights;
pub mod quantized_llama;
pub mod quantized_llama2_c;
pub mod mistral;
pub mod quantized_mistral;
pub mod mixtral;
pub mod phi;
pub mod phi3;
pub mod quantized_phi;
pub mod quantized_phi3;
pub mod qwen2;
pub mod qwen2_moe;
pub mod qwen3;
pub mod qwen3_moe;
pub mod qwen3_vl;
pub mod quantized_qwen2;
pub mod quantized_qwen3;
// ... 50+ more entries
```

**Problems:**
- 70+ flat modules in a single directory
- Full and quantized versions scattered
- No clear model family grouping
- Harder to navigate and discover related implementations
- Difficult to see which models have quantized versions

## Proposed Structure

Group models by family in subdirectories, similar to SmolLM3 (#3180):

```
models/
├── llama/
│   ├── mod.rs              # Re-exports for backward compatibility
│   ├── llama.rs            # Full precision
│   ├── llama2_c.rs         # Llama2.c variant
│   ├── quantized_llama.rs
│   └── quantized_llama2_c.rs
├── mistral/
│   ├── mod.rs
│   ├── mistral.rs
│   ├── mixtral.rs
│   └── quantized_mistral.rs
├── phi/
│   ├── mod.rs
│   ├── phi.rs
│   ├── phi3.rs
│   ├── quantized_phi.rs
│   └── quantized_phi3.rs
├── qwen/
│   ├── mod.rs
│   ├── qwen2.rs
│   ├── qwen2_moe.rs
│   ├── qwen3.rs
│   ├── qwen3_moe.rs
│   ├── qwen3_vl.rs
│   ├── quantized_qwen2.rs
│   └── quantized_qwen3.rs
├── smol/                   # Already implemented in #3180
│   ├── mod.rs
│   ├── smollm3.rs
│   └── quantized_smollm3.rs
└── ... other families
```

## Benefits

### Better Organization
- Related implementations grouped together
- Easy to see all variants of a model family
- Clear separation between families
- Easier to navigate codebase

### Better Discoverability
- Users can find all Llama variants in one place
- Clear which models have quantized versions
- Easier to compare implementations within family
- Better for documentation generation

### Backward Compatibility
- Re-export from module for existing imports
- No breaking changes for users
- Can migrate incrementally

## Backward Compatibility Strategy

The reorganization maintains backward compatibility through re-exports. Using the Llama family as an example:

### New Directory Structure
```
models/llama/
├── mod.rs
├── llama.rs
├── quantized_llama.rs
└── llama2_c.rs
```

### Re-export Pattern

**In `models/llama/mod.rs`:**
```rust
// Declare submodules
pub mod llama;
pub mod quantized_llama;
pub mod llama2_c;

// Optional: re-export everything for convenience
pub use llama::*;
pub use quantized_llama::*;
pub use llama2_c::*;
```

**In `models/mod.rs`:**
```rust
// New: expose the family module
pub mod llama;

// For backward compatibility: re-export submodules at top level
pub use llama::llama;
pub use llama::quantized_llama;
pub use llama::llama2_c;
```

### Three Import Patterns (All Work!)

**Pattern 1: Legacy (backward compatible)**
```rust
use candle_transformers::models::llama;              // Old way still works!
use candle_transformers::models::quantized_llama;    // Old way still works!
```

**Pattern 2: New nested (explicit)**
```rust
use candle_transformers::models::llama::llama;       // New explicit way
use candle_transformers::models::llama::quantized_llama;
```

**Pattern 3: Import whole family**
```rust
use candle_transformers::models::llama::*;           // Import entire family
```

## SmolLM3 Example

SmolLM3 (#3180) demonstrates this pattern:

**Structure:**
```
models/smol/
├── mod.rs
├── smollm3.rs
└── quantized_smollm3.rs
```

**Current `models/smol/mod.rs`:**
```rust
pub mod smollm3;
pub mod quantized_smollm3;
```

**In `models/mod.rs`:**
```rust
pub mod smol;
```

### Migration Decision

## Suggested Model Families

Based on the current modules, these natural groupings exist:

**Core LLM Families:**
- `llama/` - llama, llama2_c, quantized variants
- `mistral/` - mistral, mixtral, quantized_mistral
- `phi/` - phi, phi3, quantized variants
- `qwen/` - qwen2, qwen3, MoE variants, VL, quantized versions
- `gemma/` - quantized_gemma3, quantized_recurrent_gemma, paligemma
- `mpt/` - mpt, quantized_mpt
- `stablelm/` - quantized_stable_lm (if more variants added)
- `t5/` - t5, quantized_t5
- `olmo/` - olmo, olmo2

**Vision/Multimodal:**
- `llava/` - llava variants
- `blip/` - blip, quantized_blip, quantized_blip_text
- `clip/` - openclip, mobileclip
- `moondream/` - moondream, quantized_moondream
- `pixtral/` - pixtral variants

**Specialized Architectures:**
- `mamba/` - mamba variants
- `rwkv/` - quantized_rwkv_v5, quantized_rwkv_v6
- `mimi/` - mimi variants

**Audio/Speech:**
- `parler_tts/` - parler_tts variants
- `metavoice/` - metavoice, quantized_metavoice

**Keep Standalone (for now):**
- Single-model families or unique architectures that don't fit groups

## References

- SmolLM3 PR: #3180 (demonstrates pattern)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reorganize Transformers Module by Model Family #3182

Summary

Current State

Proposed Structure

Benefits

Better Organization

Better Discoverability

Backward Compatibility

Backward Compatibility Strategy

New Directory Structure

Re-export Pattern

Three Import Patterns (All Work!)

SmolLM3 Example

Migration Decision

Suggested Model Families

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reorganize Transformers Module by Model Family #3182

Description

Summary

Current State

Proposed Structure

Benefits

Better Organization

Better Discoverability

Backward Compatibility

Backward Compatibility Strategy

New Directory Structure

Re-export Pattern

Three Import Patterns (All Work!)

SmolLM3 Example

Migration Decision

Suggested Model Families

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions