Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"permissions": {
"allow": [
"Bash(pre-commit run:*)"
]
}
}
45 changes: 45 additions & 0 deletions .cursor/rules/rust.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,53 @@ The project is split in 3 separate crates:
2. `common`: Provides shared utilities and data structures for the model. Any constant definitions should be placed here. As much as possible, any shared logic should also be placed here.
3. `server`: Implements the server-side logic and API endpoints for ModelExpress in a stand alone server.

## Adding CLI Arguments

Client CLI arguments are defined in a shared struct to avoid duplication:

1. **Add to `ClientArgs`** in `modelexpress_common/src/client_config.rs`:
- This is the single source of truth for shared arguments
- Use `#[arg(long, env = "MODEL_EXPRESS_...")]` for environment variable support
- Do NOT use `-v` short flag (reserved for CLI's verbose)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance to "Do NOT use -v short flag (reserved for CLI's verbose)" is contradicted by the actual code. ClientArgs in modelexpress_common/src/client_config.rs line 33 uses short = 'v' for log_level. This creates a conflict with the CLI's use of -v for verbose mode. Either the documentation should acknowledge this existing conflict, or the code should be updated to remove the -v short flag from one of these usages.

Suggested change
- Do NOT use `-v` short flag (reserved for CLI's verbose)
- Avoid introducing new uses of the `-v` short flag; it is reserved for the top-level CLI `--verbose` option (there is a legacy use in `ClientArgs` for `log_level` that will be cleaned up separately).

Copilot uses AI. Check for mistakes.

2. **Update `ClientConfig::load()`** in the same file:
- Add override logic in the "APPLY CLI ARGUMENT OVERRIDES" section

3. **Do NOT duplicate in `Cli`** (`modelexpress_client/src/bin/modules/args.rs`):
- `Cli` embeds `ClientArgs` via `#[command(flatten)]`
- Only add CLI-specific arguments there (e.g., `--format`, `--verbose`)
Comment on lines +48 to +49
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement "Cli embeds ClientArgs via #[command(flatten)]" is inaccurate. The Cli struct in modelexpress_client/src/bin/modules/args.rs does not use #[command(flatten)] to embed ClientArgs. Instead, the CLI defines its own arguments and manually constructs a ClientArgs struct from them (see modelexpress_client/src/bin/cli.rs lines 26-38). This documentation should be corrected to accurately reflect the current implementation.

Suggested change
- `Cli` embeds `ClientArgs` via `#[command(flatten)]`
- Only add CLI-specific arguments there (e.g., `--format`, `--verbose`)
- `Cli` defines its own CLI-facing arguments; `modelexpress_client/src/bin/cli.rs` is responsible for constructing a `ClientArgs` instance from them
- Keep shared configuration fields in `ClientArgs` and only add CLI-specific options to `Cli` (e.g., `--format`, `--verbose`)

Copilot uses AI. Check for mistakes.

4. **Add tests** in the `tests` module of `client_config.rs`

# Code quality

- Do **NOT** use emojis. These are unprofessional.
- Do not create markdown files to document code changes or decisions.
- Do not over-comment code. Removing code is fine without adding new comments to explain why.

# Pre-commit Hooks

This repository uses pre-commit hooks to enforce code quality. **Run pre-commit after every code change**, even before creating commits:

```bash
# Run all pre-commit hooks on staged files
pre-commit run

# Run on all files (recommended after significant changes)
pre-commit run --all-files
```

The hooks include:
- `cargo fmt` - Code formatting
- `cargo clippy` - Linting with auto-fix
- `cargo check` - Compilation check
- File hygiene checks (trailing whitespace, end-of-file, YAML/TOML/JSON validation, etc.)

Running pre-commit hooks early and often catches issues before they accumulate. Do not wait until commit time to discover problems.

# AI Agent Instructions

When introducing new patterns, conventions, or architectural decisions that affect how code should be written, update ALL AI agent instruction files:
- `CLAUDE.md` (Claude Code)
- `.github/copilot-instructions.md` (GitHub Copilot)
- `.cursor/rules/rust.mdc` (Cursor)
6 changes: 6 additions & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ RUN apt-get update \
libprotobuf-dev \
libssl-dev \
locales \
pipx \
python3 \
lsb-release \
net-tools \
openssh-client \
Expand Down Expand Up @@ -63,6 +65,10 @@ USER ubuntu
# Install Rust
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y

# Install pre-commit via pipx
RUN pipx install pre-commit
ENV PATH="/home/ubuntu/.local/bin:${PATH}"

# Make git happy with codespaces mounted repos
RUN git config --global --add safe.directory /workspaces/ModelExpress

Expand Down
45 changes: 45 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,53 @@ The project is split in 3 separate crates:
2. `common`: Provides shared utilities and data structures for the model. Any constant definitions should be placed here. As much as possible, any shared logic should also be placed here.
3. `server`: Implements the server-side logic and API endpoints for ModelExpress in a stand alone server.

## Adding CLI Arguments

Client CLI arguments are defined in a shared struct to avoid duplication:

1. **Add to `ClientArgs`** in `modelexpress_common/src/client_config.rs`:
- This is the single source of truth for shared arguments
- Use `#[arg(long, env = "MODEL_EXPRESS_...")]` for environment variable support
- Do NOT use `-v` short flag (reserved for CLI's verbose)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance to "Do NOT use -v short flag (reserved for CLI's verbose)" is contradicted by the actual code. ClientArgs in modelexpress_common/src/client_config.rs line 33 uses short = 'v' for log_level. This creates a conflict with the CLI's use of -v for verbose mode. Either the documentation should acknowledge this existing conflict, or the code should be updated to remove the -v short flag from one of these usages.

Suggested change
- Do NOT use `-v` short flag (reserved for CLI's verbose)
- Do NOT introduce new `-v` short flags (reserved for CLI verbosity and currently used by `log_level`)

Copilot uses AI. Check for mistakes.

2. **Update `ClientConfig::load()`** in the same file:
- Add override logic in the "APPLY CLI ARGUMENT OVERRIDES" section

3. **Do NOT duplicate in `Cli`** (`modelexpress_client/src/bin/modules/args.rs`):
- `Cli` embeds `ClientArgs` via `#[command(flatten)]`
- Only add CLI-specific arguments there (e.g., `--format`, `--verbose`)

Comment on lines +45 to +48
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement "Cli embeds ClientArgs via #[command(flatten)]" is inaccurate. The Cli struct in modelexpress_client/src/bin/modules/args.rs does not use #[command(flatten)] to embed ClientArgs. Instead, the CLI defines its own arguments and manually constructs a ClientArgs struct from them (see modelexpress_client/src/bin/cli.rs lines 26-38). This documentation should be corrected to accurately reflect the current implementation.

Suggested change
3. **Do NOT duplicate in `Cli`** (`modelexpress_client/src/bin/modules/args.rs`):
- `Cli` embeds `ClientArgs` via `#[command(flatten)]`
- Only add CLI-specific arguments there (e.g., `--format`, `--verbose`)
3. **Wire CLI arguments into `ClientArgs`**:
- Define user-facing flags in `Cli` (`modelexpress_client/src/bin/modules/args.rs`)
- In `modelexpress_client/src/bin/cli.rs` (see lines 26–38), construct a `ClientArgs` from the `Cli` fields
- Keep `ClientArgs` as the single source of truth for shared client configuration; only add CLI-specific arguments to `Cli` (e.g., `--format`, `--verbose`)

Copilot uses AI. Check for mistakes.
4. **Add tests** in the `tests` module of `client_config.rs`

# Code quality

- Do **NOT** use emojis. These are unprofessional.
- Do not create markdown files to document code changes or decisions.
- Do not over-comment code. Removing code is fine without adding new comments to explain why.

# Pre-commit Hooks

This repository uses pre-commit hooks to enforce code quality. **Run pre-commit after every code change**, even before creating commits:

```bash
# Run all pre-commit hooks on staged files
pre-commit run

# Run on all files (recommended after significant changes)
pre-commit run --all-files
```

The hooks include:
- `cargo fmt` - Code formatting
- `cargo clippy` - Linting with auto-fix
- `cargo check` - Compilation check
- File hygiene checks (trailing whitespace, end-of-file, YAML/TOML/JSON validation, etc.)

Running pre-commit hooks early and often catches issues before they accumulate. Do not wait until commit time to discover problems.

# AI Agent Instructions

When introducing new patterns, conventions, or architectural decisions that affect how code should be written, update ALL AI agent instruction files:
- `CLAUDE.md` (Claude Code)
- `.github/copilot-instructions.md` (GitHub Copilot)
- `.cursor/rules/rust.mdc` (Cursor)
123 changes: 123 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Build and Development Commands

```bash
# Build the project
cargo build

# Build in release mode
cargo build --release

# Run the server
cargo run --bin modelexpress-server

# Run tests
cargo test

# Run integration tests (starts server, runs test client)
./run_integration_tests.sh

# Run a specific test client
cargo run --bin test_client -- --test-model "google-t5/t5-small"

# Run clippy (required before submitting code)
cargo clippy

# Generate sample configuration file
cargo run --bin config_gen -- --output model-express.yaml
```

## Architecture

ModelExpress is a Rust-based model cache management service that accelerates inference by caching HuggingFace models. It can be deployed standalone or as a sidecar alongside inference solutions like NVIDIA Dynamo.

### Workspace Structure

The project is a Rust workspace with three crates:

- **`modelexpress_server`** (`modelexpress-server`): gRPC server providing model services
- `services.rs`: Implements `HealthService`, `ApiService`, and `ModelService` gRPC services
- `database.rs`: SQLite-based model status persistence via `ModelDatabase`
- `cache.rs`: Cache eviction and management
- Uses global `MODEL_TRACKER` (`LazyLock<ModelDownloadTracker>`) for tracking download state

- **`modelexpress_client`** (`modelexpress-client`): Client library and CLI tool
- `lib.rs`: Main `Client` struct with gRPC clients for health, API, and model services
- `bin/cli.rs`: HuggingFace CLI replacement for model downloads
- Supports automatic fallback to direct download when server unavailable

- **`modelexpress_common`** (`modelexpress-common`): Shared code and protobuf definitions
- `grpc/` module contains generated proto code (health, api, model)
- `providers/huggingface.rs`: HuggingFace download implementation
- `download.rs`: Provider-agnostic download orchestration
- `cache.rs`, `config.rs`, `client_config.rs`: Configuration types

### gRPC Services

Protocol definitions are in `modelexpress_common/proto/`:
- `health.proto`: Health check endpoint
- `api.proto`: Generic request/response API
- `model.proto`: Model download with streaming status updates

### Key Patterns

- Download status tracked in SQLite database with compare-and-swap for concurrent request handling
- Streaming gRPC responses for download progress updates via `ModelStatusUpdate`
- `CacheConfig::discover()` finds cache configuration from environment or config files
- Configuration layering: CLI args > environment variables > config files > defaults

### Adding CLI Arguments

Client CLI arguments and environment variables are defined in a shared struct to avoid duplication:

1. **`ClientArgs`** in `modelexpress_common/src/client_config.rs`:
- Single source of truth for shared client arguments (endpoint, timeout, cache settings, etc.)
- Add new arguments here with `#[arg(long, env = "MODEL_EXPRESS_...")]`
- Avoid `-v` short flag (reserved for CLI's verbose)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance to "Avoid -v short flag (reserved for CLI's verbose)" is contradicted by the actual code. ClientArgs in modelexpress_common/src/client_config.rs line 33 uses short = 'v' for log_level. This creates a conflict with the CLI's use of -v for verbose mode. Either the documentation should acknowledge this existing conflict, or the code should be updated to remove the -v short flag from one of these usages.

Suggested change
- Avoid `-v` short flag (reserved for CLI's verbose)
- Note: `ClientArgs` currently uses `-v` as the short flag for `log_level`, and the CLI also uses `-v` for `--verbose`; avoid introducing any additional uses of `-v` and prefer long-only flags for new options until this duplication is refactored.

Copilot uses AI. Check for mistakes.

2. **`ClientConfig::load()`** in the same file:
- Apply the new argument to the config struct in the "APPLY CLI ARGUMENT OVERRIDES" section

3. **`Cli`** in `modelexpress_client/src/bin/modules/args.rs`:
- Embeds `ClientArgs` via `#[command(flatten)]`
- Only add CLI-specific arguments here (e.g., `--format`, `--verbose`)
Comment on lines +85 to +86
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statement "Embeds ClientArgs via #[command(flatten)]" is inaccurate. The Cli struct in modelexpress_client/src/bin/modules/args.rs does not use #[command(flatten)] to embed ClientArgs. Instead, the CLI defines its own arguments and manually constructs a ClientArgs struct from them (see modelexpress_client/src/bin/cli.rs lines 26-38). This documentation should be corrected to accurately reflect the current implementation.

Suggested change
- Embeds `ClientArgs` via `#[command(flatten)]`
- Only add CLI-specific arguments here (e.g., `--format`, `--verbose`)
- Defines CLI-specific arguments (e.g., `--format`, `--verbose`, model identifiers)
- Values from `Cli` are used in `modelexpress_client/src/bin/cli.rs` to manually construct a `ClientArgs` instance (no `#[command(flatten)]` embedding)

Copilot uses AI. Check for mistakes.

4. **Tests**: Add tests in `client_config.rs` for argument parsing and config loading

## Code Standards

- **No `unwrap()`**: Strictly forbidden except in benchmarks. Use `match`, `?`, or `expect()` (tests only)
- **All dependencies in root `Cargo.toml`**: Sub-crates use workspace dependencies exclusively
- **Clippy enforced**: `cargo clippy` must pass with no warnings (multiple lints set to deny)
- **No emojis in code**
- **No markdown documentation files for code changes**

## Pre-commit Hooks

This repository uses pre-commit hooks to enforce code quality. **Run pre-commit after every code change**, even before creating commits:

```bash
# Run all pre-commit hooks on staged files
pre-commit run

# Run on all files (recommended after significant changes)
pre-commit run --all-files
```

The hooks include:
- `cargo fmt` - Code formatting
- `cargo clippy` - Linting with auto-fix
- `cargo check` - Compilation check
- File hygiene checks (trailing whitespace, end-of-file, YAML/TOML/JSON validation, etc.)

Running pre-commit hooks early and often catches issues before they accumulate. Do not wait until commit time to discover problems.

## AI Agent Instructions

When introducing new patterns, conventions, or architectural decisions that affect how code should be written, update ALL AI agent instruction files:
- `CLAUDE.md` (Claude Code)
- `.github/copilot-instructions.md` (GitHub Copilot)
- `.cursor/rules/rust.mdc` (Cursor)
Loading