-
Notifications
You must be signed in to change notification settings - Fork 4
chore: Updating agents instructions + Adding Claude Code ones. #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| { | ||
| "permissions": { | ||
| "allow": [ | ||
| "Bash(pre-commit run:*)" | ||
| ] | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -32,8 +32,53 @@ The project is split in 3 separate crates: | |||||||||
| 2. `common`: Provides shared utilities and data structures for the model. Any constant definitions should be placed here. As much as possible, any shared logic should also be placed here. | ||||||||||
| 3. `server`: Implements the server-side logic and API endpoints for ModelExpress in a stand alone server. | ||||||||||
|
|
||||||||||
| ## Adding CLI Arguments | ||||||||||
|
|
||||||||||
| Client CLI arguments are defined in a shared struct to avoid duplication: | ||||||||||
|
|
||||||||||
| 1. **Add to `ClientArgs`** in `modelexpress_common/src/client_config.rs`: | ||||||||||
| - This is the single source of truth for shared arguments | ||||||||||
| - Use `#[arg(long, env = "MODEL_EXPRESS_...")]` for environment variable support | ||||||||||
| - Do NOT use `-v` short flag (reserved for CLI's verbose) | ||||||||||
|
|
||||||||||
| 2. **Update `ClientConfig::load()`** in the same file: | ||||||||||
| - Add override logic in the "APPLY CLI ARGUMENT OVERRIDES" section | ||||||||||
|
|
||||||||||
| 3. **Do NOT duplicate in `Cli`** (`modelexpress_client/src/bin/modules/args.rs`): | ||||||||||
| - `Cli` embeds `ClientArgs` via `#[command(flatten)]` | ||||||||||
| - Only add CLI-specific arguments there (e.g., `--format`, `--verbose`) | ||||||||||
|
Comment on lines
+48
to
+49
|
||||||||||
| - `Cli` embeds `ClientArgs` via `#[command(flatten)]` | |
| - Only add CLI-specific arguments there (e.g., `--format`, `--verbose`) | |
| - `Cli` defines its own CLI-facing arguments; `modelexpress_client/src/bin/cli.rs` is responsible for constructing a `ClientArgs` instance from them | |
| - Keep shared configuration fields in `ClientArgs` and only add CLI-specific options to `Cli` (e.g., `--format`, `--verbose`) |
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -30,8 +30,53 @@ The project is split in 3 separate crates: | |||||||||||||||
| 2. `common`: Provides shared utilities and data structures for the model. Any constant definitions should be placed here. As much as possible, any shared logic should also be placed here. | ||||||||||||||||
| 3. `server`: Implements the server-side logic and API endpoints for ModelExpress in a stand alone server. | ||||||||||||||||
|
|
||||||||||||||||
| ## Adding CLI Arguments | ||||||||||||||||
|
|
||||||||||||||||
| Client CLI arguments are defined in a shared struct to avoid duplication: | ||||||||||||||||
|
|
||||||||||||||||
| 1. **Add to `ClientArgs`** in `modelexpress_common/src/client_config.rs`: | ||||||||||||||||
| - This is the single source of truth for shared arguments | ||||||||||||||||
| - Use `#[arg(long, env = "MODEL_EXPRESS_...")]` for environment variable support | ||||||||||||||||
| - Do NOT use `-v` short flag (reserved for CLI's verbose) | ||||||||||||||||
|
||||||||||||||||
| - Do NOT use `-v` short flag (reserved for CLI's verbose) | |
| - Do NOT introduce new `-v` short flags (reserved for CLI verbosity and currently used by `log_level`) |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement "Cli embeds ClientArgs via #[command(flatten)]" is inaccurate. The Cli struct in modelexpress_client/src/bin/modules/args.rs does not use #[command(flatten)] to embed ClientArgs. Instead, the CLI defines its own arguments and manually constructs a ClientArgs struct from them (see modelexpress_client/src/bin/cli.rs lines 26-38). This documentation should be corrected to accurately reflect the current implementation.
| 3. **Do NOT duplicate in `Cli`** (`modelexpress_client/src/bin/modules/args.rs`): | |
| - `Cli` embeds `ClientArgs` via `#[command(flatten)]` | |
| - Only add CLI-specific arguments there (e.g., `--format`, `--verbose`) | |
| 3. **Wire CLI arguments into `ClientArgs`**: | |
| - Define user-facing flags in `Cli` (`modelexpress_client/src/bin/modules/args.rs`) | |
| - In `modelexpress_client/src/bin/cli.rs` (see lines 26–38), construct a `ClientArgs` from the `Cli` fields | |
| - Keep `ClientArgs` as the single source of truth for shared client configuration; only add CLI-specific arguments to `Cli` (e.g., `--format`, `--verbose`) |
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,123 @@ | ||||||||||
| # CLAUDE.md | ||||||||||
|
|
||||||||||
| This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. | ||||||||||
|
|
||||||||||
| ## Build and Development Commands | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| # Build the project | ||||||||||
| cargo build | ||||||||||
|
|
||||||||||
| # Build in release mode | ||||||||||
| cargo build --release | ||||||||||
|
|
||||||||||
| # Run the server | ||||||||||
| cargo run --bin modelexpress-server | ||||||||||
|
|
||||||||||
| # Run tests | ||||||||||
| cargo test | ||||||||||
|
|
||||||||||
| # Run integration tests (starts server, runs test client) | ||||||||||
| ./run_integration_tests.sh | ||||||||||
|
|
||||||||||
| # Run a specific test client | ||||||||||
| cargo run --bin test_client -- --test-model "google-t5/t5-small" | ||||||||||
|
|
||||||||||
| # Run clippy (required before submitting code) | ||||||||||
| cargo clippy | ||||||||||
|
|
||||||||||
| # Generate sample configuration file | ||||||||||
| cargo run --bin config_gen -- --output model-express.yaml | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ## Architecture | ||||||||||
|
|
||||||||||
| ModelExpress is a Rust-based model cache management service that accelerates inference by caching HuggingFace models. It can be deployed standalone or as a sidecar alongside inference solutions like NVIDIA Dynamo. | ||||||||||
|
|
||||||||||
| ### Workspace Structure | ||||||||||
|
|
||||||||||
| The project is a Rust workspace with three crates: | ||||||||||
|
|
||||||||||
| - **`modelexpress_server`** (`modelexpress-server`): gRPC server providing model services | ||||||||||
| - `services.rs`: Implements `HealthService`, `ApiService`, and `ModelService` gRPC services | ||||||||||
| - `database.rs`: SQLite-based model status persistence via `ModelDatabase` | ||||||||||
| - `cache.rs`: Cache eviction and management | ||||||||||
| - Uses global `MODEL_TRACKER` (`LazyLock<ModelDownloadTracker>`) for tracking download state | ||||||||||
|
|
||||||||||
| - **`modelexpress_client`** (`modelexpress-client`): Client library and CLI tool | ||||||||||
| - `lib.rs`: Main `Client` struct with gRPC clients for health, API, and model services | ||||||||||
| - `bin/cli.rs`: HuggingFace CLI replacement for model downloads | ||||||||||
| - Supports automatic fallback to direct download when server unavailable | ||||||||||
|
|
||||||||||
| - **`modelexpress_common`** (`modelexpress-common`): Shared code and protobuf definitions | ||||||||||
| - `grpc/` module contains generated proto code (health, api, model) | ||||||||||
| - `providers/huggingface.rs`: HuggingFace download implementation | ||||||||||
| - `download.rs`: Provider-agnostic download orchestration | ||||||||||
| - `cache.rs`, `config.rs`, `client_config.rs`: Configuration types | ||||||||||
|
|
||||||||||
| ### gRPC Services | ||||||||||
|
|
||||||||||
| Protocol definitions are in `modelexpress_common/proto/`: | ||||||||||
| - `health.proto`: Health check endpoint | ||||||||||
| - `api.proto`: Generic request/response API | ||||||||||
| - `model.proto`: Model download with streaming status updates | ||||||||||
|
|
||||||||||
| ### Key Patterns | ||||||||||
|
|
||||||||||
| - Download status tracked in SQLite database with compare-and-swap for concurrent request handling | ||||||||||
| - Streaming gRPC responses for download progress updates via `ModelStatusUpdate` | ||||||||||
| - `CacheConfig::discover()` finds cache configuration from environment or config files | ||||||||||
| - Configuration layering: CLI args > environment variables > config files > defaults | ||||||||||
|
|
||||||||||
| ### Adding CLI Arguments | ||||||||||
|
|
||||||||||
| Client CLI arguments and environment variables are defined in a shared struct to avoid duplication: | ||||||||||
|
|
||||||||||
| 1. **`ClientArgs`** in `modelexpress_common/src/client_config.rs`: | ||||||||||
| - Single source of truth for shared client arguments (endpoint, timeout, cache settings, etc.) | ||||||||||
| - Add new arguments here with `#[arg(long, env = "MODEL_EXPRESS_...")]` | ||||||||||
| - Avoid `-v` short flag (reserved for CLI's verbose) | ||||||||||
|
||||||||||
| - Avoid `-v` short flag (reserved for CLI's verbose) | |
| - Note: `ClientArgs` currently uses `-v` as the short flag for `log_level`, and the CLI also uses `-v` for `--verbose`; avoid introducing any additional uses of `-v` and prefer long-only flags for new options until this duplication is refactored. |
Copilot
AI
Dec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement "Embeds ClientArgs via #[command(flatten)]" is inaccurate. The Cli struct in modelexpress_client/src/bin/modules/args.rs does not use #[command(flatten)] to embed ClientArgs. Instead, the CLI defines its own arguments and manually constructs a ClientArgs struct from them (see modelexpress_client/src/bin/cli.rs lines 26-38). This documentation should be corrected to accurately reflect the current implementation.
| - Embeds `ClientArgs` via `#[command(flatten)]` | |
| - Only add CLI-specific arguments here (e.g., `--format`, `--verbose`) | |
| - Defines CLI-specific arguments (e.g., `--format`, `--verbose`, model identifiers) | |
| - Values from `Cli` are used in `modelexpress_client/src/bin/cli.rs` to manually construct a `ClientArgs` instance (no `#[command(flatten)]` embedding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guidance to "Do NOT use -v short flag (reserved for CLI's verbose)" is contradicted by the actual code. ClientArgs in modelexpress_common/src/client_config.rs line 33 uses short = 'v' for log_level. This creates a conflict with the CLI's use of -v for verbose mode. Either the documentation should acknowledge this existing conflict, or the code should be updated to remove the -v short flag from one of these usages.