Add --name and --set CLI Configuration Overrides#116
Add --name and --set CLI Configuration Overrides#116juliangieseke wants to merge 2 commits intoalphafrom
Conversation
599c731 to
9eb6073
Compare
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
8916705 to
f4e714d
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR adds --name and generic --set key=value CLI configuration override functionality to Hypha binaries (data, worker, gateway, scheduler). The changes enable customizable node names during initialization and flexible runtime configuration overrides, replacing specific parameters with a generic override mechanism.
Key Changes
- Added
--nameparameter to allinitcommands with binary-specific defaults - Implemented generic
--set key=valuefor flexible configuration overrides across all commands - Introduced
Config::with_name()method to generate name-based certificate filenames
Reviewed Changes
Copilot reviewed 13 out of 18 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/config/src/cli.rs | New module providing CLI utilities for parsing and applying key-value configuration overrides |
| crates/worker/src/config.rs | Added with_name() method to generate certificate filenames based on node name |
| crates/worker/src/bin/hypha-worker.rs | Integrated --name and --set parameters into init, probe, and run commands |
| crates/worker/tests/cli_init_tests.rs | Comprehensive integration tests for new CLI functionality |
| crates/scheduler/src/config.rs | Added with_name() method for scheduler configuration |
| crates/scheduler/src/bin/hypha-scheduler.rs | Integrated CLI overrides into scheduler binary |
| crates/scheduler/tests/cli_init_tests.rs | Integration tests for scheduler CLI features |
| crates/gateway/src/config.rs | Added with_name() method for gateway configuration |
| crates/gateway/src/bin/hypha-gateway.rs | Integrated CLI overrides into gateway binary |
| crates/gateway/tests/cli_init_tests.rs | Integration tests for gateway CLI features |
| crates/data/src/config.rs | Added with_name() method for data configuration |
| crates/data/src/bin/hypha-data.rs | Integrated CLI overrides into data binary |
| crates/data/tests/cli_init_tests.rs | Integration tests for data CLI features |
| crates/scheduler/Cargo.toml | Added test dependencies (tempfile, toml) |
| crates/gateway/Cargo.toml | Added test dependencies (tempfile, toml) |
| crates/data/Cargo.toml | Added test dependencies (tempfile, toml) |
| Cargo.lock | Updated dependency lock file |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d2980ef to
cb9c701
Compare
nfnt
left a comment
There was a problem hiding this comment.
This looks good and very useful. Just have 2 minor comments, otherwise would be good to merge.
crates/config/src/cli.rs
Outdated
| pub fn parse_key_val(s: &str) -> std::result::Result<(String, String), String> { | ||
| let pos = s | ||
| .find('=') | ||
| .ok_or_else(|| format!("invalid KEY=value: no `=` found in `{s}`"))?; | ||
| Ok((s[..pos].to_string(), s[pos + 1..].to_string())) | ||
| } |
There was a problem hiding this comment.
Let's return a miette::Result here instead. This will ensure a proper Error implementation in case of an error instead of having a string. While technically, there's nothing wrong in using a String as error type in a Result, it's good practice to use a type that implements the Error trait.
I'm not sure though if this is compatible with using this as a ValueParser in clap.
| pub fn parse_key_val(s: &str) -> std::result::Result<(String, String), String> { | |
| let pos = s | |
| .find('=') | |
| .ok_or_else(|| format!("invalid KEY=value: no `=` found in `{s}`"))?; | |
| Ok((s[..pos].to_string(), s[pos + 1..].to_string())) | |
| } | |
| pub fn parse_key_val(s: &str) -> Result<(String, String), String> { | |
| let pos = s | |
| .find('=') | |
| .ok_or_else(|| miette::miette!("invalid KEY=value: no `=` found in `{s}`"))?; | |
| Ok((s[..pos].to_string(), s[pos + 1..].to_string())) | |
| } |
crates/config/src/cli.rs
Outdated
| /// assert!(toml.contains("port = 8080")); // Number without quotes | ||
| /// assert!(toml.contains("name = \"my-service\"")); // String with quotes | ||
| /// ``` | ||
| pub fn create_overrides_toml(overrides: &[(String, String)]) -> Result<String> { |
There was a problem hiding this comment.
This function doesn't have an error condition, therefore we can remove the Result here.
| pub fn create_overrides_toml(overrides: &[(String, String)]) -> Result<String> { | |
| pub fn create_overrides_toml(overrides: &[(String, String)]) -> String { |
c90f592 to
d4f40b1
Compare
Replace specific CLI parameters with generic --set key=value functionality across all Hypha binaries (data, worker, gateway, scheduler). This allows users to override any TOML configuration value via command line arguments. Key changes: - Add parse_key_val function for parsing KEY=value pairs - Add create_overrides_toml function for TOML generation - Add apply_config_overrides helper for figment integration - Update Init, Probe, and Run commands to accept --set parameters - Support multiple overrides with proper type detection (bool, number, string) - Enable nested configuration via dot notation (e.g., listen_addresses.0) - Replace hypha-data dataset_path parameter with --set dataset_path=value Examples: - hypha-data init --set dataset_path="/data/ml-datasets" - hypha-worker run --set work_dir="/tmp/hypha" - hypha-gateway init --set listen_addresses.0="/ip4/0.0.0.0/tcp/8080" - hypha-scheduler run --set gateway_addresses.0="/ip4/127.0.0.1/tcp/1234" Co-Authored-By: Claude <noreply@anthropic.com>
d4f40b1 to
1359c3e
Compare
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 14 out of 19 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| impl Config { | ||
| /// Create a default configuration with the specified name prefix for certificate files. | ||
| /// | ||
| /// NOTE: This method enables the CLI `init --name` functionality by generating | ||
| /// configuration files with customized certificate file names based on the provided name. | ||
| /// This is essential for multi-node deployments where each node needs distinct certificates. | ||
| pub fn with_name(name: &str) -> Self { |
There was a problem hiding this comment.
The removal of the Default trait implementation without providing a default method means that Config::default() will no longer work. This is a breaking change that could affect other parts of the codebase that depend on Default. Consider either:
- Keeping the
Defaulttrait and having it callConfig::with_name("gateway") - Or add a
pub fn default() -> Selfmethod that callsSelf::with_name("gateway")to maintain compatibility
| #[test] | ||
| fn config_with_name_handles_empty_string() { | ||
| let config = Config::with_name(""); | ||
|
|
||
| assert_eq!(config.cert_pem, PathBuf::from("-cert.pem")); | ||
| assert_eq!(config.key_pem, PathBuf::from("-key.pem")); | ||
| assert_eq!(config.trust_pem, PathBuf::from("-trust.pem")); | ||
| } |
There was a problem hiding this comment.
The test verifies that empty string names produce certificate paths like "-cert.pem", which is likely not a desired behavior. Consider adding validation in Config::with_name to reject empty or invalid names, or provide a fallback to the default name ("worker") when an empty string is provided.
| #[test] | ||
| fn config_with_name_handles_empty_string() { | ||
| let config = Config::with_name(""); | ||
|
|
||
| assert_eq!(config.cert_pem, PathBuf::from("-cert.pem")); | ||
| assert_eq!(config.key_pem, PathBuf::from("-key.pem")); | ||
| assert_eq!(config.trust_pem, PathBuf::from("-trust.pem")); | ||
| } |
There was a problem hiding this comment.
The test verifies that empty string names produce certificate paths like "-cert.pem", which is likely not a desired behavior. Consider adding validation in Config::with_name to reject empty or invalid names, or provide a fallback to the default name ("gateway") when an empty string is provided.
| #[test] | ||
| fn config_with_name_handles_empty_string() { | ||
| let config = Config::with_name(""); | ||
|
|
||
| assert_eq!(config.cert_pem, PathBuf::from("-cert.pem")); | ||
| assert_eq!(config.key_pem, PathBuf::from("-key.pem")); | ||
| assert_eq!(config.trust_pem, PathBuf::from("-trust.pem")); | ||
| } |
There was a problem hiding this comment.
The test verifies that empty string names produce certificate paths like "-cert.pem", which is likely not a desired behavior. Consider adding validation in Config::with_name to reject empty or invalid names, or provide a fallback to the default name ("data") when an empty string is provided.
| impl Config { | ||
| /// Create a default configuration with the specified name prefix for certificate files. | ||
| /// | ||
| /// NOTE: This method enables the CLI `init --name` functionality by generating | ||
| /// configuration files with customized certificate file names based on the provided name. | ||
| /// This is essential for multi-node deployments where each node needs distinct certificates. | ||
| pub fn with_name(name: &str) -> Self { |
There was a problem hiding this comment.
The removal of the Default trait implementation without providing a default method means that Config::default() will no longer work. This is a breaking change that could affect other parts of the codebase that depend on Default. Consider either:
- Keeping the
Defaulttrait and having it callConfig::with_name("data") - Or add a
pub fn default() -> Selfmethod that callsSelf::with_name("data")to maintain compatibility
| serde.workspace = true | ||
| thiserror.workspace = true | ||
| toml.workspace = true | ||
| toml_edit = "0.23.7" |
There was a problem hiding this comment.
[nitpick] The dependency uses a specific patch version 0.23.7 instead of a more flexible version specification like "0.23". Consider using a more lenient version specification (e.g., "0.23" or "0.23.7" with a caret ^0.23.7) to allow for patch updates that might include bug fixes. Pinning to an exact version can prevent automatic security updates.
| toml_edit = "0.23.7" | |
| toml_edit = "^0.23.7" |
| impl Config { | ||
| /// Create a default configuration with the specified name prefix for certificate files. | ||
| /// | ||
| /// NOTE: This method enables the CLI `init --name` functionality by generating | ||
| /// configuration files with customized certificate file names based on the provided name. | ||
| /// This is essential for multi-node deployments where each node needs distinct certificates. | ||
| pub fn with_name(name: &str) -> Self { |
There was a problem hiding this comment.
The removal of the Default trait implementation without providing a default method means that Config::default() will no longer work. This is a breaking change that could affect other parts of the codebase that depend on Default. Consider either:
- Keeping the
Defaulttrait and having it callConfig::with_name("worker") - Or add a
pub fn default() -> Selfmethod that callsSelf::with_name("worker")to maintain compatibility
| for (key, val) in overrides { | ||
| // NOTE: We ignore errors for individual overrides to keep the function infallible. | ||
| let _ = set_nested_value(&mut doc, key, val); | ||
| } |
There was a problem hiding this comment.
Silently ignoring errors when setting nested values could lead to confusing behavior where users specify --set parameters that appear to be accepted but are silently ignored. Consider either:
- Making
create_overrides_tomlreturn aResultand propagate errors - Logging warnings when individual overrides fail
- At minimum, adding a comment explaining why errors are ignored and under what circumstances they might occur
| impl Config { | ||
| /// Create a default configuration with the specified name prefix for certificate files. | ||
| /// | ||
| /// NOTE: This method enables the CLI `init --name` functionality by generating | ||
| /// configuration files with customized certificate file names based on the provided name. | ||
| /// This is essential for multi-node deployments where each node needs distinct certificates. | ||
| pub fn with_name(name: &str) -> Self { |
There was a problem hiding this comment.
The removal of the Default trait implementation without providing a default method means that Config::default() will no longer work. This is a breaking change that could affect other parts of the codebase that depend on Default. Consider either:
- Keeping the
Defaulttrait and having it callConfig::with_name("scheduler") - Or add a
pub fn default() -> Selfmethod that callsSelf::with_name("scheduler")to maintain compatibility
| #[test] | ||
| fn config_with_name_handles_empty_string() { | ||
| let config = Config::with_name(""); | ||
|
|
||
| assert_eq!(config.cert_pem, PathBuf::from("-cert.pem")); | ||
| assert_eq!(config.key_pem, PathBuf::from("-key.pem")); | ||
| assert_eq!(config.trust_pem, PathBuf::from("-trust.pem")); | ||
| } |
There was a problem hiding this comment.
The test verifies that empty string names produce certificate paths like "-cert.pem", which is likely not a desired behavior. Consider adding validation in Config::with_name to reject empty or invalid names, or provide a fallback to the default name ("scheduler") when an empty string is provided.
orlandohohmeier
left a comment
There was a problem hiding this comment.
I’m on board with a arbitrary key/value UX, but routing it through TOML works against our layered config model’s goals (provenance + clear validation). The module may lean TOML as of today, but its core is tracking where values came from and why they’re invalid. Turning arbitrary sets into TOML and piping them into a provider obscures origin and error context.
I'd vote to just extend the CLI parsing to map explicit, typed options directly into the layered config, rather than introducing a TOML detour for dynamic overwriting. This keeps provenance intact and aligns with the existing setup.
If we really want to support the key/value UX then let's have deep look into the clap arg parsing along with serde parsing to built out this feature.
| telemetry_sample_ratio: Option<f64>, | ||
| } | ||
|
|
||
| impl Default for Config { |
There was a problem hiding this comment.
Why are we removing default?
|
yeah lets find a good solution then, ultimately the goal was to have some way to realize something like this: ... 🙈 |
|
Will open new PRs introducing similar functionality but with a different approach. Turns out the main thing I was trying to handle was passing a job to the schedulers run command - I just didnt realize it |
Summary
Adds
--nameand generic--set key=valuefunctionality to all Hypha binaries, replacing specific parameters with flexible configuration overrides.Commits
1. Add --name param to CLI init commands
--nameparameter toinitcommands across all binaries2. Add generic --set parameter for configuration overrides
--set key=valuefor all binaries (data, worker, gateway, scheduler)dataset_pathparameter with generic--set dataset_path=valueUsage