Skip to content

Add --name and --set CLI Configuration Overrides#116

Closed
juliangieseke wants to merge 2 commits intoalphafrom
jg/cli-params
Closed

Add --name and --set CLI Configuration Overrides#116
juliangieseke wants to merge 2 commits intoalphafrom
jg/cli-params

Conversation

@juliangieseke
Copy link
Contributor

@juliangieseke juliangieseke commented Nov 11, 2025

Summary

Adds --name and generic --set key=value functionality to all Hypha binaries, replacing specific parameters with flexible configuration overrides.

Commits

1. Add --name param to CLI init commands

  • Added --name parameter to init commands across all binaries
  • Allows customizing node names during configuration generation
  • Default values: "data", "worker", "gateway", "scheduler"

2. Add generic --set parameter for configuration overrides

  • Implemented --set key=value for all binaries (data, worker, gateway, scheduler)
  • Supports multiple overrides, type detection, and nested configuration
  • Replaced hypha-data's dataset_path parameter with generic --set dataset_path=value

Usage

# Initialize with custom name and configuration
hypha-data init --name my-node --set dataset_path="custom_folder"

# Override any config value at runtime
hypha-worker run --set work_dir="/tmp" --set listen_addresses.0="/ip4/0.0.0.0/tcp/8080"

@juliangieseke juliangieseke force-pushed the jg/cli-params branch 3 times, most recently from 599c731 to 9eb6073 Compare November 11, 2025 13:02
@codecov
Copy link

codecov bot commented Nov 11, 2025

Codecov Report

❌ Patch coverage is 92.00926% with 138 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/gateway/tests/cli_init_tests.rs 88.23% 38 Missing ⚠️
crates/worker/tests/cli_init_tests.rs 88.37% 38 Missing ⚠️
crates/data/tests/cli_init_tests.rs 88.44% 35 Missing ⚠️
crates/config/src/cli.rs 87.63% 23 Missing ⚠️
crates/scheduler/tests/cli_init_tests.rs 98.65% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@juliangieseke juliangieseke changed the title Add additional cli params Add --name and --set CLI Configuration Overrides Nov 11, 2025
@juliangieseke juliangieseke force-pushed the jg/cli-params branch 4 times, most recently from 8916705 to f4e714d Compare November 11, 2025 14:08
@juliangieseke juliangieseke marked this pull request as ready for review November 11, 2025 14:11
Copilot AI review requested due to automatic review settings November 11, 2025 14:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds --name and generic --set key=value CLI configuration override functionality to Hypha binaries (data, worker, gateway, scheduler). The changes enable customizable node names during initialization and flexible runtime configuration overrides, replacing specific parameters with a generic override mechanism.

Key Changes

  • Added --name parameter to all init commands with binary-specific defaults
  • Implemented generic --set key=value for flexible configuration overrides across all commands
  • Introduced Config::with_name() method to generate name-based certificate filenames

Reviewed Changes

Copilot reviewed 13 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
crates/config/src/cli.rs New module providing CLI utilities for parsing and applying key-value configuration overrides
crates/worker/src/config.rs Added with_name() method to generate certificate filenames based on node name
crates/worker/src/bin/hypha-worker.rs Integrated --name and --set parameters into init, probe, and run commands
crates/worker/tests/cli_init_tests.rs Comprehensive integration tests for new CLI functionality
crates/scheduler/src/config.rs Added with_name() method for scheduler configuration
crates/scheduler/src/bin/hypha-scheduler.rs Integrated CLI overrides into scheduler binary
crates/scheduler/tests/cli_init_tests.rs Integration tests for scheduler CLI features
crates/gateway/src/config.rs Added with_name() method for gateway configuration
crates/gateway/src/bin/hypha-gateway.rs Integrated CLI overrides into gateway binary
crates/gateway/tests/cli_init_tests.rs Integration tests for gateway CLI features
crates/data/src/config.rs Added with_name() method for data configuration
crates/data/src/bin/hypha-data.rs Integrated CLI overrides into data binary
crates/data/tests/cli_init_tests.rs Integration tests for data CLI features
crates/scheduler/Cargo.toml Added test dependencies (tempfile, toml)
crates/gateway/Cargo.toml Added test dependencies (tempfile, toml)
crates/data/Cargo.toml Added test dependencies (tempfile, toml)
Cargo.lock Updated dependency lock file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@juliangieseke juliangieseke force-pushed the jg/cli-params branch 2 times, most recently from d2980ef to cb9c701 Compare November 11, 2025 14:22
Copy link
Contributor

@nfnt nfnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and very useful. Just have 2 minor comments, otherwise would be good to merge.

Comment on lines +49 to +54
pub fn parse_key_val(s: &str) -> std::result::Result<(String, String), String> {
let pos = s
.find('=')
.ok_or_else(|| format!("invalid KEY=value: no `=` found in `{s}`"))?;
Ok((s[..pos].to_string(), s[pos + 1..].to_string()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return a miette::Result here instead. This will ensure a proper Error implementation in case of an error instead of having a string. While technically, there's nothing wrong in using a String as error type in a Result, it's good practice to use a type that implements the Error trait.

I'm not sure though if this is compatible with using this as a ValueParser in clap.

Suggested change
pub fn parse_key_val(s: &str) -> std::result::Result<(String, String), String> {
let pos = s
.find('=')
.ok_or_else(|| format!("invalid KEY=value: no `=` found in `{s}`"))?;
Ok((s[..pos].to_string(), s[pos + 1..].to_string()))
}
pub fn parse_key_val(s: &str) -> Result<(String, String), String> {
let pos = s
.find('=')
.ok_or_else(|| miette::miette!("invalid KEY=value: no `=` found in `{s}`"))?;
Ok((s[..pos].to_string(), s[pos + 1..].to_string()))
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed 👍

/// assert!(toml.contains("port = 8080")); // Number without quotes
/// assert!(toml.contains("name = \"my-service\"")); // String with quotes
/// ```
pub fn create_overrides_toml(overrides: &[(String, String)]) -> Result<String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't have an error condition, therefore we can remove the Result here.

Suggested change
pub fn create_overrides_toml(overrides: &[(String, String)]) -> Result<String> {
pub fn create_overrides_toml(overrides: &[(String, String)]) -> String {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed 👍

@juliangieseke juliangieseke force-pushed the jg/cli-params branch 3 times, most recently from c90f592 to d4f40b1 Compare November 11, 2025 15:22
Replace specific CLI parameters with generic --set key=value functionality
across all Hypha binaries (data, worker, gateway, scheduler). This allows
users to override any TOML configuration value via command line arguments.

Key changes:
- Add parse_key_val function for parsing KEY=value pairs
- Add create_overrides_toml function for TOML generation
- Add apply_config_overrides helper for figment integration
- Update Init, Probe, and Run commands to accept --set parameters
- Support multiple overrides with proper type detection (bool, number, string)
- Enable nested configuration via dot notation (e.g., listen_addresses.0)
- Replace hypha-data dataset_path parameter with --set dataset_path=value

Examples:
- hypha-data init --set dataset_path="/data/ml-datasets"
- hypha-worker run --set work_dir="/tmp/hypha"
- hypha-gateway init --set listen_addresses.0="/ip4/0.0.0.0/tcp/8080"
- hypha-scheduler run --set gateway_addresses.0="/ip4/127.0.0.1/tcp/1234"

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 14 out of 19 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +55 to +61
impl Config {
/// Create a default configuration with the specified name prefix for certificate files.
///
/// NOTE: This method enables the CLI `init --name` functionality by generating
/// configuration files with customized certificate file names based on the provided name.
/// This is essential for multi-node deployments where each node needs distinct certificates.
pub fn with_name(name: &str) -> Self {
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the Default trait implementation without providing a default method means that Config::default() will no longer work. This is a breaking change that could affect other parts of the codebase that depend on Default. Consider either:

  1. Keeping the Default trait and having it call Config::with_name("gateway")
  2. Or add a pub fn default() -> Self method that calls Self::with_name("gateway") to maintain compatibility

Copilot uses AI. Check for mistakes.
Comment on lines +322 to +329
#[test]
fn config_with_name_handles_empty_string() {
let config = Config::with_name("");

assert_eq!(config.cert_pem, PathBuf::from("-cert.pem"));
assert_eq!(config.key_pem, PathBuf::from("-key.pem"));
assert_eq!(config.trust_pem, PathBuf::from("-trust.pem"));
}
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test verifies that empty string names produce certificate paths like "-cert.pem", which is likely not a desired behavior. Consider adding validation in Config::with_name to reject empty or invalid names, or provide a fallback to the default name ("worker") when an empty string is provided.

Copilot uses AI. Check for mistakes.
Comment on lines +175 to +182
#[test]
fn config_with_name_handles_empty_string() {
let config = Config::with_name("");

assert_eq!(config.cert_pem, PathBuf::from("-cert.pem"));
assert_eq!(config.key_pem, PathBuf::from("-key.pem"));
assert_eq!(config.trust_pem, PathBuf::from("-trust.pem"));
}
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test verifies that empty string names produce certificate paths like "-cert.pem", which is likely not a desired behavior. Consider adding validation in Config::with_name to reject empty or invalid names, or provide a fallback to the default name ("gateway") when an empty string is provided.

Copilot uses AI. Check for mistakes.
Comment on lines +200 to +207
#[test]
fn config_with_name_handles_empty_string() {
let config = Config::with_name("");

assert_eq!(config.cert_pem, PathBuf::from("-cert.pem"));
assert_eq!(config.key_pem, PathBuf::from("-key.pem"));
assert_eq!(config.trust_pem, PathBuf::from("-trust.pem"));
}
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test verifies that empty string names produce certificate paths like "-cert.pem", which is likely not a desired behavior. Consider adding validation in Config::with_name to reject empty or invalid names, or provide a fallback to the default name ("data") when an empty string is provided.

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +62
impl Config {
/// Create a default configuration with the specified name prefix for certificate files.
///
/// NOTE: This method enables the CLI `init --name` functionality by generating
/// configuration files with customized certificate file names based on the provided name.
/// This is essential for multi-node deployments where each node needs distinct certificates.
pub fn with_name(name: &str) -> Self {
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the Default trait implementation without providing a default method means that Config::default() will no longer work. This is a breaking change that could affect other parts of the codebase that depend on Default. Consider either:

  1. Keeping the Default trait and having it call Config::with_name("data")
  2. Or add a pub fn default() -> Self method that calls Self::with_name("data") to maintain compatibility

Copilot uses AI. Check for mistakes.
serde.workspace = true
thiserror.workspace = true
toml.workspace = true
toml_edit = "0.23.7"
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The dependency uses a specific patch version 0.23.7 instead of a more flexible version specification like "0.23". Consider using a more lenient version specification (e.g., "0.23" or "0.23.7" with a caret ^0.23.7) to allow for patch updates that might include bug fixes. Pinning to an exact version can prevent automatic security updates.

Suggested change
toml_edit = "0.23.7"
toml_edit = "^0.23.7"

Copilot uses AI. Check for mistakes.
Comment on lines +133 to +139
impl Config {
/// Create a default configuration with the specified name prefix for certificate files.
///
/// NOTE: This method enables the CLI `init --name` functionality by generating
/// configuration files with customized certificate file names based on the provided name.
/// This is essential for multi-node deployments where each node needs distinct certificates.
pub fn with_name(name: &str) -> Self {
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the Default trait implementation without providing a default method means that Config::default() will no longer work. This is a breaking change that could affect other parts of the codebase that depend on Default. Consider either:

  1. Keeping the Default trait and having it call Config::with_name("worker")
  2. Or add a pub fn default() -> Self method that calls Self::with_name("worker") to maintain compatibility

Copilot uses AI. Check for mistakes.
Comment on lines +100 to +103
for (key, val) in overrides {
// NOTE: We ignore errors for individual overrides to keep the function infallible.
let _ = set_nested_value(&mut doc, key, val);
}
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silently ignoring errors when setting nested values could lead to confusing behavior where users specify --set parameters that appear to be accepted but are silently ignored. Consider either:

  1. Making create_overrides_toml return a Result and propagate errors
  2. Logging warnings when individual overrides fail
  3. At minimum, adding a comment explaining why errors are ignored and under what circumstances they might occur

Copilot uses AI. Check for mistakes.
Comment on lines +65 to +71
impl Config {
/// Create a default configuration with the specified name prefix for certificate files.
///
/// NOTE: This method enables the CLI `init --name` functionality by generating
/// configuration files with customized certificate file names based on the provided name.
/// This is essential for multi-node deployments where each node needs distinct certificates.
pub fn with_name(name: &str) -> Self {
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the Default trait implementation without providing a default method means that Config::default() will no longer work. This is a breaking change that could affect other parts of the codebase that depend on Default. Consider either:

  1. Keeping the Default trait and having it call Config::with_name("scheduler")
  2. Or add a pub fn default() -> Self method that calls Self::with_name("scheduler") to maintain compatibility

Copilot uses AI. Check for mistakes.
Comment on lines +229 to +236
#[test]
fn config_with_name_handles_empty_string() {
let config = Config::with_name("");

assert_eq!(config.cert_pem, PathBuf::from("-cert.pem"));
assert_eq!(config.key_pem, PathBuf::from("-key.pem"));
assert_eq!(config.trust_pem, PathBuf::from("-trust.pem"));
}
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test verifies that empty string names produce certificate paths like "-cert.pem", which is likely not a desired behavior. Consider adding validation in Config::with_name to reject empty or invalid names, or provide a fallback to the default name ("scheduler") when an empty string is provided.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@orlandohohmeier orlandohohmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m on board with a arbitrary key/value UX, but routing it through TOML works against our layered config model’s goals (provenance + clear validation). The module may lean TOML as of today, but its core is tracking where values came from and why they’re invalid. Turning arbitrary sets into TOML and piping them into a provider obscures origin and error context.

I'd vote to just extend the CLI parsing to map explicit, typed options directly into the layered config, rather than introducing a TOML detour for dynamic overwriting. This keeps provenance intact and aligns with the existing setup.

If we really want to support the key/value UX then let's have deep look into the clap arg parsing along with serde parsing to built out this feature.

telemetry_sample_ratio: Option<f64>,
}

impl Default for Config {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing default?

@juliangieseke
Copy link
Contributor Author

yeah lets find a good solution then, ultimately the goal was to have some way to realize something like this:

./hypha-scheduler init \
--set=scheduler.job.model.repository="hypha-space/lenet" \
--set=scheduler.job.model.type="vision-classification" \
--set=scheduler.job.model.filenames.0="config.json" \
--set=scheduler.job.model.filenames.1="model.safetensors" \
--set=scheduler.job.model.filenames.2="configuration_lenet.py" \
--set=scheduler.job.model.filenames.3="modeling_lenet.py" \
--set=scheduler.job.model.token="TBD" \
--set=scheduler.job.preprocessor.repository="hypha-space/lenet" \
--set=scheduler.job.preprocessor.filesnames.0="preprocessor_config.json" \
--set=scheduler.job.preprocessor.token="TBD" \
--set=scheduler.job.dataset.dataset="mnist"

... 🙈

@juliangieseke
Copy link
Contributor Author

Will open new PRs introducing similar functionality but with a different approach. Turns out the main thing I was trying to handle was passing a job to the schedulers run command - I just didnt realize it

@juliangieseke juliangieseke deleted the jg/cli-params branch November 12, 2025 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants