fix(cli): correct tuple-domain sizes in inspect#203
Conversation
|
@pesap why this is skipping tests? because binaries were not affected in this PR? |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes arco inspect sizing/instance estimates for tuple-domain sets by using tuple-row cardinality (and avoiding Cartesian overcounts), and adds regression coverage around the nodal-allocation example.
Changes:
- Use tuple-row cardinality for tuple-domain set sizing in inspect output (including bindings via alias-aware lookup).
- Correct constraint instance estimation to avoid overcounting when tuple-domain bindings repeat.
- Update example/docs syntax (
where→filter) and add regression assertions in CLI tests.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| examples/nodal-allocation/input.kdl | Updates example to use filter blocks for row filtering. |
| docs/appendix-a-ergonomic-profile.md | Updates documentation examples to use filter instead of where. |
| crates/arco-cli/tests/example_cli_commands.rs | Adds regression assertions for tuple-domain set sizes and constraint instance counts in inspect output. |
| crates/arco-cli/src/inspect.rs | Computes tuple-domain sizes via tuple rows, adds alias-aware size lookup, and fixes constraint instance estimation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let mut set_sizes: BTreeMap<&str, usize> = program | ||
| .set_registry | ||
| .iter() | ||
| .map(|(name, resolved)| (name.as_str(), resolved.values.len())) | ||
| .map(|(name, resolved)| (name.as_str(), resolved_set_cardinality(resolved))) | ||
| .collect(); | ||
| for (alias, canonical) in &program.set_aliases { | ||
| if let Some(size) = set_sizes.get(canonical.as_str()).copied() { | ||
| set_sizes.entry(alias.as_str()).or_insert(size); | ||
| } | ||
| } |
There was a problem hiding this comment.
Storing alias names as additional keys in set_sizes can produce ambiguous results if an alias collides with a real set name (e.g., a set named i and an index alias i). In that case, or_insert silently keeps the original value and the alias mapping is ignored, which can make inspect sizing inconsistent/hard to reason about. Prefer keeping set_sizes keyed only by canonical set names and performing canonicalization at lookup time (e.g., a small helper that resolves domain via program.set_aliases and then reads from set_sizes).
There was a problem hiding this comment.
@pesap this seems to be concerning, but that is a user problem. Probably we want to throw error messages when that happens
There was a problem hiding this comment.
@copilot based on this feedback, answer to this comment with a proposed test that we can integrate in arco that fails in that situation. Do not make the code, just write here the test that you suggest to see this problem
There was a problem hiding this comment.
Here is a proposed test that would expose the alias-vs-set-name collision in set_sizes:
#[test]
fn inspect_alias_does_not_shadow_real_set_with_same_name() {
// Build a minimal program where:
// - a real set called "i" exists with, say, 3 elements
// - a *different* canonical set (e.g. "nodes") has an alias that is also named "i"
// (i.e. `program.set_aliases["i"] = "nodes"` where "nodes" has 5 elements)
//
// Because `or_insert` keeps the first value, the result depends on insertion
// order: whichever key "i" wins, the other mapping is silently lost.
//
// The test asserts that the size reported for a variable/parameter whose
// declared domain is the real set "i" is exactly 3, not 5.
//
// Without the fix (canonical lookup at read time rather than at insert time),
// the test fails whenever the alias "i" → "nodes" is inserted first and the
// real set "i" is overwritten, or whenever a variable whose domain is "nodes"
// gets the wrong size because the "i" entry in set_sizes already points at
// the real set.
let input_kdl = r#"
sets {
i 1 2 3
nodes {
alias "i" // collision: alias name == existing set name
values a b c d e
}
}
variables {
x domain="i" // should have size 3
}
"#;
let output = run_inspect_json(input_kdl);
let variable = find_variable(&output, "x");
assert_eq!(variable["binding_size"], 3,
"size of domain 'i' (the real set, 3 elements) must not be confused \
with alias 'i' → 'nodes' (5 elements)");
}The key assertion is that x, whose domain is the real set i (3 elements), reports a binding size of 3 and is not silently overridden by the alias i → nodes (5 elements). This test will fail with the current or_insert-into-set_sizes approach whenever insertion order makes the alias entry win.
| program | ||
| .set_aliases | ||
| .iter() | ||
| .find_map(|(alias, canonical)| (canonical == set_name).then(|| alias.clone())) |
There was a problem hiding this comment.
find_set_alias returns only a single alias for a canonical set name, but set_aliases can naturally contain multiple aliases pointing at the same canonical set. Returning an arbitrary single alias can make the inspect output misleading. Consider either (a) changing the record shape to expose multiple aliases (e.g., aliases: Vec<String>), or (b) returning None (or a deterministic/explicit choice) when multiple aliases exist so the output doesn’t imply there is only one.
| program | |
| .set_aliases | |
| .iter() | |
| .find_map(|(alias, canonical)| (canonical == set_name).then(|| alias.clone())) | |
| let mut aliases = program | |
| .set_aliases | |
| .iter() | |
| .filter_map(|(alias, canonical)| (canonical == set_name).then(|| alias.clone())); | |
| let alias = aliases.next()?; | |
| if aliases.next().is_some() { | |
| None | |
| } else { | |
| Some(alias) | |
| } |
There was a problem hiding this comment.
@copilot based on the feedback here (#203 (comment)), answer this comment with the code of a proposed test with a case that fails because of the problem you are pointing out. Do not write code just put the code in a comment answering this
There was a problem hiding this comment.
Here is a proposed test that would fail because find_set_alias silently returns only the alphabetically-first alias when multiple aliases exist, and the SetRecord.alias field hides the others:
#[test]
fn inspect_set_record_exposes_all_aliases_not_just_one() {
// "lines" is the canonical set; it has two aliases: "i" and "j".
// With the current single-Option<String> field, only one alias is
// surfaced in the JSON output — the other is silently discarded.
//
// The test asserts that the set record for "lines" contains BOTH
// aliases in its output. With the current code it will fail because
// `find_set_alias` returns (alphabetically) "i" and drops "j" entirely.
let output = run_arco_inspect_json(concat!(
"sets {\n",
" lines {\n",
" alias \"i\"\n",
" alias \"j\"\n",
" values a b c\n",
" }\n",
"}\n",
));
// Locate the "lines" set in the output
let sets = output["sets"].as_array().expect("sets array");
let lines_set = sets
.iter()
.find(|s| s["name"] == "lines")
.expect("lines set not found");
// Both aliases must appear; with the current code only "i" is present
let aliases = lines_set["aliases"]
.as_array()
.expect("aliases should be an array, not a single optional string");
let alias_values: Vec<&str> = aliases
.iter()
.map(|a| a.as_str().unwrap())
.collect();
assert!(alias_values.contains(&"i"), "alias 'i' missing");
assert!(alias_values.contains(&"j"), "alias 'j' missing");
}The test fails in two ways today:
SetRecord.aliasisOption<String>, not aVec<String>, so the JSON key is"alias"(singular), not"aliases"(plural) — the.as_array()call panics.- Even if the field were changed to
Vec<String>, the currentfind_set_aliasimplementation would only populate it with"i"(the alphabetically-first alias in theBTreeMap), dropping"j".
SebastianManriqueM
left a comment
There was a problem hiding this comment.
Probably I'm not familiar enough with rust, some parts of the code seem to be a little bit complicated to read
| .get(binding.domain.as_str()) | ||
| .map_or(binding.domain.as_str(), String::as_str); | ||
|
|
||
| if let Some(resolved_set) = program.set_registry.get(canonical_domain) { |
There was a problem hiding this comment.
Is there anyway to make this block easier to read?, is probably a rust thing but it is supercomplicated
There was a problem hiding this comment.
Simplified in commit 390c41a. The three nested ifs are now a single boolean:
let should_count = resolved_set.tuple_rows.is_none()
|| seen_tuple_domains.insert(canonical_domain);
if should_count {
instances = instances.saturating_mul(resolved_set_cardinality(resolved_set));
}Plain English: count the set's cardinality unless it is a tuple-domain that has already been counted.
…traint_instances Agent-Logs-Url: https://github.com/NatLabRockies/arco/sessions/08ddfe6d-8090-489d-93ab-24dbb1a952c7 Co-authored-by: SebastianManriqueM <104728742+SebastianManriqueM@users.noreply.github.com>
SebastianManriqueM
left a comment
There was a problem hiding this comment.
lets review the suggestions from copilot before merging. Sounds good?
…stances Agent-Logs-Url: https://github.com/NatLabRockies/arco/sessions/08ddfe6d-8090-489d-93ab-24dbb1a952c7 Co-authored-by: SebastianManriqueM <104728742+SebastianManriqueM@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
Summary
arco inspectset sizing for tuple domains by using tuple-row cardinalityValidation