Skip to content

Conversation

@Ronkahn21
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds 5 test scenarios for scaling and hierarchical topology constraint patterns in Topology Aware Scheduling (TAS).

Tests Added:

  • SP1_FullHierarchyWithCascadingConstraints: Tests complete 3-level hierarchy (PCS → PCSG → PCLQ) with constraint inheritance and overrides
  • SP2_PCSPlusPCLQConstraint: Tests PCS-level constraint combined with PCLQ-level override constraint
  • SP3_PCSGScalingWithTopologyConstraints: Tests PCSG-level scaling (3 replicas) with topology constraint propagation
  • SP5_PCSGPlusPCLQNoParentConstraint: Tests PCSG and PCLQ constraints without parent PCS constraint
  • SP8_LargeScalingRatio: Tests large scaling ratio (6+ PCSG replicas) for scalability verification

Test Coverage:

  • Hierarchical constraints (PCS → PCSG → PCLQ)
  • Constraint inheritance, override, and composition
  • PCSG replica-level topology constraint creation
  • Large-scale replica handling
  • KAI PodGroup SubGroup verification with multi-level constraints

This PR is part 3 of 4 in the TAS e2e test suite.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Dependencies:

Test Verification:

  • All tests compile successfully with -tags e2e
  • Linter passes with 0 issues
  • Added 576 lines of test code across 5 test functions (removed duplicate helper function)
  • Added 5 YAML test scenario files

What's Next:

File Summary:

  • Modified: 1 file (topology_test.go - added 5 tests)
  • New: 5 YAML test scenario files

Does this PR introduce a API change?

NONE

Additional documentation e.g., enhancement proposals, usage docs, etc.:

NONE

- Add 4-level topology hierarchy setup (zone/block/rack/host)
- Add KAI Topology verification utilities
- Add topology constraint verification helpers
- Include 2 foundational tests:
  * TI1: Topology infrastructure verification
  * BP1: Multiple cliques with different constraints
- Update dependencies to KAI Scheduler v0.13.0-rc1
- Add Makefile target for selective test execution
- Add topology-test skaffold profile

Signed-off-by: Ron Kahn <rkahn@nvidia.com>
Add 5 tests for simple topology constraint scenarios:
- SL1: PCS-only constraint (inherited by children)
- SL2: PCSG-only constraint
- SL3: No topology constraints (baseline)
- PC1: Host-level constraint (strictest packing)
- ZL1: Zone-level constraint

These tests verify constraint behavior at different
resource levels (PCS, PCSG, PCLQ) and topology domains
(zone, rack, host, none).

Builds on PR ai-dynamo#348 (infrastructure).

Signed-off-by: Ron Kahn <rkahn@nvidia.com>
Add 5 tests for scaling and hierarchical topology patterns:
- SP1: Full hierarchy with cascading constraints (PCS→PCSG→PCLQ)
- SP2: PCS + PCLQ constraint combination
- SP3: PCSG scaling with topology constraints
- SP5: PCSG + PCLQ without parent PCS constraint
- SP8: Large scaling ratio (6+ replicas)

These tests verify:
- Hierarchical constraint inheritance and overrides
- PCSG-level topology constraint propagation
- Large-scale PCSG replica handling
- KAI PodGroup SubGroup structure with constraints

Builds on PR ai-dynamo#349 (simple level tests).

Signed-off-by: Ron Kahn <rkahn@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant