Skip to content

[AI Explainability] TrustyAI DSC based LMEval configuration#672

Merged
sheltoncyril merged 4 commits intoopendatahub-io:mainfrom
sheltoncyril:AI-Exp-TAI-DSC-Config
Oct 2, 2025
Merged

[AI Explainability] TrustyAI DSC based LMEval configuration#672
sheltoncyril merged 4 commits intoopendatahub-io:mainfrom
sheltoncyril:AI-Exp-TAI-DSC-Config

Conversation

@sheltoncyril
Copy link
Copy Markdown
Contributor

@sheltoncyril sheltoncyril commented Oct 2, 2025

This PR adds tests for the newly introduced DSC config feature where the LMEval flags allowCodeExecution and allowOnline are configurable from the DSC.

Summary by CodeRabbit

  • Tests

    • Migrated test setup to cluster-level configuration for TrustyAI evaluations, replacing legacy configuration approach.
    • Ensures scenarios cover online and code-execution permissions, improving stability across LMEval test suites (HF, local offline, emulator).
    • Updated test dependencies to use the new setup consistently.
  • Refactor

    • Consolidated and renamed test fixtures to align with current platform configuration, reducing maintenance overhead.
  • Notes

    • No user-facing functionality changes.

@sheltoncyril sheltoncyril requested review from a team and adolfo-ab as code owners October 2, 2025 12:28
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 2, 2025

📝 Walkthrough

Walkthrough

Replaces a ConfigMap-based fixture with a DataScienceCluster-based fixture to patch lmeval-related settings, updates imports and fixture signatures, and adjusts dependent tests and conftest fixtures to consume the new patched_dsc_lmeval_allow_all fixture and types.

Changes

Cohort / File(s) Summary
Fixture refactor: ConfigMap → DataScienceCluster
tests/fixtures/trustyai.py
Introduces patched_dsc_lmeval_allow_all(admin_client, trustyai_operator_deployment) -> Generator[DataScienceCluster, None, None]. Switches from creating/patching a ConfigMap to obtaining and patching a DataScienceCluster (via get_data_science_cluster and ResourceEditor) to enable trustyai.eval.lmeval.permitOnline and permitCodeExecution. Updates imports to use DataScienceCluster; removes ConfigMap-related imports and logic. Yields DataScienceCluster instead of ConfigMap. Retains operator replica scale/restart logic targeting the TrustyAI operator.
Test updates to new fixture
tests/llama_stack/eval/test_lmeval_provider.py
Replaces use of patched_trustyai_configmap_allow_online with patched_dsc_lmeval_allow_all in test fixture injection. No other behavioral changes indicated.
Conftest fixture signatures and types
tests/model_explainability/lm_eval/conftest.py
Updates three fixtures (lmevaljob_hf, lmevaljob_local_offline, lmevaljob_vllm_emulator) to depend on patched_dsc_lmeval_allow_all: DataScienceCluster instead of patched_trustyai_configmap_allow_online: ConfigMap. Usage otherwise unchanged.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely conveys the primary change of migrating LMEval configuration in TrustyAI to use DataScienceCluster (DSC) rather than ConfigMap, matching the changes in tests and fixtures. It clearly identifies the relevant subsystem and feature without extraneous details and aligns well with the pull request objectives.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ee8c5be and bb3b36c.

📒 Files selected for processing (1)
  • tests/fixtures/trustyai.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/fixtures/trustyai.py (2)
utilities/infra.py (1)
  • get_data_science_cluster (865-866)
tests/conftest.py (1)
  • admin_client (68-69)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 2, 2025

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/verified', '/cherry-pick', '/hold', '/build-push-pr-image', '/wip', '/lgtm'}

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7bd44c2 and dbf7470.

📒 Files selected for processing (4)
  • tests/fixtures/trustyai.py (2 hunks)
  • tests/llama_stack/eval/test_lmeval_provider.py (1 hunks)
  • tests/model_explainability/lm_eval/conftest.py (4 hunks)
  • utilities/trustyai_utils.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-05T10:05:17.642Z
Learnt from: adolfo-ab
PR: opendatahub-io/opendatahub-tests#334
File: tests/model_explainability/trustyai_service/test_trustyai_service.py:52-65
Timestamp: 2025-06-05T10:05:17.642Z
Learning: For TrustyAI image validation tests: operator image tests require admin_client, related_images_refs, and trustyai_operator_configmap fixtures, while service image tests would require different fixtures like trustyai_service_with_pvc_storage, model_namespace, and current_client_token.

Applied to files:

  • tests/fixtures/trustyai.py
🧬 Code graph analysis (4)
tests/llama_stack/eval/test_lmeval_provider.py (1)
tests/fixtures/trustyai.py (1)
  • patched_dsc_lmeval_allow_all (26-36)
utilities/trustyai_utils.py (1)
tests/fixtures/trustyai.py (1)
  • trustyai_operator_deployment (16-22)
tests/model_explainability/lm_eval/conftest.py (1)
tests/fixtures/trustyai.py (1)
  • patched_dsc_lmeval_allow_all (26-36)
tests/fixtures/trustyai.py (3)
utilities/infra.py (1)
  • get_data_science_cluster (865-866)
utilities/trustyai_utils.py (1)
  • patch_dsc_trustyai_lmeval_config (8-48)
tests/conftest.py (1)
  • admin_client (68-69)
🪛 Ruff (0.13.2)
tests/llama_stack/eval/test_lmeval_provider.py

55-55: Unused method argument: minio_pod

(ARG002)


55-55: Unused method argument: minio_data_connection

(ARG002)


55-55: Unused method argument: patched_dsc_lmeval_allow_all

(ARG002)

tests/model_explainability/lm_eval/conftest.py

30-30: Unused function argument: patched_dsc_lmeval_allow_all

(ARG001)


77-77: Unused function argument: patched_dsc_lmeval_allow_all

(ARG001)


107-107: Unused function argument: patched_dsc_lmeval_allow_all

(ARG001)

🔇 Additional comments (9)
tests/llama_stack/eval/test_lmeval_provider.py (1)

54-56: LGTM! Fixture parameter update aligns with DSC-based configuration.

The change from patched_trustyai_configmap_allow_online to patched_dsc_lmeval_allow_all correctly reflects the refactor to DataScienceCluster-based configuration. The static analysis warnings about unused arguments are false positives—pytest fixtures are used for setup/teardown side effects, not necessarily for direct reference in the test body.

utilities/trustyai_utils.py (4)

1-6: LGTM! Imports are appropriate.

The imports correctly bring in the necessary types and resources for patching the DataScienceCluster and managing the operator deployment.


8-13: LGTM! Well-designed function signature.

The signature uses safe defaults (deny by default) and clear typing. The Generator return type is appropriate for use in pytest fixtures with yield from.


26-43: LGTM! Patch structure is correct.

The ResourceEditor context manager correctly structures the patch for the DataScienceCluster's trustyai.eval.lmeval settings, and will automatically handle cleanup if an exception occurs.


44-48: Handle zero-initial replicas to guarantee restart
In utilities/trustyai_utils.py (lines 44–48), num_replicas may be 0, making the scale 0→0 a no-op. Add before scaling back up:

if num_replicas == 0:
    num_replicas = 1

Also confirm that wait_for_replicas() enforces a finite timeout and raises on failure.

tests/model_explainability/lm_eval/conftest.py (1)

25-32: LGTM! Fixture parameter updates are consistent and correct.

All three fixtures (lmevaljob_hf, lmevaljob_local_offline, lmevaljob_vllm_emulator) have been consistently updated to use patched_dsc_lmeval_allow_all: DataScienceCluster, aligning with the refactor from ConfigMap-based to DataScienceCluster-based configuration.

The static analysis warnings about unused arguments are false positives—pytest fixtures ensure the DSC is properly patched before the LMEvalJob runs, even if the fixture object isn't directly referenced in the function body.

Also applies to: 72-79, 103-111

tests/fixtures/trustyai.py (3)

1-12: LGTM! Imports correctly support the DSC-based fixture.

The new imports for DataScienceCluster, Generator, and the utility functions (get_data_science_cluster, patch_dsc_trustyai_lmeval_config) properly support the refactored fixture.


15-22: LGTM! Deployment fixture with class scope is appropriate.

The trustyai_operator_deployment fixture correctly uses class scope, which is appropriate since the deployment configuration is shared across tests in a class and doesn't need per-test isolation.


25-36: Class-scoped patched_dsc_lmeval_allow_all is safe. Tests using it only read from the shared cluster and don’t mutate its state.

Comment thread utilities/trustyai_utils.py Outdated
Comment thread utilities/trustyai_utils.py Outdated
Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
utilities/trustyai_utils.py (1)

44-47: Consider adding a wait between scale operations.

The deployment scaling logic scales to 0 and immediately scales back without waiting for the pods to terminate. While this might work in practice, it could lead to race conditions where:

  • The operator pods haven't fully terminated before the scale-up
  • The operator might not properly pick up the DSC changes

Consider adding a wait after scaling down:

     num_replicas: int = trustyai_operator_deployment.replicas
     trustyai_operator_deployment.scale_replicas(replica_count=0)
+    trustyai_operator_deployment.wait_for_replicas(replica_count=0)
     trustyai_operator_deployment.scale_replicas(replica_count=num_replicas)
     trustyai_operator_deployment.wait_for_replicas()

This ensures pods fully terminate before scaling back up, giving the operator a clean restart to reconcile the DSC changes.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dbf7470 and ee8c5be.

📒 Files selected for processing (1)
  • utilities/trustyai_utils.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
utilities/trustyai_utils.py (1)
tests/fixtures/trustyai.py (1)
  • trustyai_operator_deployment (16-22)
🔇 Additional comments (3)
utilities/trustyai_utils.py (3)

1-5: LGTM!

All imports are necessary and properly used in the function implementation.


8-13: Function extraction is reasonable despite single usage.

Regarding the past comment questioning whether this file/function is needed since it's only used in patched_dsc_lmeval_allow_all: while it's true this is currently used in one place, extracting this logic into a utility function is a valid design choice that:

  • Separates fixture concerns from DSC patching logic
  • Improves testability of the patching logic
  • Enables easier reuse if additional tests need similar DSC lmeval configuration

The function signature is well-typed and clear.


26-48: Implementation approach is sound for a test fixture.

The use of ResourceEditor as a context manager ensures the DSC patch is automatically reverted after the test completes, which is the correct pattern for a test fixture. The deployment scaling inside the context ensures the operator reconciles the DSC changes while the patch is active.

The patch structure correctly targets spec.components.trustyai.eval.lmeval with the appropriate permitCodeExecution and permitOnline settings.

Comment thread utilities/trustyai_utils.py Outdated
@github-actions github-actions Bot added size/m and removed size/l labels Oct 2, 2025
@sheltoncyril sheltoncyril enabled auto-merge (squash) October 2, 2025 13:04
Copy link
Copy Markdown
Contributor

@kpunwatk kpunwatk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@sheltoncyril sheltoncyril merged commit 640b72c into opendatahub-io:main Oct 2, 2025
10 checks passed
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Oct 2, 2025

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

@adolfo-ab
Copy link
Copy Markdown
Contributor

/cherry-pick 2.25

rhods-ci-bot pushed a commit that referenced this pull request Oct 2, 2025
* refactor: replace TrustyAI ConfigMap patching with DataScienceCluster config

* Apply suggestion from @adolfo-ab

Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com>

* refactor: remove utils file and fn

* refactor: remove unused file and code

---------

Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com>
@rhods-ci-bot
Copy link
Copy Markdown
Contributor

Cherry pick action created PR #673 successfully 🎉!
See: https://github.com/opendatahub-io/opendatahub-tests/actions/runs/18195029174

adolfo-ab added a commit that referenced this pull request Oct 2, 2025
)

* refactor: replace TrustyAI ConfigMap patching with DataScienceCluster config

* Apply suggestion from @adolfo-ab



* refactor: remove utils file and fn

* refactor: remove unused file and code

---------

Co-authored-by: Shelton Cyril <sheltoncyril@gmail.com>
Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com>
mwaykole pushed a commit to mwaykole/opendatahub-tests that referenced this pull request Jan 23, 2026
…hub-io#672)

* refactor: replace TrustyAI ConfigMap patching with DataScienceCluster config

* Apply suggestion from @adolfo-ab

Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com>

* refactor: remove utils file and fn

* refactor: remove unused file and code

---------

Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants