Skip to content

Commit c02d473

Browse files
dbasunagrenovate[bot]jiridanekrnetserlugi0
committed
Collect must-gather at the failure point (opendatahub-io#240)
* updates to test_registering_model() based on previous review comments * [do-not-review]must-gather collection at failure point updates! 1176505 updates! 12d9c08 updates! 12d9c08 updates! 65e0213 * [ModelRegistry] ensure RunAsUser and RunAsGroup are not set explicitly (opendatahub-io#226) updates! 4813f2b updates! 20cd457 updates! b126825 updates! 809cca7 * Lock file maintenance (opendatahub-io#241) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * RHOAIENG-22058: chore(workbenches): add test_create_simple_notebook to smoke (opendatahub-io#238) * Remove uv cache from dockerfile to support running in envs like openshift-ci (opendatahub-io#239) * Create size-labeler.yml * Delete .github/workflows/size-labeler.yml * model mesh - add auth tests * xx * fix: remove uv cache from dockerfile * `is_managed_cluster` fix condition (opendatahub-io#243) * Create size-labeler.yml * Delete .github/workflows/size-labeler.yml * model mesh - add auth tests * xx * fix: replace iter with list * fix: add logger info * RHOAIENG-22057: fix(workbenches): correct the check for spawned workbench (opendatahub-io#242) There can only ever be a single workbench pod started. Co-authored-by: Luca Giorgi <lgiorgi@redhat.com> * RHOAIENG-22057: fix(workbenches): check for internal image registry and adjust the image path accordingly (opendatahub-io#244) * now yielding TimeoutSampler get_pods_by_isvc_label func output and handling raised ResourceNotFoundError (opendatahub-io#237) Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> * [model server] add auth test to upgrade (opendatahub-io#245) * Create size-labeler.yml * Delete .github/workflows/size-labeler.yml * model mesh - add auth tests * xx * feat: add auth test to upgrade * feat: add auth test to upgrade feat: add auth test to upgrade * fix: dsci name in func * [pre-commit.ci] pre-commit autoupdate (opendatahub-io#246) updates: - [github.com/astral-sh/ruff-pre-commit: v0.11.4 → v0.11.5](astral-sh/ruff-pre-commit@v0.11.4...v0.11.5) - [github.com/gitleaks/gitleaks: v8.24.2 → v8.24.3](gitleaks/gitleaks@v8.24.2...v8.24.3) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ruth Netser <rnetser@redhat.com> * Fix add-remove-labels workflow (opendatahub-io#249) * Add Cluster sanity checks before test execution (opendatahub-io#235) * Create size-labeler.yml * Delete .github/workflows/size-labeler.yml * model mesh - add auth tests * xx * feat: cluster sanity * feat: cluster sanity * feat: cluster sanity * feat: cluster sanity add readme * fix: tix str typo * fix: address comments * fix: address review comments * fix: address comment * fix: use dsci from global config * fix: remove duplicate fixture * add labeler to add labels to prs based on areas impacted (opendatahub-io#248) * on rebase clean commented-by- labels (opendatahub-io#251) * [model registry] update namespace code and rearrange tests (opendatahub-io#247) * updates to test_registering_model() based on previous review comments * update namespace code and rearrange tests * remove unnecessary argument from function call (opendatahub-io#255) * on rebase clean commented-by- labels * remove unnecessary argument from function call * feat: add ocp_interop marker (opendatahub-io#260) * Lock file maintenance (opendatahub-io#259) * Lock file maintenance * fix: add marshmallow version --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: rnetser <rnetser@redhat.com> * [pre-commit.ci] pre-commit autoupdate (opendatahub-io#263) updates: - [github.com/astral-sh/ruff-pre-commit: v0.11.5 → v0.11.6](astral-sh/ruff-pre-commit@v0.11.5...v0.11.6) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ruth Netser <rnetser@redhat.com> * feat: add upgrade tests (opendatahub-io#258) * Remove flake8 ignore list (opendatahub-io#265) * fix: remove flake8 ignore * fix: remove flake8 ignore * [model server] Remove pod pre-checks for image pull and fix `TestServerlessScaleToZero` (opendatahub-io#256) * fix: update tests * fix: update tests * fix: update tests * fix: save test dep name * fix: minio mm external route * fix: address comemnt * fix: address comemnt * fix: address comemnt * Update python-dependencies (major) (opendatahub-io#267) * Update python-dependencies * fix: marshmellow version --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: rnetser <rnetser@redhat.com> * Adding Test For InferenceService Zero Initial Scale (opendatahub-io#262) * adding test for zero initial scale Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixing precommit error Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> * using label_selectors when getting deployment Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding argument names to func call and running pre-commit on all files Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> * fixing bug in ovms_kserve_inference_service function that was preventing isvcs from being created with 0 min-replicas Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> --------- Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * feat: move interop marker (opendatahub-io#268) * feat: Add upgrade tests for TrustyAIService (opendatahub-io#250) * feat: Add upgrade tests for TrustyAIService * Move upgrade README.md to docs/UPGRADE.md * fix: reuse kwargs in TrustyAIService fixture * fix: address comments, reuse kwargs, add docstrings --------- Co-authored-by: Ruth Netser <rnetser@redhat.com> * Fix ns deletion logic (opendatahub-io#272) * fix: fix resource deletion fixture logic * fix: fix resource deletion fixture logic * feat: fail on missing operators (opendatahub-io#257) * fix: update tests * fix: update tests * feat: fail on missing operators * fix: rename to dependent * fix: address comment * fix: add log on failure * fix: type in raise * fix: remove MR check * fix: remove MR check * fix: use package scope * Add basic InferenceGraph deployment check (opendatahub-io#233) * Add basic InferenceGraph deployment check This adds a test that deploys an InferenceGraph (IG), sends an inference request to the IG and verifies that the request succeeds. The deployed InferenceGraph is based on the example on the KServe documentation available in the following URL: https://kserve.github.io/website/0.15/modelserving/inference_graph/image_pipeline/. The example was adapted to run in openvino (which is a supported server in ODH), rather than TorchServe. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Use cloud storage in InferenceGraph test Use cloud storage for the models, instead of OCI * Feedback: Ruth * Feedback: Ruth * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply Ruth suggestions Acknowledgement to @rnester for these changes. * More feedback: Ruth * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ruth Netser <rnetser@redhat.com> * fix: address 503 (opendatahub-io#274) * [model server] Move to using unprivileged_client in tests (opendatahub-io#273) * feat: use unprivileged_client * feat: use unprivileged_client * feat: use unprivileged_client * feat: use unprivileged_client * feat: use unprivileged_client * feat: use unprivileged_client * fix: unpri selection * Update MinIo pod privileges to run on ocp 4.19 (opendatahub-io#277) * fix: add securityContext for minio pod * fix: minio on 4.19 * [model server] add multi node args check (opendatahub-io#276) * feat: add multi node args * feat: add multi node args * fix: add wait on delete * fix: update new test * [pre-commit.ci] pre-commit autoupdate (opendatahub-io#279) updates: - [github.com/astral-sh/ruff-pre-commit: v0.11.6 → v0.11.7](astral-sh/ruff-pre-commit@v0.11.6...v0.11.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ruth Netser <rnetser@redhat.com> * `verify_no_failed_pods` - exclude container failures when model mesh deployment (opendatahub-io#278) * fix: mm container * fix: update condition * feat: add test for incorrect DB TLS config in Trusty AI (opendatahub-io#221) * feat: add test for incorrect DB TLS config in Trusty AI * refactor: remove unused method from utils * feat: move TrustyAI test to own file * refactor: change name of db fixtures and deduplicate code * TrustyAI Service creation code refactor into own method * Move db secret setter to utils * Remove test from test_fairness as test moved to own file * docs: add description to TrustyAI invalid DB TLS config test * fix: check TrustyAIService container for Terminated status in lastStatus * fix: change name of terminal_state getter function * fix: change to a valid certificate and check for service failure * fix: address PR 221 reviewer feedback * revert wait_for_pods to wait_for_mariadb_pods * improve error checking logic * remove un-necessary wrapper function * docs: add docstring to create_trustyai_service method * docs: add docstring to trustyai_service_with_invalid_db_cert * fix: fix invalid return type for trustyai_db_ca_secret * feat: use retry decorator in validate trustyai_service_db_conn_failure method * fix: remove unnecessary return from validate db_conn_failure method * docs: add spacing between lines of docstring * refactor: create constants trustyai metrics and db storage config * refactor: address reviewer feedback - change docstring to correct formatting - remove len(0) check - no templating for error text * fix: use regex instead of in operator to check for error condition * docs: add correct formatting to docstrings * fix: use namespace.name instead of namespace in Pod.get * fix: remove \s from regex to check for spaces * refactor: add Raises section in docstring and use single string for pytest.fail * feat: use raise instead of pytest.fail - create new exception TooManyPodsError - create new exception UnexpectedFailureError - replace pytest.fail with raise and handle exceptions in retry - * fix: change default of teardown to True in TrustyAIService * docs: correct typo in trustyai docstring * docs: fix raises in docs and fix formatting * fix: fix create_trustyai_service namespace args issue * docs: add default for name arg in create tai svc func * [model server] Fix runtime request.param name to use external route (opendatahub-io#280) * fix: fix param name * fix: fix param name * feat: add certs when sending requests to TrustyAIService (opendatahub-io#266) * Wait for pods to be in running state before attempting to create ModelRegistry (opendatahub-io#270) * on rebase clean commented-by- labels * Wait for pods to be in running state before attempting to create ModelRegistry * Address Exception in thread Thread-1 (_monitor) error (opendatahub-io#286) * chore(deps): lock file maintenance (opendatahub-io#287) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * [pre-commit.ci] pre-commit autoupdate (opendatahub-io#292) updates: - [github.com/astral-sh/ruff-pre-commit: v0.11.7 → v0.11.8](astral-sh/ruff-pre-commit@v0.11.7...v0.11.8) - [github.com/gitleaks/gitleaks: v8.24.3 → v8.25.1](gitleaks/gitleaks@v8.24.3...v8.25.1) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Wait for dsc and dsci ready state in cluster_sanity check (opendatahub-io#293) * fix(workbenches): implement get_username for OpenShift <=4.14 (opendatahub-io#275) Turns out SelfSubjectReview is only available starting OpenShift 4.15. fixup incorporate User resource * RedHatQE/openshift-python-wrapper#2387 fixup incorporate SelfSubjectReview resource * RedHatQE/openshift-python-wrapper#2389 Co-authored-by: Debarati Basu-Nag <dbasunag@redhat.com> * replace the bot account with one owned by testdevops (opendatahub-io#291) * Fix for post upgarde operator check (opendatahub-io#297) Signed-off-by: Milind Waykole <mwaykole@mwaykole-thinkpadp1gen4i.bengluru.csb> Co-authored-by: Milind Waykole <mwaykole@mwaykole-thinkpadp1gen4i.bengluru.csb> * Add test for Model Registry RBAC for SA token (opendatahub-io#296) * feat: add RBAC test for SA token Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: address review comments Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: incorporate coderabbit suggestions Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: remove unneeded variable Signed-off-by: lugi0 <lgiorgi@redhat.com> * fix: remove excessive logs Signed-off-by: lugi0 <lgiorgi@redhat.com> --------- Signed-off-by: lugi0 <lgiorgi@redhat.com> * Support /build-push-pr-image comment to push image to quay for testing via jenkins (opendatahub-io#290) updates! 678b389 * Add tests for model_artifact update validations (opendatahub-io#284) * Add tests for model_artifact update validations * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates fixing pre-commit * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update package * minor updates * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * address review comments updates! 50ec24b updates! f3a6c3e updates! 792156f updates! 399aa10 updates! 5080e3b updates! c34f4e7 updates! a1d7baa --------- Signed-off-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> Signed-off-by: Milind Waykole <mwaykole@mwaykole-thinkpadp1gen4i.bengluru.csb> Signed-off-by: lugi0 <lgiorgi@redhat.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Jiri Daněk <jdanek@redhat.com> Co-authored-by: Ruth Netser <rnetser@redhat.com> Co-authored-by: Luca Giorgi <lgiorgi@redhat.com> Co-authored-by: Brett Thompson <196701379+brettmthompson@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adolfo Aguirrezabal <aaguirre@redhat.com> Co-authored-by: Edgar Hernández <ehernand@redhat.com> Co-authored-by: Shelton Cyril <sheltoncyril@gmail.com> Co-authored-by: Milind Waykole <mwaykole@redhat.com> Co-authored-by: Milind Waykole <mwaykole@mwaykole-thinkpadp1gen4i.bengluru.csb>
1 parent e41c535 commit c02d473

File tree

10 files changed

+1029
-571
lines changed

10 files changed

+1029
-571
lines changed

.flake8

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ fcn_exclude_functions =
1919
re,
2020
logging,
2121
LOGGER,
22+
BASIC_LOGGER,
2223
os,
2324
json,
2425
pytest,

conftest.py

Lines changed: 98 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,19 @@
22
import os
33
import pathlib
44
import shutil
5+
import datetime
6+
import traceback
57

68
import shortuuid
9+
from _pytest.runner import CallInfo
10+
from _pytest.reports import TestReport
711
from pytest import (
812
Parser,
913
Session,
1014
FixtureRequest,
1115
FixtureDef,
1216
Item,
17+
Collector,
1318
Config,
1419
CollectReport,
1520
)
@@ -18,8 +23,15 @@
1823
from pytest_testconfig import config as py_config
1924

2025
from utilities.constants import KServeDeploymentType
26+
from utilities.database import Database
2127
from utilities.logger import separator, setup_logging
22-
28+
from utilities.must_gather_collector import (
29+
set_must_gather_collector_directory,
30+
set_must_gather_collector_values,
31+
get_must_gather_collector_dir,
32+
collect_rhoai_must_gather,
33+
get_base_dir,
34+
)
2335

2436
LOGGER = logging.getLogger(name=__name__)
2537
BASIC_LOGGER = logging.getLogger(name="basic")
@@ -31,6 +43,7 @@ def pytest_addoption(parser: Parser) -> None:
3143
runtime_group = parser.getgroup(name="Runtime details")
3244
upgrade_group = parser.getgroup(name="Upgrade options")
3345
platform_group = parser.getgroup(name="Platform")
46+
must_gather_group = parser.getgroup(name="MustGather")
3447
cluster_sanity_group = parser.getgroup(name="ClusterSanity")
3548

3649
# AWS config and credentials options
@@ -118,6 +131,12 @@ def pytest_addoption(parser: Parser) -> None:
118131
"--applications-namespace",
119132
help="RHOAI/ODH applications namespace",
120133
)
134+
must_gather_group.addoption(
135+
"--collect-must-gather",
136+
help="Indicate if must-gather should be collected on failure.",
137+
action="store_true",
138+
default=False,
139+
)
121140

122141
# Cluster sanity options
123142
cluster_sanity_group.addoption(
@@ -205,14 +224,22 @@ def _add_upgrade_test(_item: Item, _upgrade_deployment_modes: list[str]) -> bool
205224

206225

207226
def pytest_sessionstart(session: Session) -> None:
208-
tests_log_file = session.config.getoption("log_file") or "pytest-tests.log"
227+
log_file = session.config.getoption("log_file") or "pytest-tests.log"
228+
tests_log_file = os.path.join(get_base_dir(), log_file)
229+
LOGGER.info(f"Writing tests log to {tests_log_file}")
209230
if os.path.exists(tests_log_file):
210231
pathlib.Path(tests_log_file).unlink()
211-
232+
if session.config.getoption("--collect-must-gather"):
233+
session.config.option.must_gather_db = Database()
212234
session.config.option.log_listener = setup_logging(
213235
log_file=tests_log_file,
214236
log_level=session.config.getoption("log_cli_level") or logging.INFO,
215237
)
238+
must_gather_dict = set_must_gather_collector_values()
239+
shutil.rmtree(
240+
path=must_gather_dict["must_gather_base_directory"],
241+
ignore_errors=True,
242+
)
216243

217244

218245
def pytest_fixture_setup(fixturedef: FixtureDef[Any], request: FixtureRequest) -> None:
@@ -226,9 +253,23 @@ def pytest_runtest_setup(item: Item) -> None:
226253
2. Adds `fail_if_missing_dependent_operators` fixture for Serverless tests.
227254
3. Adds fixtures to enable KServe/model mesh in DSC for model server tests.
228255
"""
229-
230256
BASIC_LOGGER.info(f"\n{separator(symbol_='-', val=item.name)}")
231257
BASIC_LOGGER.info(f"{separator(symbol_='-', val='SETUP')}")
258+
if item.config.getoption("--collect-must-gather"):
259+
# set must-gather collection directory:
260+
set_must_gather_collector_directory(item=item, directory_path=get_must_gather_collector_dir())
261+
262+
# At the begining of setup work, insert current epoch time into the database to indicate test
263+
# start time
264+
265+
try:
266+
db = item.config.option.must_gather_db
267+
db.insert_test_start_time(
268+
test_name=f"{item.fspath}::{item.name}",
269+
start_time=int(datetime.datetime.now().timestamp()),
270+
)
271+
except Exception as db_exception:
272+
LOGGER.error(f"Database error: {db_exception}. Must-gather collection may not be accurate")
232273

233274
if KServeDeploymentType.SERVERLESS.lower() in item.keywords:
234275
item.fixturenames.insert(0, "fail_if_missing_dependent_operators")
@@ -252,6 +293,10 @@ def pytest_runtest_call(item: Item) -> None:
252293

253294
def pytest_runtest_teardown(item: Item) -> None:
254295
BASIC_LOGGER.info(f"{separator(symbol_='-', val='TEARDOWN')}")
296+
# reset must-gather collector after each tests
297+
py_config["must_gather_collector"]["collector_directory"] = py_config["must_gather_collector"][
298+
"must_gather_base_directory"
299+
]
255300

256301

257302
def pytest_report_teststatus(report: CollectReport, config: Config) -> None:
@@ -276,10 +321,56 @@ def pytest_sessionfinish(session: Session, exitstatus: int) -> None:
276321
session.config.option.log_listener.stop()
277322
if session.config.option.setupplan or session.config.option.collectonly:
278323
return
279-
base_dir = py_config["tmp_base_dir"]
280-
LOGGER.info(f"Deleting pytest base dir {base_dir}")
281-
shutil.rmtree(path=base_dir, ignore_errors=True)
324+
if session.config.getoption("--collect-must-gather"):
325+
db = session.config.option.must_gather_db
326+
file_path = db.database_file_path
327+
LOGGER.info(f"Removing database file path {file_path}")
328+
if os.path.exists(file_path):
329+
os.remove(file_path)
330+
# clean up the empty folders
331+
collector_directory = py_config["must_gather_collector"]["must_gather_base_directory"]
332+
if os.path.exists(collector_directory):
333+
for root, dirs, files in os.walk(collector_directory, topdown=False):
334+
for _dir in dirs:
335+
dir_path = os.path.join(root, _dir)
336+
if not os.listdir(dir_path):
337+
shutil.rmtree(path=dir_path, ignore_errors=True)
338+
LOGGER.info(f"Deleting pytest base dir {session.config.option.basetemp}")
339+
shutil.rmtree(path=session.config.option.basetemp, ignore_errors=True)
282340

283341
reporter: Optional[TerminalReporter] = session.config.pluginmanager.get_plugin("terminalreporter")
284342
if reporter:
285343
reporter.summary_stats()
344+
345+
346+
def calculate_must_gather_timer(test_start_time: int) -> int:
347+
default_duration = 300
348+
if test_start_time > 0:
349+
duration = int(datetime.datetime.now().timestamp()) - test_start_time
350+
return duration if duration > 60 else default_duration
351+
else:
352+
LOGGER.warning(f"Could not get start time of test. Collecting must-gather for last {default_duration}s")
353+
return default_duration
354+
355+
356+
def pytest_exception_interact(node: Item | Collector, call: CallInfo[Any], report: TestReport | CollectReport) -> None:
357+
LOGGER.error(report.longreprtext)
358+
if node.config.getoption("--collect-must-gather"):
359+
test_name = f"{node.fspath}::{node.name}"
360+
LOGGER.info(f"Must-gather collection is enabled for {test_name}.")
361+
362+
try:
363+
db = node.config.option.must_gather_db
364+
test_start_time = db.get_test_start_time(test_name=test_name)
365+
except Exception as db_exception:
366+
test_start_time = 0
367+
LOGGER.warning(f"Error: {db_exception} in accessing database.")
368+
369+
try:
370+
collect_rhoai_must_gather(
371+
since=calculate_must_gather_timer(test_start_time=test_start_time),
372+
target_dir=os.path.join(get_must_gather_collector_dir(), "pytest_exception_interact"),
373+
)
374+
375+
except Exception as current_exception:
376+
LOGGER.warning(f"Failed to collect logs: {test_name}: {current_exception} {traceback.format_exc()}")

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ dependencies = [
6565
"jira>=3.8.0",
6666
"openshift-python-wrapper>=11.0.50",
6767
"semver>=3.0.4",
68+
"sqlalchemy>=2.0.40",
6869
"pytest-order>=1.3.0",
6970
"marshmallow==3.26.1,<4", # this version is needed for pytest-jira
7071
]

tests/global_config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
distribution: str = "downstream"
44
applications_namespace: str = "redhat-ods-applications" # overwritten in conftest.py if distribution is upstream
55
dsc_name: str = "default-dsc"
6+
must_gather_base_dir: str = "must-gather-base-dir"
67
dsci_name: str = "default-dsci"
78
dependent_operators: str = "servicemeshoperator,authorino-operator,serverless-operator"
8-
99
use_unprivileged_client: bool = True
1010

1111
for _dir in dir():

utilities/constants.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -272,3 +272,5 @@ class RunTimeConfig:
272272
},
273273
"commands": {"GRPC": "vllm_tgis_adapter"},
274274
}
275+
276+
RHOAI_OPERATOR_NAMESPACE = "redhat-ods-operator"

utilities/database.py

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
import logging
2+
import os
3+
4+
from sqlalchemy import Integer, String, create_engine
5+
from sqlalchemy.orm import Mapped, Session, mapped_column
6+
from sqlalchemy.orm import DeclarativeBase
7+
from utilities.must_gather_collector import get_base_dir
8+
9+
LOGGER = logging.getLogger(__name__)
10+
11+
TEST_DB = "opendatahub-tests.db"
12+
13+
14+
class Base(DeclarativeBase):
15+
pass
16+
17+
18+
class OpenDataHubTestTable(Base):
19+
__tablename__ = "OpenDataHubTestTable"
20+
21+
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True, nullable=False)
22+
test_name: Mapped[str] = mapped_column(String(500))
23+
start_time: Mapped[int] = mapped_column(Integer, nullable=False)
24+
25+
26+
class Database:
27+
def __init__(self, database_file_name: str = TEST_DB, verbose: bool = True) -> None:
28+
self.database_file_path = os.path.join(get_base_dir(), database_file_name)
29+
self.connection_string = f"sqlite:///{self.database_file_path}"
30+
self.verbose = verbose
31+
self.engine = create_engine(url=self.connection_string, echo=self.verbose)
32+
Base.metadata.create_all(bind=self.engine)
33+
34+
def insert_test_start_time(self, test_name: str, start_time: int) -> None:
35+
with Session(bind=self.engine) as db_session:
36+
new_table_entry = OpenDataHubTestTable(test_name=test_name, start_time=start_time)
37+
db_session.add(new_table_entry)
38+
db_session.commit()
39+
40+
def get_test_start_time(self, test_name: str) -> int:
41+
with Session(bind=self.engine) as db_session:
42+
result_row = (
43+
db_session.query(OpenDataHubTestTable)
44+
.with_entities(OpenDataHubTestTable.start_time)
45+
.filter_by(test_name=test_name)
46+
.first()
47+
)
48+
if result_row:
49+
start_time_value = result_row[0]
50+
else:
51+
start_time_value = 0
52+
LOGGER.warning(f"No test found with name: {test_name}")
53+
return start_time_value

utilities/exceptions.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,12 @@ def __str__(self) -> str:
9696
return f"Failed to log in as user {self.user}."
9797

9898

99+
class InvalidArgumentsError(Exception):
100+
"""Raised when mutually exclusive or invalid argument combinations are passed."""
101+
102+
pass
103+
104+
99105
class ResourceNotReadyError(Exception):
100106
pass
101107

utilities/infra.py

Lines changed: 87 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
1+
import base64
12
import json
3+
import os
24
import re
35
import shlex
6+
import tempfile
47
from contextlib import contextmanager
58
from functools import cache
6-
from typing import Any, Callable, Generator, Optional, Set
9+
from typing import Any, Generator, Optional, Set, Callable
10+
from json import JSONDecodeError
711

812
import kubernetes
913
import pytest
@@ -45,7 +49,8 @@
4549
from semver import Version
4650
from simple_logger.logger import get_logger
4751

48-
from utilities.constants import ApiGroups, Labels, Timeout
52+
from ocp_resources.subscription import Subscription
53+
from utilities.constants import ApiGroups, Labels, Timeout, RHOAI_OPERATOR_NAMESPACE
4954
from utilities.constants import KServeDeploymentType
5055
from utilities.constants import Annotations
5156
from utilities.exceptions import (
@@ -851,6 +856,36 @@ def wait_for_isvc_pods(client: DynamicClient, isvc: InferenceService, runtime_na
851856
return get_pods_by_isvc_label(client=client, isvc=isvc, runtime_name=runtime_name)
852857

853858

859+
def get_rhods_subscription() -> Subscription | None:
860+
subscriptions = Subscription.get(dyn_client=get_client(), namespace=RHOAI_OPERATOR_NAMESPACE)
861+
if subscriptions:
862+
for subscription in subscriptions:
863+
LOGGER.info(f"Checking subscription {subscription.name}")
864+
if subscription.name.startswith(tuple(["rhods-operator", "rhoai-operator"])):
865+
return subscription
866+
867+
LOGGER.warning("No RHOAI subscription found. Potentially ODH cluster")
868+
return None
869+
870+
871+
def get_rhods_operator_installed_csv() -> ClusterServiceVersion | None:
872+
subscription = get_rhods_subscription()
873+
if subscription:
874+
csv_name = subscription.instance.status.installedCSV
875+
LOGGER.info(f"Expected CSV: {csv_name}")
876+
return ClusterServiceVersion(name=csv_name, namespace=RHOAI_OPERATOR_NAMESPACE, ensure_exists=True)
877+
return None
878+
879+
880+
def get_rhods_csv_version() -> Version | None:
881+
rhoai_csv = get_rhods_operator_installed_csv()
882+
if rhoai_csv:
883+
LOGGER.info(f"RHOAI CSV version: {rhoai_csv.instance.spec.version}")
884+
return Version.parse(version=rhoai_csv.instance.spec.version)
885+
LOGGER.warning("No RHOAI CSV found. Potentially ODH cluster")
886+
return None
887+
888+
854889
@retry(
855890
wait_timeout=120,
856891
sleep=5,
@@ -930,3 +965,53 @@ def verify_cluster_sanity(
930965

931966
# TODO: Write to file to easily report the failure in jenkins
932967
pytest.exit(reason=error_msg, returncode=return_code)
968+
969+
970+
def get_openshift_pull_secret(client: DynamicClient = None) -> Secret:
971+
openshift_config_namespace = "openshift-config"
972+
pull_secret_name = "pull-secret" # pragma: allowlist secret
973+
secret = Secret(
974+
client=client or get_client(),
975+
name=pull_secret_name,
976+
namespace=openshift_config_namespace,
977+
)
978+
assert secret.exists, f"Pull-secret {pull_secret_name} not found in namespace {openshift_config_namespace}"
979+
return secret
980+
981+
982+
def generate_openshift_pull_secret_file(client: DynamicClient = None) -> str:
983+
pull_secret = get_openshift_pull_secret(client=client)
984+
pull_secret_path = tempfile.mkdtemp(suffix="odh-pull-secret")
985+
json_file = os.path.join(pull_secret_path, "pull-secrets.json")
986+
secret = base64.b64decode(pull_secret.instance.data[".dockerconfigjson"]).decode(encoding="utf-8")
987+
with open(file=json_file, mode="w") as outfile:
988+
outfile.write(secret)
989+
return json_file
990+
991+
992+
def get_oc_image_info(
993+
image: str,
994+
architecture: str,
995+
pull_secret: str | None = None,
996+
) -> Any:
997+
def _get_image_json(cmd: str) -> Any:
998+
return json.loads(run_command(command=shlex.split(cmd), check=False)[1])
999+
1000+
base_command = f"oc image -o json info {image} --filter-by-os {architecture}"
1001+
if pull_secret:
1002+
base_command = f"{base_command} --registry-config={pull_secret}"
1003+
1004+
sample = None
1005+
try:
1006+
for sample in TimeoutSampler(
1007+
wait_timeout=10,
1008+
sleep=5,
1009+
exceptions_dict={JSONDecodeError: [], TypeError: []},
1010+
func=_get_image_json,
1011+
cmd=base_command,
1012+
):
1013+
if sample:
1014+
return sample
1015+
except TimeoutExpiredError:
1016+
LOGGER.error(f"Failed to parse {base_command}")
1017+
raise

0 commit comments

Comments
 (0)