Skip to content

Commit ad2819e

Browse files
authored
Merge branch 'main' into fix-tls-on-fips
2 parents cb75d9a + 7fd9561 commit ad2819e

File tree

18 files changed

+632
-172
lines changed

18 files changed

+632
-172
lines changed

AGENTS.md

Lines changed: 95 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,97 @@
11
# Overview
2+
23
This is a testing repo for OpenDataHub and OpenShift AI, which are MLOps platforms for OpenShift.
3-
The tests contained in the repo are high-level integration tests at the Kubernetes API level.
4-
5-
# Documentation
6-
All the general information about the repo is contained in the /docs directory.
7-
At the start of each session, consider if you need to consult any of these files in order to answer:
8-
- [Guidelines for Getting Started](./docs/GETTING_STARTED.md)
9-
- [Developer Guide](./docs/DEVELOPER_GUIDE.md)
10-
- [Style Guide](./docs/STYLE_GUIDE.md)
11-
12-
# Specific Instructions
13-
- Avoid unnecessary complexity: Aim for the simplest solution that works, while keeping the code clean.
14-
- Avoid obvious comments: Only add comments to explain especially complex code blocks.
15-
- Maintain code consistency: Follow existing code patterns and architecture.
16-
- Maintain locality of behavior: Keep code close to where it's used.
17-
- Make small, focused changes, unless explicitly asked otherwise.
18-
- Keep security in mind: Avoid filtering sensitive information and running destructive commands.
19-
- When in doubt about something, ask the user.
4+
The tests are high-level integration tests at the Kubernetes API level.
5+
6+
You are an expert QE engineer writing maintainable pytest tests that other engineers can understand without deep domain knowledge.
7+
8+
## Commands
9+
10+
### Validation (run before committing)
11+
```bash
12+
# Run all pre-commit checks
13+
pre-commit run --all-files
14+
15+
# Run tox (CI validation)
16+
tox
17+
```
18+
19+
### Test Execution
20+
```bash
21+
# Collect tests without running (verify structure)
22+
uv run pytest --collect-only
23+
24+
# Run specific marker
25+
uv run pytest -m smoke
26+
uv run pytest -m "model_serving and tier1"
27+
28+
# Run with setup plan (debug fixtures)
29+
uv run pytest --setup-plan tests/model_serving/
30+
```
31+
32+
## Project Structure
33+
34+
```text
35+
tests/ # Test modules by component
36+
├── conftest.py # All shared fixtures
37+
├── <component>/ # Component test directories
38+
│ ├── conftest.py # Component-scoped fixtures
39+
│ └── test_*.py # Test files
40+
| └── utils.py # Component-specific utility functions
41+
utilities/ # Shared utility functions
42+
└── <topic>_utils.py # Topic-specific utility functions
43+
```
44+
45+
## Essential Patterns
46+
47+
### Tests
48+
- Every test MUST have a docstring explaining what it tests (see `tests/cluster_health/test_cluster_health.py`)
49+
- Apply relevant markers from `pytest.ini`: tier (`smoke`, `sanity`, `tier1`, `tier2`), component (`model_serving`, `model_registry`, `llama_stack`), infrastructure (`gpu`, `parallel`, `slow`)
50+
- Use Given-When-Then format in docstrings for behavioral clarity
51+
52+
### Fixtures
53+
- Fixture names MUST be nouns: `storage_secret` not `create_secret`
54+
- Use context managers for resource lifecycle (see `tests/conftest.py:544-550` for pattern)
55+
- Fixtures do one thing only—compose them rather than nesting
56+
- Use narrowest scope that meets the need: function > class > module > session
57+
58+
### Kubernetes Resources
59+
- Use [openshift-python-wrapper](https://github.com/RedHatQE/openshift-python-wrapper) for all K8s API calls
60+
- Resource lifecycle MUST use context managers to ensure cleanup
61+
- Use `oc` CLI only when wrapper is not relevant (e.g., must-gather)
62+
63+
## Common Pitfalls
64+
65+
- **ERROR vs FAILED**: Pytest reports fixture failures as ERROR, test failures as FAILED
66+
- **Heavy imports**: Don't import heavy resources at module level; defer to fixture scope
67+
- **Flaky tests**: Use `pytest.skip()` with `@pytest.mark.jira("PROJ-123")`, never delete
68+
- **Fixture scope**: Session fixtures in `tests/conftest.py` run once for entire suite—modify carefully
69+
70+
## Boundaries
71+
72+
### ✅ Always
73+
- Follow existing patterns before introducing new approaches
74+
- Add type annotations (mypy strict enforced)
75+
- Write Google-format docstrings for tests and fixtures
76+
- Run `pre-commit run --all-files` before suggesting changes
77+
78+
### ⚠️ Ask First
79+
- Adding new dependencies to `pyproject.toml`
80+
- Creating new `conftest.py` files
81+
- Moving fixtures to shared locations
82+
- Adding new markers to `pytest.ini`
83+
- Modifying session-scoped fixtures
84+
85+
### 🚫 Never
86+
- Remove or modify existing tests without explicit request
87+
- Add code that isn't immediately used (YAGNI)
88+
- Log secrets, tokens, or credentials
89+
- Skip pre-commit or type checking
90+
- Create abstractions for single-use code
91+
92+
## Documentation Reference
93+
94+
Consult these for detailed guidance:
95+
- [Constitution](./CONSTITUTION.md) - Non-negotiable principles (supersedes all other docs)
96+
- [Developer Guide](./docs/DEVELOPER_GUIDE.md) - Contribution workflow, fixture examples
97+
- [Style Guide](./docs/STYLE_GUIDE.md) - Naming, typing, docstrings

CONSTITUTION.md

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
# OpenDataHub-Tests Constitution
2+
3+
This constitution defines the non-negotiable principles and governance rules for the opendatahub-tests repository. It applies to all test development, whether performed by humans or AI assistants.
4+
5+
## Core Principles
6+
7+
### I. Simplicity First
8+
9+
All changes MUST favor the simplest solution that works. Complexity MUST be justified.
10+
11+
- Aim for the simplest solution that works while keeping the code clean
12+
- Do not prepare code for the future just because it may be useful (YAGNI)
13+
- Every function, variable, fixture, and test written MUST be used, or else removed
14+
- Flexible code MUST NOT come at the expense of readability
15+
16+
**Rationale**: The codebase is maintained by multiple teams; simplicity ensures maintainability and reduces cognitive load.
17+
18+
### II. Code Consistency
19+
20+
All changes MUST follow existing code patterns and architecture.
21+
22+
- Follow the [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html)
23+
- Use pre-commit hooks to enforce style (ruff, mypy, flake8)
24+
- Use absolute import paths; import specific functions rather than modules
25+
- Use descriptive names; meaningful names are better than short names
26+
- Add type annotations to all new code; follow the rules defined in [pyproject.toml](./pyproject.toml)
27+
28+
**Rationale**: Consistent patterns reduce the learning curve and prevent architectural drift.
29+
30+
### III. Test Clarity and Dependencies
31+
32+
Each test MUST verify a single aspect of the product and may be dependent on other tests.
33+
34+
- Tests MUST have a clear purpose and be easy to understand
35+
- Tests MUST be properly documented with docstrings explaining what the test does
36+
- When test dependencies exist, use pytest-dependency plugin to declare them explicitly, encourage use of dependency marker(s) when possible
37+
- Group related tests in classes only when they share fixtures; never group unrelated tests
38+
39+
### IV. Fixture Discipline
40+
41+
Fixtures MUST do one thing only and follow proper scoping.
42+
43+
- Fixture names MUST be nouns describing what they provide (e.g., `storage_secret` not `create_secret`)
44+
- Fixtures MUST handle setup and teardown using context managers where appropriate
45+
- Use the narrowest fixture scope that meets the need (function > class > module > session)
46+
- Conftest.py files MUST contain fixtures only; no utility functions or constants
47+
- Use `request.param` with dict structures for parameterized fixtures
48+
49+
**Rationale**: Single-responsibility fixtures are easier to debug, reuse, and compose.
50+
51+
### V. Interacting with Kubernetes Resources
52+
53+
All cluster interactions MUST use openshift-python-wrapper or oc CLI.
54+
55+
- Use [openshift-python-wrapper](https://github.com/RedHatQE/openshift-python-wrapper) for all K8s API calls
56+
- For missing resources, generate them using class_generator and contribute to wrapper
57+
- Use oc CLI only when wrapper is not relevant (e.g., must-gather generation)
58+
- Resource lifecycle MUST be managed via context managers to ensure cleanup
59+
60+
**Rationale**: Consistent API abstraction ensures portability between ODH (upstream) and RHOAI (downstream).
61+
62+
### VI. Locality of Behavior
63+
64+
Keep code close to where it is used.
65+
66+
- Keep functions and fixtures close to where they're used initially
67+
- Move to shared locations (utilities, common conftest) only when multiple modules need them
68+
- Avoid creating abstractions prematurely
69+
- Small, focused changes are preferred unless explicitly asked otherwise
70+
71+
**Rationale**: Locality reduces navigation overhead and makes the impact of changes obvious.
72+
73+
### VII. Security Awareness
74+
75+
All code MUST consider security implications.
76+
77+
- Never log/expose secrets; redact/mask if printing is unavoidable
78+
- Avoid running destructive commands without explicit user confirmation
79+
- Use detect-secrets and gitleaks pre-commit hooks to prevent secret leakage
80+
- Test code MUST NOT introduce vulnerabilities into the tested systems
81+
82+
**Rationale**: Tests interact with production-like clusters; security lapses can have real consequences.
83+
84+
## Test Development Standards
85+
86+
### Test Documentation
87+
88+
- Every test or test class MUST have a docstring explaining what it tests
89+
- Docstrings MUST be understandable by engineers from other components, managers, or PMs
90+
- Use Google-format docstrings
91+
- Comments are allowed only for complex code blocks (e.g., complex regex)
92+
93+
### Test Markers
94+
95+
- All tests MUST apply relevant markers from pytest.ini
96+
- Use tier markers (smoke, sanity, tier1, tier2) to indicate test priority
97+
- Use component markers (model_explainability, llama_stack, rag) for ownership
98+
- Use infrastructure markers (gpu, parallel, slow) for execution filtering
99+
100+
### Test Organization
101+
102+
- Tests are organized by component in `tests/<component>/`
103+
- Each component has its own conftest.py for scoped fixtures
104+
- Utilities go in `utilities/` with topic-specific modules
105+
106+
## AI-Assisted Development Guidelines
107+
108+
### Developer Responsibility
109+
110+
Developers are ultimately responsible for all code, regardless of whether AI tools assisted.
111+
112+
- Always assume AI-generated code is unsafe and incorrect until verified
113+
- Double-check all AI suggestions against project patterns and this constitution
114+
- AI tools MUST be guided by AGENTS.md (symlink to CLAUDE.md if needed)
115+
116+
### AI Code Generation Rules
117+
118+
- AI MUST follow existing patterns; never introduce new architectural concepts without justification
119+
- AI MUST NOT add unnecessary complexity or "helpful" abstractions
120+
- AI-generated tests MUST have proper docstrings and markers
121+
- AI MUST ask when in doubt about requirements or patterns
122+
123+
### Specification-Driven Development
124+
125+
When adopting AI-driven spec development:
126+
127+
- Specifications MUST be in structured format (YAML/JSON with defined schema)
128+
- Tests MUST include requirement traceability (Polarion, Jira markers)
129+
- Docstrings MUST follow Given-When-Then pattern for behavioral clarity
130+
- Generated tests MUST pass pre-commit checks before review
131+
132+
## Governance
133+
134+
### Constitution Authority
135+
136+
This constitution supersedes all other practices when there is a conflict. All PRs and reviews MUST verify compliance.
137+
138+
### Amendment Process
139+
140+
1. Propose changes via PR to `CONSTITUTION.md`
141+
2. Changes require review by at least two maintainers
142+
3. Breaking changes (principle removal/redefinition) require team discussion
143+
144+
### Versioning Policy
145+
146+
No versioning policy is enforced.
147+
148+
### Compliance Review
149+
150+
- All PRs MUST be verified against constitution principles
151+
- Pre-commit hooks enforce code quality standards
152+
- CI (tox) validates test structure and typing
153+
- Two reviewers required; verified label required before merge
154+
155+
### Guidance Reference
156+
157+
For development runtime guidance, consult:
158+
- [AGENTS.md](./AGENTS.md) for AI assistant instructions
159+
- [DEVELOPER_GUIDE.md](./docs/DEVELOPER_GUIDE.md) for contribution details
160+
- [STYLE_GUIDE.md](./docs/STYLE_GUIDE.md) for code style
161+
162+
**Version**: 1.0.0 | **Ratified**: 2026-01-08 | **Last Amended**: 2026-01-08

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ dependencies = [
7070
"marshmallow==3.26.2,<4", # this version is needed for pytest-jira
7171
"pytest-html>=4.1.1",
7272
"fire",
73-
"llama_stack_client>=0.3.0,<0.4",
73+
"llama_stack_client>=0.4.0,<0.5",
7474
"pytest-xdist==3.8.0",
7575
"dictdiffer>=0.9.0",
7676
"pytest>=9.0.0",

tests/llama_stack/conftest.py

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -652,7 +652,7 @@ def llama_stack_models(unprivileged_llama_stack_client: LlamaStackClient) -> Mod
652652
"""
653653
models = unprivileged_llama_stack_client.models.list()
654654

655-
model_id = next(m for m in models if m.api_model_type == "llm").identifier
655+
model_id = next(m for m in models if m.custom_metadata["model_type"] == "llm").id
656656

657657
# Ensure getting the right embedding model depending on the available providers
658658
providers = unprivileged_llama_stack_client.providers.list()
@@ -664,11 +664,15 @@ def llama_stack_models(unprivileged_llama_stack_client: LlamaStackClient) -> Mod
664664
else:
665665
raise ValueError("No embedding provider found")
666666

667-
embedding_model = next(m for m in models if m.api_model_type == "embedding" and m.provider_id == target_provider_id)
668-
embedding_dimension = float(embedding_model.metadata["embedding_dimension"])
667+
embedding_model = next(
668+
m
669+
for m in models
670+
if m.custom_metadata["model_type"] == "embedding" and m.custom_metadata["provider_id"] == target_provider_id
671+
)
672+
embedding_dimension = int(embedding_model.custom_metadata["embedding_dimension"])
669673

670674
LOGGER.info(f"Detected model: {model_id}")
671-
LOGGER.info(f"Detected embedding_model: {embedding_model.identifier}")
675+
LOGGER.info(f"Detected embedding_model: {embedding_model.id}")
672676
LOGGER.info(f"Detected embedding_dimension: {embedding_dimension}")
673677

674678
return ModelInfo(model_id=model_id, embedding_model=embedding_model, embedding_dimension=embedding_dimension)
@@ -700,7 +704,7 @@ def vector_store(
700704
vector_store = unprivileged_llama_stack_client.vector_stores.create(
701705
name="test_vector_store",
702706
extra_body={
703-
"embedding_model": llama_stack_models.embedding_model.identifier,
707+
"embedding_model": llama_stack_models.embedding_model.id,
704708
"embedding_dimension": llama_stack_models.embedding_dimension,
705709
"provider_id": vector_io_provider,
706710
},

tests/llama_stack/constants.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ class ModelInfo(NamedTuple):
2626

2727
model_id: str
2828
embedding_model: Model
29-
embedding_dimension: float # API returns float (e.g., 768.0) despite being conceptually an integer
29+
embedding_dimension: int # API returns integer (e.g., 768)
3030

3131

3232
LLS_CORE_POD_FILTER: str = "app=llama-stack"

tests/llama_stack/inference/test_embeddings.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ def test_inference_embeddings(
5050

5151
# Embed single input text with encoding_format=float (the returned embedding item is a list of floats)
5252
embeddings_response = unprivileged_llama_stack_client.embeddings.create(
53-
model=llama_stack_models.embedding_model.identifier,
53+
model=llama_stack_models.embedding_model.id,
5454
input="The food was delicious and the waiter...",
5555
encoding_format="float",
5656
)
@@ -63,7 +63,7 @@ def test_inference_embeddings(
6363
# Embed single input text with encoding_format=base64 (the returned embedding item is
6464
# a single base64-encoded string)
6565
embeddings_response = unprivileged_llama_stack_client.embeddings.create(
66-
model=llama_stack_models.embedding_model.identifier,
66+
model=llama_stack_models.embedding_model.id,
6767
input="The food was delicious and the waiter...",
6868
encoding_format="base64",
6969
)
@@ -74,7 +74,7 @@ def test_inference_embeddings(
7474
# Embed multiple input sets with encoding_format=float (each returned embedding item is a list of floats)
7575
input_list = ["Input text 1", "Input text 1", "Input text 1"]
7676
embeddings_response = unprivileged_llama_stack_client.embeddings.create(
77-
model=llama_stack_models.embedding_model.identifier, input=input_list, encoding_format="float"
77+
model=llama_stack_models.embedding_model.id, input=input_list, encoding_format="float"
7878
)
7979
assert isinstance(embeddings_response, CreateEmbeddingsResponse)
8080
assert len(embeddings_response.data) == len(input_list)
@@ -86,7 +86,7 @@ def test_inference_embeddings(
8686
# Embed multiple input sets with base64 encoding format (each returned embedding a single base64-encoded string)
8787
input_list = ["Input text 1", "Input text 1", "Input text 1"]
8888
embeddings_response = unprivileged_llama_stack_client.embeddings.create(
89-
model=llama_stack_models.embedding_model.identifier, input=input_list, encoding_format="base64"
89+
model=llama_stack_models.embedding_model.id, input=input_list, encoding_format="base64"
9090
)
9191
assert isinstance(embeddings_response, CreateEmbeddingsResponse)
9292
assert len(embeddings_response.data) == len(input_list)

0 commit comments

Comments
 (0)