Skip to content

fix: enable positive_data_acceptance fuzz checks via spec tightening and test hooks#1720

Closed
lugi0 wants to merge 2 commits into
opendatahub-io:mainfrom
lugi0:fix/rhoaieng-58824-spec-tightening
Closed

fix: enable positive_data_acceptance fuzz checks via spec tightening and test hooks#1720
lugi0 wants to merge 2 commits into
opendatahub-io:mainfrom
lugi0:fix/rhoaieng-58824-spec-tightening

Conversation

@lugi0
Copy link
Copy Markdown

@lugi0 lugi0 commented Apr 24, 2026

Summary

Resolves RHOAIENG-58824: Schemathesis's positive_data_acceptance check generates schema-conformant requests and expects 2xx responses, but 53 tests fail with RejectedPositiveData. This PR closes the spec-server contract gaps and adds fuzz test hooks for server-side limitations that cannot be expressed in OpenAPI 3.0.

Result: 53 → 0 stateless fuzz test failures.

Root causes found and addressed

Spec tightening (37 failures)

Root cause Failures Fix
Path parameter IDs defined as type: string with no constraints; server requires numeric int32 19 Added format: int64, pattern: "^[1-9][0-9]{0,8}$" to all ID path params
pageSize query param defined as type: string; server parses as int32 8 Changed to type: integer, format: int32, minimum: 1, maximum: 2147483647
Singleton GET endpoints mark all params optional; server requires externalId OR name+parentResourceId 6 Schemathesis hook enforces valid param combinations
filterQuery allows any string; server lexer only accepts ASCII grammar 3 Added pattern: "^[\x20-\x7E]*$"
nextPageToken has no format constraint; server expects base64 1 Improved description (opaque cursor)

Spec + hook fixes (14 failures)

Root cause Failures Fix
Required string fields accept empty string ""; server's IsZeroValue() treats it as missing 10 Added minLength: 1 to required string fields in Create schemas
Catalog server missing validation middleware for null bytes 3 Added middleware.ValidationMiddleware() to catalog/cmd/catalog.go
Artifact oneOf discriminator not enforced 1 Hook randomizes across valid artifact types for POST; ensures required fields per type

Additional issues found during re-verification

Issue Fix
MetadataIntValue spec says format: int64 but server uses int32 (StringToInt32) Changed to format: int32, pattern: "^-?[0-9]{1,9}$"
MetadataValue types allow fuzz-generated extra properties Added additionalProperties: false to all 6 MetadataValue types
MetadataProtoValue not supported by EmbedMD converter (no case in switch) Hook replaces ProtoValue with StringValue
MetadataStructValue base64 round-trip bug — write decodes, read doesn't re-encode; PATCHing resources with stored StructValues crashes Hook replaces StructValue with StringValue (see server bug section for proposed fix)
Non-ASCII/surrogate characters in request bodies break Go JSON parser Hook strips null bytes and surrogates via encode("utf-8", errors="ignore")
Non-ASCII characters in customProperties keys Hook sanitizes keys to ASCII-only
Backslash/quotes in name/externalId params break server's internal filter query construction Hook strips \, ', " from these params
Go strict JSON decoder rejects extra properties not in struct Per-path property whitelist derived from spec, applied in map_body hook
metadataType field used example instead of enum Changed to single-value enum for proper discriminator enforcement
Generated spec files (catalog.yaml, model-registry.yaml) were stale Properly regenerated via scripts/merge_openapi.sh
Metric step is int64 in Go struct but timestamp is string — spec says both are type: string, format: int64 Hook sets step as integer, timestamp as string
Artifact artifactType is immutable but included in update schema without readOnly Hook GETs existing artifact to match type on PATCH; randomizes type on POST for coverage
Schemathesis ignores minLength/format inherited through allOf Hook replaces empty name with random value; validates numeric-string fields (IDs, timestamps)
BaseResource.name missing minLength: 1 Added at source so all Create schemas inherit it

make test-fuzz reliability fixes

Issue Fix
Port-forwards start in background with no readiness check; tests fail on connection refused Added curl retry loops (30s timeout) for model-registry (:8080), minio (:9000), local registry (:5001)
$STATUS variable never set in test-fuzz; target always exits 0 Combined shell commands to properly capture and propagate exit codes from both stateless and stateful runs
Prior fuzz runs leave corrupted customProperties in DB; PATCH tests fail loading existing resources Added scripts/cleanup.sh before both stateless and stateful test runs
No stateful fuzz tests for the catalog API Added test_catalog_stateful.py and included in make test-fuzz
Stateful test fails non-deterministically with Unsatisfiable after thousands of successful steps Added @pytest.mark.flaky(reruns=2) — caused by tight spec constraints + allOf making data generation borderline

Design decisions

Per-path property whitelist vs additionalProperties: false: The Go server uses strict JSON decoding (DisallowUnknownFields), rejecting any property not in the struct. OpenAPI 3.0's allOf + additionalProperties: false is fundamentally broken — it evaluates per-subschema, so properties valid in the composite are rejected as "additional" in the base. Instead of fighting the spec, the map_body hook maintains a per-path whitelist (_PATH_PROPERTIES) derived from the spec, stripping fuzz-generated extra properties before they reach the server.

Hooks vs spec changes: Some server behaviors cannot be expressed in OpenAPI 3.0 (parameter dependencies, discriminator enforcement, strict decoding). These are handled via Schemathesis hooks in conftest.py rather than incorrect spec annotations.

⚠️ ID format: int64 in spec vs int32 in server — needs discussion

This PR annotates path parameter IDs with format: int64 and pattern: "^[1-9][0-9]{0,8}$". The pattern safely constrains values to the int32 range (max 999,999,999 < 2,147,483,647), so no functional issues arise. However, the server actually validates all path IDs as int32, not int64:

  • internal/apiutils/api_utils.go:42ValidateIDAsInt32() calls strconv.ParseInt(id, 10, 32).
  • Every path ID handler (registered_model.go, model_version.go, experiment.go, inference_service.go, serving_environment.go, artifact.go, serve_model.go, experiment_run.go) uses ValidateIDAsInt32, never ValidateIDAsInt64.
  • Meanwhile, the response body BaseResource.id at common.yaml:98 is already declared as format: int64 — this pre-dates this PR and is the existing convention.

We chose format: int64 for path params to stay consistent with the existing response body id field. But this creates a documented-vs-actual gap:

Layer Declared format Actual behavior
Response body BaseResource.id int64 (pre-existing) Server generates int32-range IDs
Path parameters (this PR) int64 (matches response body) Server validates as int32
MetadataIntValue.int_value (this PR) int32 (changed from int64) Server uses StringToInt32()

Options for follow-up discussion:

  1. Keep int64 everywhere (current state) — consistent spec, pattern constrains to safe range, but spec claims to support larger IDs than the server accepts
  2. Change path params to int32 — accurate to server, but inconsistent with response body id which still says int64
  3. Change everything to int32 — accurate to server across the board, but this is a larger API contract change that could affect generated clients
  4. Fix the server to use int64 — the most forward-looking fix; the database (MySQL/Postgres) likely supports int64 already, and ValidateIDAsInt32 could be changed to ValidateIDAsInt64

Note: ValidateIDAsInt64 already exists in api_utils.go:66 — it's just never called for path parameters.

Server bugs identified (not fixed, worked around)

These are documented for separate follow-up:

  1. MetadataProtoValue not supported — EmbedMD converter (openapi_embedmd_converter_util.go:42-78) has no case for ProtoValue in its switch statement
  2. MetadataStructValue base64 round-trip bug — the write converter (openapi_embedmd_converter_util.go:61) base64-decodes struct_value, but the read converter (embdemd_openapi_converter_util.go:47) returns raw JSON without re-encoding. Any resource with a StructValue becomes un-PATCHable because the server re-processes stored customProperties through the write converter on PATCH. Proposed one-line fix: change line 47 from NewMetadataStructValue(string(asJSON)) to NewMetadataStructValue(base64.StdEncoding.EncodeToString(asJSON)) — identical to what the ByteValue read path already does at line 60-61
  3. name/externalId filter injection — server constructs internal filter queries like name = '<value>' without escaping; backslash/quotes break the participle lexer
  4. Non-ASCII in customProperties keys — Go JSON decoder fails on surrogate pairs and non-UTF-8 characters in property key names
  5. MetadataIntValue int32/int64 mismatch — spec said int64, server uses StringToInt32() (fixed in this PR for int_value; see also ID format discussion above)
  6. All path ID validation uses int32 (ValidateIDAsInt32) despite spec declaring int64 (see section above)
  7. Metric step vs timestamp type inconsistency — both are type: string, format: int64 in spec, but Go struct has step as int64 (JSON number) and timestamp as string (JSON string)
  8. Artifact artifactType immutable but in update schema — server rejects type changes on PATCH, and infers "unknown" when field is omitted; update schema should either exclude artifactType or mark it readOnly

Test plan

  • make openapi/validate — both specs pass
  • Stateless fuzz tests: 86 passed, 0 SUBFAILED, 63 subtests passed
  • Stateful fuzz test (model-registry): passes; @flaky(reruns=2) for occasional Unsatisfiable (see known limitation below)
  • Stateful fuzz test (catalog): added — test_catalog_stateful.py
  • make test-fuzz from scratch (kind cluster creation → deploy → test): no timing failures
  • Reviewer ran cd clients/python && make test-fuzz and stateless/stateful tests directly — confirmed green

⚠️ Known limitation: stateful test Unsatisfiable — needs further investigation

The model-registry stateful test (test_mr_api_stateful) can non-deterministically fail with hypothesis.errors.Unsatisfiable even after thousands of successful API calls. This happens when Hypothesis's state machine exhausts 1000 attempts to find a valid next transition. The failure is mitigated with @pytest.mark.flaky(reruns=2) but can still occur.

What we know:

  • The failure is non-deterministic — depends on the random seed. Some seeds produce runs that complete successfully; others reach a state where all generated transitions are filtered out.
  • It occurs AFTER extensive successful exploration (often 5000+ successful API calls), not at startup.
  • It is NOT caused by server errors — the state machine simply can't generate valid data for its next step.
  • It became more likely after tightening the OpenAPI spec constraints (patterns, minLength, format), because Schemathesis struggles to generate conformant data through allOf composition.

Root cause hypothesis:
Schemathesis does not fully enforce constraints inherited through allOf during data generation. When the spec has tight constraints (e.g., pattern: "^[1-9][0-9]{0,8}$" on IDs, minLength: 1 on names), Hypothesis generates many invalid values that get filtered, eventually hitting the 1000-filter hard limit. The hooks fix values AFTER generation, but Hypothesis's internal filtering happens BEFORE the hooks run.

Possible directions for investigation:

  1. Schemathesis before_generate_body hook — could constrain generation strategies at the source rather than fixing values after the fact, reducing the filter rejection rate
  2. Custom Hypothesis strategies — register custom strategies for string fields with format/pattern constraints so Hypothesis generates valid values directly
  3. Increase Hypothesis filter tolerance — the 1000-attempt Unsatisfiable threshold is hardcoded in Hypothesis; a custom wrapper could retry with a different seed
  4. Reduce spec constraint strictness — loosen some patterns (e.g., "^[0-9]+$" instead of "^[1-9][0-9]{0,8}$") to give Hypothesis more room, at the cost of less precise spec documentation

--hypothesis-show-statistics verbosity

The make test-fuzz target passes --hypothesis-show-statistics to both stateless and stateful pytest runs. This produces extensive per-test statistics output that is useful for debugging Hypothesis generation issues but noisy for routine runs. Consider removing it from the default target and keeping it as an opt-in flag (e.g., make test-fuzz STATS=1).

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added request validation middleware to the API server.
  • Bug Fixes

    • Enhanced API robustness with stricter input validation for resource identifiers and query parameters.
    • Improved deployment reliability with automatic health checks for service readiness.
    • Strengthened test framework with improved data sanitization and validation.

lugi0 and others added 2 commits April 24, 2026 10:00
…cceptance checks

Resolves RHOAIENG-58824: Schemathesis API rejects valid requests that
conform to the OpenAPI spec. Reduces stateless fuzz test failures from
53 to 0 by closing spec-server contract gaps and adding test hooks for
server-side limitations that cannot be expressed in OpenAPI 3.0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…verage

Addresses issues found during stateful fuzz test stabilization:

- Replace MetadataStructValue with StringValue in hooks (server base64
  round-trip bug makes resources with StructValue un-PATCHable)
- Randomize artifact types on POST for coverage instead of hardcoding
  doc-artifact; ensure per-type required fields and correct value types
- GET existing artifact on PATCH to match immutable artifactType
- Validate numeric-string fields (IDs, timestamps) that Schemathesis
  fills with arbitrary Unicode despite format constraints in allOf
- Replace empty name with random value (Schemathesis ignores minLength
  through allOf); add minLength: 1 to BaseResource.name at source
- Add catalog stateful test and include in make test-fuzz
- Add @pytest.mark.flaky(reruns=2) for non-deterministic Unsatisfiable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci openshift-ci Bot requested review from dbasunag and fege April 24, 2026 11:02
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 24, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: lugi0
Once this PR has been reviewed and has the lgtm label, please assign rareddy for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

📝 Walkthrough

Walkthrough

This pull request tightens OpenAPI schema validation across three catalog and model-registry API specifications by enforcing stricter constraints on resource identifiers (minLength, readOnly, numeric patterns), metadata value objects (additionalProperties: false, enum-constrained metadataType), and query parameters (ASCII patterns, numeric formats). A ValidationMiddleware is integrated into the catalog HTTP server handler. Python fuzz tests are expanded with Schemathesis hooks to sanitize inputs (null bytes, invalid properties, type corrections) and a new stateful test suite for the catalog API is introduced. Test configuration enables positive data acceptance checks and deployment scripts add health-check polling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes


Security & Quality Observations

OpenAPI Schema Hardening

  • Input pattern constraints on filter queries ([\x20-\x7E] printable ASCII) and numeric ID fields (^[1-9][0-9]{0,8}$) reduce parser/injection surface area. Valid approach for specification-level defense.
  • additionalProperties: false on metadata schemas prevents schema-drift attacks; validates discriminator consistency via enum on metadataType.
  • readOnly: true on server-generated id fields correctly prevents client tampering at the spec level.

ValidationMiddleware Integration

  • Middleware addition in catalog/cmd/catalog.go (line 2/-1) is minimal and correct, but confirm the middleware actually validates against the schema—wrapping a handler doesn't enforce validation without explicit OpenAPI validator involvement.

Fuzz Test Conftest.py (250 new lines)

  • Sanitization logic is shallow: map_body recursively cleans null bytes but doesn't validate against a schema's actual type constraints before sending—for example, coercing numeric strings or rewriting invalid enums masks deeper API contract failures.
  • Hook function correctness: map_query removes tokens like "nextPageToken" from unsupported endpoints but the token removal logic is hardcoded and fragile; any OpenAPI path structure change could silently skip validation.
  • Non-determinism in test data: Use of secrets module for random name generation in fuzz tests improves coverage but should seed a deterministic random state for reproducibility (CWE-330).
  • Property-set restriction: Logic restricting properties to spec-allowed sets is correct but doesn't validate cardinality or required-field presence—misses catching API mutations that drop mandatory fields.

Stateful Fuzz Test

  • Configuration disables Hypothesis health checks for too_much_data, slow_runs, and large_data via settings.register_profile(...). This suppresses data-size and performance warnings; acceptable for stress testing but risks masking slow/memory-bound bugs.

Flaky Test Marker

  • @pytest.mark.flaky(reruns=2) on test_mr_api_stateful masks intermittency instead of fixing root cause. Acceptable as temporary mitigation but flag for root-cause investigation.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title accurately summarizes the main change: enabling positive_data_acceptance fuzz checks through OpenAPI spec tightening and test hooks.
Description check ✅ Passed Description comprehensively covers root causes (37 spec tightening, 14 spec+hook fixes), design decisions, server bugs, test plan results, and known limitations. Follows template with adequate detail.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (3)
api/openapi/model-registry.yaml (1)

2070-2073: experimentId has both pattern and minLength constraints.

The pattern ^[1-9][0-9]{0,8}$ already ensures at least one character, making minLength: 1 redundant. Not harmful, but adds unnecessary validation overhead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/openapi/model-registry.yaml` around lines 2070 - 2073, The experimentId
property in model-registry.yaml redundantly includes minLength: 1 while its
regex pattern ^[1-9][0-9]{0,8}$ already enforces at least one digit; remove the
minLength: 1 entry from the experimentId schema to avoid duplicate validation
rules (locate the experimentId field in the OpenAPI model-registry.yaml and
delete the minLength line).
clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py (1)

5-6: Cross-module import couples catalog tests to model_registry test module.

Importing call_and_validate_with_null_byte_handling from tests.fuzz_api.model_registry.test_mr_stateless creates a dependency that can cause confusing failures if the model_registry module changes. Consider moving shared helpers to a common module like tests.fuzz_api.helpers or conftest.py.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py` around
lines 5 - 6, The test imports the helper
call_and_validate_with_null_byte_handling directly from
tests.fuzz_api.model_registry.test_mr_stateless which couples catalog tests to
the model_registry test module; extract that helper into a shared location
(e.g., create tests.fuzz_api.helpers or add it to conftest.py) and update the
import in clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py
to import call_and_validate_with_null_byte_handling from the new shared module;
also update any other tests that used the old import to reference the shared
helper so tests no longer depend on test_mr_stateless.
clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py (1)

8-8: Flaky reruns mask root cause of Hypothesis Unsatisfiable failures.

While @pytest.mark.flaky(reruns=2) mitigates CI noise, Unsatisfiable typically indicates over-constrained generation or conflicting schema rules. Track whether these failures persist and consider adjusting Hypothesis settings or schema hooks if they do.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py` at line 8,
The flaky decorator (`@pytest.mark.flaky`) is hiding Hypothesis Unsatisfiable
failures; remove that decorator from the test(s) in
clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py and instead
surface and debug Hypothesis failures by applying explicit Hypothesis settings
on the test function(s) (e.g. add
`@hypothesis.settings`(max_examples=<reasonable>, deadline=None,
suppress_health_check=[...]) or adjust strategies/assume calls) so the
Unsatisfiable error is reported and you can relax conflicting schema constraints
or generation limits accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/openapi/catalog.yaml`:
- Around line 2360-2362: The OpenAPI parameter currently declares type: string
and format: int64 but uses pattern: "^[1-9][0-9]{0,8}$", which incorrectly
limits IDs to 9 digits; update the pattern to match 64-bit signed integer widths
(for example use "^[1-9][0-9]{0,18}$") or remove the pattern entirely so it
aligns with BaseResource.id's int64 representation, and make the same change for
the other occurrences noted (the pattern at the other block around lines
2394-2396).
- Around line 1752-1755: The schema for int_value currently advertises int32 but
uses a restrictive pattern ("^-?[0-9]{1,9}$") that rejects valid int32 numbers;
update the schema for int_value to either remove the string pattern and use a
numeric representation (type: integer, format: int32) or, if you must keep it as
string, replace the pattern with one that matches the full signed 32-bit range
(e.g., a pattern that allows -2147483648 through 2147483647) so values like
1000000000 and -2147483648 validate correctly; locate the int_value entry and
change the type/format or widen the pattern accordingly.
- Around line 2405-2409: The OpenAPI schema for parameter "pageSize" currently
allows up to 2147483647 which effectively permits abusive large requests; change
the "maximum" value for the pageSize schema to the server's actual hard cap (the
real maximum entries your list handlers will accept) so the API contract rejects
oversized values at the edge, and update the schema description for pageSize to
document the enforced server maximum and any clamping behavior; make these edits
where "pageSize" is defined in api/openapi/catalog.yaml (schema type
integer/format int32).

In `@api/openapi/src/model-registry.yaml`:
- Around line 149-151: The OpenAPI schema declares fields as type: string with
format: int64 but applies a pattern "^[1-9][0-9]{0,8}$" which limits IDs to
1..999,999,999 and contradicts the int64 claim; fix by choosing one of two
consistent options: (A) if IDs must support full int64, remove the restrictive
pattern (or replace it with a pattern that allows the full signed 64-bit range)
and keep format: int64, or (B) if IDs are intentionally capped, change format:
int64 to format: string (or remove int64) and update the pattern and description
to document the actual max allowed value; apply the chosen fix wherever the same
trio appears (the block containing "type: string", "format: int64", "pattern:
\"^[1-9][0-9]{0,8}$\"" and the other occurrences called out in the comment).

In `@catalog/cmd/catalog.go`:
- Around line 199-202: The current ValidationMiddleware only filters null bytes
and does not enforce OpenAPI schema constraints, so requests routed via
openapi.NewRouter(ctrl, mcpCtrl) (e.g., handlers like GetModel) can receive
unvalidated input; update the server setup to wrap the router with a full
OpenAPI request validator (for example oapi-codegen's middleware or
kin-openapi’s router/validation middleware) instead of or in addition to
ValidationMiddleware so that formats, regexes, min/max lengths and enums are
enforced at runtime; specifically, replace or augment the Handler passed to the
http.Server (where server := &http.Server{ Handler:
middleware.ValidationMiddleware(openapi.NewRouter(...)) }) with a validator
middleware that loads the compiled OpenAPI spec and validates incoming requests
before they reach GetModel and other handlers, or alternatively extend
middleware.ValidationMiddleware to perform full schema validation and plug that
in.

In `@clients/python/tests/fuzz_api/conftest.py`:
- Around line 151-154: The PATCH handling in map_case() calls
_get_artifact_type() without the same request kwargs (Authorization header and
verify_ssl) used elsewhere, so the probe GET can 401 or fail TLS and map_case()
will incorrectly overwrite artifactType; update the call sites in map_case()
(and the similar block at lines ~185-193) to pass the same request kwargs (auth
headers, verify_ssl) through to _get_artifact_type(), or alternatively
short-circuit and do not mutate case.body["artifactType"] when the lookup
raises/auth-fails/unverifiable TLS (i.e., treat a failed probe as "unknown" and
leave the original artifactType intact). Ensure the unique symbols
_get_artifact_type and map_case are updated accordingly and add a small check
for probe failure to avoid silent fallback to fuzzed values.
- Around line 68-72: Replace the brittle int()-based validation for
numeric-string ID fields with a regex check matching the spec ^[1-9][0-9]{0,8}$:
add "import re" with the other imports, use re.fullmatch(r'^[1-9][0-9]{0,8}$',
value) instead of int(value) tests when validating fields listed in
_NUMERIC_STRING_FIELDS, and update any fallback random generation (e.g.,
randbelow(999999999)) to produce only 1..999999999 (for example
randbelow(999999999) + 1) so generated IDs never produce "0" or other
schema-invalid forms; apply the same change to the other occurrence around lines
91-96.

---

Nitpick comments:
In `@api/openapi/model-registry.yaml`:
- Around line 2070-2073: The experimentId property in model-registry.yaml
redundantly includes minLength: 1 while its regex pattern ^[1-9][0-9]{0,8}$
already enforces at least one digit; remove the minLength: 1 entry from the
experimentId schema to avoid duplicate validation rules (locate the experimentId
field in the OpenAPI model-registry.yaml and delete the minLength line).

In `@clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py`:
- Around line 5-6: The test imports the helper
call_and_validate_with_null_byte_handling directly from
tests.fuzz_api.model_registry.test_mr_stateless which couples catalog tests to
the model_registry test module; extract that helper into a shared location
(e.g., create tests.fuzz_api.helpers or add it to conftest.py) and update the
import in clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py
to import call_and_validate_with_null_byte_handling from the new shared module;
also update any other tests that used the old import to reference the shared
helper so tests no longer depend on test_mr_stateless.

In `@clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py`:
- Line 8: The flaky decorator (`@pytest.mark.flaky`) is hiding Hypothesis
Unsatisfiable failures; remove that decorator from the test(s) in
clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py and instead
surface and debug Hypothesis failures by applying explicit Hypothesis settings
on the test function(s) (e.g. add
`@hypothesis.settings`(max_examples=<reasonable>, deadline=None,
suppress_health_check=[...]) or adjust strategies/assume calls) so the
Unsatisfiable error is reported and you can relax conflicting schema constraints
or generation limits accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 546d8130-ecef-44c7-a4f8-358a6e29be10

📥 Commits

Reviewing files that changed from the base of the PR and between 35327a4 and bf0e61c.

📒 Files selected for processing (12)
  • api/openapi/catalog.yaml
  • api/openapi/model-registry.yaml
  • api/openapi/src/catalog.yaml
  • api/openapi/src/lib/common.yaml
  • api/openapi/src/model-registry.yaml
  • catalog/cmd/catalog.go
  • clients/python/Makefile
  • clients/python/schemathesis.toml
  • clients/python/tests/fuzz_api/conftest.py
  • clients/python/tests/fuzz_api/model_catalog/test_catalog_stateful.py
  • clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py
  • clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py

Comment thread api/openapi/catalog.yaml
Comment on lines 1752 to +1755
int_value:
format: int64
format: int32
type: string
pattern: "^-?[0-9]{1,9}$"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

int32 constraint rejects valid int32 values.

^-?[0-9]{1,9}$ caps int_value at 9 digits, so values like 1000000000 and -2147483648 become schema-invalid even though this field is documented as int32. Either widen the constraint to the full signed int32 domain or stop advertising int32 here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/openapi/catalog.yaml` around lines 1752 - 1755, The schema for int_value
currently advertises int32 but uses a restrictive pattern ("^-?[0-9]{1,9}$")
that rejects valid int32 numbers; update the schema for int_value to either
remove the string pattern and use a numeric representation (type: integer,
format: int32) or, if you must keep it as string, replace the pattern with one
that matches the full signed 32-bit range (e.g., a pattern that allows
-2147483648 through 2147483647) so values like 1000000000 and -2147483648
validate correctly; locate the int_value entry and change the type/format or
widen the pattern accordingly.

Comment thread api/openapi/catalog.yaml
Comment on lines 2360 to +2362
type: string
format: int64
pattern: "^[1-9][0-9]{0,8}$"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

ID regex narrows advertised int64 IDs to 9 digits.

Both parameters are marked format: int64, but the new regex only permits 1..999999999. Once IDs cross 9 digits, valid routes and parent filters will fail schema validation while BaseResource.id still advertises int64 strings. Align the regex with the real ID width everywhere, or downgrade the format consistently.

Also applies to: 2394-2396

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/openapi/catalog.yaml` around lines 2360 - 2362, The OpenAPI parameter
currently declares type: string and format: int64 but uses pattern:
"^[1-9][0-9]{0,8}$", which incorrectly limits IDs to 9 digits; update the
pattern to match 64-bit signed integer widths (for example use
"^[1-9][0-9]{0,18}$") or remove the pattern entirely so it aligns with
BaseResource.id's int64 representation, and make the same change for the other
occurrences noted (the pattern at the other block around lines 2394-2396).

Comment thread api/openapi/catalog.yaml
Comment on lines 2405 to +2409
schema:
type: string
type: integer
format: int32
minimum: 1
maximum: 2147483647
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Major: pageSize is effectively unbounded (CWE-770).

Exploit scenario: a caller can send pageSize=2147483647; if validation middleware accepts it, downstream list handlers must clamp it themselves or absorb the query/load. Set this to the server’s real hard cap so the contract blocks abusive values at the edge.

🔧 Remediation
       schema:
         type: integer
         format: int32
         minimum: 1
-        maximum: 2147483647
+        maximum: 1000 # replace with the handler's actual enforced cap

As per coding guidelines, REVIEW PRIORITIES: 1. Security vulnerabilities (provide severity, exploit scenario, and remediation code).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
schema:
type: string
type: integer
format: int32
minimum: 1
maximum: 2147483647
schema:
type: integer
format: int32
minimum: 1
maximum: 1000 # replace with the handler's actual enforced cap
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/openapi/catalog.yaml` around lines 2405 - 2409, The OpenAPI schema for
parameter "pageSize" currently allows up to 2147483647 which effectively permits
abusive large requests; change the "maximum" value for the pageSize schema to
the server's actual hard cap (the real maximum entries your list handlers will
accept) so the API contract rejects oversized values at the edge, and update the
schema description for pageSize to document the enforced server maximum and any
clamping behavior; make these edits where "pageSize" is defined in
api/openapi/catalog.yaml (schema type integer/format int32).

Comment on lines 149 to +151
type: string
format: int64
pattern: "^[1-9][0-9]{0,8}$"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The new ID regex contradicts the declared int64 range.

These fields are now documented as int64, but the regex only allows 1..999999999. That makes the contract self-contradictory: either these IDs are truly int64-like and the pattern is too narrow, or the format is overstating the allowed range. As written, any legitimate ID beyond 9 digits becomes undeclarable in the spec.

Patch
- pattern: "^[1-9][0-9]{0,8}$"
+ pattern: "^[1-9][0-9]{0,18}$"

If the server is intentionally capped below full int64, encode that real bound consistently and drop the misleading int64 claim.

As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.

Also applies to: 1771-1779, 2000-2009, 2164-2169, 2277-2282, 2407-2412

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/openapi/src/model-registry.yaml` around lines 149 - 151, The OpenAPI
schema declares fields as type: string with format: int64 but applies a pattern
"^[1-9][0-9]{0,8}$" which limits IDs to 1..999,999,999 and contradicts the int64
claim; fix by choosing one of two consistent options: (A) if IDs must support
full int64, remove the restrictive pattern (or replace it with a pattern that
allows the full signed 64-bit range) and keep format: int64, or (B) if IDs are
intentionally capped, change format: int64 to format: string (or remove int64)
and update the pattern and description to document the actual max allowed value;
apply the chosen fix wherever the same trio appears (the block containing "type:
string", "format: int64", "pattern: \"^[1-9][0-9]{0,8}$\"" and the other
occurrences called out in the comment).

Comment thread catalog/cmd/catalog.go
Comment on lines 199 to 202
server := &http.Server{
Addr: catalogCfg.ListenAddress,
Handler: openapi.NewRouter(ctrl, mcpCtrl),
Handler: middleware.ValidationMiddleware(openapi.NewRouter(ctrl, mcpCtrl)),
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if openapi.NewRouter or related code includes request validation
rg -n "Validate|validate|Schema|schema" catalog/internal/server/openapi/

Repository: opendatahub-io/model-registry

Length of output: 525


🏁 Script executed:

#!/bin/bash
# Find and read the ValidationMiddleware implementation
find . -type f -name "*.go" | xargs grep -l "ValidationMiddleware" | head -5

Repository: opendatahub-io/model-registry

Length of output: 228


🏁 Script executed:

#!/bin/bash
# Search for the middleware package and ValidationMiddleware definition
rg -n "func.*ValidationMiddleware" --type go

Repository: opendatahub-io/model-registry

Length of output: 380


🏁 Script executed:

#!/bin/bash
# Find openapi router implementation
rg -n "func.*NewRouter" catalog/internal/server/openapi/ --type go

Repository: opendatahub-io/model-registry

Length of output: 166


🏁 Script executed:

#!/bin/bash
# Read ValidationMiddleware implementation
cat -n internal/server/middleware/validation.go

Repository: opendatahub-io/model-registry

Length of output: 3623


🏁 Script executed:

#!/bin/bash
# Read NewRouter implementation
cat -n catalog/internal/server/openapi/routers.go

Repository: opendatahub-io/model-registry

Length of output: 2027


🏁 Script executed:

#!/bin/bash
# Read validation tests to understand what is being validated
cat -n internal/server/middleware/validation_test.go

Repository: opendatahub-io/model-registry

Length of output: 7198


🏁 Script executed:

#!/bin/bash
# Find OpenAPI service handler implementations to check if they validate schemas
find catalog/internal/server/openapi -name "*.go" -not -name "*test.go" -not -name "routers.go" | head -10

Repository: opendatahub-io/model-registry

Length of output: 580


🏁 Script executed:

#!/bin/bash
# Look for typical handler patterns with validation in the OpenAPI service
rg -n "func.*Handler|func \(.*\) " catalog/internal/server/openapi/api_model_catalog_service_service.go | head -20

Repository: opendatahub-io/model-registry

Length of output: 2221


🏁 Script executed:

#!/bin/bash
# Read GetModel handler implementation
sed -n '336,380p' catalog/internal/server/openapi/api_model_catalog_service_service.go

Repository: opendatahub-io/model-registry

Length of output: 1594


🏁 Script executed:

#!/bin/bash
# Search for any validation of IDs or input parameters in handlers
rg -A 5 "Validate|validate.*ID|validate.*[Pp]aram" catalog/internal/server/openapi/api_model_catalog_service_service.go | head -40

Repository: opendatahub-io/model-registry

Length of output: 253


OpenAPI schema validation missing; only null-byte filtering enforced by middleware.

ValidationMiddleware checks for null bytes but skips schema validation: format, regex patterns, minLength, maxLength, enums. Handlers (e.g., GetModel) receive unvalidated input and pass it directly to the provider. Implement request schema validation (e.g., via kin-openapi or oapi-codegen middleware) or document that OpenAPI spec constraints are not enforced server-side.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@catalog/cmd/catalog.go` around lines 199 - 202, The current
ValidationMiddleware only filters null bytes and does not enforce OpenAPI schema
constraints, so requests routed via openapi.NewRouter(ctrl, mcpCtrl) (e.g.,
handlers like GetModel) can receive unvalidated input; update the server setup
to wrap the router with a full OpenAPI request validator (for example
oapi-codegen's middleware or kin-openapi’s router/validation middleware) instead
of or in addition to ValidationMiddleware so that formats, regexes, min/max
lengths and enums are enforced at runtime; specifically, replace or augment the
Handler passed to the http.Server (where server := &http.Server{ Handler:
middleware.ValidationMiddleware(openapi.NewRouter(...)) }) with a validator
middleware that loads the compiled OpenAPI spec and validates incoming requests
before they reach GetModel and other handlers, or alternatively extend
middleware.ValidationMiddleware to perform full schema validation and plug that
in.

Comment on lines +68 to +72
_NUMERIC_STRING_FIELDS = {
"registeredModelId", "servingEnvironmentId", "modelVersionId",
"experimentId", "experimentRunId", "startTimeSinceEpoch",
"endTimeSinceEpoch", "timestamp",
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate numeric-string IDs against the actual regex, not int().

int() accepts values like "0", "-1", " 7", and "0001", which still violate the tightened ^[1-9][0-9]{0,8}$ contract in api/openapi/src/model-registry.yaml. The fallback also uses randbelow(999999999), so it can generate "0" and keep producing schema-invalid payloads.

Patch
+_POSITIVE_ID_RE = re.compile(r"^[1-9][0-9]{0,8}$")
+
         for field in _NUMERIC_STRING_FIELDS:
             if field in body and isinstance(body[field], str):
-                try:
-                    int(body[field])
-                except ValueError:
-                    body[field] = str(secrets.randbelow(999999999))
+                if not _POSITIVE_ID_RE.fullmatch(body[field]):
+                    body[field] = str(secrets.randbelow(999999999) + 1)

Add import re with the other imports.

As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.

Also applies to: 91-96

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/python/tests/fuzz_api/conftest.py` around lines 68 - 72, Replace the
brittle int()-based validation for numeric-string ID fields with a regex check
matching the spec ^[1-9][0-9]{0,8}$: add "import re" with the other imports, use
re.fullmatch(r'^[1-9][0-9]{0,8}$', value) instead of int(value) tests when
validating fields listed in _NUMERIC_STRING_FIELDS, and update any fallback
random generation (e.g., randbelow(999999999)) to produce only 1..999999999 (for
example randbelow(999999999) + 1) so generated IDs never produce "0" or other
schema-invalid forms; apply the same change to the other occurrence around lines
91-96.

Comment on lines +151 to +154
elif method == "PATCH":
existing_type = _get_artifact_type(case)
if existing_type:
case.body["artifactType"] = existing_type
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

PATCH artifact-type preservation is bypassed in authenticated or self-signed environments.

_get_artifact_type() does its probe GET without the same Authorization header or verify_ssl setting used everywhere else. When that lookup 401s or fails TLS verification, map_case() falls back to the fuzzed artifactType, which is exactly the immutability failure this hook is trying to avoid.

Thread the test request kwargs into this lookup, or stop mutating artifactType when the lookup cannot be performed reliably.

As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.

Also applies to: 185-193

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/python/tests/fuzz_api/conftest.py` around lines 151 - 154, The PATCH
handling in map_case() calls _get_artifact_type() without the same request
kwargs (Authorization header and verify_ssl) used elsewhere, so the probe GET
can 401 or fail TLS and map_case() will incorrectly overwrite artifactType;
update the call sites in map_case() (and the similar block at lines ~185-193) to
pass the same request kwargs (auth headers, verify_ssl) through to
_get_artifact_type(), or alternatively short-circuit and do not mutate
case.body["artifactType"] when the lookup raises/auth-fails/unverifiable TLS
(i.e., treat a failed probe as "unknown" and leave the original artifactType
intact). Ensure the unique symbols _get_artifact_type and map_case are updated
accordingly and add a small check for probe failure to avoid silent fallback to
fuzzed values.

@lugi0 lugi0 closed this Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant