fix: enable positive_data_acceptance fuzz checks via spec tightening and test hooks#1720
fix: enable positive_data_acceptance fuzz checks via spec tightening and test hooks#1720lugi0 wants to merge 2 commits into
Conversation
…cceptance checks Resolves RHOAIENG-58824: Schemathesis API rejects valid requests that conform to the OpenAPI spec. Reduces stateless fuzz test failures from 53 to 0 by closing spec-server contract gaps and adding test hooks for server-side limitations that cannot be expressed in OpenAPI 3.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…verage Addresses issues found during stateful fuzz test stabilization: - Replace MetadataStructValue with StringValue in hooks (server base64 round-trip bug makes resources with StructValue un-PATCHable) - Randomize artifact types on POST for coverage instead of hardcoding doc-artifact; ensure per-type required fields and correct value types - GET existing artifact on PATCH to match immutable artifactType - Validate numeric-string fields (IDs, timestamps) that Schemathesis fills with arbitrary Unicode despite format constraints in allOf - Replace empty name with random value (Schemathesis ignores minLength through allOf); add minLength: 1 to BaseResource.name at source - Add catalog stateful test and include in make test-fuzz - Add @pytest.mark.flaky(reruns=2) for non-deterministic Unsatisfiable Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: lugi0 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThis pull request tightens OpenAPI schema validation across three catalog and model-registry API specifications by enforcing stricter constraints on resource identifiers (minLength, readOnly, numeric patterns), metadata value objects (additionalProperties: false, enum-constrained metadataType), and query parameters (ASCII patterns, numeric formats). A ValidationMiddleware is integrated into the catalog HTTP server handler. Python fuzz tests are expanded with Schemathesis hooks to sanitize inputs (null bytes, invalid properties, type corrections) and a new stateful test suite for the catalog API is introduced. Test configuration enables positive data acceptance checks and deployment scripts add health-check polling. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Security & Quality ObservationsOpenAPI Schema Hardening
ValidationMiddleware Integration
Fuzz Test Conftest.py (250 new lines)
Stateful Fuzz Test
Flaky Test Marker
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (3)
api/openapi/model-registry.yaml (1)
2070-2073: experimentId has both pattern and minLength constraints.The pattern
^[1-9][0-9]{0,8}$already ensures at least one character, makingminLength: 1redundant. Not harmful, but adds unnecessary validation overhead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/openapi/model-registry.yaml` around lines 2070 - 2073, The experimentId property in model-registry.yaml redundantly includes minLength: 1 while its regex pattern ^[1-9][0-9]{0,8}$ already enforces at least one digit; remove the minLength: 1 entry from the experimentId schema to avoid duplicate validation rules (locate the experimentId field in the OpenAPI model-registry.yaml and delete the minLength line).clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py (1)
5-6: Cross-module import couples catalog tests to model_registry test module.Importing
call_and_validate_with_null_byte_handlingfromtests.fuzz_api.model_registry.test_mr_statelesscreates a dependency that can cause confusing failures if the model_registry module changes. Consider moving shared helpers to a common module liketests.fuzz_api.helpersorconftest.py.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py` around lines 5 - 6, The test imports the helper call_and_validate_with_null_byte_handling directly from tests.fuzz_api.model_registry.test_mr_stateless which couples catalog tests to the model_registry test module; extract that helper into a shared location (e.g., create tests.fuzz_api.helpers or add it to conftest.py) and update the import in clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py to import call_and_validate_with_null_byte_handling from the new shared module; also update any other tests that used the old import to reference the shared helper so tests no longer depend on test_mr_stateless.clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py (1)
8-8: Flaky reruns mask root cause of Hypothesis Unsatisfiable failures.While
@pytest.mark.flaky(reruns=2)mitigates CI noise, Unsatisfiable typically indicates over-constrained generation or conflicting schema rules. Track whether these failures persist and consider adjusting Hypothesis settings or schema hooks if they do.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py` at line 8, The flaky decorator (`@pytest.mark.flaky`) is hiding Hypothesis Unsatisfiable failures; remove that decorator from the test(s) in clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py and instead surface and debug Hypothesis failures by applying explicit Hypothesis settings on the test function(s) (e.g. add `@hypothesis.settings`(max_examples=<reasonable>, deadline=None, suppress_health_check=[...]) or adjust strategies/assume calls) so the Unsatisfiable error is reported and you can relax conflicting schema constraints or generation limits accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/openapi/catalog.yaml`:
- Around line 2360-2362: The OpenAPI parameter currently declares type: string
and format: int64 but uses pattern: "^[1-9][0-9]{0,8}$", which incorrectly
limits IDs to 9 digits; update the pattern to match 64-bit signed integer widths
(for example use "^[1-9][0-9]{0,18}$") or remove the pattern entirely so it
aligns with BaseResource.id's int64 representation, and make the same change for
the other occurrences noted (the pattern at the other block around lines
2394-2396).
- Around line 1752-1755: The schema for int_value currently advertises int32 but
uses a restrictive pattern ("^-?[0-9]{1,9}$") that rejects valid int32 numbers;
update the schema for int_value to either remove the string pattern and use a
numeric representation (type: integer, format: int32) or, if you must keep it as
string, replace the pattern with one that matches the full signed 32-bit range
(e.g., a pattern that allows -2147483648 through 2147483647) so values like
1000000000 and -2147483648 validate correctly; locate the int_value entry and
change the type/format or widen the pattern accordingly.
- Around line 2405-2409: The OpenAPI schema for parameter "pageSize" currently
allows up to 2147483647 which effectively permits abusive large requests; change
the "maximum" value for the pageSize schema to the server's actual hard cap (the
real maximum entries your list handlers will accept) so the API contract rejects
oversized values at the edge, and update the schema description for pageSize to
document the enforced server maximum and any clamping behavior; make these edits
where "pageSize" is defined in api/openapi/catalog.yaml (schema type
integer/format int32).
In `@api/openapi/src/model-registry.yaml`:
- Around line 149-151: The OpenAPI schema declares fields as type: string with
format: int64 but applies a pattern "^[1-9][0-9]{0,8}$" which limits IDs to
1..999,999,999 and contradicts the int64 claim; fix by choosing one of two
consistent options: (A) if IDs must support full int64, remove the restrictive
pattern (or replace it with a pattern that allows the full signed 64-bit range)
and keep format: int64, or (B) if IDs are intentionally capped, change format:
int64 to format: string (or remove int64) and update the pattern and description
to document the actual max allowed value; apply the chosen fix wherever the same
trio appears (the block containing "type: string", "format: int64", "pattern:
\"^[1-9][0-9]{0,8}$\"" and the other occurrences called out in the comment).
In `@catalog/cmd/catalog.go`:
- Around line 199-202: The current ValidationMiddleware only filters null bytes
and does not enforce OpenAPI schema constraints, so requests routed via
openapi.NewRouter(ctrl, mcpCtrl) (e.g., handlers like GetModel) can receive
unvalidated input; update the server setup to wrap the router with a full
OpenAPI request validator (for example oapi-codegen's middleware or
kin-openapi’s router/validation middleware) instead of or in addition to
ValidationMiddleware so that formats, regexes, min/max lengths and enums are
enforced at runtime; specifically, replace or augment the Handler passed to the
http.Server (where server := &http.Server{ Handler:
middleware.ValidationMiddleware(openapi.NewRouter(...)) }) with a validator
middleware that loads the compiled OpenAPI spec and validates incoming requests
before they reach GetModel and other handlers, or alternatively extend
middleware.ValidationMiddleware to perform full schema validation and plug that
in.
In `@clients/python/tests/fuzz_api/conftest.py`:
- Around line 151-154: The PATCH handling in map_case() calls
_get_artifact_type() without the same request kwargs (Authorization header and
verify_ssl) used elsewhere, so the probe GET can 401 or fail TLS and map_case()
will incorrectly overwrite artifactType; update the call sites in map_case()
(and the similar block at lines ~185-193) to pass the same request kwargs (auth
headers, verify_ssl) through to _get_artifact_type(), or alternatively
short-circuit and do not mutate case.body["artifactType"] when the lookup
raises/auth-fails/unverifiable TLS (i.e., treat a failed probe as "unknown" and
leave the original artifactType intact). Ensure the unique symbols
_get_artifact_type and map_case are updated accordingly and add a small check
for probe failure to avoid silent fallback to fuzzed values.
- Around line 68-72: Replace the brittle int()-based validation for
numeric-string ID fields with a regex check matching the spec ^[1-9][0-9]{0,8}$:
add "import re" with the other imports, use re.fullmatch(r'^[1-9][0-9]{0,8}$',
value) instead of int(value) tests when validating fields listed in
_NUMERIC_STRING_FIELDS, and update any fallback random generation (e.g.,
randbelow(999999999)) to produce only 1..999999999 (for example
randbelow(999999999) + 1) so generated IDs never produce "0" or other
schema-invalid forms; apply the same change to the other occurrence around lines
91-96.
---
Nitpick comments:
In `@api/openapi/model-registry.yaml`:
- Around line 2070-2073: The experimentId property in model-registry.yaml
redundantly includes minLength: 1 while its regex pattern ^[1-9][0-9]{0,8}$
already enforces at least one digit; remove the minLength: 1 entry from the
experimentId schema to avoid duplicate validation rules (locate the experimentId
field in the OpenAPI model-registry.yaml and delete the minLength line).
In `@clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py`:
- Around line 5-6: The test imports the helper
call_and_validate_with_null_byte_handling directly from
tests.fuzz_api.model_registry.test_mr_stateless which couples catalog tests to
the model_registry test module; extract that helper into a shared location
(e.g., create tests.fuzz_api.helpers or add it to conftest.py) and update the
import in clients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.py
to import call_and_validate_with_null_byte_handling from the new shared module;
also update any other tests that used the old import to reference the shared
helper so tests no longer depend on test_mr_stateless.
In `@clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py`:
- Line 8: The flaky decorator (`@pytest.mark.flaky`) is hiding Hypothesis
Unsatisfiable failures; remove that decorator from the test(s) in
clients/python/tests/fuzz_api/model_registry/test_mr_stateful.py and instead
surface and debug Hypothesis failures by applying explicit Hypothesis settings
on the test function(s) (e.g. add
`@hypothesis.settings`(max_examples=<reasonable>, deadline=None,
suppress_health_check=[...]) or adjust strategies/assume calls) so the
Unsatisfiable error is reported and you can relax conflicting schema constraints
or generation limits accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 546d8130-ecef-44c7-a4f8-358a6e29be10
📒 Files selected for processing (12)
api/openapi/catalog.yamlapi/openapi/model-registry.yamlapi/openapi/src/catalog.yamlapi/openapi/src/lib/common.yamlapi/openapi/src/model-registry.yamlcatalog/cmd/catalog.goclients/python/Makefileclients/python/schemathesis.tomlclients/python/tests/fuzz_api/conftest.pyclients/python/tests/fuzz_api/model_catalog/test_catalog_stateful.pyclients/python/tests/fuzz_api/model_catalog/test_catalog_stateless.pyclients/python/tests/fuzz_api/model_registry/test_mr_stateful.py
| int_value: | ||
| format: int64 | ||
| format: int32 | ||
| type: string | ||
| pattern: "^-?[0-9]{1,9}$" |
There was a problem hiding this comment.
int32 constraint rejects valid int32 values.
^-?[0-9]{1,9}$ caps int_value at 9 digits, so values like 1000000000 and -2147483648 become schema-invalid even though this field is documented as int32. Either widen the constraint to the full signed int32 domain or stop advertising int32 here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/openapi/catalog.yaml` around lines 1752 - 1755, The schema for int_value
currently advertises int32 but uses a restrictive pattern ("^-?[0-9]{1,9}$")
that rejects valid int32 numbers; update the schema for int_value to either
remove the string pattern and use a numeric representation (type: integer,
format: int32) or, if you must keep it as string, replace the pattern with one
that matches the full signed 32-bit range (e.g., a pattern that allows
-2147483648 through 2147483647) so values like 1000000000 and -2147483648
validate correctly; locate the int_value entry and change the type/format or
widen the pattern accordingly.
| type: string | ||
| format: int64 | ||
| pattern: "^[1-9][0-9]{0,8}$" |
There was a problem hiding this comment.
ID regex narrows advertised int64 IDs to 9 digits.
Both parameters are marked format: int64, but the new regex only permits 1..999999999. Once IDs cross 9 digits, valid routes and parent filters will fail schema validation while BaseResource.id still advertises int64 strings. Align the regex with the real ID width everywhere, or downgrade the format consistently.
Also applies to: 2394-2396
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/openapi/catalog.yaml` around lines 2360 - 2362, The OpenAPI parameter
currently declares type: string and format: int64 but uses pattern:
"^[1-9][0-9]{0,8}$", which incorrectly limits IDs to 9 digits; update the
pattern to match 64-bit signed integer widths (for example use
"^[1-9][0-9]{0,18}$") or remove the pattern entirely so it aligns with
BaseResource.id's int64 representation, and make the same change for the other
occurrences noted (the pattern at the other block around lines 2394-2396).
| schema: | ||
| type: string | ||
| type: integer | ||
| format: int32 | ||
| minimum: 1 | ||
| maximum: 2147483647 |
There was a problem hiding this comment.
Major: pageSize is effectively unbounded (CWE-770).
Exploit scenario: a caller can send pageSize=2147483647; if validation middleware accepts it, downstream list handlers must clamp it themselves or absorb the query/load. Set this to the server’s real hard cap so the contract blocks abusive values at the edge.
🔧 Remediation
schema:
type: integer
format: int32
minimum: 1
- maximum: 2147483647
+ maximum: 1000 # replace with the handler's actual enforced capAs per coding guidelines, REVIEW PRIORITIES: 1. Security vulnerabilities (provide severity, exploit scenario, and remediation code).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| schema: | |
| type: string | |
| type: integer | |
| format: int32 | |
| minimum: 1 | |
| maximum: 2147483647 | |
| schema: | |
| type: integer | |
| format: int32 | |
| minimum: 1 | |
| maximum: 1000 # replace with the handler's actual enforced cap |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/openapi/catalog.yaml` around lines 2405 - 2409, The OpenAPI schema for
parameter "pageSize" currently allows up to 2147483647 which effectively permits
abusive large requests; change the "maximum" value for the pageSize schema to
the server's actual hard cap (the real maximum entries your list handlers will
accept) so the API contract rejects oversized values at the edge, and update the
schema description for pageSize to document the enforced server maximum and any
clamping behavior; make these edits where "pageSize" is defined in
api/openapi/catalog.yaml (schema type integer/format int32).
| type: string | ||
| format: int64 | ||
| pattern: "^[1-9][0-9]{0,8}$" |
There was a problem hiding this comment.
The new ID regex contradicts the declared int64 range.
These fields are now documented as int64, but the regex only allows 1..999999999. That makes the contract self-contradictory: either these IDs are truly int64-like and the pattern is too narrow, or the format is overstating the allowed range. As written, any legitimate ID beyond 9 digits becomes undeclarable in the spec.
Patch
- pattern: "^[1-9][0-9]{0,8}$"
+ pattern: "^[1-9][0-9]{0,18}$"If the server is intentionally capped below full int64, encode that real bound consistently and drop the misleading int64 claim.
As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.
Also applies to: 1771-1779, 2000-2009, 2164-2169, 2277-2282, 2407-2412
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/openapi/src/model-registry.yaml` around lines 149 - 151, The OpenAPI
schema declares fields as type: string with format: int64 but applies a pattern
"^[1-9][0-9]{0,8}$" which limits IDs to 1..999,999,999 and contradicts the int64
claim; fix by choosing one of two consistent options: (A) if IDs must support
full int64, remove the restrictive pattern (or replace it with a pattern that
allows the full signed 64-bit range) and keep format: int64, or (B) if IDs are
intentionally capped, change format: int64 to format: string (or remove int64)
and update the pattern and description to document the actual max allowed value;
apply the chosen fix wherever the same trio appears (the block containing "type:
string", "format: int64", "pattern: \"^[1-9][0-9]{0,8}$\"" and the other
occurrences called out in the comment).
| server := &http.Server{ | ||
| Addr: catalogCfg.ListenAddress, | ||
| Handler: openapi.NewRouter(ctrl, mcpCtrl), | ||
| Handler: middleware.ValidationMiddleware(openapi.NewRouter(ctrl, mcpCtrl)), | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if openapi.NewRouter or related code includes request validation
rg -n "Validate|validate|Schema|schema" catalog/internal/server/openapi/Repository: opendatahub-io/model-registry
Length of output: 525
🏁 Script executed:
#!/bin/bash
# Find and read the ValidationMiddleware implementation
find . -type f -name "*.go" | xargs grep -l "ValidationMiddleware" | head -5Repository: opendatahub-io/model-registry
Length of output: 228
🏁 Script executed:
#!/bin/bash
# Search for the middleware package and ValidationMiddleware definition
rg -n "func.*ValidationMiddleware" --type goRepository: opendatahub-io/model-registry
Length of output: 380
🏁 Script executed:
#!/bin/bash
# Find openapi router implementation
rg -n "func.*NewRouter" catalog/internal/server/openapi/ --type goRepository: opendatahub-io/model-registry
Length of output: 166
🏁 Script executed:
#!/bin/bash
# Read ValidationMiddleware implementation
cat -n internal/server/middleware/validation.goRepository: opendatahub-io/model-registry
Length of output: 3623
🏁 Script executed:
#!/bin/bash
# Read NewRouter implementation
cat -n catalog/internal/server/openapi/routers.goRepository: opendatahub-io/model-registry
Length of output: 2027
🏁 Script executed:
#!/bin/bash
# Read validation tests to understand what is being validated
cat -n internal/server/middleware/validation_test.goRepository: opendatahub-io/model-registry
Length of output: 7198
🏁 Script executed:
#!/bin/bash
# Find OpenAPI service handler implementations to check if they validate schemas
find catalog/internal/server/openapi -name "*.go" -not -name "*test.go" -not -name "routers.go" | head -10Repository: opendatahub-io/model-registry
Length of output: 580
🏁 Script executed:
#!/bin/bash
# Look for typical handler patterns with validation in the OpenAPI service
rg -n "func.*Handler|func \(.*\) " catalog/internal/server/openapi/api_model_catalog_service_service.go | head -20Repository: opendatahub-io/model-registry
Length of output: 2221
🏁 Script executed:
#!/bin/bash
# Read GetModel handler implementation
sed -n '336,380p' catalog/internal/server/openapi/api_model_catalog_service_service.goRepository: opendatahub-io/model-registry
Length of output: 1594
🏁 Script executed:
#!/bin/bash
# Search for any validation of IDs or input parameters in handlers
rg -A 5 "Validate|validate.*ID|validate.*[Pp]aram" catalog/internal/server/openapi/api_model_catalog_service_service.go | head -40Repository: opendatahub-io/model-registry
Length of output: 253
OpenAPI schema validation missing; only null-byte filtering enforced by middleware.
ValidationMiddleware checks for null bytes but skips schema validation: format, regex patterns, minLength, maxLength, enums. Handlers (e.g., GetModel) receive unvalidated input and pass it directly to the provider. Implement request schema validation (e.g., via kin-openapi or oapi-codegen middleware) or document that OpenAPI spec constraints are not enforced server-side.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@catalog/cmd/catalog.go` around lines 199 - 202, The current
ValidationMiddleware only filters null bytes and does not enforce OpenAPI schema
constraints, so requests routed via openapi.NewRouter(ctrl, mcpCtrl) (e.g.,
handlers like GetModel) can receive unvalidated input; update the server setup
to wrap the router with a full OpenAPI request validator (for example
oapi-codegen's middleware or kin-openapi’s router/validation middleware) instead
of or in addition to ValidationMiddleware so that formats, regexes, min/max
lengths and enums are enforced at runtime; specifically, replace or augment the
Handler passed to the http.Server (where server := &http.Server{ Handler:
middleware.ValidationMiddleware(openapi.NewRouter(...)) }) with a validator
middleware that loads the compiled OpenAPI spec and validates incoming requests
before they reach GetModel and other handlers, or alternatively extend
middleware.ValidationMiddleware to perform full schema validation and plug that
in.
| _NUMERIC_STRING_FIELDS = { | ||
| "registeredModelId", "servingEnvironmentId", "modelVersionId", | ||
| "experimentId", "experimentRunId", "startTimeSinceEpoch", | ||
| "endTimeSinceEpoch", "timestamp", | ||
| } |
There was a problem hiding this comment.
Validate numeric-string IDs against the actual regex, not int().
int() accepts values like "0", "-1", " 7", and "0001", which still violate the tightened ^[1-9][0-9]{0,8}$ contract in api/openapi/src/model-registry.yaml. The fallback also uses randbelow(999999999), so it can generate "0" and keep producing schema-invalid payloads.
Patch
+_POSITIVE_ID_RE = re.compile(r"^[1-9][0-9]{0,8}$")
+
for field in _NUMERIC_STRING_FIELDS:
if field in body and isinstance(body[field], str):
- try:
- int(body[field])
- except ValueError:
- body[field] = str(secrets.randbelow(999999999))
+ if not _POSITIVE_ID_RE.fullmatch(body[field]):
+ body[field] = str(secrets.randbelow(999999999) + 1)Add import re with the other imports.
As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.
Also applies to: 91-96
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/python/tests/fuzz_api/conftest.py` around lines 68 - 72, Replace the
brittle int()-based validation for numeric-string ID fields with a regex check
matching the spec ^[1-9][0-9]{0,8}$: add "import re" with the other imports, use
re.fullmatch(r'^[1-9][0-9]{0,8}$', value) instead of int(value) tests when
validating fields listed in _NUMERIC_STRING_FIELDS, and update any fallback
random generation (e.g., randbelow(999999999)) to produce only 1..999999999 (for
example randbelow(999999999) + 1) so generated IDs never produce "0" or other
schema-invalid forms; apply the same change to the other occurrence around lines
91-96.
| elif method == "PATCH": | ||
| existing_type = _get_artifact_type(case) | ||
| if existing_type: | ||
| case.body["artifactType"] = existing_type |
There was a problem hiding this comment.
PATCH artifact-type preservation is bypassed in authenticated or self-signed environments.
_get_artifact_type() does its probe GET without the same Authorization header or verify_ssl setting used everywhere else. When that lookup 401s or fails TLS verification, map_case() falls back to the fuzzed artifactType, which is exactly the immutability failure this hook is trying to avoid.
Thread the test request kwargs into this lookup, or stop mutating artifactType when the lookup cannot be performed reliably.
As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.
Also applies to: 185-193
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/python/tests/fuzz_api/conftest.py` around lines 151 - 154, The PATCH
handling in map_case() calls _get_artifact_type() without the same request
kwargs (Authorization header and verify_ssl) used elsewhere, so the probe GET
can 401 or fail TLS and map_case() will incorrectly overwrite artifactType;
update the call sites in map_case() (and the similar block at lines ~185-193) to
pass the same request kwargs (auth headers, verify_ssl) through to
_get_artifact_type(), or alternatively short-circuit and do not mutate
case.body["artifactType"] when the lookup raises/auth-fails/unverifiable TLS
(i.e., treat a failed probe as "unknown" and leave the original artifactType
intact). Ensure the unique symbols _get_artifact_type and map_case are updated
accordingly and add a small check for probe failure to avoid silent fallback to
fuzzed values.
Summary
Resolves RHOAIENG-58824: Schemathesis's
positive_data_acceptancecheck generates schema-conformant requests and expects 2xx responses, but 53 tests fail withRejectedPositiveData. This PR closes the spec-server contract gaps and adds fuzz test hooks for server-side limitations that cannot be expressed in OpenAPI 3.0.Result: 53 → 0 stateless fuzz test failures.
Root causes found and addressed
Spec tightening (37 failures)
type: stringwith no constraints; server requires numeric int32format: int64,pattern: "^[1-9][0-9]{0,8}$"to all ID path paramspageSizequery param defined astype: string; server parses as int32type: integer, format: int32, minimum: 1, maximum: 2147483647externalIdORname+parentResourceIdfilterQueryallows any string; server lexer only accepts ASCII grammarpattern: "^[\x20-\x7E]*$"nextPageTokenhas no format constraint; server expects base64Spec + hook fixes (14 failures)
""; server'sIsZeroValue()treats it as missingminLength: 1to required string fields in Create schemasmiddleware.ValidationMiddleware()tocatalog/cmd/catalog.gooneOfdiscriminator not enforcedAdditional issues found during re-verification
MetadataIntValuespec saysformat: int64but server uses int32 (StringToInt32)format: int32,pattern: "^-?[0-9]{1,9}$"additionalProperties: falseto all 6 MetadataValue typesMetadataProtoValuenot supported by EmbedMD converter (no case in switch)MetadataStructValuebase64 round-trip bug — write decodes, read doesn't re-encode; PATCHing resources with stored StructValues crashesencode("utf-8", errors="ignore")customPropertieskeysname/externalIdparams break server's internal filter query construction\,',"from these paramsmap_bodyhookmetadataTypefield usedexampleinstead ofenumenumfor proper discriminator enforcementscripts/merge_openapi.shstepis int64 in Go struct buttimestampis string — spec says both aretype: string, format: int64stepas integer,timestampas stringartifactTypeis immutable but included in update schema withoutreadOnlyminLength/formatinherited throughallOfnamewith random value; validates numeric-string fields (IDs, timestamps)BaseResource.namemissingminLength: 1make test-fuzzreliability fixes:8080), minio (:9000), local registry (:5001)$STATUSvariable never set intest-fuzz; target always exits 0customPropertiesin DB; PATCH tests fail loading existing resourcesscripts/cleanup.shbefore both stateless and stateful test runstest_catalog_stateful.pyand included inmake test-fuzzUnsatisfiableafter thousands of successful steps@pytest.mark.flaky(reruns=2)— caused by tight spec constraints + allOf making data generation borderlineDesign decisions
Per-path property whitelist vs
additionalProperties: false: The Go server uses strict JSON decoding (DisallowUnknownFields), rejecting any property not in the struct. OpenAPI 3.0'sallOf+additionalProperties: falseis fundamentally broken — it evaluates per-subschema, so properties valid in the composite are rejected as "additional" in the base. Instead of fighting the spec, themap_bodyhook maintains a per-path whitelist (_PATH_PROPERTIES) derived from the spec, stripping fuzz-generated extra properties before they reach the server.Hooks vs spec changes: Some server behaviors cannot be expressed in OpenAPI 3.0 (parameter dependencies, discriminator enforcement, strict decoding). These are handled via Schemathesis hooks in
conftest.pyrather than incorrect spec annotations.This PR annotates path parameter IDs with
format: int64andpattern: "^[1-9][0-9]{0,8}$". The pattern safely constrains values to the int32 range (max 999,999,999 < 2,147,483,647), so no functional issues arise. However, the server actually validates all path IDs as int32, not int64:internal/apiutils/api_utils.go:42—ValidateIDAsInt32()callsstrconv.ParseInt(id, 10, 32).registered_model.go,model_version.go,experiment.go,inference_service.go,serving_environment.go,artifact.go,serve_model.go,experiment_run.go) usesValidateIDAsInt32, neverValidateIDAsInt64.BaseResource.idatcommon.yaml:98is already declared asformat: int64— this pre-dates this PR and is the existing convention.We chose
format: int64for path params to stay consistent with the existing response bodyidfield. But this creates a documented-vs-actual gap:BaseResource.idint64(pre-existing)int64(matches response body)MetadataIntValue.int_value(this PR)int32(changed from int64)StringToInt32()Options for follow-up discussion:
idwhich still says int64ValidateIDAsInt32could be changed toValidateIDAsInt64Note:
ValidateIDAsInt64already exists inapi_utils.go:66— it's just never called for path parameters.Server bugs identified (not fixed, worked around)
These are documented for separate follow-up:
MetadataProtoValuenot supported — EmbedMD converter (openapi_embedmd_converter_util.go:42-78) has no case for ProtoValue in its switch statementMetadataStructValuebase64 round-trip bug — the write converter (openapi_embedmd_converter_util.go:61) base64-decodesstruct_value, but the read converter (embdemd_openapi_converter_util.go:47) returns raw JSON without re-encoding. Any resource with a StructValue becomes un-PATCHable because the server re-processes stored customProperties through the write converter on PATCH. Proposed one-line fix: change line 47 fromNewMetadataStructValue(string(asJSON))toNewMetadataStructValue(base64.StdEncoding.EncodeToString(asJSON))— identical to what the ByteValue read path already does at line 60-61name/externalIdfilter injection — server constructs internal filter queries likename = '<value>'without escaping; backslash/quotes break the participle lexercustomPropertieskeys — Go JSON decoder fails on surrogate pairs and non-UTF-8 characters in property key namesMetadataIntValueint32/int64 mismatch — spec said int64, server usesStringToInt32()(fixed in this PR for int_value; see also ID format discussion above)ValidateIDAsInt32) despite spec declaring int64 (see section above)stepvstimestamptype inconsistency — both aretype: string, format: int64in spec, but Go struct hasstepasint64(JSON number) andtimestampasstring(JSON string)artifactTypeimmutable but in update schema — server rejects type changes on PATCH, and infers"unknown"when field is omitted; update schema should either excludeartifactTypeor mark itreadOnlyTest plan
make openapi/validate— both specs pass@flaky(reruns=2)for occasionalUnsatisfiable(see known limitation below)test_catalog_stateful.pymake test-fuzzfrom scratch (kind cluster creation → deploy → test): no timing failurescd clients/python && make test-fuzzand stateless/stateful tests directly — confirmed greenUnsatisfiable— needs further investigationThe model-registry stateful test (
test_mr_api_stateful) can non-deterministically fail withhypothesis.errors.Unsatisfiableeven after thousands of successful API calls. This happens when Hypothesis's state machine exhausts 1000 attempts to find a valid next transition. The failure is mitigated with@pytest.mark.flaky(reruns=2)but can still occur.What we know:
allOfcomposition.Root cause hypothesis:
Schemathesis does not fully enforce constraints inherited through
allOfduring data generation. When the spec has tight constraints (e.g.,pattern: "^[1-9][0-9]{0,8}$"on IDs,minLength: 1on names), Hypothesis generates many invalid values that get filtered, eventually hitting the 1000-filter hard limit. The hooks fix values AFTER generation, but Hypothesis's internal filtering happens BEFORE the hooks run.Possible directions for investigation:
before_generate_bodyhook — could constrain generation strategies at the source rather than fixing values after the fact, reducing the filter rejection rateUnsatisfiablethreshold is hardcoded in Hypothesis; a custom wrapper could retry with a different seed"^[0-9]+$"instead of"^[1-9][0-9]{0,8}$") to give Hypothesis more room, at the cost of less precise spec documentation--hypothesis-show-statisticsverbosityThe
make test-fuzztarget passes--hypothesis-show-statisticsto both stateless and stateful pytest runs. This produces extensive per-test statistics output that is useful for debugging Hypothesis generation issues but noisy for routine runs. Consider removing it from the default target and keeping it as an opt-in flag (e.g.,make test-fuzz STATS=1).🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes