fix: enable positive_data_acceptance fuzz checks via spec tightening and test hooks#2638
fix: enable positive_data_acceptance fuzz checks via spec tightening and test hooks#2638lugi0 wants to merge 3 commits intokubeflow:mainfrom
Conversation
…cceptance checks Resolves RHOAIENG-58824: Schemathesis API rejects valid requests that conform to the OpenAPI spec. Reduces stateless fuzz test failures from 53 to 0 by closing spec-server contract gaps and adding test hooks for server-side limitations that cannot be expressed in OpenAPI 3.0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…verage Addresses issues found during stateful fuzz test stabilization: - Replace MetadataStructValue with StringValue in hooks (server base64 round-trip bug makes resources with StructValue un-PATCHable) - Randomize artifact types on POST for coverage instead of hardcoding doc-artifact; ensure per-type required fields and correct value types - GET existing artifact on PATCH to match immutable artifactType - Validate numeric-string fields (IDs, timestamps) that Schemathesis fills with arbitrary Unicode despite format constraints in allOf - Replace empty name with random value (Schemathesis ignores minLength through allOf); add minLength: 1 to BaseResource.name at source - Add catalog stateful test and include in make test-fuzz - Add @pytest.mark.flaky(reruns=2) for non-deterministic Unsatisfiable Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pboyd
left a comment
There was a problem hiding this comment.
I added a few comments, but it's looking good.
| description: Number of entities in each page. | ||
| schema: | ||
| type: string | ||
| type: integer |
There was a problem hiding this comment.
This change might be a problem. Can it remain a string but have an int32 format?
There was a problem hiding this comment.
I believe this is due to how schemathesis would generate cases, leading to preventable failures in fuzzing tests. I can look into forcing acceptable values via schemathesis hooks, but I'm not sure why we would want to define a field as string when what we want is an integer
| schema: | ||
| type: string | ||
| format: int64 | ||
| pattern: "^[1-9][0-9]{0,8}$" |
There was a problem hiding this comment.
Positive int64's could be up to 19 digits, there are a few instances of this.
There was a problem hiding this comment.
For the int64 issues, I would ask you to read through the ID format: int64 in spec vs int32 in server paragraph in the description - I think we should have a discussion within the team on how to handle this moving forward.
| format: int64 | ||
| format: int32 | ||
| type: string | ||
| pattern: "^-?[0-9]{1,9}$" |
There was a problem hiding this comment.
Do you want to allow leading zero's here? I see the unsigned int64 pattern prevents them.
Signed-off-by: Luca Giorgi <lgiorgi@redhat.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary
Resolves RHOAIENG-58824: Schemathesis's
positive_data_acceptancecheck generates schema-conformant requests and expects 2xx responses, but 53 tests fail withRejectedPositiveData. This PR closes the spec-server contract gaps and adds fuzz test hooks for server-side limitations that cannot be expressed in OpenAPI 3.0.Result: 53 → 0 stateless fuzz test failures.
Root causes found and addressed
Spec tightening (37 failures)
type: stringwith no constraints; server requires numeric int32format: int64,pattern: "^[1-9][0-9]{0,8}$"to all ID path paramspageSizequery param defined astype: string; server parses as int32type: integer, format: int32, minimum: 1, maximum: 2147483647externalIdORname+parentResourceIdfilterQueryallows any string; server lexer only accepts ASCII grammarpattern: "^[\x20-\x7E]*$"nextPageTokenhas no format constraint; server expects base64Spec + hook fixes (14 failures)
""; server'sIsZeroValue()treats it as missingminLength: 1to required string fields in Create schemasmiddleware.ValidationMiddleware()tocatalog/cmd/catalog.gooneOfdiscriminator not enforcedAdditional issues found during re-verification
MetadataIntValuespec saysformat: int64but server uses int32 (StringToInt32)format: int32,pattern: "^-?[0-9]{1,9}$"additionalProperties: falseto all 6 MetadataValue typesMetadataProtoValuenot supported by EmbedMD converter (no case in switch)MetadataStructValuebase64 round-trip bug — write decodes, read doesn't re-encode; PATCHing resources with stored StructValues crashesencode("utf-8", errors="ignore")customPropertieskeysname/externalIdparams break server's internal filter query construction\,',"from these paramsmap_bodyhookmetadataTypefield usedexampleinstead ofenumenumfor proper discriminator enforcementscripts/merge_openapi.shstepis int64 in Go struct buttimestampis string — spec says both aretype: string, format: int64stepas integer,timestampas stringartifactTypeis immutable but included in update schema withoutreadOnlyminLength/formatinherited throughallOfnamewith random value; validates numeric-string fields (IDs, timestamps)BaseResource.namemissingminLength: 1make test-fuzzreliability fixes:8080), minio (:9000), local registry (:5001)$STATUSvariable never set intest-fuzz; target always exits 0customPropertiesin DB; PATCH tests fail loading existing resourcesscripts/cleanup.shbefore both stateless and stateful test runstest_catalog_stateful.pyand included inmake test-fuzzUnsatisfiableafter thousands of successful steps@pytest.mark.flaky(reruns=2)— caused by tight spec constraints + allOf making data generation borderlineDesign decisions
Per-path property whitelist vs
additionalProperties: false: The Go server uses strict JSON decoding (DisallowUnknownFields), rejecting any property not in the struct. OpenAPI 3.0'sallOf+additionalProperties: falseis fundamentally broken — it evaluates per-subschema, so properties valid in the composite are rejected as "additional" in the base. Instead of fighting the spec, themap_bodyhook maintains a per-path whitelist (_PATH_PROPERTIES) derived from the spec, stripping fuzz-generated extra properties before they reach the server.Hooks vs spec changes: Some server behaviors cannot be expressed in OpenAPI 3.0 (parameter dependencies, discriminator enforcement, strict decoding). These are handled via Schemathesis hooks in
conftest.pyrather than incorrect spec annotations.This PR annotates path parameter IDs with
format: int64andpattern: "^[1-9][0-9]{0,8}$". The pattern safely constrains values to the int32 range (max 999,999,999 < 2,147,483,647), so no functional issues arise. However, the server actually validates all path IDs as int32, not int64:internal/apiutils/api_utils.go:42—ValidateIDAsInt32()callsstrconv.ParseInt(id, 10, 32).registered_model.go,model_version.go,experiment.go,inference_service.go,serving_environment.go,artifact.go,serve_model.go,experiment_run.go) usesValidateIDAsInt32, neverValidateIDAsInt64.BaseResource.idatcommon.yaml:98is already declared asformat: int64— this pre-dates this PR and is the existing convention.We chose
format: int64for path params to stay consistent with the existing response bodyidfield. But this creates a documented-vs-actual gap:BaseResource.idint64(pre-existing)int64(matches response body)MetadataIntValue.int_value(this PR)int32(changed from int64)StringToInt32()Options for follow-up discussion:
idwhich still says int64ValidateIDAsInt32could be changed toValidateIDAsInt64Note:
ValidateIDAsInt64already exists inapi_utils.go:66— it's just never called for path parameters.Server bugs identified (not fixed, worked around)
These are documented for separate follow-up:
MetadataProtoValuenot supported — EmbedMD converter (openapi_embedmd_converter_util.go:42-78) has no case for ProtoValue in its switch statementMetadataStructValuebase64 round-trip bug — the write converter (openapi_embedmd_converter_util.go:61) base64-decodesstruct_value, but the read converter (embdemd_openapi_converter_util.go:47) returns raw JSON without re-encoding. Any resource with a StructValue becomes un-PATCHable because the server re-processes stored customProperties through the write converter on PATCH. Proposed one-line fix: change line 47 fromNewMetadataStructValue(string(asJSON))toNewMetadataStructValue(base64.StdEncoding.EncodeToString(asJSON))— identical to what the ByteValue read path already does at line 60-61name/externalIdfilter injection — server constructs internal filter queries likename = '<value>'without escaping; backslash/quotes break the participle lexercustomPropertieskeys — Go JSON decoder fails on surrogate pairs and non-UTF-8 characters in property key namesMetadataIntValueint32/int64 mismatch — spec said int64, server usesStringToInt32()(fixed in this PR for int_value; see also ID format discussion above)ValidateIDAsInt32) despite spec declaring int64 (see section above)stepvstimestamptype inconsistency — both aretype: string, format: int64in spec, but Go struct hasstepasint64(JSON number) andtimestampasstring(JSON string)artifactTypeimmutable but in update schema — server rejects type changes on PATCH, and infers"unknown"when field is omitted; update schema should either excludeartifactTypeor mark itreadOnlyTest plan
make openapi/validate— both specs pass@flaky(reruns=2)for occasionalUnsatisfiable(see known limitation below)test_catalog_stateful.pymake test-fuzzfrom scratch (kind cluster creation → deploy → test): no timing failurescd clients/python && make test-fuzzand stateless/stateful tests directly — confirmed greenUnsatisfiable— needs further investigationThe model-registry stateful test (
test_mr_api_stateful) can non-deterministically fail withhypothesis.errors.Unsatisfiableeven after thousands of successful API calls. This happens when Hypothesis's state machine exhausts 1000 attempts to find a valid next transition. The failure is mitigated with@pytest.mark.flaky(reruns=2)but can still occur.What we know:
allOfcomposition.Root cause hypothesis:
Schemathesis does not fully enforce constraints inherited through
allOfduring data generation. When the spec has tight constraints (e.g.,pattern: "^[1-9][0-9]{0,8}$"on IDs,minLength: 1on names), Hypothesis generates many invalid values that get filtered, eventually hitting the 1000-filter hard limit. The hooks fix values AFTER generation, but Hypothesis's internal filtering happens BEFORE the hooks run.Possible directions for investigation:
before_generate_bodyhook — could constrain generation strategies at the source rather than fixing values after the fact, reducing the filter rejection rateUnsatisfiablethreshold is hardcoded in Hypothesis; a custom wrapper could retry with a different seed"^[0-9]+$"instead of"^[1-9][0-9]{0,8}$") to give Hypothesis more room, at the cost of less precise spec documentation--hypothesis-show-statisticsverbosityThe
make test-fuzztarget passes--hypothesis-show-statisticsto both stateless and stateful pytest runs. This produces extensive per-test statistics output that is useful for debugging Hypothesis generation issues but noisy for routine runs. Consider removing it from the default target and keeping it as an opt-in flag (e.g.,make test-fuzz STATS=1).🤖 Generated with Claude Code