You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Hybrid memory/file support: small bodies stay in memory, large bodies are read from NGINX temporary files.
82
83
- Memory allocation pre-allocation is capped at 1MB to avoid large upfront allocations. Actual in-memory accumulation may grow up to the configured `inference_bbr_max_body_size` limit; large payloads spill to disk and are read incrementally.
83
84
@@ -87,15 +88,16 @@ Current behavior and defaults
87
88
- Directive `inference_epp_header_name` configures the upstream header name to read from EPP responses (default `X-Inference-Upstream`).
88
89
- Directive `inference_epp_timeout_ms` sets the gRPC timeout for EPP communication (default `200ms`).
- Directive `inference_epp_ca_file /path/to/ca.crt` specifies CA certificate file path for TLS verification (optional).
92
94
- EPP follows the Gateway API Inference Extension specification: performs headers-only exchange, reads header mutations from responses, and sets the upstream header for endpoint selection.
93
95
- The `$inference_upstream` NGINX variable exposes the EPP-selected endpoint (read from the header configured by `inference_epp_header_name`) and can be used in `proxy_pass` directives.
94
96
95
97
- Fail-open/closed:
96
-
-`inference_bbr_failure_mode_allow on|off` and `inference_epp_failure_mode_allow on|off`control fail-open vs fail-closed behavior.
97
-
-In fail-closed mode, BBR enforces size limits and may return `413 Request Entity Too Large` or `500 Internal Server Error` on processing errors; EPP failures return `502 Bad Gateway`.
98
-
-In fail-open mode, processing continues without terminating the request.
98
+
-`inference_epp_failure_mode_allow on|off`controls EPP fail-open vs fail-closed behavior.
99
+
-EPP fail-closed mode returns `500 Internal Server Error` on EPP processing failures.
100
+
-EPP fail-open mode continues processing when EPP fails. When `inference_epp_failure_mode_allow` is `on`, you can configure `inference_default_upstream` to specify a fallback upstream when EPP fails.
0 commit comments