[TT-16767][Global] [Implementation] Centralised Error Overrides Infrastructure#7867
[TT-16767][Global] [Implementation] Centralised Error Overrides Infrastructure#7867MFCaballero merged 32 commits intomasterfrom
Conversation
|
This PR introduces a centralized Error Overrides feature, enabling the customization of error responses at the gateway level. This allows users to standardize error formats, hide internal details, and provide custom messages for both gateway-generated errors (e.g., auth failures, rate limits) and upstream service errors (e.g., 4xx/5xx). The implementation is optimized for performance, featuring a near-zero overhead path when no overrides are configured. All matching rules, including regular expressions and inline templates, are pre-compiled at gateway startup to ensure minimal latency. Files Changed AnalysisThis is a significant feature addition, reflected in the 30 files changed with 2,398 additions and only 16 deletions. The changes are well-organized:
Architecture & Impact Assessment
Error Handling Flow with OverridessequenceDiagram
participant Client
participant Gateway
participant ErrorHandler
participant OverrideEngine
Client->>Gateway: Makes a request
Gateway-->>ErrorHandler: Request fails (e.g., auth error, upstream 5xx)
ErrorHandler->>OverrideEngine: tryWriteOverride(code, message, body)
alt Rule matches
OverrideEngine-->>ErrorHandler: Return OverrideResult{new_code, new_body, ...}
ErrorHandler->>ErrorHandler: writeOverrideResponse()
ErrorHandler-->>Client: HTTP Response with custom body/code/headers
else No matching override
OverrideEngine-->>ErrorHandler: Return nil (no match)
ErrorHandler->>ErrorHandler: writeTemplateErrorResponse() (default behavior)
ErrorHandler-->>Client: Default error template response
end
Scope Discovery & Context Expansion
Metadata
Powered by Visor from Probelabs Last updated: 2026-04-20T15:49:30.966Z | Triggered by: pr_updated | Commit: 0e739a5 💡 TIP: You can chat with Visor using |
|
API Changes --- prev.txt 2026-04-20 15:11:20.276889259 +0000
+++ current.txt 2026-04-20 15:11:14.634751057 +0000
@@ -310,6 +310,10 @@
// SecurityRequirements stores all OAS security requirements (auto-populated from OpenAPI description import)
// When len(SecurityRequirements) > 1, OR logic is automatically applied
SecurityRequirements [][]string `json:"security_requirements,omitempty" bson:"security_requirements,omitempty"`
+
+ // ErrorOverrides contains the configurations for error response customization.
+ ErrorOverrides ErrorOverridesMap `bson:"error_overrides" json:"error_overrides"`
+ ErrorOverridesDisabled bool `bson:"error_overrides_disabled" json:"error_overrides_disabled" `
}
APIDefinition represents the configuration for a single proxied API and it's
versions.
@@ -523,6 +527,72 @@
Headers map[string]string `bson:"headers" json:"headers"`
}
+type ErrorMatcher struct {
+ // Flag matches against the error classification flag from the request context.
+ Flag errors.ResponseFlag `bson:"flag,omitempty" json:"flag,omitempty"`
+
+ // MessagePattern is a regex pattern to match against the response body.
+ MessagePattern string `bson:"message_pattern,omitempty" json:"message_pattern,omitempty"`
+
+ // BodyField is a JSON path (gjson syntax) to extract a value from the response body.
+ BodyField string `bson:"body_field,omitempty" json:"body_field,omitempty"`
+
+ // BodyValue is the expected value at BodyField for the match to succeed.
+ BodyValue string `bson:"body_value,omitempty" json:"body_value,omitempty"`
+
+ // CompiledPattern is the pre-compiled regex for MessagePattern.
+ CompiledPattern *regexp.Regexp `bson:"-" json:"-" ignored:"true"`
+}
+ ErrorMatcher defines additional matching criteria for error overrides.
+
+func (m *ErrorMatcher) Compile() error
+ Compile compiles the MessagePattern regex if present. Should be called after
+ unmarshaling from JSON or YAML.
+
+type ErrorOverride struct {
+ // Match contains optional additional matching criteria.
+ Match *ErrorMatcher `bson:"match,omitempty" json:"match,omitempty"`
+
+ // Response defines the response to return when matched.
+ Response ErrorResponse `bson:"response" json:"response"`
+
+ // Has unexported fields.
+}
+ ErrorOverride combines an optional matcher with its response.
+
+func (e *ErrorOverride) GetCompiledTemplate(isXML bool) any
+ GetCompiledTemplate returns the pre-compiled template for the given content
+ type. Returns nil if no inline Body template was compiled (e.g., using file
+ template).
+
+func (e *ErrorOverride) HasCompiledTemplate() bool
+ HasCompiledTemplate returns true if this override has a pre-compiled inline
+ Body template.
+
+func (e *ErrorOverride) SetCompiledTemplates(textTmpl, htmlTmpl any)
+ SetCompiledTemplates stores the pre-compiled templates for inline Body.
+
+type ErrorOverridesMap map[string][]ErrorOverride
+ ErrorOverridesMap maps status codes to their override rules.
+
+type ErrorResponse struct {
+ // StatusCode is the HTTP status code to return.
+ StatusCode int `bson:"status_code" json:"status_code"`
+
+ // Body is the HTTP response body (literal or inline template).
+ Body string `bson:"body,omitempty" json:"body,omitempty"`
+
+ // Message is the semantic error message passed to templates as {{.Message}}.
+ Message string `bson:"message,omitempty" json:"message,omitempty"`
+
+ // Template references an error template file in the templates/ directory.
+ Template string `bson:"template,omitempty" json:"template,omitempty"`
+
+ // Headers are HTTP headers to include in the response.
+ Headers map[string]string `bson:"headers,omitempty" json:"headers,omitempty"`
+}
+ ErrorResponse defines the override response for error overrides.
+
type EventHandlerMetaConfig struct {
Events map[TykEvent][]EventHandlerTriggerConfig `bson:"events" json:"events"`
}
@@ -3003,6 +3073,77 @@
func (et *EnforceTimeout) Fill(meta apidef.HardTimeoutMeta)
Fill fills *EnforceTimeout from apidef.HardTimeoutMeta.
+type ErrorMatcher struct {
+ // Flag matches against the error classification flag from the request context.
+ Flag errors.ResponseFlag `bson:"flag,omitempty" json:"flag,omitempty"`
+
+ // MessagePattern is a regex pattern to match against the response body.
+ MessagePattern string `bson:"messagePattern,omitempty" json:"messagePattern,omitempty"`
+
+ // BodyField is a JSON path (gjson syntax) to extract a value from the response body.
+ BodyField string `bson:"bodyField,omitempty" json:"bodyField,omitempty"`
+
+ // BodyValue is the expected value at BodyField for the match to succeed.
+ BodyValue string `bson:"bodyValue,omitempty" json:"bodyValue,omitempty"`
+}
+ ErrorMatcher defines additional matching criteria for error overrides.
+
+func (em *ErrorMatcher) ExtractTo(api *apidef.ErrorMatcher)
+
+type ErrorOverride struct {
+ // Match contains optional additional matching criteria.
+ Match *ErrorMatcher `bson:"match,omitempty" json:"match,omitempty"`
+
+ // Response defines the response to return when matched.
+ Response ErrorResponse `bson:"response" json:"response"`
+}
+ ErrorOverride combines an optional matcher with its response.
+
+func (eo *ErrorOverride) ExtractTo(api *apidef.ErrorOverride)
+
+func (eo *ErrorOverride) Fill(api apidef.ErrorOverride)
+
+type ErrorOverrides struct {
+ // Enabled determines if error overrides are active for this API.
+ // Maps to Tyk classic API definition: `error_overrides_disabled`
+ Enabled bool `bson:"enabled" json:"enabled"`
+
+ // Value contains the map of status codes to their override rules.
+ Value ErrorOverridesMap `bson:"value,omitempty" json:"value,omitempty"`
+}
+ ErrorOverrides defines the OAS extension configuration for error overrides.
+
+func (e *ErrorOverrides) ExtractTo(api *apidef.APIDefinition)
+
+func (e *ErrorOverrides) Fill(api apidef.APIDefinition)
+
+type ErrorOverridesMap map[string][]ErrorOverride
+ ErrorOverridesMap maps status codes to their override rules.
+
+func (e *ErrorOverridesMap) ExtractTo(api *apidef.APIDefinition)
+
+func (e *ErrorOverridesMap) Fill(api apidef.APIDefinition)
+
+type ErrorResponse struct {
+ // StatusCode is the HTTP status code to return.
+ StatusCode int `bson:"statusCode" json:"statusCode"`
+
+ // Body is the HTTP response body (literal or inline template).
+ Body string `bson:"body,omitempty" json:"body,omitempty"`
+
+ // Message is the semantic error message passed to templates as {{.Message}}.
+ Message string `bson:"message,omitempty" json:"message,omitempty"`
+
+ // Template references an error template file in the templates/ directory.
+ Template string `bson:"template,omitempty" json:"template,omitempty"`
+
+ // Headers are HTTP headers to include in the response.
+ Headers map[string]string `bson:"headers,omitempty" json:"headers,omitempty"`
+}
+ ErrorResponse defines the override response for error overrides.
+
+func (er ErrorResponse) ExtractTo(api *apidef.ErrorResponse)
+
type EventHandler struct {
// Enabled enables the event handler.
//
@@ -5576,6 +5717,8 @@
Server Server `bson:"server" json:"server"` // required
// Middleware contains the configurations related to the Tyk middleware.
Middleware *Middleware `bson:"middleware,omitempty" json:"middleware,omitempty"`
+ // ErrorOverrides contains the configurations for error response customization.
+ ErrorOverrides *ErrorOverrides `bson:"errorOverrides,omitempty" json:"errorOverrides,omitempty"`
}
XTykAPIGateway contains custom Tyk API extensions for the OpenAPI
definition. The values for the extensions are stored inside the OpenAPI
@@ -6829,6 +6972,25 @@
// ```
OverrideMessages map[string]TykError `bson:"override_messages" json:"override_messages"`
+ // ErrorOverrides allows you to customize the error responses that the Gateway will return to API clients.
+ // This configuration will be used to override both Gateway-generated errors (e.g. authentication failures, rate limits, validation errors)
+ // and errors returned by the upstream service (4xx/5xx responses from backend APIs).
+ // Rules are organized by HTTP status code and can include additional matching criteria.
+ // These rules will be superseded by any overrides configured in the API definition
+ //
+ // Sample Override Setting
+ // ```
+ // "error_overrides": {
+ // "500": [{
+ // "response": {
+ // "status_code": 503,
+ // "body": "{\"error\": \"Service temporarily unavailable\"}"
+ // }
+ // }]
+ // }
+ // ```
+ ErrorOverrides apidef.ErrorOverridesMap `json:"error_overrides,omitempty"`
+
// Cloud flag shows the Gateway runs in Tyk Cloud.
Cloud bool `json:"cloud"`
@@ -9704,6 +9866,12 @@
APIError is generic error object returned if there is something wrong with
the request
+type APIErrorWithContext struct {
+ Message htmltemplate.HTML
+ StatusCode int
+}
+ APIErrorWithContext provides context for error override templates.
+
type APISpec struct {
*apidef.APIDefinition
OAS oas.OAS
@@ -9770,6 +9938,7 @@
// all primitives on every JSON-RPC request that doesn't match a VEM.
// This is a convenience flag that combines ToolsAllowListEnabled, ResourcesAllowListEnabled, and PromptsAllowListEnabled.
MCPAllowListEnabled bool
+
// Has unexported fields.
}
APISpec represents a path specification for an API, to avoid enumerating
@@ -9806,6 +9975,10 @@
func (s *APISpec) FireEvent(name apidef.TykEvent, meta interface{})
+func (a *APISpec) GetCompiledErrorOverrides() *CompiledErrorOverrides
+ GetCompiledErrorOverrides returns the compiled error overrides for O(1)
+ lookup.
+
func (a *APISpec) GetPRMConfig() *oas.ProtectedResourceMetadata
GetPRMConfig returns the Protected Resource Metadata configuration if the
API is an OAS API Definition (OAS API, MCP Proxy, Stream API) with PRM
@@ -9832,6 +10005,9 @@
func (a *APISpec) SanitizeProxyPaths(r *http.Request)
+func (a *APISpec) SetCompiledErrorOverrides(compiled *CompiledErrorOverrides)
+ SetCompiledErrorOverrides stores the compiled error overrides.
+
func (a *APISpec) StopSessionManagerPool()
func (a *APISpec) StripListenPath(reqPath string) string
@@ -10195,6 +10371,23 @@
ObjectPostProcess does CoProcessObject post-processing (adding/removing
headers or params, etc.).
+type CompiledErrorOverrides struct {
+ // ByExactCode maps exact status codes to their override rules.
+ ByExactCode map[int][]*apidef.ErrorOverride
+
+ // ByPrefix maps status code prefixes to pattern rules.
+ ByPrefix map[int][]*apidef.ErrorOverride
+}
+ CompiledErrorOverrides provides lookup for error overrides by status code.
+
+func CompileErrorOverrides(overrides apidef.ErrorOverridesMap) *CompiledErrorOverrides
+ CompileErrorOverrides compiles all regex patterns, pre-compiles inline
+ message templates, and builds an indexed lookup structure for O(1) status
+ code matching. Called during config load (gateway-level) or API load
+ (API-level). Compilation failures are logged as warnings and those rules
+ are skipped. Returns nil if no overrides are provided or all rules failed to
+ compile.
+
type ComplexityFailReason int
const (
@@ -10372,10 +10565,58 @@
most middleware will invoke the ErrorHandler if something is wrong with the
request and halt the request processing through the chain
+func (e *ErrorHandler) ExecuteErrorTemplate(w http.ResponseWriter, tmpl TemplateExecutor, data any, errCode int) *http.Response
+ ExecuteErrorTemplate executes a template and captures output for analytics.
+ Uses io.MultiWriter to write to both the response and a buffer for
+ recording.
+
func (e *ErrorHandler) HandleError(w http.ResponseWriter, r *http.Request, errMsg string, errCode int, writeResponse bool)
HandleError is the actual error handler and will store the error details in
analytics if analytics processing is enabled.
+func (e *ErrorHandler) SetErrorResponseHeaders(w http.ResponseWriter, contentType string) http.Header
+ SetErrorResponseHeaders sets common error response headers on both the
+ ResponseWriter and returns a copy for analytics recording.
+
+type ErrorOverrides struct {
+ Spec *APISpec
+ Gw *Gateway
+}
+ ErrorOverrides provides centralized error override logic for both
+ Tyk-generated errors (via HandleError) and upstream error responses (via
+ response middleware).
+
+func NewErrorOverrides(spec *APISpec, gw *Gateway) *ErrorOverrides
+ NewErrorOverrides creates a new ErrorOverrides instance.
+
+func (o *ErrorOverrides) ApplyOverride(r *http.Request, statusCode int, body []byte) *OverrideResult
+ ApplyOverride attempts to match and apply an override for the given error.
+ Uses O(1) lookup by status code, then checks additional matching criteria.
+ Returns nil if no override matches.
+
+func (o *ErrorOverrides) ApplyUpstreamOverride(statusCode int, readBody func() []byte) *OverrideResult
+ ApplyUpstreamOverride applies overrides for upstream 4xx/5xx responses.
+ Uses lazy body reading via closure.
+
+type ErrorResponseContext struct {
+ // ContentType is the Content-Type header value to use in the response.
+ ContentType string
+
+ // TemplateExtension is the file extension for template lookup ("json" or "xml").
+ TemplateExtension string
+
+ // IsXML indicates whether XML content type was detected.
+ // When true, text/template is used; otherwise html/template is used.
+ IsXML bool
+}
+ ErrorResponseContext holds content-type detection results for error
+ responses. Used to determine template extension and template engine
+ selection.
+
+func DetectErrorResponseContext(r *http.Request) *ErrorResponseContext
+ DetectErrorResponseContext extracts content type info from the request.
+ Follows the same pattern as writeTemplateErrorResponse for consistency.
+
type EventCurcuitBreakerMeta struct {
EventMetaDefault
Path string
@@ -10641,6 +10882,10 @@
func (gw *Gateway) GetCoProcessGrpcServerTargetURL() (*url.URL, error)
+func (gw *Gateway) GetCompiledErrorOverrides() *CompiledErrorOverrides
+ GetCompiledErrorOverrides returns the compiled error overrides for O(1)
+ lookup.
+
func (gw *Gateway) GetConfig() config.Config
func (gw *Gateway) GetLoadedAPIIDs() []model.LoadedAPIInfo
@@ -10710,6 +10955,9 @@
func (gw *Gateway) SetCheckerHostList()
+func (gw *Gateway) SetCompiledErrorOverrides(compiled *CompiledErrorOverrides)
+ SetCompiledErrorOverrides stores the compiled error overrides.
+
func (gw *Gateway) SetConfig(conf config.Config, skipReload ...bool)
func (gw *Gateway) SetNodeID(nodeID string)
@@ -11775,6 +12023,40 @@
func (k *OrganizationMonitor) SetOrgSentinel(orgChan chan bool, orgId string)
+type OverrideResult struct {
+ // StatusCode is the HTTP status code to return.
+ StatusCode int
+
+ // Headers are additional HTTP headers to include.
+ Headers map[string]string
+
+ // OriginalCode is the original error status code before override.
+ OriginalCode int
+
+ // Has unexported fields.
+}
+ OverrideResult contains the result of applying an error override. Holds
+ context needed for response writing including the matched rule.
+
+func (r *OverrideResult) GetBody() string
+ GetBody returns the response body.
+
+func (r *OverrideResult) GetMessageForTemplate() string
+ GetMessageForTemplate returns the semantic message for {{.Message}} in
+ templates.
+
+func (r *OverrideResult) GetTemplateExecutor(gw *Gateway, errCtx *ErrorResponseContext) TemplateExecutor
+ GetTemplateExecutor returns the template to execute, or nil if body should
+ be written directly.
+
+func (r *OverrideResult) ShouldUseDefaultTemplate() bool
+ ShouldUseDefaultTemplate returns true when only Message is set (no Body,
+ no Template).
+
+func (r *OverrideResult) ShouldWriteDirectly() bool
+ ShouldWriteDirectly returns true if body should be written as-is (no
+ template variables).
+
type PRMMiddleware struct {
*BaseMiddleware
}
@@ -12341,6 +12623,29 @@
func (m *ResponseCacheMiddleware) Name() string
+type ResponseErrorOverrideMiddleware struct {
+ BaseTykResponseHandler
+}
+ ResponseErrorOverrideMiddleware intercepts upstream 4xx/5xx responses and
+ applies configured error overrides before they reach the client.
+
+func (r *ResponseErrorOverrideMiddleware) Base() *BaseTykResponseHandler
+
+func (r *ResponseErrorOverrideMiddleware) Enabled() bool
+
+func (r *ResponseErrorOverrideMiddleware) HandleError(_ http.ResponseWriter, _ *http.Request)
+
+func (r *ResponseErrorOverrideMiddleware) HandleResponse(
+ _ http.ResponseWriter,
+ res *http.Response,
+ req *http.Request,
+ _ *user.SessionState,
+) error
+
+func (r *ResponseErrorOverrideMiddleware) Init(_ any, spec *APISpec) error
+
+func (r *ResponseErrorOverrideMiddleware) Name() string
+
type ResponseGoPluginMiddleware struct {
BaseTykResponseHandler
Path string // path to .so file |
Security Issues (2)
Security Issues (2)
No architecture issues found – changes LGTM. ✅ Performance Check PassedNo performance issues found – changes LGTM. Quality Issues (2)
Powered by Visor from Probelabs Last updated: 2026-04-20T15:49:11.453Z | Triggered by: pr_updated | Commit: 0e739a5 💡 TIP: You can chat with Visor using |
andyo-tyk
left a comment
There was a problem hiding this comment.
Proposed some tweaks to the wording
Co-authored-by: andyo-tyk <99968932+andyo-tyk@users.noreply.github.com>
Co-authored-by: andyo-tyk <99968932+andyo-tyk@users.noreply.github.com>
…mport cycling (#7893) <!-- Provide a general summary of your changes in the Title above --> ## Description <!-- Describe your changes in detail --> ## Related Issue <!-- This project only accepts pull requests related to open issues. --> <!-- If suggesting a new feature or change, please discuss it in an issue first. --> <!-- If fixing a bug, there should be an issue describing it with steps to reproduce. --> <!-- OSS: Please link to the issue here. Tyk: please create/link the JIRA ticket. --> ## Motivation and Context <!-- Why is this change required? What problem does it solve? --> ## How This Has Been Tested <!-- Please describe in detail how you tested your changes --> <!-- Include details of your testing environment, and the tests --> <!-- you ran to see how your change affects other areas of the code, etc. --> <!-- This information is helpful for reviewers and QA. --> ## Screenshots (if appropriate) ## Types of changes <!-- What types of changes does your code introduce? Put an `x` in all the boxes that apply: --> - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Refactoring or add test (improvements in base code or adds test coverage to functionality) ## Checklist <!-- Go over all the following points, and put an `x` in all the boxes that apply --> <!-- If there are no documentation updates required, mark the item as checked. --> <!-- Raise up any additional concerns not covered by the checklist. --> - [ ] I ensured that the documentation is up to date - [ ] I explained why this PR updates go.mod in detail with reasoning why it's required - [ ] I would like a code coverage CI quality gate exception and have explained why Co-authored-by: Vlad Zabolotnyi <vlad.z@tyk.io>
<!-- Provide a general summary of your changes in the Title above --> ## Description <!-- Describe your changes in detail --> This PR adds integration tests for the error overrides feature. Depends on #7867 CI tests have been tested against master based and are passing Base infra taken from @tbuchaillot https://github.com/tbuchaillot/test-access-logs ## Related Issue <!-- This project only accepts pull requests related to open issues. --> <!-- If suggesting a new feature or change, please discuss it in an issue first. --> <!-- If fixing a bug, there should be an issue describing it with steps to reproduce. --> <!-- OSS: Please link to the issue here. Tyk: please create/link the JIRA ticket. --> ## Motivation and Context <!-- Why is this change required? What problem does it solve? --> ## How This Has Been Tested <!-- Please describe in detail how you tested your changes --> <!-- Include details of your testing environment, and the tests --> <!-- you ran to see how your change affects other areas of the code, etc. --> <!-- This information is helpful for reviewers and QA. --> ## Screenshots (if appropriate) ## Types of changes <!-- What types of changes does your code introduce? Put an `x` in all the boxes that apply: --> - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Refactoring or add test (improvements in base code or adds test coverage to functionality) ## Checklist <!-- Go over all the following points, and put an `x` in all the boxes that apply --> <!-- If there are no documentation updates required, mark the item as checked. --> <!-- Raise up any additional concerns not covered by the checklist. --> - [ ] I ensured that the documentation is up to date - [ ] I explained why this PR updates go.mod in detail with reasoning why it's required - [ ] I would like a code coverage CI quality gate exception and have explained why <!---TykTechnologies/jira-linter starts here--> ### Ticket Details <details> <summary> <a href="https://tyktech.atlassian.net/browse/TT-16775" title="TT-16775" target="_blank">TT-16775</a> </summary> | | | |---------|----| | Status | Merge | | Summary | [1b] Testing Centralised ErrorOverrides Infrastructure | Generated at: 2026-03-17 16:18:07 </details> <!---TykTechnologies/jira-linter ends here-->
<!-- Provide a general summary of your changes in the Title above --> ## Description <!-- Describe your changes in detail --> Upstream errors override implementation # Error Override Middleware Performance Benchmarks ## Executive Summary The upstream error override middleware adds **negligible performance overhead**: - **Success responses**: ~0.82 ns/op overhead (sub-nanosecond status code check) - **No overrides configured**: ~0.84 ns/op overhead (fast path with map length check) - **Error responses**: ~0.83 ns/op (all fast paths are essentially identical) - **Error with no match**: ~19 ns/op (fast rejection) - **Error with exact match**: ~77 ns/op (includes map lookup and result creation) - **Error with body inspection**: ~292 ns/op (includes JSON parsing) **Optimization:** The middleware checks `statusCode >= 400` and `len(ErrorOverrides) > 0` before any processing, ensuring zero impact on success responses and deployments without overrides. ## Detailed Results ### 1. Fast Path Performance Testing the critical fast path checks that protect the hot path (averaged over 10 runs): ``` BenchmarkShouldProcessResponse/fast_path_success 0.82 ns/op 0 B/op 0 allocs/op BenchmarkShouldProcessResponse/fast_path_no_config 0.84 ns/op 0 B/op 0 allocs/op BenchmarkShouldProcessResponse/error_response 0.83 ns/op 0 B/op 0 allocs/op ``` **Key Findings:** - **All three paths are essentially identical**: 0.82-0.84 ns - differences are measurement noise - **Sub-nanosecond overhead**: Effectively immeasurable in production - **Zero allocations** on all paths - **CPU-optimized**: Branch prediction and L1 cache make the check nearly free - The function performs two integer comparisons (status >= 400, len > 0) with short-circuit evaluation - Modern CPUs execute this in a fraction of a nanosecond **Note on Variance**: At sub-nanosecond scales, individual measurements can vary by ±0.2 ns due to CPU scheduling, cache effects, and branch prediction. Statistical averages over multiple runs show all paths perform identically. ### 2. Lazy Body Reader Testing the lazy body reading mechanism that defers I/O until needed: ``` BenchmarkLazyBodyReader/no_read 7.8 ns/op 0 B/op 0 allocs/op BenchmarkLazyBodyReader/small_body 660.0 ns/op 600 B/op 4 allocs/op BenchmarkLazyBodyReader/large_body 9907 ns/op 17496 B/op 10 allocs/op BenchmarkLazyBodyReader/cached_read 6.4 ns/op 0 B/op 0 allocs/op BenchmarkLazyBodyReader/restore_body 863.6 ns/op 720 B/op 8 allocs/op ``` **Key Findings:** - Creating reader has no measurable cost (7.8 ns, 0 allocs) - Body only read when rule requires inspection - Cached reads enable multiple rule checks without re-reading (6.4 ns) - Large bodies respect maxBodySizeForMatching limit (16 KB) - Restore mechanism preserves full body with minimal overhead ### 3. ApplyUpstreamOverride - Core Matching Testing the core matching logic for upstream error responses: ``` BenchmarkApplyUpstreamOverride/no_match 18.9 ns/op 0 B/op 0 allocs/op BenchmarkApplyUpstreamOverride/exact_match_no_body 77.0 ns/op 32 B/op 1 allocs/op BenchmarkApplyUpstreamOverride/pattern_match_5xx 78.2 ns/op 32 B/op 1 allocs/op BenchmarkApplyUpstreamOverride/URS_flag 84.4 ns/op 32 B/op 1 allocs/op BenchmarkApplyUpstreamOverride/body_field_match 292.4 ns/op 48 B/op 2 allocs/op BenchmarkApplyUpstreamOverride/message_pattern_match 261.2 ns/op 32 B/op 1 allocs/op BenchmarkApplyUpstreamOverride/multiple_rules_first_match 68.7 ns/op 32 B/op 1 allocs/op BenchmarkApplyUpstreamOverride/multiple_rules_last_match 92.6 ns/op 32 B/op 1 allocs/op ``` **Key Findings:** - **No match**: 18.9 ns with zero allocations (fast rejection) - **Exact status code match** (e.g., "503"): 77.0 ns (O(1) hash map lookup) - **Pattern match** (e.g., "5xx"): 78.2 ns (prefix calculation + map lookup) - **URS flag matching**: 84.4 ns (integer range check for 500-599) - **Body field matching**: 292.4 ns (JSON path extraction adds overhead) - **Regex pattern matching**: 261.2 ns (pre-compiled patterns) - **Multiple rules**: ~24 ns difference between first and last match (~24 ns per rule iteration) ### 4. CompiledErrorOverrides - Direct Map Access Testing the optimized compiled structure with direct map lookups: ``` BenchmarkCompiledErrorOverrides/exact_code_lookup 5.9 ns/op 0 B/op 0 allocs/op BenchmarkCompiledErrorOverrides/prefix_lookup 6.4 ns/op 0 B/op 0 allocs/op BenchmarkCompiledErrorOverrides/no_match 6.9 ns/op 0 B/op 0 allocs/op ``` **Key Findings:** - O(1) exact match check: 5.9 ns (direct map lookup) - Prefix check: 6.4 ns (includes prefix calculation from status code) - No match: 6.9 ns (checks both exact and prefix maps) - Zero allocations enable efficient early rejection - Compiled structure eliminates runtime parsing overhead ### 5. MatchesUpstreamCriteria Testing different matching criteria types: ``` BenchmarkMatchesUpstreamCriteria/no_criteria 5.6 ns/op 0 B/op 0 allocs/op BenchmarkMatchesUpstreamCriteria/URS_flag 6.0 ns/op 0 B/op 0 allocs/op BenchmarkMatchesUpstreamCriteria/body_field_small_JSON 193.6 ns/op 16 B/op 1 allocs/op BenchmarkMatchesUpstreamCriteria/body_field_large_JSON 1576 ns/op 8 B/op 1 allocs/op BenchmarkMatchesUpstreamCriteria/message_pattern_simple 175.4 ns/op 0 B/op 0 allocs/op BenchmarkMatchesUpstreamCriteria/message_pattern_complex 205.1 ns/op 0 B/op 0 allocs/op ``` **Key Findings:** - **No criteria** (match all): 5.6 ns - **URS flag** (5xx check): 6.0 ns - simplest semantic matching - **Small JSON body field**: 193.6 ns - gjson path extraction - **Large JSON body field**: 1576 ns - deeper nesting increases overhead - **Simple regex**: 175.4 ns - pre-compiled pattern - **Complex regex**: 205.1 ns - alternation and capture groups - Zero allocations for flag and regex matching ### 6. Rule Matching Scalability Performance with varying rule counts (worst case: matching last rule): ``` BenchmarkFindMatchingRuleGeneric/10_rules 60.3 ns/op 0 B/op 0 allocs/op BenchmarkFindMatchingRuleGeneric/50_rules 329.1 ns/op 0 B/op 0 allocs/op BenchmarkFindMatchingRuleGeneric/100_rules 565.6 ns/op 0 B/op 0 allocs/op ``` **Key Findings:** - Linear scaling: ~5.4 ns per rule - Zero allocations regardless of rule count - 10 rules (typical): 60.3 ns - 50 rules (large): 329.1 ns - 100 rules (extreme): 565.6 ns - First-match semantics: place frequent rules first ### 7. End-to-End Middleware Performance Full middleware execution (includes HTTP response handling overhead): ``` BenchmarkHandleResponse/no_override_passthrough 1310 ns/op 672 B/op 9 allocs/op BenchmarkHandleResponse/success_response_skip 207.4 ns/op 136 B/op 4 allocs/op BenchmarkHandleResponse/exact_match_status_only 1727 ns/op 1216 B/op 18 allocs/op BenchmarkHandleResponse/exact_match_with_body 3531 ns/op 1232 B/op 19 allocs/op BenchmarkHandleResponse/pattern_match_small_body 2595 ns/op 1776 B/op 21 allocs/op BenchmarkHandleResponse/pattern_match_large_body 13394 ns/op 29171 B/op 21 allocs/op ``` **Note:** These times include benchmark setup overhead (HTTP response object creation). Actual middleware overhead is shown in fast path benchmarks (0.82 ns). **Key Findings:** - Success response: 207.4 ns total (middleware: 0.82 ns, rest: test harness) - Error passthrough: 1310 ns - Override application: ~1.7-3.5 μs for status/body changes - Large body: 13.4 μs (dominated by I/O) ### 8. Real-World Scenarios Production workload simulations: ``` BenchmarkRealWorld/high_traffic_no_override 329.7 ns/op 125 B/op 4 allocs/op (99% success, 1% errors without matching rules) BenchmarkRealWorld/high_traffic_with_override 270.5 ns/op 141 B/op 4 allocs/op (98% success, 2% errors with matching overrides) BenchmarkRealWorld/complex_ruleset 1660 ns/op 1201 B/op 18 allocs/op (10 error codes, multiple rules, mixed traffic) ``` **Key Findings:** - Typical API (1% errors, no override): 329.7 ns average - With error overrides (2% errors): 270.5 ns average - Complex configuration: 1.66 μs **Production Impact Analysis:** For typical API handling **10,000 req/sec**: **Scenario 1: No Overrides Configured** ``` 10,000 req/sec × 0.84 ns = 8.4 μs/sec = 0.0008% CPU ``` **Scenario 2: 99% Success, 1% Errors** ``` Success: 9,900 req/sec × 0.82 ns = 8.1 μs/sec Errors: 100 req/sec × 270 ns = 27 μs/sec Total: 35.1 μs/sec = 0.004% CPU ``` **Scenario 3: High Error Rate (10% errors)** ``` Success: 9,000 req/sec × 0.82 ns = 7.4 μs/sec Errors: 1,000 req/sec × 270 ns = 270 μs/sec Total: 277 μs/sec = 0.028% CPU ``` ## Performance Impact Analysis ### Hot Path (Every Request) When **no overrides are configured** (common case): - Overhead: **0.84 ns** per request - Memory: 0 bytes allocated - **Impact: Zero** - immeasurable in production When **overrides are configured**: - Success responses: **0.82 ns** (status code check only) - Error with exact match: **77 ns** (O(1) map lookup) - Error with pattern match: **78 ns** (prefix + map lookup) - Error with body inspection: **292 ns** (includes JSON parsing) ### Cold Path (Gateway Startup) Compilation overhead (one-time at startup): - Tested via `CompileErrorOverrides` function - Simple rules: ~1-2 μs - With regex patterns: ~5-10 μs - **Impact: Negligible** - happens once ## Memory Allocation Analysis Memory allocations per operation: ``` Operation Allocations Bytes ---------------------------------------------------- Success response check 0 allocs 0 B No config check 0 allocs 0 B No matching rule 0 allocs 0 B Exact code match 1 alloc 32 B Pattern match (5xx) 1 alloc 32 B URS flag match 1 alloc 32 B Body field match 2 allocs 48 B Regex pattern match 1 alloc 32 B Body read (small) 4 allocs 600 B Body read (large) 10 allocs 17 KB ``` **Key Findings:** - Zero allocations on all fast/rejection paths - Single 32B allocation for matched overrides - Lazy body reading prevents unnecessary allocations - No memory leaks or unbounded growth ## Scalability Considerations ### Rule Count Impact - 10 rules: 60.3 ns (typical configuration) - 50 rules: 329.1 ns (large configuration) - 100 rules: 565.6 ns (extreme configuration) - Linear scaling: 5.4 ns per additional rule ### Body Size Impact - Small bodies (< 1 KB): ~660 ns read time - Large bodies (16 KB): ~9.9 μs read time - Respects `maxBodySizeForMatching` limit - Lazy reading: only when rule requires it ## Conclusions **Virtually zero overhead when disabled**: 0.84 ns (map length check) - immeasurable in production **Sub-nanosecond fast paths**: All paths (success, no-config, error) have identical overhead (~0.83 ns) **Efficient matching**: O(1) status code lookups (exact: 77 ns, pattern: 78 ns) **URS flag is fastest semantic matching**: 6.0 ns for simple 5xx range check **Body inspection adds reasonable overhead**: 194-1576 ns depending on JSON complexity **Scalable**: Linear performance up to 100+ rules with 5.4 ns per rule **Memory efficient**: Zero allocations on fast paths, single allocation (32B) for matches **Production ready**: < 0.03% CPU impact even with 10% error rate and full override processing ## Recommendations For **best performance**: 1. Use URS flag for 5xx matching (6.0 ns - fastest semantic match) 2. Use exact status codes for specific errors (77.0 ns) 3. Use pattern matching (5xx) for broad categories (78.2 ns) 4. Place frequently matched rules first (saves ~5 ns per rule skipped) 5. Minimize body inspection when possible (adds ~200-300 ns) For **body inspection** (when you need to match on response content): **Use regex patterns when:** - Large JSON responses or deeply nested structures - Body size > 1 KB or nesting depth > 2-3 levels - Performance is critical (regex: 175-205 ns regardless of JSON size) - Matching error messages or text patterns - Example: `"message_pattern": "database.*unavailable"` **Use body field matching when:** - Small JSON responses (< 1 KB) with shallow structure - Need precise field extraction (e.g., `error.code == "TIMEOUT"`) - Fields are at root or 1-2 levels deep - Performance: 194 ns for small/shallow JSON, but degrades to 1576 ns for large/nested JSON - Example: `"body_field": "error.code", "body_value": "TIMEOUT"` **Performance comparison:** ``` Small JSON (< 1 KB, shallow): - Body field: 193.6 ns ≈ Regex: 175-205 ns (comparable) Large/nested JSON: - Body field: 1576 ns vs Regex: 175-205 ns (regex 8x faster!) ``` For **optimal flexibility**: 1. Use URS flag for semantic upstream error matching 2. Use regex patterns for most body matching (consistent performance) 3. Use body field matching only for small JSON with shallow fields 4. Use status code + criteria combinations for precise matching 5. Both exact and pattern (4xx/5xx) matching are very efficient --- **Test Environment:** - Machine: Apple M1 Pro - Go Version: 1.25.1 - OS: macOS (darwin/arm64) - Benchmark Duration: 1s per benchmark (with 10 runs for fast paths) - Total Benchmarks: 39 - Run Date: 2026-03-16 ## Related Issue <!-- This project only accepts pull requests related to open issues. --> <!-- If suggesting a new feature or change, please discuss it in an issue first. --> <!-- If fixing a bug, there should be an issue describing it with steps to reproduce. --> <!-- OSS: Please link to the issue here. Tyk: please create/link the JIRA ticket. --> ## Motivation and Context <!-- Why is this change required? What problem does it solve? --> ## How This Has Been Tested <!-- Please describe in detail how you tested your changes --> <!-- Include details of your testing environment, and the tests --> <!-- you ran to see how your change affects other areas of the code, etc. --> <!-- This information is helpful for reviewers and QA. --> ## Screenshots (if appropriate) ## Types of changes <!-- What types of changes does your code introduce? Put an `x` in all the boxes that apply: --> - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Refactoring or add test (improvements in base code or adds test coverage to functionality) ## Checklist <!-- Go over all the following points, and put an `x` in all the boxes that apply --> <!-- If there are no documentation updates required, mark the item as checked. --> <!-- Raise up any additional concerns not covered by the checklist. --> - [ ] I ensured that the documentation is up to date - [ ] I explained why this PR updates go.mod in detail with reasoning why it's required - [ ] I would like a code coverage CI quality gate exception and have explained why <!---TykTechnologies/jira-linter starts here--> ### Ticket Details <details> <summary> <a href="https://tyktech.atlassian.net/browse/TT-16772" title="TT-16772" target="_blank">TT-16772</a> </summary> | | | |---------|----| | Status | In Code Review | | Summary | [2] Implement Upstream Error Response Overrides | Generated at: 2026-03-17 15:57:40 </details> <!---TykTechnologies/jira-linter ends here--> --------- Co-authored-by: Vlad Zabolotnyi <109525963+vladzabolotnyi@users.noreply.github.com> Co-authored-by: Vlad Zabolotnyi <vlad.z@tyk.io> Co-authored-by: Leonid Bugaev <leonsbox@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: andrei-tyk <97896463+andrei-tyk@users.noreply.github.com> Co-authored-by: Laurentiu <6229829+lghiur@users.noreply.github.com>
|
Swagger Changes _ __ __
ErrorMatcher:
ErrorOverride:
ErrorResponse:
error_overrides:
error_overrides_disabled:
_| |_ _ / _|/ _| between swagger-prev.yml
+ three map entries added:
+ two map entries added:
/ _' | | | | |_| |_ and swagger-current.yml
\__,_|\__, |_| |_| returned two differences
components.schemas
components.schemas.APIDefinition.properties
| (_| | |_| | _| _| |
…rors (#7935) PR for https://tyktech.atlassian.net/browse/TT-16770 --------- Co-authored-by: María Florencia Caballero <66144664+MFCaballero@users.noreply.github.com>
bojank93
left a comment
There was a problem hiding this comment.
LGTM.
@MFCaballero can you please merge PR ?
Thank you in advance
🚨 Jira Linter FailedCommit: The Jira linter failed to validate your PR. Please check the error details below: 🔍 Click to view error detailsNext Steps
This comment will be automatically deleted once the linter passes. |
|



Description
This PR introduces a centralized Error Overrides feature, allowing for the customization of error responses at the gateway level. This functionality enables users to standardize error formats, hide internal error details, and provide branded or localized messages for both gateway-generated errors (e.g., auth failures, rate limits) and upstream service errors (e.g., 4xx/5xx responses).
The implementation is optimized for performance, featuring a near-zero overhead path when no overrides are configured. All matching rules, including regular expressions and inline templates, are pre-compiled at gateway startup to ensure minimal latency during error handling.
Error Override Performance Benchmarks
Executive Summary
The error override feature adds negligible performance overhead to the error handling path:
Optimization: The code checks if overrides exist before entering the override path, ensuring minimal impact on existing deployments.
Detailed Results
1. ApplyOverride - Matching Performance
Testing the core matching logic that determines if an override should be applied:
Key Findings:
2. Flag-Based Matching (Error Classification)
Flag matching uses the error classification system for semantic matching - matching by error type rather than text patterns:
Flag vs Regex Performance Comparison:
Key Findings:
3. WriteOverrideResponse vs WriteTemplateErrorResponse
Direct comparison of error response writing:
Key Findings:
4. Compilation Performance
One-time cost during gateway startup or API reload:
Key Findings:
Fast Path Optimization
Testing the entry point
tryWriteOverridewith empty vs configured overrides:Key Findings:
len(e.Spec.GlobalConfig.ErrorOverrides) == 0before proceedingPerformance Impact Analysis
Hot Path (Every Error Response)
When no overrides are configured (most common case):
When overrides are configured and match:
Cold Path (Gateway Startup)
Compilation happens once during:
Memory Allocation Analysis
Memory allocations per error response:
Key Findings:
Scalability Considerations
Large Body Handling
Multiple Rules
Conclusions
Virtually zero overhead when disabled: ~5 ns (fast path with map length check) - immeasurable in production
Optimized check path: Early exit when no overrides configured prevents significant overhead
Status code matching is fast: Both exact (~55 ns) and pattern (~66 ns) matching are very efficient
Flag matching is ~2.3x faster than regex: 75 ns vs 173 ns for semantic error matching
Faster for simple overrides: Direct message writing is 48% faster than default templates
Acceptable overhead for advanced features: Template execution adds reasonable overhead for the flexibility gained
Efficient matching: O(1) lookups for exact codes, fast flag comparison, pre-compiled regex
No memory leaks: Bounded allocations, pre-compiled patterns
Recommendations
For best performance:
For optimal flexibility:
RLTfor rate limiting)Test Environment:
Related Issue
Motivation and Context
How This Has Been Tested
Screenshots (if appropriate)
Types of changes
Checklist
Ticket Details
TT-16767
Generated at: 2026-03-17 16:11:43