fix(automl,autorag): resolve MinIO storage access issues#7221
fix(automl,autorag): resolve MinIO storage access issues#7221openshift-merge-bot[bot] merged 7 commits intoopendatahub-io:mainfrom
Conversation
Add S3 client support for in-cluster MinIO and other S3-compatible object stores. Three configurable options control security behavior, all defaulting to permissive for MinIO compatibility: - S3_ALLOW_HTTP=true: permit plain HTTP S3 endpoints - S3_INSECURE_SKIP_VERIFY=true: skip TLS cert verification - S3_ALLOW_INTERNAL_IPS=true: allow RFC-1918 private IPs Customers can set any of these to "false" via environment variables to enforce stricter security policies. Loopback, link-local, and reserved IP ranges remain always blocked. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR adds an exported Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Security Issues
Architectural Issues
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (1)
packages/automl/bff/internal/integrations/s3/client_factory.go (1)
24-38: The exporteddefault: truecontract is not actually implemented.
S3ClientOptions{}still leaves all three booleansfalsebecausewithDefaults()only fills transfer knobs. Any caller other thanNewAppgets behavior that disagrees with these field docs. Either move the defaulting into construction with a tri-state representation, or remove thedefault: truewording from the exported type.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/automl/bff/internal/integrations/s3/client_factory.go` around lines 24 - 38, The docs claim S3ClientOptions boolean fields default to true but S3ClientOptions{} leaves them false because withDefaults() only sets transfer knobs; fix by initializing these defaults in construction: add a constructor (e.g., NewS3ClientOptions) or modify withDefaults() to set InsecureSkipVerify, AllowInternalIPs, and AllowHTTP to true when they are unset, and update all call sites (including callers that instantiate S3ClientOptions directly) to use the constructor or run withDefaults(); alternatively remove the "default: true" text from the exported field comments if you prefer not to change behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/automl/bff/internal/api/app.go`:
- Around line 155-176: The current defaults enable insecure S3 behavior; change
the three flags so they default to false and require explicit opt-in via env
vars: update s3InsecureSkipVerify, s3AllowInternalIPs, and s3AllowHTTP to be
enabled only when their respective env var equals "true" (or parse a boolean
from the env) rather than being enabled unless set to "false"; ensure you update
any surrounding comment to reflect the new secure-by-default posture and that
callers (e.g., the S3 client construction code that reads these vars) continue
to use these variables.
- Around line 174-176: The current boolean parsing for s3InsecureSkipVerify,
s3AllowInternalIPs, and s3AllowHTTP uses != "false" which fails open; change
each to parse with strconv.ParseBool(strings.TrimSpace(os.Getenv(...))) and
treat parse errors as invalid (log and return an error or set the value to false
explicitly). Specifically, replace assignments to s3InsecureSkipVerify,
s3AllowInternalIPs, and s3AllowHTTP with code that calls strings.TrimSpace on
the env var, calls strconv.ParseBool, if err != nil then reject/handle the
invalid value (e.g., return startup error or log and set false), otherwise set
the boolean to the parsed value so the default is secure (false).
In `@packages/automl/bff/internal/integrations/s3/client.go`:
- Around line 102-110: When opts.InsecureSkipVerify is true the code assigns
cfg.HTTPClient a bare http.Client with a minimal http.Transport which lacks the
standard timeouts; update the branch that sets cfg.HTTPClient so it clones
http.DefaultTransport (preserving its Dial/TLS/Idle timeout settings) and
assigns that transport to the http.Client, and set http.Client.Timeout to 30
seconds to enforce an explicit timeout consistent with other BFF HTTP
integrations (adjusting TLSClientConfig.InsecureSkipVerify on the cloned
transport as currently done).
In `@packages/autorag/bff/internal/api/app.go`:
- Around line 179-186: The three S3 flags (s3InsecureSkipVerify,
s3AllowInternalIPs, s3AllowHTTP) are currently parsed as fail-open; replace the
current os.Getenv(...) != "false" logic with strconv.ParseBool for each
variable, default to false (opt-in) and abort startup on parse error so invalid
values fail fast; wire the parsed booleans into s3int.S3ClientOptions. Also
update the HTTP client created earlier (the external-service http.Client) to set
a sensible Timeout (e.g., 30s) and ensure TLS verification is not disabled by
default. Ensure error handling surfaces parse errors (return or log.Fatal)
rather than silently continuing with insecure defaults.
In `@packages/autorag/bff/internal/integrations/s3/client_factory.go`:
- Around line 21-35: The comment points out that S3ClientOptions boolean fields
(InsecureSkipVerify, AllowInternalIPs, AllowHTTP) default to false because
withDefaults() only sets numeric fields; update withDefaults() (or
NewRealClientFactory()) to explicitly set these three booleans to true when the
struct fields are zero-valued so the documented defaults are actually applied,
and ensure any callers that rely on zero-values (e.g., tests using
S3ClientOptions{}) will now receive the intended true defaults; reference the
S3ClientOptions type and the withDefaults()/NewRealClientFactory() functions
when making the change.
In `@packages/autorag/bff/internal/integrations/s3/client.go`:
- Around line 82-90: The custom S3 HTTP client currently builds a zero-value
http.Transport when opts.InsecureSkipVerify is true, losing
http.DefaultTransport features and leaving no http.Client.Timeout; fix by
cloning http.DefaultTransport (type-assert to *http.Transport and create a
shallow copy) then modify its TLSClientConfig to set InsecureSkipVerify and
MinVersion as needed, assign that transport to cfg.HTTPClient, and set a
sensible cfg.HTTPClient.Timeout (non-zero) to avoid hanging requests; update
references in the block that constructs cfg.HTTPClient and where
opts.InsecureSkipVerify is checked.
---
Nitpick comments:
In `@packages/automl/bff/internal/integrations/s3/client_factory.go`:
- Around line 24-38: The docs claim S3ClientOptions boolean fields default to
true but S3ClientOptions{} leaves them false because withDefaults() only sets
transfer knobs; fix by initializing these defaults in construction: add a
constructor (e.g., NewS3ClientOptions) or modify withDefaults() to set
InsecureSkipVerify, AllowInternalIPs, and AllowHTTP to true when they are unset,
and update all call sites (including callers that instantiate S3ClientOptions
directly) to use the constructor or run withDefaults(); alternatively remove the
"default: true" text from the exported field comments if you prefer not to
change behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: d1c0fd4b-1604-48c9-b79d-89152dd56ff5
📒 Files selected for processing (6)
packages/automl/bff/internal/api/app.gopackages/automl/bff/internal/integrations/s3/client.gopackages/automl/bff/internal/integrations/s3/client_factory.gopackages/autorag/bff/internal/api/app.gopackages/autorag/bff/internal/integrations/s3/client.gopackages/autorag/bff/internal/integrations/s3/client_factory.go
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7221 +/- ##
==========================================
+ Coverage 64.77% 64.81% +0.04%
==========================================
Files 2447 2441 -6
Lines 76059 76004 -55
Branches 19172 19159 -13
==========================================
- Hits 49265 49264 -1
+ Misses 26794 26740 -54 see 19 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
AI code review 🤖 🔍
Backend-only Go changes across 6 files in automl/autorag BFF packages, adding three env-var-configurable S3 client options ( Summary
DetailsItem 1: 🔴 Bare `http.Transport{}` drops all default timeouts🔴 Bare
|
…sive mode tests - Clone http.DefaultTransport instead of bare http.Transport to preserve default timeouts and connection pooling. Add 30s client timeout. - Add tests for the permissive (production-default) code paths: HTTP accepted when AllowHTTP=true, private IPs accepted when AllowInternalIPs=true, loopback/link-local still blocked when permissive, HTTP+private IP combination works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The PR was fully refactored to ensure that TLS enforcement remains in place in production. Introduced validation for the self signed certs used with MinIO. |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
packages/autorag/bff/internal/integrations/s3/client_test.go (1)
181-192: Missing test coverage for CGNAT and IPv6 ULA in permissive mode.The test covers RFC-1918 ranges but omits CGNAT (
100.64.0.0/10) and IPv6 ULA (fc00::/7), both of which are conditionally allowed whenAllowInternalIPsis true. Without coverage, a regression in those ranges would go undetected.Add missing ranges
func TestValidateAndNormalizeEndpoint_AcceptsPrivateIPWhenAllowed(t *testing.T) { c := newPermissiveTestClient() for _, endpoint := range []string{ "https://10.0.0.1:9000", "https://172.16.0.1:9000", "https://192.168.1.1:9000", + "https://100.64.0.1:9000", // CGNAT (RFC 6598) + "https://[fc00::1]:9000", // IPv6 ULA } { result, err := c.validateAndNormalizeEndpoint(endpoint) assert.NoError(t, err, "should accept %s", endpoint) assert.Equal(t, endpoint, result) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/autorag/bff/internal/integrations/s3/client_test.go` around lines 181 - 192, The test TestValidateAndNormalizeEndpoint_AcceptsPrivateIPWhenAllowed currently checks only RFC-1918 ranges; update it to also include a CGNAT IPv4 address (e.g. "https://100.64.0.1:9000") and an IPv6 ULA address (e.g. "https://[fd00::1]:9000") in the endpoint list, using the same newPermissiveTestClient() and calling c.validateAndNormalizeEndpoint(endpoint) and asserting NoError and equality for each; ensure the IPv6 address is bracketed and treated the same as the other endpoints so CGNAT and ULA acceptance is covered.packages/autorag/bff/internal/integrations/s3/client.go (1)
351-372: Verbose struct literal repetition.The repeated anonymous struct literals are functional but noisy. A minor cleanup using a named type or slice literal would improve readability.
Optional cleanup
+ type blockedRange struct { + cidr string + description string + } + // Private/internal ranges — blocked unless AllowInternalIPs is set if !c.options.AllowInternalIPs { - blockedRanges = append(blockedRanges, - struct { - cidr string - description string - }{"10.0.0.0/8", "RFC-1918 private range (10.0.0.0/8)"}, - struct { - cidr string - description string - }{"100.64.0.0/10", "Carrier-Grade NAT range (RFC 6598)"}, - struct { - cidr string - description string - }{"172.16.0.0/12", "RFC-1918 private range (172.16.0.0/12)"}, - struct { - cidr string - description string - }{"192.168.0.0/16", "RFC-1918 private range (192.168.0.0/16)"}, - struct { - cidr string - description string - }{"fc00::/7", "IPv6 unique local addresses"}, - ) + blockedRanges = append(blockedRanges, []blockedRange{ + {"10.0.0.0/8", "RFC-1918 private range (10.0.0.0/8)"}, + {"100.64.0.0/10", "Carrier-Grade NAT range (RFC 6598)"}, + {"172.16.0.0/12", "RFC-1918 private range (172.16.0.0/12)"}, + {"192.168.0.0/16", "RFC-1918 private range (192.168.0.0/16)"}, + {"fc00::/7", "IPv6 unique local addresses"}, + }...) }Note: This requires changing the
blockedRangesdeclaration at line 337 to use the named type as well.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/autorag/bff/internal/integrations/s3/client.go` around lines 351 - 372, Replace the repeated anonymous struct literals by defining a named type (e.g., type blockedRange struct { cidr, description string }) and change blockedRanges to be a slice of that type (blockedRanges []blockedRange). Then populate blockedRanges with a concise composite literal like []blockedRange{ { "10.0.0.0/8", "RFC-1918 private range (10.0.0.0/8)" }, ... } instead of repeated struct { ... }{...} entries; update any references to blockedRanges accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/automl/bff/internal/integrations/s3/client.go`:
- Around line 517-535: The strict SSRF blocking in client.go omits the shared
address space 100.64.0.0/10 when c.options.AllowInternalIPs is false, leaving
CGNAT reachable; update the blockedRanges append (the same block that adds
"10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16", "fc00::/7") to also include the
CIDR "100.64.0.0/10" with a descriptive label (e.g., "Carrier-grade NAT
(100.64.0.0/10)") so AllowInternalIPs enforcement in the code that builds/uses
blockedRanges covers shared address space as well.
- Around line 459-464: The current check allowing parsedURL.Scheme == "http"
when c.options.AllowHTTP is true is unsafe; change it to fail closed by
rejecting any non-"https" scheme by default and only permit "http" when an
explicit, clearly named dev-only flag is set (e.g. c.options.AllowHTTPDev) in
addition to c.options.AllowHTTP, and document that this is for local/dev use
only; also ensure the S3 HTTP client you build enforces TLS verification (do not
set InsecureSkipVerify) and always configures sensible timeouts on the HTTP
client/transport used to create the S3 client (set
Transport.TLSClientConfig.Verify and http.Client.Timeout) so that when you do
permit non-HTTPS for dev the code path is explicit and production paths remain
HTTPS-only (update checks around parsedURL.Scheme and usages of
c.options.AllowHTTP accordingly).
- Around line 102-113: The code currently sets opts.InsecureSkipVerify to true
and assigns transport.TLSClientConfig with InsecureSkipVerify in the S3 client
creation (see opts.InsecureSkipVerify, transport.TLSClientConfig,
cfg.HTTPClient), which disables TLS verification; change this to require valid
certificate verification by default: remove or disallow setting
InsecureSkipVerify=true, instead load a CA bundle when provided (e.g., from a
mounted ConfigMap/ENV like S3_ROOT_CA) and set TLSClientConfig.RootCAs
accordingly, and make any insecure-skip path explicit and rejected at
initialization if not intentionally enabled (validate S3_INSECURE_SKIP_VERIFY so
production default is secure); ensure the code that reads
S3_INSECURE_SKIP_VERIFY enforces false-by-default and returns an error if an
insecure mode is requested without explicit approval.
In `@packages/autorag/bff/internal/integrations/s3/client.go`:
- Line 82: The conditional is checking the raw opts instead of the
defaults-applied config; change the check to use c.options.InsecureSkipVerify
(after opts.withDefaults() which was stored on c.options) so the code reads the
resolved setting, i.e., replace uses of opts.InsecureSkipVerify with
c.options.InsecureSkipVerify in the relevant conditional around the
TLS/insecure-skip-verify logic in the client initialization (look for
opts.withDefaults(), c.options, and the InsecureSkipVerify check).
---
Nitpick comments:
In `@packages/autorag/bff/internal/integrations/s3/client_test.go`:
- Around line 181-192: The test
TestValidateAndNormalizeEndpoint_AcceptsPrivateIPWhenAllowed currently checks
only RFC-1918 ranges; update it to also include a CGNAT IPv4 address (e.g.
"https://100.64.0.1:9000") and an IPv6 ULA address (e.g.
"https://[fd00::1]:9000") in the endpoint list, using the same
newPermissiveTestClient() and calling c.validateAndNormalizeEndpoint(endpoint)
and asserting NoError and equality for each; ensure the IPv6 address is
bracketed and treated the same as the other endpoints so CGNAT and ULA
acceptance is covered.
In `@packages/autorag/bff/internal/integrations/s3/client.go`:
- Around line 351-372: Replace the repeated anonymous struct literals by
defining a named type (e.g., type blockedRange struct { cidr, description string
}) and change blockedRanges to be a slice of that type (blockedRanges
[]blockedRange). Then populate blockedRanges with a concise composite literal
like []blockedRange{ { "10.0.0.0/8", "RFC-1918 private range (10.0.0.0/8)" },
... } instead of repeated struct { ... }{...} entries; update any references to
blockedRanges accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: 6b7b72e2-f9ac-4b5f-9459-88eece33d8a2
📒 Files selected for processing (4)
packages/automl/bff/internal/integrations/s3/client.gopackages/automl/bff/internal/integrations/s3/client_test.gopackages/autorag/bff/internal/integrations/s3/client.gopackages/autorag/bff/internal/integrations/s3/client_test.go
- Add CGNAT 100.64.0.0/10 to automl blocked ranges (was already in autorag) - Use named blockedRange type instead of repeated anonymous struct literals - Fix autorag to use c.options.InsecureSkipVerify (post-defaults) instead of raw opts - Add CGNAT and IPv6 ULA test cases to permissive mode tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rification Replace InsecureSkipVerify, AllowHTTP, and AllowInternalIPs env vars (which were not settable in production) with proper CA bundle-based TLS verification. The RHOAI operator already mounts cluster and custom CA bundles into the BFF pod via --bundle-paths, so self-signed MinIO certificates are validated against these bundles rather than skipped. - Remove S3_INSECURE_SKIP_VERIFY, S3_ALLOW_HTTP, S3_ALLOW_INTERNAL_IPS - Add RootCAs to S3ClientOptions, populated from operator-mounted bundles - HTTPS always required (no plain HTTP) - Private IPs always allowed (MinIO runs in-cluster) - Dev-mode fallback: skip TLS verification when no CA bundles provided - Add BUNDLE_PATHS support to Makefiles for local development Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
packages/autorag/bff/internal/integrations/s3/client.go (1)
347-366:⚠️ Potential issue | 🔴 CriticalRe-block private/CGNAT/ULA targets or replace this with an explicit allowlist (CWE-918).
This change turns any user-controlled S3 endpoint into a cluster-internal SSRF primitive:
10/8,172.16/12,192.168/16,100.64/10, andfc00::/7are all reachable as long as they speak HTTPS. With the operator CA bundle mounted, many internal services will also present certs this client trusts.Suggested fix
blockedRanges := []blockedRange{ + {"10.0.0.0/8", "RFC-1918 private range (10.0.0.0/8)"}, + {"172.16.0.0/12", "RFC-1918 private range (172.16.0.0/12)"}, + {"192.168.0.0/16", "RFC-1918 private range (192.168.0.0/16)"}, + {"100.64.0.0/10", "CGNAT range (100.64.0.0/10)"}, + {"fc00::/7", "IPv6 unique local range (fc00::/7)"}, {"0.0.0.0/8", "reserved 'this network' range (RFC 1122)"}, {"169.254.0.0/16", "link-local range (169.254.0.0/16)"}, {"127.0.0.0/8", "loopback range (127.0.0.0/8)"}, {"240.0.0.0/4", "reserved for future use (RFC 1112)"}, {"::1/128", "IPv6 loopback"}, {"fe80::/10", "IPv6 link-local"}, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/autorag/bff/internal/integrations/s3/client.go` around lines 347 - 366, The validateIPAddress function on RealS3Client currently permits RFC1918/CGNAT/ULA ranges, enabling SSRF; update the blockedRanges slice in validateIPAddress to include 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 100.64.0.0/10 and the IPv6 unique local block fc00::/7 (in addition to the existing loopback/link-local/reserved entries), or replace the permissive check by implementing an explicit allowlist approach used by the S3 client connection path (i.e., only permit known-good public CIDRs/hostnames before making requests); ensure you modify validateIPAddress (RealS3Client) and any callers to use the new blocked/allowlist logic consistently.packages/automl/bff/internal/integrations/s3/client.go (1)
116-125:⚠️ Potential issue | 🔴 CriticalRemove dev fallback that sets
InsecureSkipVerify(CWE-295).Line 124 disables certificate verification. If a developer runs this path on an untrusted network, S3 credentials/data are MITM-exposable.
Proposed fix
- } else if c.options.DevMode { - // In dev mode without CA bundles, skip TLS verification so developers - // can test against clusters with self-signed certificates from their - // local machine. In production the operator always provides CA bundles - // via --bundle-paths, so this path is never reached. - slog.Warn("S3 TLS certificate verification disabled (dev mode, no CA bundles provided)") - transport := http.DefaultTransport.(*http.Transport).Clone() - transport.TLSClientConfig = &tls.Config{ - InsecureSkipVerify: true, //nolint:gosec // dev-mode only fallback - MinVersion: tls.VersionTLS12, - } - cfg.HTTPClient = &http.Client{ - Transport: transport, - Timeout: 30 * time.Second, - } + } else if c.options.DevMode { + return nil, fmt.Errorf("dev mode requires trusted CA bundles via --bundle-paths; refusing to disable TLS verification") }#!/bin/bash # Verify no insecure TLS bypass remains in this S3 client. rg -n --type go -C2 'InsecureSkipVerify\s*:\s*true' packages/automl/bff/internal/integrations/s3/client.goAs per coding guidelines,
**/*.go: "No InsecureSkipVerify in TLS configs (enables MITM attacks)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/automl/bff/internal/integrations/s3/client.go` around lines 116 - 125, The dev-mode TLS bypass in the S3 client (the branch checking c.options.DevMode that sets transport.TLSClientConfig.InsecureSkipVerify = true) must be removed; instead, stop creating a transport with InsecureSkipVerify and return an explicit error or require valid CA bundles when TLS verification cannot be performed. Locate the branch that logs "S3 TLS certificate verification disabled (dev mode, no CA bundles provided)" and replace it with logic that fails fast (e.g., return an error from the S3 client constructor or require c.options.BundlePaths) and ensure transport.TLSClientConfig is never set with InsecureSkipVerify, keeping MinVersion TLS1.2 handling but relying on proper CA configuration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/automl/bff/Makefile`:
- Line 67: The Makefile run command leaves $(BUNDLE_PATHS) unquoted which can
allow word-splitting or shell injection; update the conditional in the go run
invocation so the flag emits --bundle-paths="$(BUNDLE_PATHS)" (preserving the
existing conditional that only adds the flag when BUNDLE_PATHS is set) to ensure
the variable is passed as a single quoted argument and prevent splitting or
metacharacter interpretation.
In `@packages/autorag/bff/internal/integrations/s3/client.go`:
- Around line 96-110: The dev-mode branch that sets
tls.Config{InsecureSkipVerify: true} must be removed and replaced with a
fail-closed path: when c.options.DevMode is true but no CA bundles are provided,
do not set InsecureSkipVerify or construct a permissive transport; instead
return an error (or log and exit) from the S3 client initialization so the
client creation fails and requires a valid CA bundle. Locate the
c.options.DevMode check and the code that sets transport.TLSClientConfig /
cfg.HTTPClient and change it to reject initialization (propagate an error from
the enclosing constructor/function) rather than disabling certificate
verification.
In `@packages/autorag/bff/Makefile`:
- Line 94: In the Makefile target where the go run invocation builds the flag
--bundle-paths=$(BUNDLE_PATHS), quote the shell variable so it isn't subject to
word-splitting or globbing (change the expansion to use "$(BUNDLE_PATHS)" in the
go run command); locate the line containing --bundle-paths=$(BUNDLE_PATHS) in
the Makefile and update it to pass the quoted BUNDLE_PATHS variable, preserving
the rest of the command and escaping as needed for the shell context.
---
Duplicate comments:
In `@packages/automl/bff/internal/integrations/s3/client.go`:
- Around line 116-125: The dev-mode TLS bypass in the S3 client (the branch
checking c.options.DevMode that sets
transport.TLSClientConfig.InsecureSkipVerify = true) must be removed; instead,
stop creating a transport with InsecureSkipVerify and return an explicit error
or require valid CA bundles when TLS verification cannot be performed. Locate
the branch that logs "S3 TLS certificate verification disabled (dev mode, no CA
bundles provided)" and replace it with logic that fails fast (e.g., return an
error from the S3 client constructor or require c.options.BundlePaths) and
ensure transport.TLSClientConfig is never set with InsecureSkipVerify, keeping
MinVersion TLS1.2 handling but relying on proper CA configuration.
In `@packages/autorag/bff/internal/integrations/s3/client.go`:
- Around line 347-366: The validateIPAddress function on RealS3Client currently
permits RFC1918/CGNAT/ULA ranges, enabling SSRF; update the blockedRanges slice
in validateIPAddress to include 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16,
100.64.0.0/10 and the IPv6 unique local block fc00::/7 (in addition to the
existing loopback/link-local/reserved entries), or replace the permissive check
by implementing an explicit allowlist approach used by the S3 client connection
path (i.e., only permit known-good public CIDRs/hostnames before making
requests); ensure you modify validateIPAddress (RealS3Client) and any callers to
use the new blocked/allowlist logic consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: fba82c4c-b40f-4362-a494-6ac5cfe26309
📒 Files selected for processing (10)
packages/automl/bff/Makefilepackages/automl/bff/internal/api/app.gopackages/automl/bff/internal/integrations/s3/client.gopackages/automl/bff/internal/integrations/s3/client_factory.gopackages/automl/bff/internal/integrations/s3/client_test.gopackages/autorag/bff/Makefilepackages/autorag/bff/internal/api/app.gopackages/autorag/bff/internal/integrations/s3/client.gopackages/autorag/bff/internal/integrations/s3/client_factory.gopackages/autorag/bff/internal/integrations/s3/client_test.go
✅ Files skipped from review due to trivial changes (1)
- packages/automl/bff/internal/api/app.go
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/autorag/bff/internal/integrations/s3/client_factory.go
- packages/automl/bff/internal/integrations/s3/client_test.go
…plitting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… document SSRF allowlist - Extract cloneDefaultTransport() to safely handle non-standard http.DefaultTransport replacements (e.g. in test environments) - Add TestNewRealS3Client_WithRootCAs and TestNewRealS3Client_DevModeFallback to verify both TLS transport configuration paths - Document CGN (100.64/10, RFC 6598) alongside RFC-1918 and IPv6 ULA in validateIPAddress doc comments as permitted ranges Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Addressed the 3 comments. Thanks @chrjones-rh Test the TLS transport path — Added TestNewRealS3Client_WithRootCAs (verifies client creation with a custom CA pool) and TestNewRealS3Client_DevModeFallback (verifies client creation in dev mode without CA bundles) to both automl/client_test.go and autorag/client_test.go. Document the SSRF allowlist change — Updated the validateIPAddress doc comment in both automl/client.go and autorag/client.go to explicitly list CGN (100.64/10, RFC 6598) alongside RFC-1918 and IPv6 ULA as permitted ranges. Guard the type assertion — Extracted cloneDefaultTransport() in both packages. It uses a safe type assertion (ok check) on http.DefaultTransport and falls back to a fresh *http.Transport if it's been replaced with a non-standard implementation. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: nickmazzi The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
packages/autorag/bff/internal/integrations/s3/client.go (1)
96-105:⚠️ Potential issue | 🟠 MajorDo not couple
DevModetoInsecureSkipVerify.This is still CWE-295:
DEV_MODE=trueplus missing or unreadable bundle paths drops certificate verification entirely. In a shared dev namespace or misconfigured CI job, a MITM can impersonate the S3 endpoint and tamper with object traffic. Fail closed here, or gate the bypass behind a separate explicit insecure opt-in instead ofDevMode.Suggested fix
} else if c.options.DevMode { - // In dev mode without CA bundles, skip TLS verification so developers - // can test against clusters with self-signed certificates from their - // local machine. In production the operator always provides CA bundles - // via --bundle-paths, so this path is never reached. - slog.Warn("S3 TLS certificate verification disabled (dev mode, no CA bundles provided)") - transport := cloneDefaultTransport() - transport.TLSClientConfig = &tls.Config{ - InsecureSkipVerify: true, //nolint:gosec // dev-mode only fallback - MinVersion: tls.VersionTLS12, - } - cfg.HTTPClient = &http.Client{ - Transport: transport, - Timeout: 30 * time.Second, - } + return nil, fmt.Errorf( + "no CA bundles loaded for S3 endpoint %q; pass --bundle-paths in dev mode", + validatedEndpoint, + ) }As per coding guidelines,
**/*.go: "No InsecureSkipVerify in TLS configs (enables MITM attacks)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/autorag/bff/internal/integrations/s3/client.go` around lines 96 - 105, The code currently ties c.options.DevMode to setting transport.TLSClientConfig.InsecureSkipVerify=true in the S3 client creation path (see c.options.DevMode, cloneDefaultTransport, TLSClientConfig and InsecureSkipVerify), which must be removed; instead fail closed by returning an error when CA bundle paths are missing/unreadable while not explicitly allowed, and add a separate explicit insecure opt-in (e.g., c.options.AllowInsecureTLS or similar) that must be true to set InsecureSkipVerify; update the conditional to check AllowInsecureTLS before setting InsecureSkipVerify, otherwise return/propagate an error and log a clear warning without disabling verification.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/automl/bff/internal/integrations/s3/client_test.go`:
- Around line 32-53: Update the two tests for NewRealS3Client
(TestNewRealS3Client_WithRootCAs and TestNewRealS3Client_DevModeFallback) to
assert the constructed HTTP transport's TLS settings: after calling
NewRealS3Client with S3ClientOptions{RootCAs: pool}, extract the client's
transport/TLSClientConfig and assert TLSClientConfig.RootCAs is the same pool
and TLSClientConfig.InsecureSkipVerify is false; and after calling
NewRealS3Client with S3ClientOptions{DevMode: true}, assert the transport's
TLSClientConfig.InsecureSkipVerify is true (only in the DevMode path). Locate
the transport via the returned client (from NewRealS3Client) or its internal
http.Client/Transport to perform these explicit assertions.
In `@packages/autorag/bff/internal/integrations/s3/client_test.go`:
- Around line 31-37: The test uses a real hostname in S3Credentials.EndpointURL
which causes DNS resolution; update the test to use an IP literal for
EndpointURL (instead of "https://s3.amazonaws.com") so NewRealS3Client's
endpoint validation doesn't hit public DNS. Locate the test cases that construct
S3Credentials in packages/autorag/bff/internal/integrations/s3/client_test.go
(the calls to NewRealS3Client and the S3Credentials struct) and replace the
hostname-based URL with an accepted IP literal URL for both occurrences so the
constructor branches are exercised without external DNS.
In `@packages/autorag/bff/internal/integrations/s3/client.go`:
- Around line 92-95: The current code sets cfg.HTTPClient.Timeout which enforces
a full-request timeout and breaks long multipart transfers (see cfg.HTTPClient
and transfer calls like GetObject/UploadObject); remove the Client.Timeout
setting and instead set the equivalent deadline on the transport (e.g.,
transport.ResponseHeaderTimeout = 30 * time.Second) so only header read is
bounded, and rely on per-call contexts passed into the AWS SDK transfer manager
for operation-level deadlines; update the code that builds cfg.HTTPClient to
omit Timeout and ensure transport is configured with ResponseHeaderTimeout.
---
Duplicate comments:
In `@packages/autorag/bff/internal/integrations/s3/client.go`:
- Around line 96-105: The code currently ties c.options.DevMode to setting
transport.TLSClientConfig.InsecureSkipVerify=true in the S3 client creation path
(see c.options.DevMode, cloneDefaultTransport, TLSClientConfig and
InsecureSkipVerify), which must be removed; instead fail closed by returning an
error when CA bundle paths are missing/unreadable while not explicitly allowed,
and add a separate explicit insecure opt-in (e.g., c.options.AllowInsecureTLS or
similar) that must be true to set InsecureSkipVerify; update the conditional to
check AllowInsecureTLS before setting InsecureSkipVerify, otherwise
return/propagate an error and log a clear warning without disabling
verification.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: ae4902cc-90ad-44a8-9bf9-163624a5c903
📒 Files selected for processing (6)
packages/automl/bff/Makefilepackages/automl/bff/internal/integrations/s3/client.gopackages/automl/bff/internal/integrations/s3/client_test.gopackages/autorag/bff/Makefilepackages/autorag/bff/internal/integrations/s3/client.gopackages/autorag/bff/internal/integrations/s3/client_test.go
✅ Files skipped from review due to trivial changes (1)
- packages/automl/bff/Makefile
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/automl/bff/internal/integrations/s3/client.go
| func TestNewRealS3Client_WithRootCAs(t *testing.T) { | ||
| t.Parallel() | ||
| pool := x509.NewCertPool() | ||
| _, err := NewRealS3Client(&S3Credentials{ | ||
| AccessKeyID: "a", | ||
| SecretAccessKey: "b", | ||
| Region: "us-east-1", | ||
| EndpointURL: "https://s3.amazonaws.com", | ||
| }, S3ClientOptions{RootCAs: pool}) | ||
| assert.NoError(t, err) | ||
| } | ||
|
|
||
| func TestNewRealS3Client_DevModeFallback(t *testing.T) { | ||
| t.Parallel() | ||
| _, err := NewRealS3Client(&S3Credentials{ | ||
| AccessKeyID: "a", | ||
| SecretAccessKey: "b", | ||
| Region: "us-east-1", | ||
| EndpointURL: "https://s3.amazonaws.com", | ||
| }, S3ClientOptions{DevMode: true}) | ||
| assert.NoError(t, err) | ||
| } |
There was a problem hiding this comment.
Assert TLS wiring explicitly in constructor tests (security regression gap).
Severity: High (test blind spot on cert-validation path).
At Line 41 and Line 52, success-only assertions allow false positives: tests still pass if RootCAs is ignored or if InsecureSkipVerify leaks into non-dev paths. Exploit scenario: a regression could enable MITM acceptance on S3-compatible endpoints (CWE-295), while CI remains green.
Add assertions that verify the constructed transport/TLS config state:
RootCAspath:TLSClientConfig.RootCAs == poolandInsecureSkipVerify == falseDevModefallback path (only when no bundles):InsecureSkipVerify == true
As per coding guidelines "HTTP clients to external services must set timeouts and use TLS verification."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/automl/bff/internal/integrations/s3/client_test.go` around lines 32
- 53, Update the two tests for NewRealS3Client (TestNewRealS3Client_WithRootCAs
and TestNewRealS3Client_DevModeFallback) to assert the constructed HTTP
transport's TLS settings: after calling NewRealS3Client with
S3ClientOptions{RootCAs: pool}, extract the client's transport/TLSClientConfig
and assert TLSClientConfig.RootCAs is the same pool and
TLSClientConfig.InsecureSkipVerify is false; and after calling NewRealS3Client
with S3ClientOptions{DevMode: true}, assert the transport's
TLSClientConfig.InsecureSkipVerify is true (only in the DevMode path). Locate
the transport via the returned client (from NewRealS3Client) or its internal
http.Client/Transport to perform these explicit assertions.
| _, err := NewRealS3Client(&S3Credentials{ | ||
| AccessKeyID: "a", | ||
| SecretAccessKey: "b", | ||
| Region: "us-east-1", | ||
| EndpointURL: "https://s3.amazonaws.com", | ||
| }, S3ClientOptions{RootCAs: pool}) | ||
| assert.NoError(t, err) |
There was a problem hiding this comment.
Keep these unit tests off public DNS.
NewRealS3Client resolves hostnames during endpoint validation, so https://s3.amazonaws.com makes these constructor tests depend on external DNS even though they only exercise local constructor branches. Use an accepted IP literal instead.
Suggested fix
_, err := NewRealS3Client(&S3Credentials{
AccessKeyID: "a",
SecretAccessKey: "b",
Region: "us-east-1",
- EndpointURL: "https://s3.amazonaws.com",
+ EndpointURL: "https://10.0.0.1:9000",
}, S3ClientOptions{RootCAs: pool})
@@
_, err := NewRealS3Client(&S3Credentials{
AccessKeyID: "a",
SecretAccessKey: "b",
Region: "us-east-1",
- EndpointURL: "https://s3.amazonaws.com",
+ EndpointURL: "https://10.0.0.1:9000",
}, S3ClientOptions{DevMode: true})Also applies to: 42-48
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/autorag/bff/internal/integrations/s3/client_test.go` around lines 31
- 37, The test uses a real hostname in S3Credentials.EndpointURL which causes
DNS resolution; update the test to use an IP literal for EndpointURL (instead of
"https://s3.amazonaws.com") so NewRealS3Client's endpoint validation doesn't hit
public DNS. Locate the test cases that construct S3Credentials in
packages/autorag/bff/internal/integrations/s3/client_test.go (the calls to
NewRealS3Client and the S3Credentials struct) and replace the hostname-based URL
with an accepted IP literal URL for both occurrences so the constructor branches
are exercised without external DNS.
| cfg.HTTPClient = &http.Client{ | ||
| Transport: transport, | ||
| Timeout: 30 * time.Second, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n packages/autorag/bff/internal/integrations/s3/client.go | head -120Repository: opendatahub-io/odh-dashboard
Length of output: 5209
🏁 Script executed:
cat -n packages/autorag/bff/internal/integrations/s3/client.go | sed -n '120,200p'Repository: opendatahub-io/odh-dashboard
Length of output: 3673
Move timeout from http.Client to transport layer; let per-call context carry operation deadlines.
http.Client.Timeout applies to the full request lifetime including response-body reads. This kills multipart GetObject and UploadObject transfers legitimately exceeding 30s (lines 92–95, 107–110). Use transport.ResponseHeaderTimeout to bound header reads only; remove Client.Timeout and rely on the context deadlines passed through the AWS SDK transfer manager.
Suggested fix
transport := cloneDefaultTransport()
+ transport.ResponseHeaderTimeout = 30 * time.Second
transport.TLSClientConfig = &tls.Config{
RootCAs: c.options.RootCAs,
MinVersion: tls.VersionTLS12,
}
cfg.HTTPClient = &http.Client{
Transport: transport,
- Timeout: 30 * time.Second,
}
transport := cloneDefaultTransport()
+ transport.ResponseHeaderTimeout = 30 * time.Second
transport.TLSClientConfig = &tls.Config{
InsecureSkipVerify: true, //nolint:gosec // dev-mode only fallback
MinVersion: tls.VersionTLS12,
}
cfg.HTTPClient = &http.Client{
Transport: transport,
- Timeout: 30 * time.Second,
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/autorag/bff/internal/integrations/s3/client.go` around lines 92 -
95, The current code sets cfg.HTTPClient.Timeout which enforces a full-request
timeout and breaks long multipart transfers (see cfg.HTTPClient and transfer
calls like GetObject/UploadObject); remove the Client.Timeout setting and
instead set the equivalent deadline on the transport (e.g.,
transport.ResponseHeaderTimeout = 30 * time.Second) so only header read is
bounded, and rely on per-call contexts passed into the AWS SDK transfer manager
for operation-level deadlines; update the code that builds cfg.HTTPClient to
omit Timeout and ensure transport is configured with ResponseHeaderTimeout.
1ffb866
into
opendatahub-io:main



https://redhat.atlassian.net/browse/RHOAIENG-57882
Description
The S3 client layer needs to support MinIO and other S3-compatible object stores that use self-signed or cluster-issued TLS certificates. Rather than skipping TLS verification entirely (as initially implemented), this PR uses the operator-mounted CA bundles to properly validate certificates.
How it works
The RHOAI operator already mounts CA bundles into the BFF pod via
--bundle-paths:/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem(system CAs)/var/run/secrets/kubernetes.io/serviceaccount/ca.crt(cluster CA, includes ingress operator CA)/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt(service serving CA)/etc/pki/tls/certs/odh-ca-bundle.crt(ODH CA bundle)/etc/pki/tls/certs/odh-trusted-ca-bundle.crt(ODH trusted CA bundle)The S3 client uses this CA pool (
RootCAs) to verify endpoint certificates, so self-signed MinIO certs are validated properly without skipping verification.Key design decisions
InsecureSkipVerifyInsecureSkipVerifywhen no CA bundles providedWhy not env vars?
The initial implementation used
S3_INSECURE_SKIP_VERIFY,S3_ALLOW_HTTP, andS3_ALLOW_INTERNAL_IPSenv vars. These were removed because:InsecureSkipVerify=truewas a security concern compared to every other BFF (which defaults tofalse)Files changed
packages/{automl,autorag}/bff/internal/integrations/s3/client_factory.go-- replacedInsecureSkipVerify,AllowHTTP,AllowInternalIPswithRootCAs *x509.CertPoolpackages/{automl,autorag}/bff/internal/integrations/s3/client.go-- use CA pool for TLS config, HTTPS-only validation, always allow private IPs, dev-mode fallbackpackages/{automl,autorag}/bff/internal/api/app.go-- passrootCAs(from bundle-paths) to S3 client, removed env var readingpackages/{automl,autorag}/bff/internal/integrations/s3/client_test.go-- updated tests: private IPs now accepted, removed permissive mode testspackages/{automl,autorag}/bff/Makefile-- addedBUNDLE_PATHSvariable for local developmentHow Has This Been Tested?
Test Impact
Request review criteria:
Self checklist (all need to be checked):
If you have UI changes:
After the PR is posted and before it merges:
mainSummary by CodeRabbit
New Features
Documentation