fix(s3): skip hydrate calls for buckets outside name/region qualifiers#2740
Open
rmnjaat wants to merge 1 commit intoturbot:mainfrom
Open
fix(s3): skip hydrate calls for buckets outside name/region qualifiers#2740rmnjaat wants to merge 1 commit intoturbot:mainfrom
rmnjaat wants to merge 1 commit intoturbot:mainfrom
Conversation
Querying aws_s3_bucket with a WHERE region or WHERE name filter was triggering all 14 hydrate API calls for every bucket globally before any filtering occurred. For accounts with many buckets (e.g. 272), this caused ~3,800 API calls and query timeouts exceeding 12 minutes. Two early-exit gates are introduced: 1. Name filter (listS3Buckets): buckets whose name does not match a name = '...' or name IN (...) qualifier are skipped before d.StreamListItem, so they never enter the hydrate pipeline at all. This saves HeadBucket + all 13 downstream API calls per bucket. 2. Region filter (doGetBucketRegion / getBucketRegion): after HeadBucket resolves the bucket's actual region, buckets outside the requested region return nil,nil, causing all 13 dependent hydrate functions to exit immediately via a nil guard added at the top of each function. Both filters handle = and IN operators via d.Quals loops. Other operators (LIKE, !=) fall back to full hydration. The region cache check also applies the qual filter so warm-cache paths are equally optimised. Fixes turbot#2737
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Querying
aws_s3_bucketwithWHERE region = 'x'orWHERE name = 'x'triggered all 14 hydrate API calls on every bucket globally before
any filtering occurred.
For an account with 276 buckets and
WHERE region = 'ap-southeast-2'(zero matching buckets), this caused ~3,800 API calls and query timeouts
exceeding 12 minutes despite returning zero results.
Fixes #2737
Root Cause
ListBucketsalways returns all buckets globally (unavoidable)getBucketRegion(HeadBucket) ran for every bucket regardless of filtersSolution
Two early-exit gates inserted at the earliest possible stage:
Gate 1 — Name filter (inside
listS3Buckets, beforeStreamListItem)nameis available directly fromListBuckets. Buckets not matching aname = '...'orname IN (...)qualifier are skipped befored.StreamListItem, so they never enter the hydrate pipeline.Saves: HeadBucket + 13 downstream API calls per non-matching bucket.
Gate 2 — Region filter (inside
doGetBucketRegion, after HeadBucket)After HeadBucket resolves the actual bucket region, buckets outside the
requested region return
nil, nilfromgetBucketRegion. All 13dependent hydrate functions have a nil guard at the top and exit
immediately without making any API calls.
Saves: 13 API calls per non-matching bucket (HeadBucket is unavoidable
as region is unknown before it runs).
Both gates handle
=andINoperators viad.Qualsloops. Otheroperators fall back to full hydration. The region cache path also applies
the qualifier check so warm-cache queries are equally optimised.
Performance Impact
WHERE region = 'ap-southeast-2'(0 match)WHERE region = 'us-east-1'(21 match)WHERE name = 'my-bucket'(1 match)Notes
InvalidTokenissue that caused PR Revert the region changes for the aws_s3_* tables #2536 to be reverted in Add ignore error config to aws_rds_db_instance and aws_rds_pending_maintenance_action tables fixes #2543 #2545)nameandregiondeclared asOptionalColumnsinListConfigsolely to maked.Qualsaccessible in hydrate functions —listS3Bucketsstill fetches all buckets globallynameorregionfiltersIntegration test logs
Logs
Example query results
Results