-
Notifications
You must be signed in to change notification settings - Fork 730
Description
Describe the bug
When setting the s3.iam_role_arn option (introduced in #23775) to create an Iceberg source with a JDBC catalog, the initial java loadTable call does not use the assume role functionality.
This causes failure if the s3 bucket in which table metadata is stored is meant to be access using an assume role which is the expectation when the s3.iam_role_arn property is used.
Error message/log
software.amazon.awssdk.services.s3.model.S3Exception: User: <arn of the Risingwave Server> is not authorized to perform: s3:GetObject on resource: "<s3_bucketName>/<metadata_json_location>" because no resource-based policy allows the s3:GetObject action (Service: S3, Status Code: 403, Request ID: xxx, Extended Request ID: xxx)
software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:113) at
software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:61) at
software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.retryPolicyDisallowedRetryException(RetryableStageHelper.java:168) at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:73) at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36) at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:53) at
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:35) at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:82) at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:43) at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) at
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:210) at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173) at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$0(BaseSyncClientHandler.java:66) at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182) at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:60) at
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:52) at
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:60) at
software.amazon.awssdk.services.s3.DefaultS3Client.getObject(DefaultS3Client.java:6416) at
org.apache.iceberg.aws.s3.S3InputStream.openStream(S3InputStream.java:240) at
org.apache.iceberg.aws.s3.S3InputStream.openStream(S3InputStream.java:225) at
org.apache.iceberg.aws.s3.S3InputStream.positionStream(S3InputStream.java:221) at
org.apache.iceberg.aws.s3.S3InputStream.read(S3InputStream.java:143) at
com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.ensureLoaded(ByteSourceJsonBootstrapper.java:547) at
com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.detectEncoding(ByteSourceJsonBootstrapper.java:137) at
com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.constructParser(ByteSourceJsonBootstrapper.java:266) at
com.fasterxml.jackson.core.JsonFactory._createParser(JsonFactory.java:1874) at
com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1273) at
com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3924) at
org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:291) at
org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:284) at
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:180) at
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:199) at
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413) at
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219) at
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203) at
org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196) at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:199) at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:176) at
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:167) at
org.apache.iceberg.jdbc.JdbcTableOperations.doRefresh(JdbcTableOperations.java:100) at
org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:88) at
org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:71) at
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:49) at
org.apache.iceberg.rest.CatalogHandlers.loadTable(CatalogHandlers.java:328) at
com.risingwave.connector.catalog.JniCatalogWrapper.loadTable(JniCatalogWrapper.java:53
To Reproduce
The setup requires having a postgres database setup with Iceberg metadata schemas.
CREATE SOURCE test_bwr_3 WITH (
connector = 'iceberg',
catalog.type = 'jdbc',
catalog.uri = 'jdbc:postgresql://localhost:5432/postgres',
catalog.jdbc.user = 'user',
catalog.jdbc.password = 'password',
warehouse.path = 's3://my_bucket',
s3.region = 'us-east-1',
s3.endpoint = 'https://s3.amazonaws.com',
s3.iam_role_arn = 'arn:aws:iam::xxx:role/my_role',
table.name = 'table',
catalog.name = 'catalog',
database.name = 'database',
enable_config_load = 'true'
);
Expected behavior
The expected behavior is, when reading metadata configured in the JDBC catalog and stored in the s3 bucket, the java process should use the configured assumed role.
This is not the case currently, resulting in the process failing with the following stack trace where the mention of user is the indicator that the process has not attempted to assume the expected role before trying to retrieve the s3 file.
How did you deploy RisingWave?
Risingwave is deployed on a kubernetes using the official docker image v2.7.2 with vanilla configurations
The version of RisingWave
PostgreSQL 13.14.0-RisingWave-2.7.2 (30301dc965a6f30c08de859e2be0e6cb1b66f6b0)
Additional context
No response