Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Implements the core AWS Glue-backed catalog functionality for the catalog-glue module, adding schema/table CRUD, partition operations, and Glue→Gravitino exception/type conversions, with unit tests using mocked Glue clients and synthetic SDK model objects.
Changes:
- Add
GlueCatalogOperationsimplementingSupportsSchemasandTableCatalogusing the AWS Glue API (including pagination and optionalcatalogId). - Add table/partition conversion and operations (
GlueTable,GlueTableOperations,GlueSchema,GlueColumn,GlueTypeConverter,GlueExceptionConverter, constants/properties metadata). - Add unit tests for schema/table/partition CRUD and conversion logic; add optional AWS-backed tests gated by env vars.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueCatalogOperations.java | Implements schema + table CRUD on top of Glue API, pagination, format filtering, and change application. |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTableOperations.java | Implements Glue-backed partition listing/get/add/drop. |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTable.java | Converts Glue Table→Gravitino and wires table ops to partition operations. |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueSchema.java | Converts Glue Database→Gravitino schema. |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueColumn.java | Converts Glue Column→Gravitino column using a type converter. |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTypeConverter.java | Converts between Glue/Hive type strings and Gravitino Type. |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueExceptionConverter.java | Maps common Glue SDK exceptions to Gravitino exceptions. |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueConstants.java | Adds connector config keys and StorageDescriptor-derived table property keys. |
| catalogs/catalog-glue/src/main/java/org/apache/gravitino/catalog/glue/GlueTablePropertiesMetadata.java | Updates property description for table format values. |
| catalogs/catalog-glue/src/test/java/org/apache/gravitino/catalog/glue/*.java | Adds mocked CRUD tests plus shared conversion test scenarios; adds optional AWS endpoint tests gated by env vars. |
| catalogs/catalog-glue/build.gradle.kts | Adjusts test task behavior (removes -PskipITs conditional). |
Code Coverage Report
Files
|
…etadata - Add GlueConstants for all catalog and table property keys - Add GlueClientProvider: static creds / DefaultCredentialChain selection, region, and endpoint override (for VPC endpoints / LocalStack) - Implement GlueCatalogPropertiesMetadata: required aws-region + aws-glue-catalog-id, optional credentials (hidden), endpoint, default-table-format, table-type-filter - Implement GlueCatalogCapability: case-insensitive names, no NOT NULL, no DEFAULT - Implement GlueTablePropertiesMetadata: table_type, metadata_location, location - Add TestGlueClientProvider unit tests
- Fix stale JavaDoc: DEFAULT_TABLE_FORMAT "Defaults to iceberg" -> "Defaults to hive" - GlueClientProvider: fail-fast on partial credentials (one key without the other) - GlueClientProvider: wrap URI.create with property-context error message - GlueCatalogCapability: remove COLUMN from case-insensitive scope (no AWS docs backing) - GlueTablePropertiesMetadata: remove ephemeral PR-05 forward reference from comment - TestGlueClientProvider: use try-with-resources; update partial-cred test to expect exception; add tests for secret-only and invalid endpoint cases - Add TestGlueCatalogCapability: covers all capability method contracts - Add TestGlueCatalogPropertiesMetadata: covers required/hidden/immutable flags and defaults
- Use StringUtils.isNotBlank() to reject blank region and credential values - Make aws-glue-catalog-id optional (Glue defaults to caller's account ID) - Add casing note to TABLE_FORMAT_TYPE JavaDoc distinguishing Glue uppercase from filter lowercase - Clarify deferred validation for default-table-format and table-type-filter
- Rename TABLE_TYPE_FILTER_ALL to DEFAULT_TABLE_TYPE_FILTER - Add default credential chain order comment in GlueClientProvider - Remove try-catch wrapping on URI.create for endpoint validation - Make aws-glue-catalog-id optional - Add casing note to TABLE_FORMAT_TYPE JavaDoc - Clarify deferred validation for default-table-format and table-type-filter - Add StringUtils.isNotBlank() for blank value checks
…ests Add model layer for catalog-glue (PR-03): - GlueConstants: storage descriptor and table-format constants - GlueTypeConverter: Glue/Hive type string to Gravitino Type mapping - GlueSchema: maps AWS Glue Database -> Gravitino BaseSchema - GlueColumn: maps AWS Glue Column -> Gravitino BaseColumn - GlueTable: maps AWS Glue Table -> Gravitino BaseTable (columns, partitioning, distribution, sort orders, properties) Test architecture: abstract base class + two implementations: - SyntheticGlueXxxTest: SDK builder, no network, always runs - AwsGlueXxxIT: real AWS Glue API, tagged gravitino-aws-test, skipped by default
…Table - Rename test classes to follow TestXxx naming convention - Add error handling for malformed type strings in GlueTypeConverter - Fix NPE in GlueTable distribution and sort order null-checks - Add null-safe handling for GlueSchema parameters - Use GlueException for AWS cleanup errors - Add aws.glue test dependency for SdkException class - Fix TABLE_FORMAT description to use uppercase values
Glue/Hive timestamps are timezoneless. fromGravitino now throws IllegalArgumentException for TimestampType.withTimeZone(), consistent with HiveDataTypeConverter. Ref: https://cwiki.apache.org/confluence/display/hive/languagemanual+types
…ests Replace @tag("gravitino-aws-test") + Gradle excludeTags with @EnabledIfEnvironmentVariable(named = "AWS_ACCESS_KEY_ID"). Tests now auto-skip when the env var is absent, no Gradle flag change needed.
…nd exception mapping
…t during rebase Restore GlueTypeConverter to the full DataTypeConverter implementation from glue-pr03 (now in main), including complex type support (array, map, struct, uniontype) and the splitTopLevel helper. Update GlueColumn.fromGlueColumn and GlueTable.fromGlueTable to accept a GlueTypeConverter instance. Add a typeConverter field to GlueCatalogOperations and make toGlueColumn non-static to use it. Restore AbstractGlueTableTest and TestGlueTypeConverter from main.
…estGlueCatalogTable
…le into TestGlueCatalogOperations
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Implement schema CRUD, table CRUD, partition support, and exception mapping for the catalog-glue module:
GlueExceptionConverter— mapsEntityNotFoundException/AlreadyExistsException/InvalidInputExceptionto Gravitino exceptionsGlueCatalogOperations— implementsSupportsSchemasandTableCatalogbacked by AWS Glue APIGlueTableOperations— implementsSupportsPartitionsbacked by Glue partition APIsGlueTable—newOps()wired to returnGlueTableOperationsWhy are the changes needed?
These classes implement the core metadata operations required to make the catalog-glue module functional.
Fix: #10828
Does this PR introduce any user-facing change?
No. This is internal implementation of the catalog-glue module.
How was this patch tested?
Unit tests covering schema/table/partition CRUD with mocked
GlueClient.