This document provides a step-by-step guide on how to use the Iceberg Catalog Migrator. This guide uses an example of migrating from a Polaris catalog to another Polaris catalog that are backed by an AWS S3 bucket.
- Java 21 or later installed
- Have a target catalog created and configured
- Have a source catalog to migrate from
- Block in-progress commits to the source catalog
Migration happens in five steps:
- Build the Iceberg Catalog Migrator
- Set the object storage environment variables
- Get access to the source and target catalogs
- Validate the migration
- Migrate the tables
Execute the following commands to build the tool:
git clone https://github.com/apache/polaris-tools.git
cd polaris-tools/iceberg-catalog-migrator
./gradlew buildThese commands:
- Clone the repository
- Navigate to the
iceberg-catalog-migratordirectory - Build the tool
- Create a JAR file in
iceberg-catalog-migrator/cli/build/libs/directory
The JAR file will be created with name iceberg-catalog-migrator-cli-<version>.jar where <version> is the version of the tool found in the iceberg-catalog-migrator/version.txt file. For the examples below, we will assume the version is 0.0.1-SNAPSHOT, so the JAR file name will be iceberg-catalog-migrator-cli-0.0.1-SNAPSHOT.jar.
The tool will need access to the underlying object storage via environmental variables. For this example, we will use AWS S3 with an access key and id:
export AWS_ACCESS_KEY_ID=<access_key>
export AWS_SECRET_ACCESS_KEY=<secret_key>For more information on configuring access to object storage, please see this guide.
The tool will need to be authorized to the source & target catalogs. In this example, we will use two Polaris catalogs. For getting access to a Polaris catalog, use the OAuth token endpoint like:
curl -X POST http://sourcecatalog:8181/api/catalog/v1/oauth/tokens \
-d "grant_type=client_credentials" \
-d "client_id=my-client-id" \
-d "client_secret=my-client-secret" \
-d "scope=PRINCIPAL_ROLE:ALL"
export TOKEN_SOURCE=xxxxxxx
curl -X POST http://targetcatalog:8181/api/catalog/v1/oauth/tokens \
-d "grant_type=client_credentials" \
-d "client_id=my-client-id" \
-d "client_secret=my-client-secret" \
-d "scope=PRINCIPAL_ROLE:ALL"
export TOKEN_TARGET=xxxxxxxExecute the following command to understand how to migrate the tables:
java -jar ./cli/build/libs/iceberg-catalog-migrator-cli-0.0.1-SNAPSHOT.jar register -h In the example, execute the following command to perform a dry run migration. This will not migrate the tables but will provide information on the operation:
java -jar ./cli/build/libs/iceberg-catalog-migrator-cli-0.0.1-SNAPSHOT.jar register \
--source-catalog-type REST \
--source-catalog-properties uri=http://sourcecatalog:8181/api/catalog,warehouse=test,token=$TOKEN_SOURCE \
--target-catalog-type REST \
--target-catalog-properties uri=http://targetcatalog:8181/api/catalog,warehouse=test,token=$TOKEN_TARGET \
--dry-runAfter validating all inputs, the console will display a list of table identifiers that are identified for migration. This information will also be written to a file called dry_run.txt,
In the example, execute the following command to perform a migration:
java -jar ./cli/build/libs/iceberg-catalog-migrator-cli-0.0.1-SNAPSHOT.jar migrate \
--source-catalog-type REST \
--source-catalog-properties uri=http://sourcecatalog:8181/api/catalog,warehouse=test,token=$TOKEN_SOURCE \
--target-catalog-type REST \
--target-catalog-properties uri=http://targetcatalog:8181/api/catalog,warehouse=test,token=$TOKEN_TARGETPlease note that a log file will be created to verify the migration proceeded successfully. If any issues occur, please use the troubleshooting guide.
For more example migrations, please see this guide.