Skip to content

Latest commit

 

History

History
78 lines (56 loc) · 5.23 KB

File metadata and controls

78 lines (56 loc) · 5.23 KB
title Import Data from Amazon S3 into {{{ .premium }}}
summary Learn how to import CSV files from Amazon S3 into {{{ .premium }}} instances using the console wizard.

Import Data from Amazon S3 into {{{ .premium }}}

This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) into {{{ .premium }}} instances. The steps reflect the current private preview user interface and serve as an initial framework for the upcoming public preview launch.

Warning:

{{{ .premium }}} is currently available in private preview in select AWS regions.

If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click Support in the lower-left corner of the TiDB Cloud console, or submit a request through the Contact Us form on the website.

Tip:

Limitations

  • To ensure data consistency, {{{ .premium }}} allows importing CSV files into empty tables only. If the target table already contains data, import into a staging table and then copy the rows using the INSERT ... SELECT statement.
  • During the private preview, the user interface currently supports Amazon S3 as the only storage provider. Support for additional providers will be added in future releases.
  • Each import job maps a single source pattern to one destination table.

Step 1. Prepare the CSV files

  1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files around 256 MiB so {{{ .premium }}} can process them in parallel.
  2. Name your CSV files according to the Dumpling naming conventions:
    • Full-table files: use the ${db_name}.${table_name}.csv format.
    • Sharded files: append numeric suffixes, such as ${db_name}.${table_name}.000001.csv.
    • Compressed files: use the ${db_name}.${table_name}.${suffix}.csv.${compress} format.
  3. Optional schema files (${db_name}-schema-create.sql, ${db_name}.${table_name}-schema.sql) help {{{ .premium }}} create databases and tables automatically.

Step 2. Create target schemas (optional)

If you want {{{ .premium }}} to create the databases and tables automatically, place the schema files generated by Dumpling in the same S3 directory. Otherwise, create the databases and tables manually in {{{ .premium }}} before running the import.

Step 3. Configure access to Amazon S3

To allow {{{ .premium }}} to read your bucket, use either of the following methods:

  • Provide an AWS Role ARN that trusts TiDB Cloud and grants the s3:GetObject and s3:ListBucket permissions on the relevant paths.
  • Provide an AWS access key (access key ID and secret access key) with equivalent permissions.

The wizard includes a helper link labeled Click here to create a new one with AWS CloudFormation. Follow this link if you need {{{ .premium }}} to pre-fill a CloudFormation stack that creates the role for you.

Step 4. Import CSV files from Amazon S3

  1. In the TiDB Cloud console, navigate to the TiDB Instances page, and then click the name of your TiDB instance.

  2. In the left navigation pane, click Data > Import, and choose Import data from Cloud Storage.

  3. In the Source Connection dialog:

    • Set Storage Provider to Amazon S3.
    • Enter the Source Files URI for a single file (s3://bucket/path/file.csv) or for a folder (s3://bucket/path/).
    • Choose AWS Role ARN or AWS Access Key and provide the credentials.
    • Click Test Bucket Access to validate connectivity.
  4. Click Next and provide the TiDB SQL username and password for the import job. Optionally, test the connection.

  5. Review the automatically generated source-to-target mapping. Disable automatic mapping if you need to define custom patterns and destination tables.

  6. Click Next to run the pre-check. Resolve any warnings about missing files or incompatible schemas.

  7. Click Start Import to launch the job group.

  8. Monitor the job statuses until they show Completed, then verify the imported data in TiDB Cloud.

Troubleshooting

  • If the pre-check reports zero files, verify the S3 path and IAM permissions.
  • If jobs remain in Preparing, ensure that the destination tables are empty and the required schema files exist.
  • Use the Cancel action to stop a job group if you need to adjust mappings or credentials.

Next steps