Export time-series data snapshots from InfluxDB into Apache Iceberg format. Integrate data with Snowflake and other Iceberg-compatible tools without the need for complex ETL processes.
- Efficient data access: Query your data directly from Snowflake.
- Cost-effective storage: Optimize data retention and minimize storage costs.
- Supports AI and ML workloads: Enhance machine learning applications by making time-series data accessible in Snowflake.
Follow these steps to integrate InfluxDB 3 with Snowflake using Apache Iceberg:
- Configure external storage
- Set up a catalog integration in Snowflake
- Export InfluxDB data to Iceberg format
- Create an Iceberg table in Snowflake
- Query your data in Snowflake
Before you begin, ensure you have the following:
- A Snowflake account with necessary permissions.
- Access to an external object store (such as AWS S3).
- Familiarity with Apache Iceberg and Snowflake.
Set up an external storage location (such as AWS S3) to store Iceberg table data and metadata.
CREATE STAGE my_s3_stage
URL='s3://my-bucket/'
STORAGE_INTEGRATION=my_storage_integration;
For more details, refer to the Snowflake documentation.
Set up a catalog integration in Snowflake to manage and load Iceberg tables efficiently.
CREATE CATALOG INTEGRATION my_catalog_integration
CATALOG_SOURCE = 'OBJECT_STORE'
TABLE_FORMAT = 'ICEBERG'
ENABLED = TRUE;
For more information, refer to the Snowflake documentation.
Use InfluxData's Iceberg exporter to convert and export your time-series data from your {{% product-name omit="Clustered" %}} cluster to the Iceberg table format.
This example assumes the following:
- You have followed the example for writing and querying data in the IOx README.
- You've configured compaction to trigger more quickly with these environment variables:
INFLUXDB_IOX_COMPACTION_MIN_NUM_L0_FILES_TO_COMPACT=1
INFLUXDB_IOX_COMPACTION_MIN_NUM_L1_FILES_TO_COMPACT=1
- You have a
config.json
.
{
"exports": [
{
"namespace": "company_sensors",
"table_name": "cpu"
}
]
}
$ influxdb_iox iceberg export \
--catalog-dsn postgresql://postgres@localhost:5432/postgres \
--source-object-store file
--source-data-dir ~/.influxdb_iox/object_store \
--sink-object-store file \
--sink-data-dir /tmp/iceberg \
--export-config-path config.json
The export command outputs an absolute path to an Iceberg metadata file:
/tmp/iceberg/company_sensors/cpu/metadata/v1.metadata.json
$ duckdb
D SELECT * FROM iceberg_scan('/tmp/iceberg/metadata/v1.metadata.json') LIMIT 1;
┌───────────┬──────────────────────┬─────────────────────┬─────────────┬───┬────────────┬───────────────┬─────────────┬────────────────────┬────────────────────┐
│ cpu │ host │ time │ usage_guest │ … │ usage_nice │ usage_softirq │ usage_steal │ usage_system │ usage_user │
│ varchar │ varchar │ timestamp │ double │ │ double │ double │ double │ double │ double │
├───────────┼──────────────────────┼─────────────────────┼─────────────┼───┼────────────┼───────────────┼─────────────┼────────────────────┼────────────────────┤
│ cpu-total │ Andrews-MBP.hsd1.m… │ 2020-06-11 16:52:00 │ 0.0 │ … │ 0.0 │ 0.0 │ 0.0 │ 1.1173184357541899 │ 0.9435133457479826 │
├───────────┴──────────────────────┴─────────────────────┴─────────────┴───┴────────────┴───────────────┴─────────────┴────────────────────┴────────────────────┤
│ 1 rows 13 columns (9 shown) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Next, create an Iceberg table in Snowflake.
After exporting the data, create an Iceberg table in Snowflake.
CREATE ICEBERG TABLE my_iceberg_table
EXTERNAL_VOLUME = 'my_external_volume'
METADATA_FILE_PATH = 's3://my-bucket/path/to/metadata.json';
Ensure that EXTERNAL_VOLUME
and METADATA_FILE_PATH
point to your external storage and metadata file.
Once the Iceberg table is set up, you can query it using standard SQL in Snowflake.
SELECT * FROM my_iceberg_table
WHERE timestamp > '2025-01-01';
- Use the CLI to trigger snapshot exports
- Use the API to manage and configure snapshots
- Use SQL in Snowflake to query Iceberg tables
# Enable Iceberg feature
$ influxctl enable-iceberg
# Export a snapshot
$ influxctl export --namespace foo --table bar
Use the {{% product-name %}} HTTP API to export snapshots and check status.
This example demonstrates how to export a snapshot of your data from InfluxDB to an Iceberg table using the HTTP API.
- Method:
POST
- Endpoint:
/snapshots/export
- Request body:
{
"namespace": "foo",
"table": "bar"
}
The POST
request to the /snapshots/export
endpoint triggers the export of data from the specified namespace and table in InfluxDB to an Iceberg table. The request body specifies the namespace (foo
) and the table (bar
) to be exported.
This example shows how to check the status of an ongoing or completed snapshot export using the HTTP API.
- Method:
GET
- Endpoint:
/snapshots/status
The GET
request to the /snapshots/status
endpoint retrieves the status of the snapshot export. This can be used to monitor the progress of the export or verify its completion.
When exporting data from InfluxDB to an Iceberg table, keep the following considerations and limitations in mind:
- Data consistency: Ensure that the exported data in the Iceberg table is consistent with the source data in InfluxDB.
- Performance: Query performance may vary based on data size and query complexity.
- Feature support: Some advanced features of InfluxDB may not be fully supported in Snowflake through Iceberg integration.