This guide covers table operations for managing physical datasets in Dremio.
Table Operations allow you to promote datasets to physical datasets (tables), configure file formats, and update table metadata.
Promote a dataset to a physical dataset (table).
alt-dremio-cli table promote <DATASET_ID>Arguments:
DATASET_ID- The dataset ID (UUID)
Examples:
# Promote a dataset to table
alt-dremio-cli table promote abc-123-def-456Configure the file format for a physical dataset.
alt-dremio-cli table format <DATASET_ID> --type <FORMAT> [--from-file <FILE>]Arguments:
DATASET_ID- The dataset ID (UUID)
Options:
--type- Format type: CSV, JSON, Parquet, etc. (required)--from-file- Load format configuration from JSON file
Examples:
# Set CSV format
alt-dremio-cli table format abc-123 --type CSV
# Set format with configuration file
alt-dremio-cli table format abc-123 --type CSV --from-file csv_format.json
# Set JSON format
alt-dremio-cli table format abc-123 --type JSONUpdate table metadata.
alt-dremio-cli table update <DATASET_ID> --from-file <FILE>Arguments:
DATASET_ID- The dataset ID (UUID)
Options:
--from-file- Updated table JSON file (required)
Examples:
# Update table metadata
alt-dremio-cli table update abc-123 --from-file updated_table.json{
"type": "CSV",
"fieldDelimiter": ",",
"lineDelimiter": "\n",
"quote": "\"",
"escape": "\\",
"skipFirstLine": true,
"extractHeader": true
}{
"type": "JSON"
}{
"type": "Parquet",
"autoCorrectCorruptDates": true
}# 1. Get dataset ID
DATASET_ID=$(dremio --output json catalog get-by-path "MySource.data.customers.csv" | jq -r '.id')
# 2. Promote to table
alt-dremio-cli table promote $DATASET_ID
# 3. Configure CSV format
cat > csv_format.json <<EOF
{
"type": "CSV",
"fieldDelimiter": ",",
"skipFirstLine": true,
"extractHeader": true
}
EOF
alt-dremio-cli table format $DATASET_ID --type CSV --from-file csv_format.json# Get JSON file dataset
DATASET_ID=$(dremio --output json catalog get-by-path "MySource.data.events.json" | jq -r '.id')
# Promote and set format
alt-dremio-cli table promote $DATASET_ID
alt-dremio-cli table format $DATASET_ID --type JSON#!/bin/bash
# promote_all_csv.sh - Promote all CSV files in a source
SOURCE="MySource"
# Find all CSV files
alt-dremio-cli --output json catalog list | jq -r ".data[] | select(.path[0] == \"$SOURCE\" and (.path[-1] | endswith(\".csv\"))) | .id" | while read dataset_id; do
echo "Promoting: $dataset_id"
alt-dremio-cli table promote $dataset_id
alt-dremio-cli table format $dataset_id --type CSV --from-file csv_format.json
done#!/bin/bash
# apply_format.sh - Apply format template
DATASET_ID=$1
FORMAT_TYPE=$2
case $FORMAT_TYPE in
csv)
cat > format.json <<EOF
{
"type": "CSV",
"fieldDelimiter": ",",
"skipFirstLine": true,
"extractHeader": true
}
EOF
;;
tsv)
cat > format.json <<EOF
{
"type": "CSV",
"fieldDelimiter": "\t",
"skipFirstLine": true,
"extractHeader": true
}
EOF
;;
json)
cat > format.json <<EOF
{
"type": "JSON"
}
EOF
;;
esac
alt-dremio-cli table format $DATASET_ID --type ${FORMAT_TYPE^^} --from-file format.json
rm format.json-
Promote before formatting: Always promote datasets before configuring format
alt-dremio-cli table promote $ID alt-dremio-cli table format $ID --type CSV
-
Test format settings: Verify format with a query
alt-dremio-cli sql execute "SELECT * FROM dataset LIMIT 10" -
Use format files: Store format configurations for reuse
alt-dremio-cli table format $ID --type CSV --from-file standard_csv.json
$ alt-dremio-cli table promote abc-123
Error: Dataset is already a physical datasetSolution: Skip promotion, proceed with format configuration.
$ alt-dremio-cli table format abc-123 --type CSV --from-file bad_format.json
Error: Invalid format configurationSolution: Verify JSON format and required fields.
- Full table operations support
- All format types available
- Promotion and format configuration
- Table operations available
- Format types may vary
- Project-scoped operations
- Promote systematically: Promote datasets as part of source setup
- Document formats: Keep format configurations in version control
- Test configurations: Verify format settings with sample queries
- Use templates: Standardize format configurations
- Automate promotion: Script bulk dataset promotion
- CSV - Comma-separated values
- TSV - Tab-separated values
- JSON - JSON documents
- Parquet - Columnar format
- Avro - Row-based format
- Excel - Excel spreadsheets
- Promote: Convert datasets to physical datasets
- Format: Configure file format settings
- Update: Modify table metadata
- Automate: Use scripts for bulk operations
- Test: Verify format with queries