Skip to content

Latest commit

 

History

History
128 lines (104 loc) · 3.59 KB

File metadata and controls

128 lines (104 loc) · 3.59 KB

Dremio-as-Code (GitOps)

Dremio-as-Code (DAC) allows you to manage your Dremio catalog (Spaces, Folders, Views) using local files, enabling GitOps workflows.

Quick Start

1. Initialize Configuration

Create a dremio.yaml file in your project root:

version: "1.0"
scope:
  path: "dremio-catalog.finance"  # The Dremio path to sync
  type: "SPACE"                   # SPACE or ICEBERGCATALOG
ignore:
  - "*.tmp"

2. Pull State

Capture the current state of your Dremio space into local files.

alt-dremio-cli sync pull

Returns a directory structure mirroring Dremio:

  • finance/
    • monthly_report.sql
    • monthly_report.yaml

View Definition (view.yaml)

Define your virtual datasets with SQL, tags, wiki content, dependencies, and governance policies.

name: revenue_report
type: VIRTUAL_DATASET
# Full path in Dremio
path: ["dremio-catalog", "finance", "reports", "revenue_report"]
# SQL Definition
sql: |
  SELECT region, sum(amount) as total 
  FROM "dremio-catalog".finance.stg_sales 
  GROUP BY region
# Dependencies
dependencies: 
  - "stg_sales"
# Tags & Wiki
tags: ["finance", "official"]
description: "docs/revenue_report.md" 

# Governance: Access Control (RBAC)
access_control:
  roles:
    - name: "finance_managers"
      privileges: ["SELECT"]
  users:
    - name: "auditor@example.com"
      privileges: ["SELECT", "ALTER"]

# Governance: Row/Column Policies
# (Requires UDFs to be defined separately)
governance:
  row_access_policy:
    name: "protect_region_udf"
    args: ["region"]
  masking_policies:
    - column: "total"
      name: "mask_amount_udf"

# Reflections
reflections:
  - name: "raw_sales_agg"
    type: "RAW"
    displayFields: ["region", "total"]
  - name: "agg_sales_by_region"
    type: "AGGREGATION"
    dimensionFields: ["region"]
    measureFields: ["total"]
    distributionFields: ["region"]
    partitionFields: ["region"]

Workflow

  1. Push: alt-dremio-cli sync push

    • recurses, sorts dependencies, applies SQL, updates Tags/Wiki.
    • Applies Grants: Resolves Role/User names to IDs and enforces access control.
    • Applies Policies: Executes SQL commands to attach Row Access and Masking policies.
  2. Pull: alt-dremio-cli sync pull

    • Fetches state, rebuilds folders, views, wikis.
    • Important Limitation: Governance policies (RBAC, Row Access, Masking) and Reflections are NOT automatically retrieved from Dremio during a pull.
      • To manage them via DAC, you must manually define access_control, governance, and reflections blocks in your YAML files.
      • The pull command will only generate the standard SQL and metadata.

new_view.sql:

SELECT * FROM parent.table

new_view.yaml:

type: VIEW
path: ["dremio-catalog", "finance", "new_view"]
sql_file: "new_view.sql"
context: []

4. Push Changes

Apply your local changes back to Dremio.

alt-dremio-cli sync push

Feature Guides

  • Sources: Manage Dremio Sources (S3, Nessie, Relational).
  • Tables: Manage Physical Datasets and Iceberg Tables.
  • Validations: Define Data Quality Checks.
  • Reflections: Manage RAW and AGGREGATION reflections.
  • Governance: Manage Access Control (RBAC) and Row/Column Policies.

Concepts

  • Scope: Limits the sync to a specific subtree to support multi-team environments.
  • State File: .dremio_state.json tracks the last known state to enable efficient updates and deletes. Do not edit this file manually.