Proposed
Unity Catalog provides centralized governance, access control, and lineage tracking across Databricks workspaces. Following ADR-001's medallion architecture inside data products guidelines, we need to establish how data products map to Unity Catalog's three-level hierarchy (catalog.schema.table) to ensure consistent governance and discoverability.
Data products represent autonomous software development lifecycle scopes in our decentralized data architecture, making them natural candidates for catalog-level isolation in Unity Catalog.
- Catalog:
{environment}_{domain}_{product_name}_dp- Examples:
dev_sales_orders_dp,prod_finance_revenue_dp
- Examples:
- Schemas:
bronze,silver,gold(aligned with medallion layers from ADR-001) - Tables:
{source_or_entity}_{type}_tbl- Examples:
orders_data_tbl,customer_metrics_tbl
- Examples:
- Bronze Schema: Data engineers and ETL service accounts only
- Silver Schema: Data analysts and approved business users
- Gold Schema: Business stakeholders and BI tools
- Cross-layer: Row-level security and column masking applied as needed
- Lineage: Automatic tracking across medallion layers within each data product
- Data Classification: Sensitive data tagged at column level
- Audit Logging: All access and modifications tracked per data product catalog
- Isolation: Each data product has dedicated catalog ensuring clear boundaries
- Governance: Consistent access patterns aligned with medallion architecture
- Discoverability: Standardized naming enables automated discovery and cataloging
- Compliance: Centralized audit trails and data classification per data product
- Catalog Proliferation: Each data product creates a separate catalog
- Cross-Product Queries: Joining data across products requires cross-catalog permissions
- Management Overhead: More catalogs to provision and maintain
- Risk: Catalog sprawl becomes unmanageable
- Mitigation: Implement automated catalog lifecycle management and naming conventions
- Risk: Complex cross-product access patterns
- Mitigation: Define clear data sharing contracts through output ports (ADR-001)
- Risk: Inconsistent governance across catalogs
- Mitigation: Standardized templates and automated policy enforcement
- ADR-001: Medallion Architecture Mapping to Data Product Internals