Skip to content

clickzetta/clickzetta-skills

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

258 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

clickzetta-skills

This repository contains AI Agent skills for ClickZetta Lakehouse. The skills are designed for Codex, Claude Code, czcode, and other assistants that can load a SKILL.md-based instruction set.

The repository turns ClickZetta operational knowledge into reusable routing rules, workflows, SQL patterns, and reference documents. An assistant can use these skills to select the correct domain workflow, read the minimum required reference material, and generate or execute the right ClickZetta steps for the user's task.

Repository Contents

The repository currently contains 26 top-level clickzetta-* skills and one official documentation knowledge base:

  • lakehouse-doc-en: English ClickZetta Lakehouse official documentation index and reference corpus.
  • clickzetta-*: task-oriented skills for ingestion, Studio tasks, dbt, modeling, dynamic tables, connectors, external integrations, governance, and operations.

Skill Catalog

Category Skill Scope
Official docs lakehouse-doc-en Official ClickZetta Lakehouse documentation covering SQL, functions, permissions, VClusters, data sharing, SDKs, BI tools, AI functions, and general platform usage.
Foundation and connectivity clickzetta-overview Product overview, object model, architecture, Studio modules, brand naming, and service endpoints.
Foundation and connectivity clickzetta-sql-migration SQL migration guidance from Snowflake, Databricks, and Spark SQL to ClickZetta SQL.
Data ingestion and pipelines clickzetta-data-ingest-pipeline Router for choosing ingestion methods based on source type, latency, sync scope, and whether ingestion is one-time or continuous.
Data ingestion and pipelines clickzetta-file-import-pipeline Import data from URLs, local files, or Volume paths using format inference, table creation, and COPY INTO.
Data ingestion and pipelines clickzetta-oss-ingest-pipeline Batch or continuous ingestion from OSS, S3, or COS through Storage Connections, External Volumes, Pipes, and COPY INTO.
Data ingestion and pipelines clickzetta-kafka-ingest-pipeline Kafka ingestion through READ_KAFKA Pipes or Kafka External Table plus Table Stream pipelines.
Data ingestion and pipelines clickzetta-batch-sync-pipeline Studio offline batch sync tasks for single-table sync, multi-table mirrors, and sharded-table merge scenarios.
Data ingestion and pipelines clickzetta-realtime-sync-pipeline Studio single-table real-time sync tasks for Kafka, MySQL, PostgreSQL, and related sources.
Data ingestion and pipelines clickzetta-cdc-sync-pipeline Studio multi-table CDC sync from MySQL or PostgreSQL into Lakehouse, including full database mirror and sharded-table merge modes.
Data ingestion and pipelines clickzetta-sql-pipeline-manager SQL-native management for Dynamic Tables, Materialized Views, Table Streams, Pipes, and layered SQL pipelines.
Data ingestion and pipelines clickzetta-table-stream-pipeline Table Stream change data capture workflows, offset handling, preview, consumption, and idempotent downstream writes.
Data ingestion and pipelines clickzetta-studio-task-manager Studio task creation, folder organization, scheduling, dependencies, deployment, task operations, and engineering conventions.
Data ingestion and pipelines clickzetta-pipeline-review Pipeline review and diagnostics across Studio tasks, Lakehouse objects, pipeline SQL, and run histories.
Data ingestion and pipelines clickzetta-dbt-studio-pipeline Publish dbt models into Studio assets and configure scheduled execution from dbt artifacts.
Modeling and analytics clickzetta-dw-modeling Data warehouse modeling for ODS/DWD/DWS/ADS, Medallion architecture, schema design, and pipeline-aware modeling.
Modeling and analytics clickzetta-dbt-project-setup dbt-clickzetta project initialization, profiles.yml, dbt_project.yml, and layered project standards.
Modeling and analytics clickzetta-dbt-modeling dbt source discovery, model design, incremental materialization, tests, and model generation.
Modeling and analytics clickzetta-dynamic-table Dynamic Table creation, refresh configuration, incremental computation, ALTER workflows, refresh history, and best practices.
Modeling and analytics clickzetta-data-science Data science workflows using SQL, ZettaPark, notebooks, EDA, feature engineering, inference, and vector retrieval.
Modeling and analytics clickzetta-semantic-view Semantic View modeling with logical tables, dimensions, metrics, filters, and semantic layer queries.
SDK and integrations clickzetta-zettapark ZettaPark DataFrame API, Session setup, reads, transformations, writes, file operations, and SQL execution.
SDK and integrations clickzetta-spark-flink-connector Spark Connector reads/writes and Flink Write Connector CDC or append-only writes.
SDK and integrations clickzetta-external-function External Functions, Python/Java UDF packaging, and cloud function integration.
SDK and integrations clickzetta-ai-function Built-in AI functions: AI_COMPLETE (call LLMs) and AI_EMBEDDING (text vectors) with API CONNECTION setup.
Operations and governance clickzetta-volume-manager External Volume, User Volume, Table Volume, object storage mounting, file operations, import, and export.
Operations and governance clickzetta-table-lineage Table lineage and cost visualization based on information_schema.job_history and generated HTML artifacts.

Routing Guide

Use the table below when deciding which skill should handle a user request.

User intent Recommended entry point
Understand ClickZetta concepts, Workspace, Schema, VCluster, object hierarchy, or Studio modules. clickzetta-overview
Configure Python, JDBC, SQLAlchemy, ZettaPark, or general Lakehouse connections. clickzetta-zettapark / lakehouse-doc-en
Choose an ingestion method without knowing whether to use files, object storage, Kafka, batch sync, real-time sync, or CDC. clickzetta-data-ingest-pipeline
Import files from a local path, URL, Volume, object storage, or Kafka. clickzetta-file-import-pipeline / clickzetta-oss-ingest-pipeline / clickzetta-kafka-ingest-pipeline
Build Studio offline sync, single-table real-time sync, or multi-table CDC tasks. clickzetta-batch-sync-pipeline / clickzetta-realtime-sync-pipeline / clickzetta-cdc-sync-pipeline
Manage Studio tasks, folders, schedules, dependencies, deployments, and task state. clickzetta-studio-task-manager
Initialize a dbt project, build dbt models, or publish dbt models to Studio. clickzetta-dbt-project-setup / clickzetta-dbt-modeling / clickzetta-dbt-studio-pipeline
Design SQL pipelines, Dynamic Tables, Materialized Views, Pipes, or Table Streams. clickzetta-sql-pipeline-manager / clickzetta-dynamic-table / clickzetta-table-stream-pipeline
Review an existing pipeline, diagnose task failures, inspect dependencies, or identify data quality issues. clickzetta-pipeline-review
Write native ClickZetta SQL, look up syntax, functions, permissions, VClusters, or official product behavior. lakehouse-doc-en
Migrate SQL from Snowflake, Databricks, or Spark SQL. clickzetta-sql-migration
Query metadata, table structures, job history, cost attribution, permissions, Time Travel, recovery, or platform operations. lakehouse-doc-en
Investigate query performance, EXPLAIN, Result Cache, OPTIMIZE, small files, or execution plans. lakehouse-doc-en
Manage users, roles, grants, masking policy, network policy, lifecycle, data sharing, SDK, BI, or Java/Python application docs. lakehouse-doc-en
Manage Volumes, object storage mounts, file upload/download, import, or export. clickzetta-volume-manager
Work with Spark, Flink, ZettaPark, External Functions, or External Catalogs. clickzetta-spark-flink-connector / clickzetta-zettapark / clickzetta-external-function / lakehouse-doc-en

Repository Layout

Each top-level skill is stored in one directory:

clickzetta-<domain>/
├── SKILL.md
└── references/
    └── *.md

SKILL.md is the routing and workflow entry point. It should define:

  • the skill name and trigger description in front matter;
  • when the skill should and should not be used;
  • the minimal workflow the agent should follow;
  • which reference files to read for detailed syntax, examples, or troubleshooting.

references/ contains detailed technical material, SQL snippets, operational playbooks, API notes, and examples. Agents should load only the specific reference files needed for the current task.

Some skills contain additional sub-skill or best-practice directories. For example:

clickzetta-dynamic-table/
├── dt-creator/
├── sql-to-dt/
└── best-practices/

Official Documentation Skill

lakehouse-doc-en is the authoritative fallback for native ClickZetta behavior. Use it when:

  • a topic has been consolidated into official documentation rather than a dedicated operational skill;
  • the user asks for SQL syntax, functions, permissions, VCluster behavior, data sharing, Time Travel, SDK, BI, AI functions, or other official product capabilities;
  • a workflow skill needs product-level confirmation from the docs.

Maintenance Notes

When adding, renaming, or deleting a skill:

  1. Update the top-level skill directory.
  2. Update .well-known/skills/index.json.
  3. Update this README.md catalog and routing table.
  4. Search the repository for stale skill references.

For deleted or consolidated topics, route users to lakehouse-doc-en unless there is another active task-specific skill that clearly owns the workflow.

Usage

Install or expose this repository's skill directories to an AI coding assistant that supports SKILL.md-based skills. Users can then describe ClickZetta tasks directly, for example:

Import CSV files from OSS into public.orders on a schedule.
Build a CDC pipeline from MySQL to Lakehouse and publish it as a Studio task.
Create a Dynamic Table for the DWS order summary layer and configure refresh behavior.

The assistant should route the request through the appropriate SKILL.md, read only the required references, and generate the concrete SQL, cz-cli commands, Studio workflow, or troubleshooting steps for the task.

About

ClickZetta Lakehouse / 云器 Lakehouse 的 AI Agent Skills 集合,适用于 Codex、Claude Code、cz-cli、czcode 以及其他支持 Skills 机制的 AI 编程助手。

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors