|
| 1 | +# Plugin Contract Reference |
| 2 | + |
| 3 | +Concrete proto↔plugin mappings for the three core AppKit plugins. |
| 4 | + |
| 5 | +## Files Plugin Contract |
| 6 | + |
| 7 | +**Plugin manifest**: `files/manifest.json` |
| 8 | +**Resource**: UC Volume with `WRITE_VOLUME` permission |
| 9 | +**Env**: `DATABRICKS_VOLUME_FILES` for volume path |
| 10 | + |
| 11 | +### Boundary: What the files plugin owns |
| 12 | + |
| 13 | +The files plugin is the ONLY module that touches UC Volumes. Other modules |
| 14 | +interact with files through typed proto messages, never raw paths. |
| 15 | + |
| 16 | +``` |
| 17 | +┌─────────────┐ UploadRequest ┌──────────────┐ |
| 18 | +│ api module │ ──────────────────→ │ files plugin │ |
| 19 | +│ │ ←────────────────── │ │ |
| 20 | +│ │ StoredArtifact │ UC Volumes │ |
| 21 | +└─────────────┘ └──────────────┘ |
| 22 | +``` |
| 23 | + |
| 24 | +### Proto → Plugin Method Mapping |
| 25 | + |
| 26 | +| Proto Message | Plugin Method | Direction | |
| 27 | +|---------------|---------------|-----------| |
| 28 | +| `UploadRequest` | `files.upload(path, content, opts)` | IN | |
| 29 | +| `StoredArtifact` | Return type of upload/getInfo | OUT | |
| 30 | +| `VolumeLayout` | `files.config.volumePath` + conventions | CONFIG | |
| 31 | + |
| 32 | +### Volume Path Convention (from VolumeLayout proto) |
| 33 | + |
| 34 | +``` |
| 35 | +/Volumes/{catalog}/{schema}/{volume}/ |
| 36 | +├── uploads/ # User uploads (UploadRequest.destination_path) |
| 37 | +├── results/ # Computed outputs (StoredArtifact) |
| 38 | +│ └── {run_id}/ |
| 39 | +│ ├── output.proto.bin # Binary proto serialization |
| 40 | +│ └── output.json # JSON for debugging |
| 41 | +└── artifacts/ # Build artifacts, archives |
| 42 | + └── {app_name}/ |
| 43 | + └── {version}/ |
| 44 | +``` |
| 45 | + |
| 46 | +### Config ↔ Proto Mapping |
| 47 | + |
| 48 | +| manifest.json field | Proto field | Notes | |
| 49 | +|---------------------|-------------|-------| |
| 50 | +| `config.timeout` (30000) | Not in proto | Plugin-internal config | |
| 51 | +| `config.maxUploadSize` (5GB) | `UploadRequest.content` max size | Validation constraint | |
| 52 | +| `resources.path` env | `VolumeLayout.root` | Runtime injection | |
| 53 | + |
| 54 | +--- |
| 55 | + |
| 56 | +## Lakebase Plugin Contract |
| 57 | + |
| 58 | +**Plugin manifest**: `lakebase/manifest.json` |
| 59 | +**Resource**: Postgres with `CAN_CONNECT_AND_CREATE` permission |
| 60 | +**Env**: `PGHOST`, `PGDATABASE`, `PGPORT`, `PGSSLMODE`, `LAKEBASE_ENDPOINT` |
| 61 | + |
| 62 | +### Boundary: What the lakebase plugin owns |
| 63 | + |
| 64 | +Lakebase owns ALL structured data. Every table's schema is derived from a proto |
| 65 | +message in `database.proto`. No ad-hoc `CREATE TABLE` statements. |
| 66 | + |
| 67 | +``` |
| 68 | +┌─────────────┐ RunRecord ┌──────────────┐ |
| 69 | +│ compute mod │ ──────────────────→ │ lakebase │ |
| 70 | +│ │ │ plugin │ |
| 71 | +│ │ MetricRecord │ │ |
| 72 | +│ │ ──────────────────→ │ Postgres │ |
| 73 | +└─────────────┘ └──────┬───────┘ |
| 74 | + │ |
| 75 | +┌─────────────┐ SQL query │ |
| 76 | +│ analytics │ ←──────────────────────────┘ |
| 77 | +│ module │ RunRecord[] |
| 78 | +└─────────────┘ |
| 79 | +``` |
| 80 | + |
| 81 | +### Proto → Table Mapping |
| 82 | + |
| 83 | +| Proto Message | Table Name | Primary Key | Notes | |
| 84 | +|---------------|-----------|-------------|-------| |
| 85 | +| `RunRecord` | `runs` | `(run_id, app_name)` | One row per run | |
| 86 | +| `MetricRecord` | `metrics` | auto-increment | FK to runs.run_id | |
| 87 | +| `ConfigRecord` | `configs` | `config_id` | Versioned configs | |
| 88 | + |
| 89 | +### Proto → DDL Type Mapping |
| 90 | + |
| 91 | +| Proto Type | SQL Type | Column Default | |
| 92 | +|-----------|----------|----------------| |
| 93 | +| `string` | `TEXT` | `''` | |
| 94 | +| `bool` | `BOOLEAN` | `false` | |
| 95 | +| `int32` | `INTEGER` | `0` | |
| 96 | +| `int64` | `BIGINT` | `0` | |
| 97 | +| `double` | `DOUBLE PRECISION` | `0.0` | |
| 98 | +| `bytes` | `BYTEA` | `NULL` | |
| 99 | +| `Timestamp` | `TIMESTAMPTZ` | `NOW()` | |
| 100 | +| `repeated T` | `JSONB` | `'[]'::jsonb` | |
| 101 | +| `map<K,V>` | `JSONB` | `'{}'::jsonb` | |
| 102 | +| nested message | `JSONB` | `NULL` | |
| 103 | +| `enum` | `TEXT` | First value name | |
| 104 | + |
| 105 | +### Migration Convention |
| 106 | + |
| 107 | +``` |
| 108 | +migrations/ |
| 109 | +├── 001_create_runs.sql |
| 110 | +├── 002_create_metrics.sql |
| 111 | +├── 003_create_configs.sql |
| 112 | +└── 004_add_metrics_index.sql |
| 113 | +``` |
| 114 | + |
| 115 | +Each migration is idempotent (`CREATE TABLE IF NOT EXISTS`, `CREATE INDEX IF NOT EXISTS`). |
| 116 | + |
| 117 | +### Config ↔ Proto Mapping |
| 118 | + |
| 119 | +| manifest.json field | Proto usage | Notes | |
| 120 | +|---------------------|-------------|-------| |
| 121 | +| `resources.branch` | Not in proto | Infrastructure config | |
| 122 | +| `resources.database` | Not in proto | Infrastructure config | |
| 123 | +| `resources.host` (`PGHOST`) | Connection string | Runtime injection | |
| 124 | +| `resources.databaseName` (`PGDATABASE`) | Database selection | Runtime injection | |
| 125 | + |
| 126 | +--- |
| 127 | + |
| 128 | +## Jobs / Compute Contract |
| 129 | + |
| 130 | +**No plugin manifest** — Jobs are invoked via `@databricks/sdk-experimental` |
| 131 | +**Resource**: Databricks Jobs API |
| 132 | +**Auth**: Workspace token or OAuth |
| 133 | + |
| 134 | +### Boundary: What the jobs module owns |
| 135 | + |
| 136 | +The jobs module owns compute execution. It receives typed task inputs, runs them |
| 137 | +on Databricks clusters, and produces typed task outputs. |
| 138 | + |
| 139 | +``` |
| 140 | +┌─────────────┐ JobConfig ┌──────────────┐ |
| 141 | +│ api module │ ──────────────────→ │ jobs module │ |
| 142 | +│ │ │ │ |
| 143 | +│ │ JobTaskInput │ Databricks │ |
| 144 | +│ │ ──────────────────→ │ Jobs API │ |
| 145 | +│ │ │ │ |
| 146 | +│ │ JobTaskOutput │ Clusters │ |
| 147 | +│ │ ←────────────────── │ │ |
| 148 | +└─────────────┘ └──────────────┘ |
| 149 | +``` |
| 150 | + |
| 151 | +### Proto → Jobs SDK Mapping |
| 152 | + |
| 153 | +| Proto Message | SDK Method | Direction | |
| 154 | +|---------------|-----------|-----------| |
| 155 | +| `JobConfig` | `jobs.create(config)` | IN — defines the job | |
| 156 | +| `TaskConfig` | Task within a job | IN — defines task deps | |
| 157 | +| `JobTaskInput` | Task params (base64 proto) | IN — task receives | |
| 158 | +| `JobTaskOutput` | Task output (written to Volume) | OUT — task produces | |
| 159 | + |
| 160 | +### Task Parameter Convention |
| 161 | + |
| 162 | +Job tasks receive their typed input via: |
| 163 | +1. **Small payloads (<256KB)**: Base64-encoded proto in task params |
| 164 | +2. **Large payloads**: Proto binary written to UC Volume, path passed as param |
| 165 | + |
| 166 | +```typescript |
| 167 | +// Producer (api module) |
| 168 | +const input: JobTaskInput = { taskId, taskType, runId, inputPayload }; |
| 169 | +const encoded = Buffer.from(JobTaskInput.encode(input).finish()).toString('base64'); |
| 170 | +// Pass as notebook parameter: { "input": encoded } |
| 171 | + |
| 172 | +// Consumer (job task code) |
| 173 | +const decoded = JobTaskInput.decode(Buffer.from(params.input, 'base64')); |
| 174 | +``` |
| 175 | + |
| 176 | +### Task Output Convention |
| 177 | + |
| 178 | +Job tasks write their typed output to: |
| 179 | +``` |
| 180 | +/Volumes/{catalog}/{schema}/{volume}/results/{run_id}/{task_id}.output.bin |
| 181 | +``` |
| 182 | + |
| 183 | +The output is a serialized `JobTaskOutput` proto. The orchestrator reads it |
| 184 | +back with the generated decoder. |
| 185 | + |
| 186 | +### Jobs API Patterns |
| 187 | + |
| 188 | +```typescript |
| 189 | +// Create a multi-task job from JobConfig proto |
| 190 | +const jobConfig: JobConfig = { |
| 191 | + jobName: `${appName}-${runId}`, |
| 192 | + clusterSpec: '{"num_workers": 1}', |
| 193 | + maxRetries: 2, |
| 194 | + timeoutSeconds: 3600, |
| 195 | + tasks: [ |
| 196 | + { taskKey: 'generate', taskType: 'generate', dependsOn: [] }, |
| 197 | + { taskKey: 'evaluate', taskType: 'evaluate', dependsOn: ['generate'] }, |
| 198 | + { taskKey: 'aggregate', taskType: 'aggregate', dependsOn: ['evaluate'] }, |
| 199 | + ], |
| 200 | +}; |
| 201 | +``` |
0 commit comments