Improve Lakebase skill: synced tables, connectivity, and best-practice alignment#56
Draft
Improve Lakebase skill: synced tables, connectivity, and best-practice alignment#56
Conversation
04635d9 to
67bc5f1
Compare
- Add Pattern 4 (Python Databricks App with platform-injected env vars) to connectivity.md - Clarify connection patterns are Python-specific; JS/AppKit apps get full auto-injection - Remove duplicate Data API / Reverse ETL pointers already covered in reference docs block - Fix missing blank lines before section headers in SKILL.md Co-authored-by: Isaac
- Rename reverse-etl.md to synced-tables.md (official Databricks terminology) - Replace all `databricks database` commands with `databricks postgres` (old Provisioned API → new Autoscaling API) - Add `new_pipeline_spec` as required field with warning: storage_catalog must be a regular UC catalog, not the Lakebase catalog (DLT can't create event logs in Postgres-backed schemas) - Add Prerequisites section for `create-catalog` - Add complete field reference table - Add NYC taxi example (end-to-end lifecycle) - Add synced tables section to apps lakebase.md (read-only pattern, permission grants, differences from CRUD tables) - All commands verified against live Lakebase project Co-authored-by: Isaac
- Fix broken relative path in appkit/lakebase.md (../../ → ../../../) - Trim duplicate autoscaling section in SKILL.md (defer to computes-and-scaling.md) - Restore Data API and Synced Tables links in Other Workflows section - Add "Previously known as Reverse ETL" to synced-tables.md for discoverability Co-authored-by: Isaac
…sync troubleshooting - Expand SKILL.md description with trigger words (synced tables, reverse ETL, Data API, scale-to-zero, OAuth, connection pooling) for better skill activation - Switch to 3rd person voice per create-skill best practices - Replace feature status table with bulleted Capabilities section - Add 3 synced-table troubleshooting entries (storage_catalog, CDF, permissions) - Update manifest description to match Co-authored-by: Isaac
- Add DNS resolution (macOS) workaround to connectivity.md (was referenced from SKILL.md but missing) - Rewrite synced table read example as tRPC route in appkit/lakebase.md (was Express-style, contradicting tRPC guidance) - Deduplicate create-catalog command in synced-tables.md NYC taxi example - Remove redundant storage_catalog warning callout (already in field table) - Clarify that delete-synced-table leaves the Postgres table behind Co-authored-by: Isaac
Add use-case guidance to help AI assistants discover synced tables for the right patterns: operational consoles, user-facing apps on analytical data, online feature serving, hybrid read/write, and Postgres-specific capabilities. Include anti-patterns (OLAP, writes, huge tables, FGAC). - Broaden Lakebase row in decision table to mention synced data - Add callout below table pointing to synced tables section - Add architecture pattern, use cases, and anti-patterns Co-authored-by: Isaac
Add a bullet for "Read lakehouse data with low latency" that points agents to the Lakebase guide when apps need fast point lookups. Add guidance note for agents to ask about latency requirements rather than defaulting to DBSQL for all read-from-UC use cases. Co-authored-by: Isaac
Agent testing showed the "When to Use What" section routes to analytics by default, even when the user explicitly asks for "fast" or "instant" reads. The synced tables bullet was buried after three analytics bullets that matched first. - Add decision gate at top of section: if user mentions speed/latency, present both analytics and synced tables options before proceeding - Rewrite guidance note to reference the gate instead of defaulting - Remove redundant standalone synced tables bullet (covered by gate) Co-authored-by: Isaac
…idance From agent testing: - Add psql connection workflow (generate-credential + psql) for running GRANT, CREATE INDEX, etc. against Lakebase - Add post-deploy GRANT workflow for app SP access to synced tables - Add source table guidance: don't run ad-hoc CREATE TABLE AS SELECT - Add DAB caveat: synced_database_tables maps to Provisioned API, not Autoscaling. DAB support for postgres_synced_tables not yet available. - Add creation + GRANT note to apps lakebase.md Co-authored-by: Isaac
- Fix psql sslmode syntax (connection string instead of broken positional arg) - Add scriptable psql workflow for agent automation - Add endpoint path note for generate-database-credential - Align GRANT SQL between SKILL.md and apps/lakebase.md (add ALTER DEFAULT PRIVILEGES) - Surface DABs caveat in SKILL.md troubleshooting table - Broaden latency decision gate triggers in apps SKILL.md - Add "sync gold tables, not raw" as top best practice - Clarify 8 TB storage limit scope (per branch) Co-authored-by: Isaac
cfe90bc to
63dee0b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
reverse-etl.mdwith comprehensivesynced-tables.md— sync modes, verified CLI syntax, prerequisites, data type mapping, capacity planning, DABs caveat, and a worked example