Skip to content

Commit 3c9b598

Browse files
Merge remote-tracking branch 'origin/main' into contrib/alanshurafa/brain-backup
2 parents d265095 + f15cff6 commit 3c9b598

3 files changed

Lines changed: 90 additions & 22 deletions

File tree

schemas/enhanced-thoughts/README.md

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,23 @@
11
# Enhanced Thoughts Columns and Utility RPCs
22

3+
<div align="center">
4+
5+
![Community Contribution](https://img.shields.io/badge/OB1_COMMUNITY-Approved_Contribution-2ea44f?style=for-the-badge&logo=github)
6+
7+
**Created by [@alanshurafa](https://github.com/alanshurafa)**
8+
9+
</div>
10+
311
> Adds structured columns and utility functions to the Open Brain thoughts table for richer classification, full-text search, statistics, and connection discovery.
412
513
## What It Does
614

7-
This schema extension adds six new columns to the `thoughts` table (`type`, `sensitivity_tier`, `importance`, `quality_score`, `source_type`, `enriched`) so thoughts can be classified, filtered, and ranked without parsing the metadata JSONB every time. It also upgrades `upsert_thought` so metadata-backed writes keep those structured columns in sync. It installs three utility RPC functions:
15+
This schema extension adds six new columns to the `thoughts` table (`type`, `sensitivity_tier`, `importance`, `quality_score`, `source_type`, `enriched`) so thoughts can be classified, filtered, and ranked without parsing the metadata JSONB every time. It also installs four RPC functions:
816

917
- **`search_thoughts_text`** -- Full-text search with boolean operators, ILIKE fallback, pagination, and result counts.
1018
- **`brain_stats_aggregate`** -- Returns total thought count, top types, and top topics as a single JSONB payload.
1119
- **`get_thought_connections`** -- Finds thoughts that share metadata topics or people with a given thought.
20+
- **`backfill_thought_types(p_allowed_types TEXT[])`** -- Populates the new top-level `type` column from `metadata->>'type'`. The default allowlist covers the canonical eight values (`idea`, `task`, `person_note`, `reference`, `decision`, `lesson`, `meeting`, `journal`). Pass a custom array to accept additional values, or pass `NULL` to backfill whatever `metadata->>'type'` contains.
1221

1322
## Prerequisites
1423

@@ -36,19 +45,33 @@ SUPABASE (from your Open Brain setup)
3645
2. Create a new query and paste the full contents of `schema.sql`
3746
3. Click **Run** to execute the migration
3847
4. Open **Table Editor** and select the `thoughts` table to confirm the new columns appear: `type`, `sensitivity_tier`, `importance`, `quality_score`, `source_type`, `enriched`
39-
5. Navigate to **Database > Functions** and verify three new functions exist: `search_thoughts_text`, `brain_stats_aggregate`, `get_thought_connections`
40-
6. Verify `upsert_thought` still exists. The enhanced version mirrors `metadata.type`, `metadata.source`, `metadata.importance`, `metadata.quality_score`, `metadata.sensitivity_tier`, and task/idea status into top-level columns.
41-
7. If you have existing thoughts with `type` or `source` values stored in the metadata JSONB, the backfill statements at the bottom of the script will have populated the new columns automatically
48+
5. Navigate to **Database > Functions** and verify the new functions exist: `search_thoughts_text`, `brain_stats_aggregate`, `get_thought_connections`, `backfill_thought_types`
49+
6. If you have existing thoughts with `type` or `source` values stored in the metadata JSONB, the script automatically calls `backfill_thought_types()` with the default canonical allowlist. If your brain uses non-canonical `type` values, re-run `SELECT backfill_thought_types(ARRAY['your','custom','types']);` or `SELECT backfill_thought_types(NULL);` to accept any value
4250

4351
## Expected Outcome
4452

4553
After running the migration:
4654

47-
- The `thoughts` table has six new columns with dashboard-friendly defaults.
55+
- The `thoughts` table has six new columns with sensible defaults:
56+
- `sensitivity_tier TEXT DEFAULT 'standard'` (canonical values: `'standard'`, `'personal'`, `'restricted'`)
57+
- `importance SMALLINT DEFAULT 3` (scale: 1-5, where 3 is the default)
58+
- `quality_score NUMERIC(5,2) DEFAULT 50` (scale: 0-100, where 50 is the default)
59+
- `enriched BOOLEAN DEFAULT false`
60+
- `type TEXT` (nullable; populated by backfill or writers)
61+
- `source_type TEXT` (nullable; populated by backfill or writers)
4862
- New indexes on `type`, `importance`, `source_type`, and a GIN tsvector index on `content` for fast full-text search.
49-
- Three new RPC functions callable via the Supabase client or REST API.
50-
- `upsert_thought` remains the canonical write path, but now keeps structured dashboard columns synchronized with metadata payloads.
51-
- Any existing thoughts with `type` or `source` in their metadata JSONB will have those values copied into the new top-level columns.
63+
- Four new RPC functions callable via the Supabase client or REST API (`search_thoughts_text`, `brain_stats_aggregate`, `get_thought_connections`, `backfill_thought_types`).
64+
- Any existing thoughts with `type` or `source` in their metadata JSONB will have those values copied into the new top-level columns (via `backfill_thought_types()` for `type` with the canonical allowlist, plus an inline `UPDATE` for `source_type`).
65+
66+
## Security
67+
68+
This schema follows stock Open Brain's "service_role only" posture:
69+
70+
- `brain_stats_aggregate` and `get_thought_connections` are `SECURITY DEFINER` with `SET search_path = public` (defense in depth against search-path hijacks). They can read the full `thoughts` table regardless of RLS.
71+
- `search_thoughts_text` is `SECURITY INVOKER` and respects RLS.
72+
- **None of the three RPCs are granted to `anon`.** Execute privilege is limited to `authenticated` and `service_role`. The publishable anon key cannot call them.
73+
74+
If you want to expose any of these to `anon` (for example, a public-read dashboard), add your own `GRANT EXECUTE ... TO anon;` in a follow-up migration and confirm that `p_exclude_restricted := true` (the default) plus your sensitivity-tier hygiene gives you the exposure surface you actually want. This is an explicit opt-in: the default stance is private.
5275

5376
## Troubleshooting
5477

@@ -59,4 +82,4 @@ Solution: These are safe to ignore. The `ADD COLUMN IF NOT EXISTS` syntax preven
5982
Solution: Confirm your thoughts have content populated. Try a simple query first (single word, no operators). If using boolean operators, ensure the syntax matches websearch format ("quoted phrases", word AND word, -excluded).
6083

6184
**Issue: brain_stats_aggregate returns empty types or topics**
62-
Solution: The function filters by `created_at`. Pass `p_since_days := 0` for all-time stats. Also confirm that your thoughts have the `type` column populated (run the backfill UPDATE if needed).
85+
Solution: The function filters by `created_at`. Pass `p_since_days := 0` for all-time stats. Also confirm that your thoughts have the `type` column populated. If you use non-canonical type values in `metadata->>'type'` (anything outside `idea`, `task`, `person_note`, `reference`, `decision`, `lesson`, `meeting`, `journal`), call the backfill RPC with your own allowlist, e.g. `SELECT backfill_thought_types(ARRAY['idea','task','article','quote']);`, or `SELECT backfill_thought_types(NULL);` to accept whatever is present.

schemas/enhanced-thoughts/metadata.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,5 +14,5 @@
1414
"difficulty": "beginner",
1515
"estimated_time": "15 minutes",
1616
"created": "2026-04-06",
17-
"updated": "2026-04-06"
17+
"updated": "2026-04-17"
1818
}

schemas/enhanced-thoughts/schema.sql

Lines changed: 57 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ RETURNS TABLE (
5555
total_count BIGINT
5656
)
5757
LANGUAGE plpgsql
58-
VOLATILE
58+
STABLE
5959
SET statement_timeout = '25s'
6060
AS $$
6161
BEGIN
@@ -84,7 +84,7 @@ BEGIN
8484
AND (SELECT count(*) FROM tsvector_hits) < (p_limit + p_offset)
8585
AND t.content ILIKE '%' || q.raw_query || '%'
8686
AND t.metadata @> coalesce(p_filter, '{}'::jsonb)
87-
AND t.id NOT IN (SELECT th.hit_id FROM tsvector_hits th)
87+
AND NOT EXISTS (SELECT 1 FROM tsvector_hits th WHERE th.hit_id = t.id)
8888
LIMIT 500
8989
),
9090
all_hits AS (
@@ -118,8 +118,10 @@ BEGIN
118118
ELSE 0
119119
END
120120
)
121-
+ (coalesce(t.importance, 5) / 20.0)::real
122-
+ (coalesce(t.quality_score, 0.50) / 500.0)::real
121+
-- importance is 1..5; max bonus 5/20 = 0.25
122+
+ (coalesce(t.importance, 3) / 20.0)::real
123+
-- quality_score is 0..100; max bonus 100/500 = 0.20
124+
+ (coalesce(t.quality_score, 50) / 500.0)::real
123125
)::real AS rank
124126
FROM public.thoughts t
125127
CROSS JOIN query_input q
@@ -138,8 +140,12 @@ BEGIN
138140
END;
139141
$$;
140142

143+
-- Do NOT grant to `anon`. Stock Open Brain keeps `thoughts` behind RLS
144+
-- (service_role only). Broadening execution to the publishable anon key
145+
-- would expose the entire brain to anyone who knows the project URL.
146+
-- See README "Security" section.
141147
GRANT EXECUTE ON FUNCTION search_thoughts_text(TEXT, INTEGER, JSONB, INTEGER)
142-
TO authenticated, anon, service_role;
148+
TO authenticated, service_role;
143149

144150
-- ============================================================
145151
-- 3. BRAIN STATS AGGREGATE RPC
@@ -195,8 +201,10 @@ BEGIN
195201
END;
196202
$$;
197203

204+
-- Do NOT grant to `anon`. This RPC is SECURITY DEFINER and would bypass
205+
-- RLS on the thoughts table. See README "Security" section.
198206
GRANT EXECUTE ON FUNCTION brain_stats_aggregate(INTEGER, BOOLEAN)
199-
TO authenticated, anon, service_role;
207+
TO authenticated, service_role;
200208

201209
-- ============================================================
202210
-- 4. THOUGHT CONNECTIONS RPC
@@ -220,6 +228,7 @@ RETURNS TABLE (
220228
overlap_count INT
221229
)
222230
LANGUAGE plpgsql
231+
STABLE
223232
SECURITY DEFINER
224233
SET search_path = public
225234
AS $$
@@ -266,7 +275,7 @@ BEGIN
266275
) AS shared_people
267276
FROM thoughts bt
268277
WHERE bt.id != p_thought_id
269-
AND (NOT p_exclude_restricted OR bt.sensitivity_tier != 'restricted')
278+
AND (NOT p_exclude_restricted OR bt.sensitivity_tier IS DISTINCT FROM 'restricted')
270279
AND (
271280
EXISTS (
272281
SELECT 1 FROM jsonb_array_elements_text(bt.metadata->'topics') val
@@ -288,19 +297,55 @@ BEGIN
288297
END;
289298
$$;
290299

300+
-- Do NOT grant to `anon`. This RPC is SECURITY DEFINER and exposes
301+
-- a 200-char content preview plus metadata for any thought by UUID;
302+
-- granting to anon would let anyone with the project URL pull content.
303+
-- See README "Security" section.
291304
GRANT EXECUTE ON FUNCTION get_thought_connections(UUID, INT, BOOLEAN)
292-
TO authenticated, anon, service_role;
305+
TO authenticated, service_role;
293306

294307
-- ============================================================
295308
-- 5. BACKFILL EXISTING DATA
296309
-- Populates new columns from metadata for rows that already
297310
-- exist. Safe to run multiple times (WHERE ... IS NULL guard).
298311
-- ============================================================
299312

300-
-- Backfill type from metadata
301-
UPDATE thoughts SET type = metadata->>'type'
302-
WHERE type IS NULL AND metadata->>'type' IS NOT NULL
303-
AND metadata->>'type' IN ('idea','task','person_note','reference','decision','lesson','meeting','journal');
313+
-- Backfill `type` from metadata. Wrapped in an RPC so callers can
314+
-- override the allowlist. Default allowlist matches the canonical
315+
-- Open Brain type vocabulary; pass NULL to accept any string value
316+
-- present in metadata->>'type'.
317+
CREATE OR REPLACE FUNCTION backfill_thought_types(
318+
p_allowed_types TEXT[] DEFAULT ARRAY[
319+
'idea','task','person_note','reference',
320+
'decision','lesson','meeting','journal'
321+
]
322+
)
323+
RETURNS BIGINT
324+
LANGUAGE plpgsql
325+
VOLATILE
326+
SET search_path = public
327+
AS $$
328+
DECLARE
329+
v_updated BIGINT;
330+
BEGIN
331+
UPDATE public.thoughts
332+
SET type = metadata->>'type'
333+
WHERE type IS NULL
334+
AND metadata->>'type' IS NOT NULL
335+
AND (p_allowed_types IS NULL OR metadata->>'type' = ANY(p_allowed_types));
336+
337+
GET DIAGNOSTICS v_updated = ROW_COUNT;
338+
RETURN v_updated;
339+
END;
340+
$$;
341+
342+
-- Do NOT grant to `anon`. This RPC writes to the thoughts table.
343+
GRANT EXECUTE ON FUNCTION backfill_thought_types(TEXT[])
344+
TO authenticated, service_role;
345+
346+
-- Run the backfill with the default allowlist so the paste-and-run
347+
-- flow still auto-populates `type` for canonical values.
348+
SELECT backfill_thought_types();
304349

305350
-- Backfill source_type from metadata
306351
UPDATE thoughts SET source_type = metadata->>'source'

0 commit comments

Comments
 (0)