You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
5. Each index's `meta_type` is mapped via `META_TYPE_MAP` to an `IndexType` enum
43
43
6.`_register_dri_translators()` discovers `DateRecurringIndex` instances and registers `IPGIndexTranslator` utilities
44
+
7.`_ensure_text_indexes()` creates GIN expression indexes for any dynamically discovered `TEXT`-type indexes with `idx_key is not None` (Title, Description, addon ZCTextIndex fields)
**TextIndex**: `searchable_text @@ plainto_tsquery('simple', %(text)s)` for SearchableText; other text indexes treated as field match.
219
+
**TextIndex (SearchableText)**: `searchable_text @@ plainto_tsquery(pgcatalog_lang_to_regconfig(%(lang)s)::regconfig, %(text)s)` -- language-aware stemming via the per-object `Language` field. Falls back to `'simple'` when no language is set.
220
+
221
+
**TextIndex (Title/Description/addon)**: `to_tsvector('simple'::regconfig, COALESCE(idx->>'Title', '')) @@ plainto_tsquery('simple'::regconfig, %(text)s)` -- word-level matching on idx JSONB values, backed by GIN expression indexes. Uses `'simple'` config (no stemming) because expression indexes require a fixed regconfig.
219
222
220
223
## Transactional Writes
221
224
@@ -239,12 +242,46 @@ Includes:
239
242
240
243
-`ALTER TABLE object_state ADD COLUMN IF NOT EXISTS ...` for catalog columns (`path`, `idx`, `searchable_text`)
241
244
-`pgcatalog_to_timestamptz()` immutable wrapper for expression indexes
245
+
-`pgcatalog_lang_to_regconfig()` maps Plone language codes (ISO 639-1) to PG text search configurations (e.g. `'de'` → `'german'`). Used at both write time (`to_tsvector`) and query time (`plainto_tsquery`). Returns `'simple'` for NULL, empty, or unmapped languages.
242
246
- GIN index on `idx` JSONB
243
247
- B-tree expression indexes on `idx` JSONB for path queries (`path`, `path_parent`, `path_depth`)
244
248
- B-tree expression indexes for common sort/filter fields (modified, created, effective, expires, sortable_title, portal_type, review_state, UID)
- Dynamic GIN expression indexes for addon ZCTextIndex fields (created at startup by `_ensure_text_indexes()`)
246
252
- rrule_plpgsql schema and functions (for DateRecurringIndex)
247
253
254
+
## Full-Text Search
255
+
256
+
Three tiers of text search, each with different characteristics:
257
+
258
+
### SearchableText (Language-Aware)
259
+
260
+
Uses the dedicated `searchable_text` TSVECTOR column with per-object language stemming:
261
+
262
+
-**Write path**: `to_tsvector(pgcatalog_lang_to_regconfig(idx->>'Language')::regconfig, text)` -- language extracted from the object's `Language` field in idx JSONB
263
+
-**Query path**: `searchable_text @@ plainto_tsquery(pgcatalog_lang_to_regconfig(%(lang)s)::regconfig, %(text)s)` -- language from the query's `Language` filter
264
+
-**Index**: GIN on `searchable_text` column
265
+
-**Stemming**: Yes, for the 30 supported languages (falls back to `'simple'` for unknown/empty)
266
+
267
+
### Title / Description (Word-Level)
268
+
269
+
Uses tsvector expression matching on idx JSONB values:
270
+
271
+
-**Write path**: Values stored as plain text in `idx->>'Title'` / `idx->>'Description'`
-**Index**: GIN expression indexes (pre-created in DDL)
274
+
-**Stemming**: No (`'simple'` config) -- expression indexes require a fixed regconfig. Language-aware stemmed search for titles is available via SearchableText (which includes title text).
275
+
276
+
### Addon ZCTextIndex Fields
277
+
278
+
Any addon that registers a ZCTextIndex in ZCatalog (via `catalog.xml`) is automatically supported:
279
+
280
+
1.`sync_from_catalog()` discovers the index → registered as `(IndexType.TEXT, idx_key, source_attrs)`
281
+
2.`_ensure_text_indexes()` creates a GIN expression index at startup: `to_tsvector('simple', COALESCE(idx->>'{idx_key}', ''))`
282
+
3. Value extracted into idx JSONB during indexing (idx_key is not None)
283
+
4.`_handle_text()` generates tsvector expression matching -- zero addon code needed
284
+
248
285
## Query Optimizations
249
286
250
287
1.**orjson**: Registered as psycopg's JSONB deserializer for faster JSON parsing
0 commit comments