Implementation of on-demand denormalized projections #4937
mnonnenmacher
started this conversation in
Ideas
Replies: 1 comment 2 replies
-
|
When you have time, an example of the usage would be very helpful. I myself have noticed that for example, querying curated purls is incredibly complicated for the current normalized db. That could be one candidate of such an example. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Querying complex data across our current normalized schema often requires very complex queries, tinkering with the schema and has become a performance bottleneck and slows down the development of new features for the UI. This proposal introduces a Request-Triggered Projection pattern to build specialized read models on demand.
This concrete proposal addresses similar issues as #1915.
Use Cases
The architecture described below was chosen to address several conflicting requirements:
1. Infrastructure Components
projection_registry):id: Primary Key (e.g.,run_id+projection_type)status:PENDING,BUILDING,READY,ERRORschema_version: Current version of the projection logic.created_at,last_accessed_at: To facilitate TTL/Cleanup.projected_vulnerabilities) using standard columns for filtered/sorted fields and aJSONBcolumns for additional payload.projection-builder-queueto decouple the API from the heavy lifting.2. The Request Flow (API)
schema_version.READY: Return data immediately (200 OK). Updatelast_accessed_at.BUILDING/PENDING: Return 202 Accepted with aRetry-After: 5header.MISSING/STALE: UseINSERT ... ON CONFLICT DO NOTHING. If the row is successfully inserted/claimed, dispatch a build message to RabbitMQ. Return 202 Accepted.3. The Builder Flow (Worker)
BUILDING.INSERTresults into the projection table.READYand setschema_version.ERRORfor visibility and dead-letter the message.4. Lifecycle & Maintenance
last_accessed_at < NOW() - INTERVAL '30 days'.SCHEMA_VERSIONconstant in the code will automatically invalidate existing projections, triggering a rebuild on the next user request.5. Edge Cases
ERRORstate by allowing a "Retry" which clears the registry entry.READYonce all data is fully committed.Beta Was this translation helpful? Give feedback.
All reactions