Skip to content

Get rid of the ContentData abstraction #10

@chriscarrollsmith

Description

@chriscarrollsmith

The reason I opted to use a separate ContentData table to store text rather than a Node with no tag is that I wanted the text content to be keyword-searchable.

But actually, it's still keyword-searchable if it's a Node, and having tagless text nodes is closer to the HTML data model on which I've based our database schema.

There's not a strong reason to migrate to to tagless text nodes, except that: 1. This allows a container to have both HTML tag and text children, as the input HTML occasionally might, and 2. it simplifies the data model and speeds up certain queries to have one less table.

On the other hand, keyword search might go a bit slower if we have to pre-filter more records without text content before searching. And the current pattern is more space-efficient. And an HTML element having both tag and text children doesn't happen very often (except in the case of inline style tags, which I'm not treating as nodes anyway.

To migrate from ContentData to text nodes, I would need to:

  1. Insert content data to the nodes table,
  2. link embeddings to the inserted nodes rather than to the content data,
  3. bump the sequence for any children that come after the inserted text nodes (this almost certainly won't ever happen during this migration, but it would be good to have safeguards in place for this anyway)
  4. Drop the content data table

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions