Skip to content

Feature: retention time marker procedure #5518

Open
@dantengsky

Description

@dantengsky

Summary

Provides a way of marking historical snapshots invisible, so that the old snapshots( and maybe the data it referenced) can fade away gradually.


Basic desc of functionalities:

Marks the latest visible snapshot of the given table.

  • A system configuration, let's say table_retention_time: Duration,
  • A system procedure, which marks the latest visible snapshot of a table
    • by insert/update a specified key of the KV service
    • TimeTravel of table data will respect this mark

NOTE: The query nodes work on their local clocks, which is NOT perfectly synced


basic idea of impl:

  • provides a system procedure, let's say
    call system$retention_mark([database_name,] table_name)
    • grab meta data of the table specified
    • check if key LATEST_VISBLE_SNAPHOST of the give table exist
      LATEST_VISBLE_SNAPHOST/<tid> -> timestamp
      • if it exist and value of it is less than (now() + table_retention_time)
        try to update it to (now() + table_retention_time)
      • if it does not exist
        try to insert the kv pair
    • And of course, the mutations should be executed in a kv transaction
      • the most important invariant of this operation
        value of LATEST_VISBLE_SNAPHOST/<tid> should only be increased

Notes:

  • if database_name is not provided, use the context's current database name
  • A "hurry" marker, whose clock is crazily ahead of time, may mark the LATEST_VISBLE_SNAPHOST "incorrectly"
    • Have to live with it, hoping it is not too crazy : )

      e.g. if the clock is two months ahead of time. The history of the table may be not accessible in the next 2 months.

      To intimidate this situation:
      The value of LATEST_VISBLE_SNAPHOST/<tid> could be changed to the timestamp of the snapshot, by navigating to the snapshot S at (now() + table_retention_time).
      Thus, snapshots generated after S, could be accessible, if clocks go back to normal.

    • The "current snapshot" referenced by the KV meta, is always visible

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions