Skip to content

Amend RFC 1000: Revision ids should be hashes of the revision contents #2

@tailhook

Description

@tailhook

Motivation

Each revision is a file in dbschema/migrations/* per spec. Files are stored in revision control, so technically can be edited after being registered in the database server. Even if we explicitly document that editing files is supported, users can still fail on fixing merge conflicts. The erroneous migration is hard to fix if we don't have hashes because it's unclear which contents is the source of truth if two branches were merged (i.e. id is opaque, and any revision could be the right one depending on which branch was applied first, and order might be different on different staging servers or even canary deployments).

Assumptions

It should be possible to hash revision contents based solely on tokenization of the file without understanding the semantics of the data.

Alternatively, it might be possible to implement full AST or simplified AST (parse tree) of the revision file and hash that, if some forms of statements are ambiguous.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions