Skip to content

refactor: consolidate duplicate _INSERT_QUERY definitions for exported_summaries #193

@anfredette

Description

@anfredette

Context

There are two independently maintained _INSERT_QUERY SQL strings that insert into the same exported_summaries table:

  • src/planner/knowledge_base/loader.py (lines ~123-158) — used by the benchmark loading flow (/api/v1/db/load, CLI scripts)
  • src/planner/knowledge_base/model_catalog_sync.py (lines ~208-216) — used by the model catalog sync flow

PR #190 fixed a bug where model_uri was present in model_catalog_sync.py's query but missing from loader.py's query. The root cause is that these two queries are maintained independently with no shared structure, so column additions can easily miss one path.

Proposal

Options (in order of preference):

  1. Extract a shared column list: Define the column list once (e.g., as a tuple or list in a shared module) and generate the INSERT query from it. Both loader.py and model_catalog_sync.py would import and use the same definition.

  2. Cross-reference with comments: If a shared definition is too disruptive, at minimum add comments in each file pointing to the other, e.g.:

    # NOTE: A similar INSERT query exists in model_catalog_sync.py.
    # If you add/remove columns here, update that file too.
  3. Structural test: A test that extracts %(...)s params from both queries and asserts they match (could complement option 1 or 2). See related issue for test coverage.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions