Skip to content

[Plugin Arch] Phase 1 — Extract Core Catalog Primitives into basecatalog #2527

@Al-Pragliola

Description

@Al-Pragliola

Summary

Extract generic, reusable building blocks from the monolithic model catalog into catalog/internal/catalog/basecatalog/. These primitives — sources, config, loader, monitor/watcher, filter engine, entity registry, and validation — form the foundation that any catalog plugin can build on without duplicating infrastructure code.

Motivation

The current catalog implementation bundles domain-specific logic (models) with generic infrastructure (source loading, watching, filtering). This coupling prevents reuse by other catalog types (datasets, MCP tools, etc.) and makes the system harder to extend.

Scope

  • Extract the source abstractions: source types (ModelSource, MCPSource, CommonSourceFields), source status constants, and config loading/validation
  • Extract the BaseLoader: shared loader infrastructure with leader state management, inflight write tracking, file watcher lifecycle, and operation cancellation (exposes LoaderState interface for child loaders)
  • Extract the monitor/watcher: file change detection using CRC32 hashing, with Kubernetes ConfigMap symlink handling
  • Extract the filter engine: CatalogEntityRegistry for declarative entity-type-to-filter mapping, filterQuery conversion from named FieldFilter to query strings (supports =, !=, >, <, LIKE, IN, etc.), and FilterOptions conversion from DB PropertyOptions to API types
  • Extract the name filter: include/exclude glob pattern matching for entity names
  • Extract validation: artifact URI scheme validation (oci, http, https, s3, gs, az, file) and named query validation
  • Extract license transforms: SPDX license ID to human-readable name mapping
  • Place all extracted code in catalog/internal/catalog/basecatalog/ with clean interfaces
  • Ensure existing model catalog functionality continues to work after extraction

Known Coupling to Address

The following areas have model-specific assumptions that will need to be parameterized for full plugin generality (can be deferred to Phase 5/6):

  • License transforms: hardcoded overrides for model-specific licenses (llama2, llama3.1, nvidia-open-model-license)
  • Artifact URI validation: fixed scheme list — other catalog types may need different schemes
  • Config: ModelSource/MCPSource types are catalog-specific rather than fully generic

Acceptance Criteria

  • catalog/internal/catalog/basecatalog/ contains well-defined interfaces and types for loader, monitor, entity registry, filter engine, source config, and validation
  • LoaderState interface decouples child loaders from BaseLoader internals
  • CatalogEntityRegistry enables declarative entity registration without hand-coded switch statements
  • Existing model catalog tests pass without modification (or with minimal adapter changes)
  • Unit tests cover the extracted primitives (config, entity registry, filter query, monitor, name filter, validation, license transforms)
  • make -C catalog build && make -C catalog test pass

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions