Skip to content

feat: Hook system for partition cleanup lifecycle#107

Draft
vmercierfr wants to merge 13 commits into
qonto:mainfrom
vmercierfr:hooks-support
Draft

feat: Hook system for partition cleanup lifecycle#107
vmercierfr wants to merge 13 commits into
qonto:mainfrom
vmercierfr:hooks-support

Conversation

@vmercierfr

Copy link
Copy Markdown
Collaborator

Summary

This PR introduces a requirements specification for a hook system that allows users to execute custom commands before and after partition archival operations (detach/drop).

The primary use case is archiving partition data (e.g., to S3) before dropping partitions, but the system is designed to be generic enough to support any pre/post operation workflow.

Specification

The full requirements document is available at: .kiro/specs/partition-hooks/requirements.md

Key Design Decisions

Hook Types (initial implementation)

  • shell — Execute arbitrary commands with template variables and optional credential propagation
  • postgresql — Execute SQL statements (e.g., VACUUM, ANALYZE) using PPM database connection

Hook Types (future)

  • s3 — PPM-managed extraction pipeline with two modes:
    • streaming (preferred): direct pipe to S3 multipart upload, no local disk usage
    • temp_dir (fallback): local write + upload with disk space validation using pg_table_size and configurable spare_percent
  • aws_s3 — Server-side export via PostgreSQL aws_s3 extension (recommended for large datasets)

Lifecycle Events

  • before-detach, after-detach, before-drop, after-drop

Configuration Scope

  • Global hooks (applied to all partitions)
  • Per-partition hooks (override global)

Production-Ready Features

  • Retry with backoff — Fixed or exponential, configurable per hook
  • Timeout — Per-hook execution timeout (default 300s)
  • Failure behaviorabort (stop everything) or continue, with safe defaults (before-hooks cancel operation, after-hooks log and continue)
  • Transaction isolation — Hooks execute outside PostgreSQL transactions, no locks held
  • Template variables{{.Schema}}, {{.Table}}, {{.ParentTable}}, {{.LowerBound}}, {{.UpperBound}}, {{.DatabaseName}}, {{.Hostname}}, etc.
  • Credential propagation — Optional injection of PGHOST, PGPORT, PGDATABASE, PGUSER, PGPASSWORD for shell hooks
  • Dry-run mode--dry-run flag to preview resolved hooks without execution
  • Structured metrics — Hook execution duration, success/failure counters, retry counters in JSON log output

What This PR Does NOT Include

  • No implementation code — this is a spec-only PR for review and discussion
  • Design document and task breakdown will follow once requirements are validated

Example Configuration

hooks:
  after-detach:
    - name: archive-to-s3
      type: shell
      command: /usr/local/bin/archive-partition.sh
      args:
        - "{{.Schema}}.{{.Table}}"
        - "{{.LowerBound}}"
        - "{{.UpperBound}}"
      timeout: 10m
      on_failure: abort
      retry:
        attempts: 3
        backoff: exponential
        initial_delay: 5s
        max_delay: 60s
    - name: vacuum-parent
      type: postgresql
      config:
        sql: "VACUUM ANALYZE {{.Schema}}.{{.ParentTable}}"
      timeout: 5m

partitions:
  logs:
    schema: public
    table: logs
    partitionKey: created_at
    interval: daily
    retention: 30
    preProvisioned: 7
    cleanupPolicy: drop

@vmercierfr vmercierfr added the enhancement New feature or request label May 28, 2026
@vmercierfr vmercierfr force-pushed the hooks-support branch 4 times, most recently from fc2476c to db203e7 Compare June 8, 2026 20:42
vmercierfr added 13 commits June 8, 2026 23:20
Introduce the hook package with configuration model, validation logic,
and the Runner interface that all hook runners must implement.

Add pgregory.net/rapid for property-based testing and promote
spf13/pflag to a direct dependency.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Add Go text/template based rendering for hook command arguments and
SQL queries, allowing partition context variables to be interpolated
into hook configurations at execution time.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Add resolver logic to merge global and partition-level hook
configurations, with partition-level hooks taking precedence
over global defaults.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Add credential extraction from PostgreSQL connection URLs, parsing
host, database, username, and password into environment variables
for use by shell hooks with propagate-credentials enabled.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Add MetricsCollector to track hook execution outcomes (success, failure,
skip counts) and Registry that maps hook types to their Runner
implementations.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Implement the shell hook runner that executes external commands with
templated arguments, environment variable propagation, timeout handling,
and configurable working directory.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Implement the PostgreSQL hook runner that executes SQL statements
directly against the database with templated queries, connection
management, and timeout support.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Add the Executor that drives hook Runner implementations with retry
logic, on_failure policy enforcement (continue vs abort), and dry-run
mode that previews which hooks would run without executing them.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Add the Orchestrator that coordinates hook lifecycle execution around
partition operations (pre/post hooks), manages template context
injection, and delegates to the Executor for actual hook runs.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Prepare the application layer for hooks integration:
- Add Hooks field to app config and partition configuration
- Add hook validation in config.Check()
- Extend PPM struct with connectionURL, globalHooks, and dryRun
- Add --dry-run flag to run and cleanup commands
- Update existing tests for new PPM constructor signature

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Wire hooks into the cleanup workflow:
- Add hook helper methods (runHook, newHookOrchestrator, buildPartitionContext)
- Execute pre/post hooks around partition detach and drop operations
- Support on_failure policies (abort vs continue) per partition
- Add comprehensive tests for hook execution and dry-run mode

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Add user-facing documentation for the hooks feature:
- Hook configuration reference (shell and postgresql types)
- Update mkdocs navigation and CLI reference
- Update README with hooks feature mention

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Add developer documentation explaining how to implement new hook types,
with architecture overview and step-by-step guide.

Signed-off-by: Vincent Mercier <vmercier@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant