Skip to content

Feature Request: Add TiCDCChangefeed CRD for declarative changefeed lifecycle management #6771

@TailinLyu

Description

@TailinLyu

Motivation

TiCDC changefeed lifecycle management (create, pause, resume, update config, delete) can only be done imperatively via the TiCDC HTTP API or cdc cli. This is a gap in Kubernetes environments:

  • Changefeeds are invisible to Kubernetes — no resource to kubectl get, no status to watch, no RBAC to control.
  • Changefeed state drifts with no reconciliation loop to correct it.
  • There is no GitOps-friendly way to manage changefeeds alongside the rest of the TiDB deployment.

Proposal

Introduce a TiCDCChangefeed CRD for declarative changefeed lifecycle management.

Create a changefeed

apiVersion: pingcap.com/v1alpha1
kind: TiCDCChangefeed
metadata:
  name: my-changefeed
  namespace: tidb-cluster
spec:
  clusterRef: my-tidb-cluster
  sinkURI: "kafka://kafka:9092/my-topic?protocol=canal-json"
  startTs: "438102428640256000"
  config: |
    {
      "filter": {
        "rules": ["mydb.*"]
      }
    }
$ kubectl apply -f changefeed.yaml
ticdcchangefeed.pingcap.com/my-changefeed created

$ kubectl get tcf
NAME             STATE    CHANGEFEEDID    CHECKPOINTTS          AGE
my-changefeed    Normal   my-changefeed   438102428640256512    30s

Pause, update config, resume

TiCDC only allows config updates on paused changefeeds:

kubectl patch tcf my-changefeed --type=merge -p '{"spec":{"paused":true}}'
kubectl patch tcf my-changefeed --type=merge -p '{"spec":{"sinkURI":"kafka://kafka:9092/new-topic"}}'
kubectl patch tcf my-changefeed --type=merge -p '{"spec":{"paused":false}}'

Delete with cleanup

$ kubectl delete tcf my-changefeed

A finalizer ensures the controller calls the TiCDC API to delete the changefeed before removing the Kubernetes resource.

Reconciliation behavior

Scenario Action
CR created, changefeed doesn't exist in TiCDC Create via TiCDC API v2
spec.paused: true, changefeed running Pause
spec.paused: false, changefeed paused/stopped/failed Resume
Spec generation changed while paused Update config (sinkURI, targetTs, replica_config)
CR deleted Delete from TiCDC, remove finalizer
Every reconcile Sync status: state, checkpointTs, resolvedTs, error, Ready condition

Implementation

I have a working implementation on TiDB Operator v1.5.5 (release-1.5) that includes:

  • CRD types with deepcopy/openapi generation
  • Controller using the standard v1 workqueue pattern
  • TiCDC HTTP API v2 client (/api/v2/changefeeds/...) — full CRUD + pause/resume
  • Finalizer-based cleanup on deletion
  • Unit tests (~2200 lines covering controller, API client, reconciliation)

This has been tested and is running in our environment. I have also verified it cherry-picks to v1.6.5 with only 5 trivial context conflicts (from CompactBackup additions / AutoScaler removals — no logic changes).

**I would like to contribute this to v1, targeting release-1.5 and cherry-pick to release-1.6. Please let me know your thoughts and I could raise a PR if we get concensus. **

On v2 compatibility

I understand v1 is frozen and v2 is the active development target. My v1 implementation cannot be directly ported to v2 due to the architectural differences (controller-runtime, split API module, task-based reconciliation, etc.).

I also note that v2.0.0's TiCDC API client (pkg/ticdcapi/v1/) only covers node-level operations (GetStatus, DrainCapture, ResignOwner, IsHealthy) and has no changefeed support at all — no types, no endpoints, no references anywhere in the codebase.

Do you plan to close this gap for TiCDC changefeed management in v2? If not, I am willing to contribute a v2-native implementation after the v1 contribution is settled.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions