-
Notifications
You must be signed in to change notification settings - Fork 531
Description
Motivation
TiCDC changefeed lifecycle management (create, pause, resume, update config, delete) can only be done imperatively via the TiCDC HTTP API or cdc cli. This is a gap in Kubernetes environments:
- Changefeeds are invisible to Kubernetes — no resource to
kubectl get, no status to watch, no RBAC to control. - Changefeed state drifts with no reconciliation loop to correct it.
- There is no GitOps-friendly way to manage changefeeds alongside the rest of the TiDB deployment.
Proposal
Introduce a TiCDCChangefeed CRD for declarative changefeed lifecycle management.
Create a changefeed
apiVersion: pingcap.com/v1alpha1
kind: TiCDCChangefeed
metadata:
name: my-changefeed
namespace: tidb-cluster
spec:
clusterRef: my-tidb-cluster
sinkURI: "kafka://kafka:9092/my-topic?protocol=canal-json"
startTs: "438102428640256000"
config: |
{
"filter": {
"rules": ["mydb.*"]
}
}$ kubectl apply -f changefeed.yaml
ticdcchangefeed.pingcap.com/my-changefeed created
$ kubectl get tcf
NAME STATE CHANGEFEEDID CHECKPOINTTS AGE
my-changefeed Normal my-changefeed 438102428640256512 30sPause, update config, resume
TiCDC only allows config updates on paused changefeeds:
kubectl patch tcf my-changefeed --type=merge -p '{"spec":{"paused":true}}'
kubectl patch tcf my-changefeed --type=merge -p '{"spec":{"sinkURI":"kafka://kafka:9092/new-topic"}}'
kubectl patch tcf my-changefeed --type=merge -p '{"spec":{"paused":false}}'Delete with cleanup
$ kubectl delete tcf my-changefeedA finalizer ensures the controller calls the TiCDC API to delete the changefeed before removing the Kubernetes resource.
Reconciliation behavior
| Scenario | Action |
|---|---|
| CR created, changefeed doesn't exist in TiCDC | Create via TiCDC API v2 |
spec.paused: true, changefeed running |
Pause |
spec.paused: false, changefeed paused/stopped/failed |
Resume |
| Spec generation changed while paused | Update config (sinkURI, targetTs, replica_config) |
| CR deleted | Delete from TiCDC, remove finalizer |
| Every reconcile | Sync status: state, checkpointTs, resolvedTs, error, Ready condition |
Implementation
I have a working implementation on TiDB Operator v1.5.5 (release-1.5) that includes:
- CRD types with deepcopy/openapi generation
- Controller using the standard v1 workqueue pattern
- TiCDC HTTP API v2 client (
/api/v2/changefeeds/...) — full CRUD + pause/resume - Finalizer-based cleanup on deletion
- Unit tests (~2200 lines covering controller, API client, reconciliation)
This has been tested and is running in our environment. I have also verified it cherry-picks to v1.6.5 with only 5 trivial context conflicts (from CompactBackup additions / AutoScaler removals — no logic changes).
**I would like to contribute this to v1, targeting release-1.5 and cherry-pick to release-1.6. Please let me know your thoughts and I could raise a PR if we get concensus. **
On v2 compatibility
I understand v1 is frozen and v2 is the active development target. My v1 implementation cannot be directly ported to v2 due to the architectural differences (controller-runtime, split API module, task-based reconciliation, etc.).
I also note that v2.0.0's TiCDC API client (pkg/ticdcapi/v1/) only covers node-level operations (GetStatus, DrainCapture, ResignOwner, IsHealthy) and has no changefeed support at all — no types, no endpoints, no references anywhere in the codebase.
Do you plan to close this gap for TiCDC changefeed management in v2? If not, I am willing to contribute a v2-native implementation after the v1 contribution is settled.