Skip to content

Replace Fluentd with Vector for Tekton log collection and Results API integration #3381

Description

@vdemeester

Context

The dogfooding cluster has a working log pipeline:

  1. Fluent Bit (DaemonSet) → collects pod logs
  2. Fluentd (StatefulSet) → forwards to OCI Object Storage (S3-compatible)
  3. tekton-logs-server (Node.js) → serves logs at logs.infra.tekton.dev
  4. Dashboard → configured with --external-logs=https://logs.infra.tekton.dev/logs

This works for the Dashboard, but Tekton Results API cannot serve logs because the S3 path format doesn't match what the Results Blob plugin expects.

The path format mismatch

Fluentd writes:

<namespace>/<pod-name>/<container-name>/YYYYMMDDHHMI_N.log
# e.g. tekton-ci/request-pr-docs-reviewer-zrqpg-clone-repo-pod/step-clone/202605130930_0.log

Results Blob plugin expects:

<LOGS_PATH>/<parent>/<resultName>/<recordName>/*.log
# e.g. tekton-ci/results/<pipelinerun-uid>/records/<taskrun-uid>/*.log

Results identifies records by k8s UIDs (PipelineRun UID, TaskRun UID), not by pod/container names. The Blob plugin (LOGS_TYPE=Blob) lists S3 objects under a prefix derived from these UIDs.

Proposal: Replace Fluentd with Vector

Vector is better suited because it has native Kubernetes metadata enrichment — it can:

  1. Collect pod logs (same as Fluent Bit → Fluentd today)
  2. Enrich with k8s labels/annotations (tekton.dev/pipelineRun, tekton.dev/taskRun, UIDs)
  3. Transform S3 output paths to match the Results Blob convention
  4. Write directly to OCI Object Storage (S3-compatible) — no Fluentd intermediary needed

Vector was already proven in the tekton-experiments PoC for artifact archival (SRVKP-10766), where it successfully collected Tekton pod logs into MinIO with custom path formatting.

Migration plan

Phase 1: Keep current setup working

  • Dashboard continues using --external-logs → tekton-logs-server → S3
  • Results logs API remains disabled
  • No user-facing changes

Phase 2: Deploy Vector with Results-compatible S3 paths

  • Deploy Vector (Helm, DaemonSet) alongside or replacing Fluentd
  • Configure Vector to:
    • Collect logs from pods with app.kubernetes.io/managed-by=tekton-pipelines
    • Enrich with k8s metadata (pod labels contain PipelineRun/TaskRun names, pod metadata has UIDs)
    • Write to S3 with paths matching Results Blob convention
  • Enable Results Blob plugin (LOGS_TYPE=Blob, LOGGING_PLUGIN_API_URL=s3://...)
  • Verify Results API serves logs correctly via gRPC/REST

Phase 3: Dashboard uses Results API for logs

  • Investigate/implement Results API log support in tektoncd/dashboard (may need upstream work)
  • Switch Dashboard from --external-logs to Results API integration
  • This gives both Dashboard and Results API a single source of truth

Phase 4: Retire legacy components

  • Remove tekton-logs-server deployment
  • Remove Fluentd StatefulSet
  • Optionally keep Fluent Bit → Vector (or replace Fluent Bit with Vector's own source)

Benefits

  • Single log store read by both Results API and Dashboard — no duplication
  • Vector is more capable than Fluentd for k8s metadata enrichment and path transformation
  • Results API integration enables programmatic log access, retention policies, and future TEP-0164 artifact archival
  • Simpler stack: Vector replaces both Fluentd (forwarder) and tekton-logs-server (reader proxy)

Technical notes

Vector k8s metadata enrichment

Vector's kubernetes_logs source automatically enriches log events with pod labels, annotations, namespace, and container name. The owning PipelineRun/TaskRun UIDs need to be resolved — either via:

  • Pod labels (Tekton sets tekton.dev/pipelineRun name, but not UID)
  • Pod ownerReferences → TaskRun UID → look up PipelineRun UID from TaskRun's ownerReferences
  • Or: add a Vector transform that queries the k8s API for UIDs based on names

Results Blob plugin path derivation

The Blob plugin uses defaultBlobPathParams = "/%s/%s/%s/" (parent/resultName/recordName) to construct S3 prefixes. The parent is the namespace, resultName is the PipelineRun UID (for owned TaskRuns), and recordName is the TaskRun UID.

Dashboard Results API integration

Dashboard currently supports --external-logs (HTTP endpoint) and native k8s log streaming. Using Results API for logs may need upstream Dashboard work — investigate whether Dashboard can query Results gRPC/REST for log records.

Related

/kind feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/dogfoodingIndicates an issue on dogfooding (aka using Pipeline to test Pipeline)kind/featureCategorizes issue or PR as related to a new feature.

    Fields

    No fields configured for Feature.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions