Skip to content

ScanJobs fail in Trivy Operator Server Mode — all ScanJob pods in Error state, no vulnerabilities detected #2815

@hichem-belhocine

Description

@hichem-belhocine

What steps did you take and what happened:

When running Trivy Operator in server mode, all ScanJob pods end up in an Error state and no vulnerability reports are generated. The operator keeps attempting to create ScanJobs, but each one fails immediately. However, when switching to standalone mode, vulnerability scanning works correctly and reports are created as expected.

What did you expect to happen:

  • ScanJobs should run successfully.
  • Trivy Operator should retrieve vulnerability results from the Trivy server.
  • VulnerabilityReports should be created in the scanned namespaces.

Environment:

  • Trivy-Operator version (use trivy-operator version): 0.31.0
  • Kubernetes version (use kubectl version): v1.33.5

Values.yaml

global:
  image:
    registry: "private-registry"

operator:
  replicas: 2
  scanJobsConcurrentLimit: 5
  scanJobTTL: "30s"
  # We use Server mode  
  builtInTrivyServer: true

trivyOperator:
  scanJobTolerations:
  scanJobCustomVolumesMount:
    - name: trusted-ca
      mountPath: /etc/ssl/certs/trusted-ca-bundle.crt
      subPath: trusted-ca-bundle.crt
      readOnly: true
    - name: tmp
      mountPath: /tmp
  scanJobCustomVolumes:
    - name: trusted-ca
      secret:
        secretName: trusted-ca
    - emptyDir: {}
      name: tmp
  scanJobPodTemplatePodSecurityContext:
    runAsUser: 0
    runAsNonRoot: false
    seccompProfile:
      type: RuntimeDefault
  scanJobPodTemplateContainerSecurityContext:
    allowPrivilegeEscalation: false
    capabilities:
      drop:
        - ALL
    privileged: false
    readOnlyRootFilesystem: true
    # -- For filesystem scanning, Trivy needs to run as the root user
    # runAsUser: 0

trivy:
  mode: ClientServer # Standalone by default.
  storageSize: "30Gi"
  resources:
    requests:
      cpu: 10m
      memory: 100M
      # ephemeralStorage: "2Gi"
    limits:
      cpu: 500m
      memory: 1000M
      # ephemeralStorage: "2Gi"
  httpProxy: xxxxxxxxxxxxx
  httpsProxy: xxxxxxxxxxxxx
  noProxy: xxxxxxxxxxxxxx
  slow: true
  ignoreUnfixed: true
  dbRegistry: "private-registry"
  javaDbRegistry: "private-registry"
  command: image
  server:
    # -- resources set trivy-server resource
    resources:
      requests:
        cpu: 10m
        memory: 512Mi
        # ephemeral-storage: "2Gi"
      limits:
        cpu: 1
        memory: 1Gi
        # ephemeral-storage: "2Gi" 
    replicas: 2
    podSecurityContext:
      runAsUser: 65534
      runAsNonRoot: true
      fsGroup: 65534
      seccompProfile:
        type: RuntimeDefault
    securityContext:
      runAsNonRoot: true
      privileged: false
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      seccompProfile:
        type: RuntimeDefault
      capabilities:
        drop:
        - ALL 
    extraServerVolumes:
      volumeMounts:
        - name: trusted-ca
          mountPath: /etc/ssl/certs/trusted-ca-bundle.crt
          subPath: trusted-ca-bundle.crt
          readOnly: true
      volumes:
        - name: trusted-ca
          secret:
            secretName: trusted-ca

resources:
  limits:
    cpu: 500m
    memory: 2048Mi
  requests:
    cpu: 100m
    memory: 512Mi

podAnnotations:
  prometheus.io/port: '8080'
  prometheus.io/scrape: 'true'
  prometheus.io/honor_labels: "true"

podSecurityContext:
  runAsNonRoot: true
  seccompProfile:
    type: RuntimeDefault

# -- securityContext security context
securityContext:
  runAsNonRoot: true
  privileged: false
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  capabilities:
    drop:
      - ALL
  seccompProfile:
    type: RuntimeDefault
volumeMounts:
  - name: trusted-ca
    mountPath: /etc/ssl/certs/trusted-ca-bundle.crt
    subPath: trusted-ca-bundle.crt
    readOnly: true
  - name: tmp
    mountPath: /tmp
volumes:
  - name: trusted-ca
    secret:
      secretName: trusted-ca
  - emptyDir: {}
    name: tmp
tolerations:
affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app.kubernetes.io/name: trivy-operator
        topologyKey: topology.kubernetes.io/zone
      - labelSelector:
          matchLabels:
            app.kubernetes.io/name: trivy-operator
        topologyKey: kubernetes.io/hostname
policiesBundle:
  registry: private-registry
nodeCollector:
  registry: private-registry
> k get po
NAME                                        READY   STATUS    RESTARTS   AGE
scan-vulnerabilityreport-66d6c58f66-8vtlm   1/2     Error     0          3s
scan-vulnerabilityreport-67dd5bcdbb-fpl7r   0/1     Error     0          33s
scan-vulnerabilityreport-698c55c4f9-hvbsr   0/1     Error     0          3s
scan-vulnerabilityreport-84656d9585-s2dqj   0/1     Error     0          33s
scan-vulnerabilityreport-86fbbf6f5-s6g2w    0/1     Error     0          3s
trivy-operator-55b44b8579-8g8kk             1/1     Running   0          15m
trivy-server-0                              1/1     Running   0          22m

trivy-operator pod logs has hundrends of this issue

{"level":"error","ts":"2025-11-17T14:33:50Z","logger":"reconciler.scan job","msg":"Scan job container","job":"trivy-system/scan-vulnerabilityreport-59599f5f8d","container":"xxxxx","status.reason":"Error","status.message":"","stacktrace":"github.com/aquasecurity/trivy-operator/pkg/vulnerabilityreport/controller.(*ScanJobController).completedContainers\n\t/home/runner/work/trivy-operator/trivy-operator/pkg/vulnerabilityreport/controller/scanjob.go:441\ngithub.com/aquasecurity/trivy-operator/pkg/vulnerabilityreport/controller.(*ScanJobController).SetupWithManager.(*ScanJobController).reconcileJobs.func1\n\t/home/runner/work/trivy-operator/trivy-operator/pkg/vulnerabilityreport/controller/scanjob.go:103\nsigs.k8s.io/controller-runtime/pkg/reconcile.TypedFunc[...].Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.1/pkg/reconcile/reconcile.go:134\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.1/pkg/internal/controller/controller.go:216\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.1/pkg/internal/controller/controller.go:461\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.1/pkg/internal/controller/controller.go:421\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func1.1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.22.1/pkg/internal/controller/controller.go:296"}

Removing the whole ns as suggested here #1325 (comment) does not help

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions