Skip to content

EKS fails to pull images from ECR after upgrading to Knative Serving 1.17.0 #15778

Open
@ssagi118

Description

@ssagi118

What version of Knative?

Knative serving v1.17.0 k8S 1.32

Expected Behavior

I have an image in ECR with simple application that exposes REST endpoint.
I use the following .yaml to deploy on EKS

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: dummy-model
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/min-scale: "1"
    spec:
      containers:
      - image: 111111111111.dkr.ecr.eu-west-1.amazonaws.com/dummy-model:1.5
        ports:
        - containerPort: 4000

This works perfectly with Knative serving v1.16.2 the image is pulled and revision, deployment, pod are created and application is responsive to REST calls.

Actual Behavior

When upgrading to Knative serving v1.17.0 and deploying the same applicatin on the same EKS the output of kn revisions list is:

NAME                SERVICE       TRAFFIC   TAGS   GENERATION   AGE   CONDITIONS   READY   REASON
dummy-model-00001   dummy-model                    1            3s    0 OK / 3     False   ContainerMissing : Unable to fetch image "1111 ...

When looking at the log of the controller (kubectl logs controller-cc7d86698-p648q -n knative-serving)
I see the following:

"severity": "ERROR",
    "timestamp": "2025-02-14T09:17:33.629925989Z",
    "logger": "controller",
    "caller": "controller/controller.go:564",
    "message": "Reconcile error",
    "commit": "6265a8e",
    "knative.dev/pod": "controller-cc7d86698-p648q",
    "knative.dev/controller": "knative.dev.serving.pkg.reconciler.revision.Reconciler",
    "knative.dev/kind": "serving.knative.dev.Revision",
    "knative.dev/traceid": "544e9974-8628-4fd9-a08c-2822eca4a357",
    "knative.dev/key": "default/dummy-model-00001",
    "duration": "253.88µs",
    "error": "Unable to fetch image \"111111111111.dkr.ecr.eu-west-1.amazonaws.com/dummy-model:1.5\": failed to resolve image to digest: HEAD https://111111111111.dkr.ecr.eu-west-1.amazonaws.com/v2/dummy-model/manifests/1.5: unexpected status code 401 Unauthorized (HEAD responses have no body, use GET for details)",
    "stacktrace": "knative.dev/pkg/controller.(*Impl).handleErr\n\tknative.dev/[email protected]/controller/controller.go:564\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tknative.dev/[email protected]/controller/controller.go:541\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\tknative.dev/[email protected]/controller/controller.go:489"

Steps to Reproduce the Problem

Create an EKS with K8S v1.32 using Hashikorp standard Terraform script.
Create an image with dummy application, push to ECR, deply as Knative serving v1.17.0

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.triage/acceptedIssues which should be fixed (post-triage)

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions