Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add re-try mechanims within signing manifest when using recursive #11339

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

see-quick
Copy link
Member

@see-quick see-quick commented Apr 9, 2025

Type of change

  • Bugfix
  • Enhancement / new feature
  • Refactoring

Description

Why

Fixes [1]:

Error: signing [quay.io/strimzi/kaniko-executor@sha256:ef1a39c363e145041d80103c3c12da9429ce06cf21dff6fb1fb75d0c0ed9c35b]: recursively signing: signing digest:  context deadline exceeded
main.go:74: error during command execution: signing [quay.io/strimzi/kaniko-executor@sha256:ef1a39c363e145041d80103c3c12da9429ce06cf21dff6fb1fb75d0c0ed9c35b]: recursively signing: signing digest: context deadline exceeded
make[1]: *** [../../Makefile.docker:77: docker_sign_manifest_default] Error 1
make: *** [Makefile:185: docker-images/kaniko-executor] Error 2
make[1]: Leaving directory '/home/vsts/work/1/s/docker-images/kaniko-executor'

This PR adds a retry mechanism because of the dozens of problems we hit during signing manifests. I have done only change to

cosign sign --recursive --tlog-upload=false -a author=StrimziCI -a BuildID=$(BUILD_ID) -a Commit=$(BUILD_COMMIT) --key cosign.key $(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME)@$$MANIFEST_DIGEST
, because I only saw it fail in that part.

These parts I thin,k should be safe:

cosign sign --tlog-upload=false -a author=StrimziCI -a BuildID=$(BUILD_ID) -a Commit=$(BUILD_COMMIT) --key cosign.key --attachment sbom $(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME)@$$MANIFEST_DIGEST

and

cosign sign-blob --tlog-upload=false --key cosign.key --bundle $(SBOM_DIR)/$(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME)/$(DOCKER_TAG)/$$MANIFEST_DIGEST.txt.bundle $(SBOM_DIR)/$(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME)/$(DOCKER_TAG)/$$MANIFEST_DIGEST.txt
MANIFEST_DIGEST=$(shell docker buildx imagetools inspect $(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME):$(DOCKER_TAG)$(DOCKER_PLATFORM_TAG_SUFFIX) --format '{{ json . }}' | jq -r .manifest.digest); \
cosign sign-blob --tlog-upload=false --key cosign.key --bundle $(SBOM_DIR)/$(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME)/$(DOCKER_TAG)/$$MANIFEST_DIGEST.json.bundle $(SBOM_DIR)/$(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME)/$(DOCKER_TAG)/$$MANIFEST_DIGEST.json
@rm cosign.key

[1] - https://dev.azure.com/cncf/strimzi/_build/results?buildId=183956&view=logs&j=3d72a2f4-aa53-5c85-1963-3e9abd2e3bb0&t=16fddc62-1491-5038-9cc9-c79f7f3fde48

Checklist

  • Make sure all tests pass

@see-quick see-quick added this to the 0.46.0 milestone Apr 9, 2025
@see-quick see-quick requested a review from a team April 9, 2025 09:33
@see-quick see-quick self-assigned this Apr 9, 2025
Copy link
Member

@im-konge im-konge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one question.

MANIFEST_DIGEST=$(shell docker buildx imagetools inspect $(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME):$(DOCKER_TAG) --format '{{ json . }}' | jq -r .manifest.digest); \
cosign sign --recursive --tlog-upload=false -a author=StrimziCI -a BuildID=$(BUILD_ID) -a Commit=$(BUILD_COMMIT) --key cosign.key $(DOCKER_REGISTRY)/$(DOCKER_ORG)/$(PROJECT_NAME)@$$MANIFEST_DIGEST
@rm cosign.key
RETRIES=5; \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion -> should 3 be enough in this case?

@ppatierno
Copy link
Member

ppatierno commented Apr 10, 2025

While I think the proposed change could work, I was wondering if just increasing the timeout for the command, by using --timeout (default is 3 minutes, doc here https://github.com/sigstore/cosign/blob/main/doc/cosign_sign.md) would be enough instead of making our code more complex.
I also think the issue is with --recursive signing, for this reason we don't need a fix on the other commands because the issue doesn't happen.
@see-quick would you mind to try with a separate PR?

@see-quick
Copy link
Member Author

While I think the proposed change could work, I was wondering if just increasing the timeout for the command, by using --timeout (default is 3 minutes, doc here https://github.com/sigstore/cosign/blob/main/doc/cosign_sign.md) would be enough instead of making our code more complex. I also think the issue is with --recursive signing, for this reason we don't need a fix on the other commands because the issue doesn't happen. @see-quick would you mind to try with a separate PR?

Yeah, we can try increasing the timeout as a first step. That's a good point. I will create a separate PR.

@see-quick
Copy link
Member Author

see-quick commented Apr 10, 2025

While I think the proposed change could work, I was wondering if just increasing the timeout for the command, by using --timeout (default is 3 minutes, doc here https://github.com/sigstore/cosign/blob/main/doc/cosign_sign.md) would be enough instead of making our code more complex. I also think the issue is with --recursive signing, for this reason we don't need a fix on the other commands because the issue doesn't happen. @see-quick would you mind to try with a separate PR?

Yeah, we can try increasing the timeout as a first step. That's a good point. I will create a separate PR.

Created #11342.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants