Skip to content

pipelines/git,fetch: obscure half of found git commit and sha to avoid copy pasting#2458

Open
vishal-chdhry wants to merge 1 commit intochainguard-dev:mainfrom
vishal-chdhry:obscure-sha-commit
Open

pipelines/git,fetch: obscure half of found git commit and sha to avoid copy pasting#2458
vishal-chdhry wants to merge 1 commit intochainguard-dev:mainfrom
vishal-chdhry:obscure-sha-commit

Conversation

@vishal-chdhry
Copy link
Copy Markdown
Member

Currently, the fetch and git-checkout pipeline return the entire commit sha and the fetched object sha.

This encourages copy-pasting of the sha on package updates instead of verifying the right sha from the source. If a human or an agent gets the URL wrong, or if the URL is bypassed, copy-pasting can lead to malicious builds

…d copy pasting

Signed-off-by: Vishal Choudhary <vishal.choudhary@chainguard.dev>
Copy link
Copy Markdown
Member

@EyeCantCU EyeCantCU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree the source needs to be verified, I believe this makes things harder for humans in favor of improving agentic behavior and I don't really like that. I'm open to hearing other arguments though

@vishal-chdhry
Copy link
Copy Markdown
Member Author

@EyeCantCU humans should not copy paste that sha either, that is also a security risk

@EyeCantCU
Copy link
Copy Markdown
Member

As I said, I agree, but it provides immediate visibility into the SHAs that were used, making it, even if just a bit, easier to track things down

@vishal-chdhry
Copy link
Copy Markdown
Member Author

but it provides immediate visibility into the SHAs that were used, making it, even if just a bit, easier to track things down

I am obscuring only the second half of the SHA to ensure it's possible to check it against the right value once a human has computed it.

I agree that this makes it harder by requiring the computation of the SHA, but it makes our pipelines more secure in my opinion

@EyeCantCU
Copy link
Copy Markdown
Member

EyeCantCU commented Apr 2, 2026

I am obscuring only the second half of the SHA to ensure it's possible to check it against the right value once a human has computed it.

Alternatively, we could adjust output to point link to the tree of the found commit for validation. This is the kind of assessment I don't expect is easy for agents anyway (at least right now) and I'd almost expect a human to do this themselves

I agree that this makes it harder by requiring the computation of the SHA, but it makes our pipelines more secure in my opinion

I'm really interested in knowing how. We embed the commit we've built from in both pkginfo and the compiled melange manifest packaged with every APK. This is also fronted before a package is ever built or published anywhere (as the found commit would have to be wrong, likely because it moved)

@vishal-chdhry
Copy link
Copy Markdown
Member Author

Alternatively, we could adjust output to point link to the tree of the found commit for validation. This is the kind of assessment I don't expect is easy for agents anyway (at least right now) and I'd almost expect a human to do this themselves

Wouldn't the link to the GitHub tree at the commit contain the commit in the URL 🤔

I'm really interested in knowing how.

If our build pipelines and update infra get compromised or the upstream registry gets compromised, updating using the computed sha from the pulled malicious source will put us at risk, even if there is a human in the loop its very unlikely they will question the commit hash if it builds. If we rely on the trusted source, for example, the SHAs in the NVIDIA distribution manifests or GitHub SHA, that's more secure.

@EyeCantCU
Copy link
Copy Markdown
Member

I just want to say thank you for dealing with me here. I appreciate you

Wouldn't the link to the GitHub tree at the commit contain the commit in the URL 🤔

It would, but would provide a mechanism for validation that's helpful and drives people in the right direction

If our build pipelines and update infra get compromised or the upstream registry gets compromised, updating using the computed sha from the pulled malicious source will put us at risk, even if there is a human in the loop its very unlikely they will question the commit hash if it builds. If we rely on the trusted source, for example, the SHAs in the NVIDIA distribution manifests or GitHub SHA, that's more secure.

I want to be wrong, but as far as I understand, this PR doesn't address that problem. It obscures the found SHA. It doesn't re-enforce verification of source

@vishal-chdhry
Copy link
Copy Markdown
Member Author

Thank you for taking the time to review this! It is really helpful in understanding what the right solution should be here

It would, but would provide a mechanism for validation that's helpful and drives people in the right direction

Yes, I agree, but the agents would parse the URL in this case, and short-circuit the verification process

I want to be wrong, but as far as I understand, this PR doesn't address that problem. It obscures the found SHA. It doesn't re-enforce verification of source

My understanding here is that, by obscuring the sha, it makes it effectively unusable; the user will have to check the correct source (GitHub commits or Nvidia distribution manifests) to get the sha. If the sha provided by the manifest is different from the build pipeline has issues. Using the sha provided in the error to update the expected commit/sha check, bypasses some of the benefits of that check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants