pipelines/git,fetch: obscure half of found git commit and sha to avoid copy pasting#2458
pipelines/git,fetch: obscure half of found git commit and sha to avoid copy pasting#2458vishal-chdhry wants to merge 1 commit intochainguard-dev:mainfrom
Conversation
…d copy pasting Signed-off-by: Vishal Choudhary <vishal.choudhary@chainguard.dev>
EyeCantCU
left a comment
There was a problem hiding this comment.
While I agree the source needs to be verified, I believe this makes things harder for humans in favor of improving agentic behavior and I don't really like that. I'm open to hearing other arguments though
|
@EyeCantCU humans should not copy paste that sha either, that is also a security risk |
|
As I said, I agree, but it provides immediate visibility into the SHAs that were used, making it, even if just a bit, easier to track things down |
I am obscuring only the second half of the SHA to ensure it's possible to check it against the right value once a human has computed it. I agree that this makes it harder by requiring the computation of the SHA, but it makes our pipelines more secure in my opinion |
Alternatively, we could adjust output to point link to the tree of the found commit for validation. This is the kind of assessment I don't expect is easy for agents anyway (at least right now) and I'd almost expect a human to do this themselves
I'm really interested in knowing how. We embed the commit we've built from in both pkginfo and the compiled melange manifest packaged with every APK. This is also fronted before a package is ever built or published anywhere (as the found commit would have to be wrong, likely because it moved) |
Wouldn't the link to the GitHub tree at the commit contain the commit in the URL 🤔
If our build pipelines and update infra get compromised or the upstream registry gets compromised, updating using the computed sha from the pulled malicious source will put us at risk, even if there is a human in the loop its very unlikely they will question the commit hash if it builds. If we rely on the trusted source, for example, the SHAs in the NVIDIA distribution manifests or GitHub SHA, that's more secure. |
|
I just want to say thank you for dealing with me here. I appreciate you
It would, but would provide a mechanism for validation that's helpful and drives people in the right direction
I want to be wrong, but as far as I understand, this PR doesn't address that problem. It obscures the found SHA. It doesn't re-enforce verification of source |
|
Thank you for taking the time to review this! It is really helpful in understanding what the right solution should be here
Yes, I agree, but the agents would parse the URL in this case, and short-circuit the verification process
My understanding here is that, by obscuring the sha, it makes it effectively unusable; the user will have to check the correct source (GitHub commits or Nvidia distribution manifests) to get the sha. If the sha provided by the manifest is different from the build pipeline has issues. Using the sha provided in the error to update the expected commit/sha check, bypasses some of the benefits of that check |
Currently, the
fetchandgit-checkoutpipeline return the entire commit sha and the fetched object sha.This encourages copy-pasting of the sha on package updates instead of verifying the right sha from the source. If a human or an agent gets the URL wrong, or if the URL is bypassed, copy-pasting can lead to malicious builds