Skip to content

Support DigestFunction in the Remote Asset API#57

Merged
mortenmj merged 3 commits intomasterfrom
coldtom/digest-function
Mar 14, 2025
Merged

Support DigestFunction in the Remote Asset API#57
mortenmj merged 3 commits intomasterfrom
coldtom/digest-function

Conversation

@tomcoldrick-ct
Copy link
Collaborator

The Remote Asset API now includes parameters for the digest function used everywhere, which allows us to correct a lot of our usage. This PR handles all of that correction. I've made some API changes to the Asset store, so that we can pass digest.Function around instead of digest.InstanceName, which simplifies computing digests everywhere. As part of that, I've also removed a load of cruft we had using the REAPI objects directly instead of the nice digest/buffer APIs from bb-storage.

The HTTP fetcher required an additional bit of work to extricate our handling of the checksum.sri validator. Before the API allowed specifying an digest function, we used this to work out which digest function to use. This is now incorrect, and will try to do unexpected things, especially in the case you're using an Action Cache Asset Store. The behaviour is now corrected to only use checksum.sri to validate. In future we should handle the checksum generically on the returned Digest, rather than only in the HTTP fetcher.

Fixes #54

Depends on #56

@tomcoldrick-ct tomcoldrick-ct force-pushed the coldtom/digest-function branch from 54b0ff1 to c9a4410 Compare March 12, 2025 16:13
@mortenmj mortenmj self-requested a review March 14, 2025 12:29
The Remote Asset API now supports specifying the Digest Function to use in all
messages.  Let's implement support for that, so that we can avoid breaking users
that want to use a specific digest function.

In doing this, I've made some other modifications, which also improve
correctness.

1. We now pass digest.Functions into the Asset Store, rather than
   digest.InstanceNames.  This lets us use the passed digest function instead of
   guessing/just using SHA256.
2. We use the bb-storage digest/buffer APIs everywhere, instead of using
   REAPI objects and converting
3. Thanks to (2), we now only need a single ProtoSerialise method, that converts
   a proto to a Buffer and Digest.

In addition to being more correct, this has hugely simplified things in the
Remote Execution fetcher and Action Cache asset store.
When we implemented HTTP fetching, there was no digest_function field in the
requests, so we used checksum.sri to choose one.  The API has now changed to
explicitly include a digest_function, so using the checksum.sri is actually
incorrect behaviour -- if a client sets digest_function to SHA256, and specifies
a SHA512 checksum, we should validate using the SHA512 checksum, but return a
SHA256 digest.

This change in API may need a wider rework of how checksum validation works.  We
should probably add a decorator fetcher that does this, pulling the object from
CAS based on the inner fetcher's response, and validating that.  This would be
more correct, but at the expense of additional network I/O.
These changes should make the tests less brittle, because it stops us needlessly
mocking the io.ReadCloser we pass around.  Instead we pass a buffer of the
actual test data.
@tomcoldrick-ct tomcoldrick-ct force-pushed the coldtom/digest-function branch from c9a4410 to 1f8b78b Compare March 14, 2025 12:35
Copy link
Collaborator

@mortenmj mortenmj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mortenmj mortenmj merged commit 739510e into master Mar 14, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fetches fail when bazel specifies digests with anything other than sha256

2 participants

Comments