@@ -440,13 +440,21 @@ service ContentAddressableStorage {
440440 option (google.api.http ) = { get : "/v2/{instance_name=**}/blobs/{root_digest.hash}/{root_digest.size_bytes}:getTree" };
441441 }
442442
443- // Split a blob into chunks.
444- //
445- // This call splits a blob into chunks, stores the chunks in the CAS, and
446- // returns a list of the chunk digests. Using this list, a client can check
447- // which chunks are locally available and just fetch the missing ones. The
448- // desired blob can be assembled by concatenating the fetched chunks in the
449- // order of the digests in the list.
443+ // Split retrieves information about how a blob is split into chunks.
444+ //
445+ // This call returns information about how a blob is split into chunks, and
446+ // returns a list of the chunk digests. The server returns a known manifest
447+ // if one exists for the requested blob digest. If no manifest exists, the
448+ // server MAY compute a new one by performing chunking on-demand if the blob
449+ // exists in the CAS, though this should be rare as manifests are typically
450+ // created when blobs are written. The original blob digest does not need to
451+ // be present in the CAS for this call to succeed if a manifest already exists.
452+ //
453+ // Using the returned list of chunk digests, a client can check which chunks
454+ // are locally available and just fetch the missing ones. The desired blob can
455+ // be assembled by concatenating the fetched chunks in the order of the
456+ // digests in the list. ZSTD-compressed chunks can be concatenated without
457+ // decompression.
450458 //
451459 // This rpc can be used to reduce the required data to download a large blob
452460 // from CAS if chunks from earlier downloads of a different version of this
@@ -459,6 +467,8 @@ service ContentAddressableStorage {
459467 // 1. The blob chunks are stored in CAS.
460468 // 2. Concatenating the blob chunks in the order of the digest list returned
461469 // by the server results in the original blob.
470+ //
471+ // A client should NOT expect that the original blob is available.
462472 //
463473 // Servers which implement this functionality MUST declare that they support
464474 // it by setting the
@@ -491,26 +501,43 @@ service ContentAddressableStorage {
491501 //
492502 // Errors:
493503 //
494- // * `NOT_FOUND`: The requested blob is not present in the CAS.
504+ // * `NOT_FOUND`: No manifest exists for the blob and the blob is not present
505+ // in the CAS.
495506 // * `RESOURCE_EXHAUSTED`: There is insufficient disk quota to store the blob
496507 // chunks.
497508 rpc SplitBlob (SplitBlobRequest ) returns (SplitBlobResponse ) {
498509 option (google.api.http ) = { get : "/v2/{instance_name=**}/blobs/{blob_digest.hash}/{blob_digest.size_bytes}:splitBlob" };
499510 }
500511
501- // Splice a blob from chunks.
512+ // Splice tells the CAS how chunks compose a blob .
502513 //
503514 // This is the complementary operation to the
504515 // [ContentAddressableStorage.SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob]
505516 // function to handle the chunked upload of large blobs to save upload
506517 // traffic.
507518 //
519+ // When uploading a large blob using chunked upload, clients MUST first upload
520+ // all chunks to the CAS, then call this RPC to store a manifest that describes
521+ // how those chunks compose the original blob. The chunks referenced in the
522+ // manifest must be available in the CAS before calling this RPC. The original
523+ // blob does not need to be available: the correctness of the manifest can be
524+ // validated from manifest correctness and by verifying that the chunks match
525+ // their specified digests.
526+ //
527+ // This RPC stores or updates a manifest that describes how a blob is split
528+ // into chunks and contains the digests of those chunks. The server is not
529+ // expected to actually stitch the blob together when this RPC is called;
530+ // instead, it stores and validates the manifest. The blob is not actually
531+ // spliced together until BS/Read is called on the original blob digest.
532+ // The server SHOULD also communicate the chunking configuration used, so that
533+ // it can be changed or overwritten to a more optimized chunking in the future.
534+ //
508535 // If a client needs to upload a large blob and is able to split a blob into
509536 // chunks in such a way that reusable chunks are obtained, e.g., by means of
510537 // content-defined chunking, it can first determine which parts of the blob
511538 // are already available in the remote CAS and upload the missing chunks, and
512- // then use this API to instruct the server to splice the original blob from
513- // the remotely available blob chunks .
539+ // then use this API to store the manifest describing how the chunks compose
540+ // the original blob.
514541 //
515542 // Servers which implement this functionality MUST declare that they support
516543 // it by setting the
@@ -522,11 +549,13 @@ service ContentAddressableStorage {
522549 //
523550 // In order to ensure data consistency of the CAS, the server MUST only add
524551 // blobs to the CAS after verifying their digests. In particular, servers MUST NOT
525- // trust digests provided by the client. The server MAY accept a request as no-op
526- // if the client-specified blob is already in CAS; the lifetime of that blob SHOULD
527- // be extended as usual. If the client-specified blob is not already in the CAS,
528- // the server SHOULD verify that the digest of the newly created blob matches the
529- // digest specified by the client, and reject the request if they differ.
552+ // trust digests provided by the client. The server MUST validate the manifest
553+ // and verify that all referenced chunks exist in the CAS and match their
554+ // specified digests. The server MAY optionally verify that concatenating the
555+ // chunks results in a blob matching the original blob digest, particularly if
556+ // the client is not trusted. The server MAY accept a request as no-op if a
557+ // manifest for the client-specified blob already exists; the lifetime of that
558+ // manifest and its chunks SHOULD be extended as usual.
530559 //
531560 // When blob splitting and splicing is used at the same time, the clients and
532561 // the server SHOULD agree out-of-band upon a chunking algorithm used by both
0 commit comments