Skip to content

Commit 9541618

Browse files
committed
update split and splice docs
1 parent 3860ca2 commit 9541618

File tree

1 file changed

+45
-16
lines changed

1 file changed

+45
-16
lines changed

build/bazel/remote/execution/v2/remote_execution.proto

Lines changed: 45 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -440,13 +440,21 @@ service ContentAddressableStorage {
440440
option (google.api.http) = { get: "/v2/{instance_name=**}/blobs/{root_digest.hash}/{root_digest.size_bytes}:getTree" };
441441
}
442442

443-
// Split a blob into chunks.
444-
//
445-
// This call splits a blob into chunks, stores the chunks in the CAS, and
446-
// returns a list of the chunk digests. Using this list, a client can check
447-
// which chunks are locally available and just fetch the missing ones. The
448-
// desired blob can be assembled by concatenating the fetched chunks in the
449-
// order of the digests in the list.
443+
// Split retrieves information about how a blob is split into chunks.
444+
//
445+
// This call returns information about how a blob is split into chunks, and
446+
// returns a list of the chunk digests. The server returns a known manifest
447+
// if one exists for the requested blob digest. If no manifest exists, the
448+
// server MAY compute a new one by performing chunking on-demand if the blob
449+
// exists in the CAS, though this should be rare as manifests are typically
450+
// created when blobs are written. The original blob digest does not need to
451+
// be present in the CAS for this call to succeed if a manifest already exists.
452+
//
453+
// Using the returned list of chunk digests, a client can check which chunks
454+
// are locally available and just fetch the missing ones. The desired blob can
455+
// be assembled by concatenating the fetched chunks in the order of the
456+
// digests in the list. ZSTD-compressed chunks can be concatenated without
457+
// decompression.
450458
//
451459
// This rpc can be used to reduce the required data to download a large blob
452460
// from CAS if chunks from earlier downloads of a different version of this
@@ -459,6 +467,8 @@ service ContentAddressableStorage {
459467
// 1. The blob chunks are stored in CAS.
460468
// 2. Concatenating the blob chunks in the order of the digest list returned
461469
// by the server results in the original blob.
470+
//
471+
// A client should NOT expect that the original blob is available.
462472
//
463473
// Servers which implement this functionality MUST declare that they support
464474
// it by setting the
@@ -491,26 +501,43 @@ service ContentAddressableStorage {
491501
//
492502
// Errors:
493503
//
494-
// * `NOT_FOUND`: The requested blob is not present in the CAS.
504+
// * `NOT_FOUND`: No manifest exists for the blob and the blob is not present
505+
// in the CAS.
495506
// * `RESOURCE_EXHAUSTED`: There is insufficient disk quota to store the blob
496507
// chunks.
497508
rpc SplitBlob(SplitBlobRequest) returns (SplitBlobResponse) {
498509
option (google.api.http) = { get: "/v2/{instance_name=**}/blobs/{blob_digest.hash}/{blob_digest.size_bytes}:splitBlob" };
499510
}
500511

501-
// Splice a blob from chunks.
512+
// Splice tells the CAS how chunks compose a blob.
502513
//
503514
// This is the complementary operation to the
504515
// [ContentAddressableStorage.SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob]
505516
// function to handle the chunked upload of large blobs to save upload
506517
// traffic.
507518
//
519+
// When uploading a large blob using chunked upload, clients MUST first upload
520+
// all chunks to the CAS, then call this RPC to store a manifest that describes
521+
// how those chunks compose the original blob. The chunks referenced in the
522+
// manifest must be available in the CAS before calling this RPC. The original
523+
// blob does not need to be available: the correctness of the manifest can be
524+
// validated from manifest correctness and by verifying that the chunks match
525+
// their specified digests.
526+
//
527+
// This RPC stores or updates a manifest that describes how a blob is split
528+
// into chunks and contains the digests of those chunks. The server is not
529+
// expected to actually stitch the blob together when this RPC is called;
530+
// instead, it stores and validates the manifest. The blob is not actually
531+
// spliced together until BS/Read is called on the original blob digest.
532+
// The server SHOULD also communicate the chunking configuration used, so that
533+
// it can be changed or overwritten to a more optimized chunking in the future.
534+
//
508535
// If a client needs to upload a large blob and is able to split a blob into
509536
// chunks in such a way that reusable chunks are obtained, e.g., by means of
510537
// content-defined chunking, it can first determine which parts of the blob
511538
// are already available in the remote CAS and upload the missing chunks, and
512-
// then use this API to instruct the server to splice the original blob from
513-
// the remotely available blob chunks.
539+
// then use this API to store the manifest describing how the chunks compose
540+
// the original blob.
514541
//
515542
// Servers which implement this functionality MUST declare that they support
516543
// it by setting the
@@ -522,11 +549,13 @@ service ContentAddressableStorage {
522549
//
523550
// In order to ensure data consistency of the CAS, the server MUST only add
524551
// blobs to the CAS after verifying their digests. In particular, servers MUST NOT
525-
// trust digests provided by the client. The server MAY accept a request as no-op
526-
// if the client-specified blob is already in CAS; the lifetime of that blob SHOULD
527-
// be extended as usual. If the client-specified blob is not already in the CAS,
528-
// the server SHOULD verify that the digest of the newly created blob matches the
529-
// digest specified by the client, and reject the request if they differ.
552+
// trust digests provided by the client. The server MUST validate the manifest
553+
// and verify that all referenced chunks exist in the CAS and match their
554+
// specified digests. The server MAY optionally verify that concatenating the
555+
// chunks results in a blob matching the original blob digest, particularly if
556+
// the client is not trusted. The server MAY accept a request as no-op if a
557+
// manifest for the client-specified blob already exists; the lifetime of that
558+
// manifest and its chunks SHOULD be extended as usual.
530559
//
531560
// When blob splitting and splicing is used at the same time, the clients and
532561
// the server SHOULD agree out-of-band upon a chunking algorithm used by both

0 commit comments

Comments
 (0)