You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Umbrella for the plan and stacked PRs that close out #349.
#349 asked for a blessed way to compose stores. Surveying the real use cases, zarrita actually needs two composition points, not one, and the current AsyncReadable<Options> generic is a symptom of only having the wrong half.
This is the plan:
zarr.extendStore for the transport layer (bytes in, bytes out)
zarr.extendArray for the data layer (coords in, chunks out)
import*aszarrfrom"zarrita";// Transport layer: wrap a store with caching, batching, consolidated metadata.letstore=awaitzarr.extendStore(newzarr.FetchStore("https://example.com/data.zarr"),(s)=>zarr.withConsolidation(s),(s)=>zarr.withRangeBatching(s,{cacheSize: 512}),);// Data layer: wrap an array with chunk caching, prefetch, observability.letarr=awaitzarr.extendArray(awaitzarr.open(store,{kind: "array"}),(a)=>withChunkCache(a,{cache: newMap()}),);awaitzarr.get(arr,[null,zarr.slice(0,10)]);
Both extension points are built from the same factory primitive:
Before this refactor, zarrita only had a transport extension point, and even that was unofficial (subclass FetchStore or hand-roll an AsyncReadable). The data layer had no extension point at all, so chunk-level concerns were smuggled through AsyncReadable<Options>, a generic that threaded opaque per-call state from the call site all the way down to the store.
Once the data layer has its own extension point, the Options generic has no job. It goes away, and signal (the one thing anyone ever actually threaded) becomes a plain field:
#349 asked for a blessed way to compose stores. Surveying the real use cases, zarrita actually needs two composition points, not one, and the current
AsyncReadable<Options>generic is a symptom of only having the wrong half.This is the plan:
zarr.extendStorefor the transport layer (bytes in, bytes out)zarr.extendArrayfor the data layer (coords in, chunks out)Both extension points are built from the same factory primitive:
Why two layers
Every custom store, wrapper, cache, and hack in the zarrita ecosystem falls into one of two buckets.
Transport concerns operate on
(key, range) -> Uint8Arrayand don't care about zarr's logical model:custom
fetchoption from Add customfetchoption to FetchStore #388.lru(store)wrapper.Data concerns operate on
(chunkCoords) -> Chunk<T>and don't care about paths or bytes:get()function #296, closed). Tried to addcache?: ChunkCachetoGetOptionsbecause there was no chunk-level extension point.vole-core's
wrapArray,which bypasses the generic entirely with a bare
Proxyover the Array.RequestInitthrough layers that never read it.Before this refactor, zarrita only had a transport extension point, and even that was unofficial (subclass
FetchStoreor hand-roll anAsyncReadable). The data layer had no extension point at all, so chunk-level concerns were smuggled throughAsyncReadable<Options>, a generic that threaded opaque per-call state from the call site all the way down to the store.Once the data layer has its own extension point, the
Optionsgeneric has no job. It goes away, andsignal(the one thing anyone ever actually threaded) becomes a plain field:Stacked PRs
zarr.defineStoreMiddlewareandzarr.extendStoremiddleware system #384 Add composable store middleware system (defineStoreMiddleware,extendStore,withConsolidation,withRangeBatching)Optionsgeneric and threadsignaldirectly #391 DropOptionsgeneric and threadsignaldirectlyzarr.defineArrayMiddlewareandzarr.extendArray#392 Addzarr.defineArrayMiddlewareandzarr.extendArrayWhat this unblocks
get()function #296 Chunk caching becomes awithChunkCachemiddleware, roughly 15 lines.wrapArrayscheduler migrates todefineArrayMiddleware+extendArray, dropping the bare-Proxyworkaround.AsyncReadable<RequestInit>(idetik, orkestrator, carbonplan/zarr-layer, keller-mark virtual stores) collapse to plainAsyncReadable.