You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Icechunk allows repos to contain [virtual chunks](./virtual.md). To allow for referencing these virtual chunks, you can configure the `virtual_chunk_containers` parameter to specify the storage locations and configurations for any virtual chunks. Each virtual chunk container is specified by a [`VirtualChunkContainer`](./reference.md#icechunk.VirtualChunkContainer) object which contains a name, a url prefix, and a storage configuration. When a container is added to the settings, any virtual chunks with a url that starts with the configured prefix will use the storage configuration for that matching container.
72
+
Icechunk allows repos to contain [virtual chunks](./virtual.md). To allow for referencing these virtual chunks, you must configure the `virtual_chunk_containers` parameter to specify the storage locations and configurations for any virtual chunks. Each virtual chunk container is specified by a [`VirtualChunkContainer`](./reference.md#icechunk.VirtualChunkContainer) object which contains a url prefix, and a storage configuration. When a container is added to the settings, any virtual chunks with a url that starts with the configured prefix will use the storage configuration for that matching container.
73
73
74
74
!!! note
75
75
76
-
Currently only `s3` compatible storage and `local_filesystem` storage are supported for virtual chunk containers. Other storage backends such as `gcs`, `azure`, and `https` are on the roadmap.
76
+
Currently only `s3` compatible storage, `gcs`, `local_filesystem` and `http[s]` storages are supported for virtual chunk containers. Other storage backends such as `azure` are on the roadmap.
77
77
78
78
#### Example
79
79
@@ -82,7 +82,6 @@ For example, if we wanted to configure an icechunk repo to be able to contain vi
Now at read time, if icechunk encounters a virtual chunk url that starts with `s3://my-other-s3-bucket/`, it will use the storage configuration for the `my-other-s3-bucket` container.
106
+
Now at read time, if Icechunk encounters a virtual chunk url that starts with `s3://my-other-s3-bucket/`, it will use the storage configuration for the `my-other-s3-bucket` container.
109
107
110
108
!!! note
111
109
112
-
While virtual chunk containers specify the storage configuration for any virtual chunks, they do not contain any authentication information. The credentials must also be specified when opening the repository using the [`virtual_chunk_credentials`](./reference.md#icechunk.Repository.open) parameter. See the [Virtual Chunk Credentials](#virtual-chunk-credentials) section for more information.
110
+
While virtual chunk containers specify the storage configuration for any virtual chunks, they do not contain any authentication information. The credentials must also be specified when opening the repository using the [`authorize_virtual_chunk_access`](./reference.md#icechunk.Repository.open) parameter. This parameter also serves as a way for the user to authorize the access to the virtual chunk containers, containers that are not explicitly allowed with `authorize_virtual_chunk_access` won't be able to fetch their chunks. See the [Virtual Chunk Credentials](#virtual-chunk-credentials) section for more information.
@@ -269,21 +267,22 @@ The next time this repo is opened, the persisted config will be loaded by defaul
269
267
270
268
## Virtual Chunk Credentials
271
269
272
-
When using virtual chunk containers, the credentials for the storage backend must also be specified. This is done using the [`virtual_chunk_credentials`](./reference.md#icechunk.Repository.open) parameter when creating or opening the repo. Credentials are specified as a dictionary of container names mapping to credential objects. A helper function, [`containers_credentials`](./reference.md#icechunk.containers_credentials), is provided to make it easier to specify credentials for multiple containers.
270
+
When using virtual chunk containers, the containers must be authorized by the repo user, and the credentials for the storage backend must be specified. This is done using the [`authorize_virtual_chunk_access`](./reference.md#icechunk.Repository.open) parameter when creating or opening the repo. Credentials are specified as a dictionary of container url prefixes mapping to credential objects or `None`. A `None` credential will fetch credentials from the process environment or it will use anonymous credentials if the container allows it. A helper function, [`containers_credentials`](./reference.md#icechunk.containers_credentials), is provided to make it easier to specify credentials for multiple containers.
273
271
274
272
### Example
275
273
276
274
Expanding on the example from the [Virtual Chunk Containers](#virtual-chunk-containers) section, we can configure the repo to use the credentials for the `my-s3-bucket` and `my-other-s3-bucket` containers.
Users of the repo will need to enable the virtual chunk container by passing the `credentials` argument to `Repository.open`. This way, the repo user, flags the container as authorized. `credentials` argument must be a dict using url prefixes as keys and optional credentials as values. If the container requires no credentials, `None` can be used as the value in the map. Failing to authorize a container, will generate an error when a chunk is fetched from it.
156
+
153
157
## Virtual Reference API
154
158
155
159
While `VirtualiZarr` is the easiest way to create virtual datasets with Icechunk, the Store API that it uses to create the datasets in Icechunk is public. `IcechunkStore` contains a [`set_virtual_ref`](./reference.md#icechunk.IcechunkStore.set_virtual_ref) method that specifies a virtual ref for a specified chunk.
156
160
157
161
### Virtual Reference Storage Support
158
162
159
-
Currently, Icechunk supports two types of storage for virtual references:
163
+
Currently, Icechunk supports four types of storage for virtual references:
160
164
161
165
#### S3 Compatible
162
166
@@ -167,13 +171,48 @@ References to files accessible via S3 compatible storage.
167
171
Here is how we can set the chunk at key `c/0` to point to a file on an s3 bucket,`mybucket`, with the prefix `my/data/file.nc`:
S3 virtual references require configuring credential for the store to be able to access the specified s3 bucket. See [the configuration docs](./configuration.md#virtual-reference-storage-config) for instructions.
176
184
185
+
#### GCS
186
+
187
+
References to files accessible on Google Cloud Storage
188
+
189
+
##### Example
190
+
191
+
Here is how we can set the chunk at key `c/0` to point to a file on an s3 bucket,`mybucket`, with the prefix `my/data/file.nc`:
0 commit comments