-
Notifications
You must be signed in to change notification settings - Fork 65
Description
What happened?
My issue is similar to this one but I could not implement the workaround since I am using local_filesystem_storage: #1431
I am attempting to process a set of repos containing virtual chunks, essentially the same NEX-GDDP-CMIP6 data on S3 referenced in this example from the virtualizarr docs.
To mitigate the intermittent network issues, I attempted to add retry settings to the RepositoryConfig:
storage = icechunk.local_filesystem_storage(vds_name)
config = icechunk.RepositoryConfig(
storage=icechunk.StorageSettings(
retries=icechunk.StorageRetriesSettings(
max_tries=10,
initial_backoff_ms=500,
max_backoff_ms=5000,
),
)
)
config.set_virtual_chunk_container(
icechunk.VirtualChunkContainer(f"{bucket_url}/", icechunk.s3_store(region="us-west-2"))
)
credentials = icechunk.containers_credentials(
{bucket_url: icechunk.s3_credentials(anonymous=True)}
)This works great when I open a readonly session to an existing icechunk store, however since I am using local filesystem storage for my initial buildout, I get an error when creating a new one:
# This works
repo = icechunk.Repository.open(storage, config, credentials)
# This throws an error
repo = icechunk.Repository.open_or_create(storage, config, credentials) PyRepository.open_or_create(
~~~~~~~~~~~~~~~~~~~~~~~~~~~^
storage,
^^^^^^^^
config=config,
^^^^^^^^^^^^^^
authorize_virtual_chunk_access=authorize_virtual_chunk_access,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
icechunk.IcechunkError: × object store error Operation `put_opts` with `opts.attributes` specified not yet implemented by LocalFileSystem(file:///foo/bar).
│
│ context:
│ 0: icechunk::storage::object_store::update_config
│ with previous_version=VersionInfo { etag: None, generation: None }
│ at icechunk/src/storage/object_store.rs:320
│ 1: icechunk::repository::store_config
│ with previous_version=VersionInfo { etag: None, generation: None }
│ at icechunk/src/repository.rs:418
│ 2: icechunk::repository::create
│ at icechunk/src/repository.rs:172
│
├─▶ object store error Operation `put_opts` with `opts.attributes` specified not yet implemented by LocalFileSystem(file:///foo/bar).
╰─▶ Operation `put_opts` with `opts.attributes` specified not yet implemented by LocalFileSystem(file:///foo/bar).
What did you expect to happen?
It sort of makes sense that it would throw a not implemented error for setting "retry" options on a local filesystem icechunk store, but the root of the issue is that I am actually passing these StorageSettings so that they get copied to the VirtualChunkResolver
I think RepositoryConfig should either allow StorageSettings to be set for the VirtualChunkResolver or it should ignore StorageRetrySettings when creating a Repository that uses local filesystem storage (or in-memory storage).
Minimal Complete Verifiable Example
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "icechunk<2",
# "zarr",
# ]
#
# [[tool.uv.index]]
# name = "scientific-python-nightly-wheels"
# url = "https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/"
#
# [tool.uv.sources]
# icechunk = { index = "scientific-python-nightly-wheels" }
# zarr = { index = "scientific-python-nightly-wheels" }
#
# [tool.uv]
# prerelease = "allow"
# ///
#
# This script automatically imports the development branch of icechunk to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!
import icechunk
icechunk.print_debug_info()
bucket = "nex-gddp-cmip6"
bucket_url = f"s3://{bucket}"
vds_name = "inm-cm5-0-ssp370-tas"
storage = icechunk.local_filesystem_storage(vds_name)
config = icechunk.RepositoryConfig(
storage=icechunk.StorageSettings(
retries=icechunk.StorageRetriesSettings(
max_tries=10,
initial_backoff_ms=500,
max_backoff_ms=5000,
),
)
)
config.set_virtual_chunk_container(
icechunk.VirtualChunkContainer(
f"{bucket_url}/",
icechunk.s3_store(
region="us-west-2",
anonymous=True,
allow_http=True
)
)
)
credentials = icechunk.containers_credentials(
{bucket_url: icechunk.s3_credentials(anonymous=True)}
)
repo = icechunk.Repository.open_or_create(storage, config, credentials)MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in icechunk.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of icechunk and its dependencies.
Relevant log output
Anything else we need to know?
No response