🪣 Support for OBject Storage by Kezzsim · Pull Request #1021 · bluesky/tiled

Kezzsim · 2025-07-17T15:12:57Z

Resolves #905

tiled/adapters/zarr.py

Co-authored-by: Eugene <ymatviych@bnl.gov>

danielballan · 2025-10-16T17:30:53Z

The only remaining big thing here is to run a MinIO container on GHA so we can test against it.

danielballan · 2025-10-22T10:28:18Z

tiled/storage.py

-def get_storage(uri: str) -> Storage:
-    "Look up Storage by URI."
-    return _STORAGE[uri]
+def get_storage(uri: str) -> Storage | Tuple[Storage, str]:


This ambiguity in return type could be a signal that the design isn't quite right.

If the problem is, "How can we get the path back to the user?" what are our other options? What would the consequences be of adding a path attribute to ObjectStorage?

danielballan · 2025-10-22T10:46:32Z

tiled/storage.py

+    elif scheme in {"sqlite", "duckdb"}:
+        return EmbeddedSQLStorage(uri)
+    elif scheme == "http":
+        # Split on the first single '/' that is not part of '://'


This works but it's a lot to digest:

regex

string emptiness check

lstrip

replace

Above, we have the parsed URL urlparse(uri), which we use to get the scheme. If we keep a reference to that in some local variable, parsed_uri, we can easily grab the path:

full_path = parsed_uri.path # includes bucket and the rest bucket_name, blob_path = full_path.split("/", 1)

Aside: If you haven't seen it before, the second argument to split works like this:

>>> 'a/b/c'.split('/') ['a', 'b', 'c'] >>> 'a/b/c'.split('/', 1) # split on the first instance of '/' only ['a', 'b/c']

Now, to get the "bucket only" version of the URL:

from urllib.parse import urlunparse # Get the URL without the blob_path, just the bucket. bucket_uri = urlunparse(parsed_uri._replace(path='/' + bucket_name)) # Look up storage by bucket URI. storage = _STORAGE[bucket_uri] # Return a copy encapsulating the blob_path. return storage.with_blob_path(blob_path) # Note: This method would need to be added to ObjectStorage.

Example:

>>> uri = 'https://example.com/bucket/a/b/c' >>> parsed_uri = urlparse(uri) >>> parsed_uri ParseResult(scheme='https', netloc='example.com', path='/bucket/a/b/c', params='', query='', fragment='') >>> full_path = parsed_uri.path >>> full_path '/bucket/a/b/c' >>> full_path.lstrip('/') 'bucket/a/b/c' >>> bucket_name, blob_path = full_path.lstrip('/').split('/', 1) >>> bucket_name 'bucket' >>> blob_path 'a/b/c'

refactor `ObjectStorage` and clean up code

Kezzsim · 2025-10-23T19:45:49Z

Pertaining to the minio testing container:

Specify Object fixture in test_writing.py
Set fixture to xfail when minio url isn't available
Add GitHub Actions to setup the minio container
Add code to reset the state of the bucket after each test

tiled/adapters/zarr.py

tiled/_tests/test_writing.py

tiled/catalog/adapter.py

More refactoring of Storage

danielballan

I like that we leave Azure and Google off now, until they can be tested, but we have a clear path for adding support soon.

https://github.com/bluesky/tiled/pull/1021/files#diff-38e7f2525a2fda64425a32d8b78379eedfbb2a2d3bd147e5c6f0571be0210201R29

And I like the clarity of this:

https://github.com/bluesky/tiled/pull/1021/files#diff-1fb9039deb7e0ac14eb3afa15f93579ebeef9858ca8275a7ccef55ad6b52277eR218-R219

I hope to test drive this before clicking the green merge button, but this will go in in time for maint, barring any surprises. Well done!

* 🎈 *writes data to get teh party started* * 🌫️ *anxiously adds more cloud providers* * Resolve mypy errors * 👍️ Resolve minio https error preventing us from writing `zarr.json` * 🚮 Experiment with writing (sloppy) data * 🪲 DEBUG: problems with `write` * 🕶️ Review : Add missing prefix Co-authored-by: Eugene <ymatviych@bnl.gov> * ✍️ Write regex helper function * 🧽 refactor to clean up repeated code * ✍️ Add Blobs to writing tests * ✍️ Rewrite `get_storage` to be a router for buckets * refactor ObjectStorage * 🐋 Add minio container to CI for testing * 🧪 Make `TILED_TEST_BUCKET` env var for advanced testing * More refactoring of Storage * FIX: look up registered storages instead of recreating them * Simplify test config * TST: fix test_writing + more refactoring * MNT: add minio dependency for server * ENH: generalize asset deletion --------- Co-authored-by: Eugene <ymatviych@bnl.gov>

Kezzsim changed the title 🎈 *writes data to get teh party started* 🪣 Support for Bucket Storage Jul 17, 2025

Kezzsim marked this pull request as draft July 17, 2025 15:14

Kezzsim changed the title ~~🪣 Support for Bucket Storage~~ 🪣 Support for OBject Storage Jul 17, 2025

danielballan added this to the v0.1.0 release milestone Aug 5, 2025

danielballan removed this from the v0.1.0 release milestone Aug 28, 2025

genematx reviewed Sep 10, 2025

View reviewed changes

tiled/adapters/zarr.py Outdated Show resolved Hide resolved

Kezzsim and others added 9 commits October 16, 2025 11:56

🎈 *writes data to get teh party started*

0d436e3

🌫️ *anxiously adds more cloud providers*

152b886

Resolve mypy errors

4e42fb0

👍️ Resolve minio https error preventing us from writing zarr.json

1c0164b

🚮 Experiment with writing (sloppy) data

6f8ab0f

🪲 DEBUG: problems with write

9db8cca

🕶️ Review : Add missing prefix

3a4e81c

Co-authored-by: Eugene <ymatviych@bnl.gov>

✍️ Write regex helper function

99480e2

🧽 refactor to clean up repeated code

f54cc7f

danielballan force-pushed the obtsor branch from 9f12ed1 to f54cc7f Compare October 16, 2025 17:02

Kezzsim added 3 commits October 17, 2025 15:49

✍️ Add Blobs to writing tests

d102d11

Merge branch 'bluesky:main' into obtsor

a4d56c7

✍️ Rewrite get_storage to be a router for buckets

76b8dfa

danielballan reviewed Oct 22, 2025

View reviewed changes

Kezzsim and others added 4 commits October 23, 2025 13:15

Merge branch 'bluesky:main' into obtsor

6df1ecb

refactor ObjectStorage

3be96c0

Merge pull request #1 from genematx/obstore-eugene

d61e11a

refactor `ObjectStorage` and clean up code

🐋 Add minio container to CI for testing

65bebe8

genematx reviewed Oct 23, 2025

View reviewed changes

tiled/adapters/zarr.py Outdated Show resolved Hide resolved

Kezzsim and others added 3 commits October 23, 2025 17:33

🧪 Make TILED_TEST_BUCKET env var for advanced testing

0e200a9

More refactoring of Storage

bd7007b

FIX: look up registered storages instead of recreating them

401e432

Simplify test config

ef5e07d

genematx reviewed Oct 24, 2025

View reviewed changes

tiled/_tests/test_writing.py Outdated Show resolved Hide resolved

genematx reviewed Oct 24, 2025

View reviewed changes

tiled/catalog/adapter.py Outdated Show resolved Hide resolved

genematx reviewed Oct 24, 2025

View reviewed changes

tiled/catalog/adapter.py Show resolved Hide resolved

genematx and others added 4 commits October 24, 2025 17:25

TST: fix test_writing + more refactoring

4d8afb9

MNT: add minio dependency for server

aae6b13

ENH: generalize asset deletion

5a270f5

Merge pull request #2 from genematx/obstore-eugene

8e71719

More refactoring of Storage

Kezzsim marked this pull request as ready for review October 27, 2025 18:49

danielballan approved these changes Oct 27, 2025

View reviewed changes

danielballan approved these changes Oct 28, 2025

View reviewed changes

danielballan merged commit c4b1693 into bluesky:main Oct 28, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🪣 Support for OBject Storage#1021

🪣 Support for OBject Storage#1021
danielballan merged 24 commits intobluesky:mainfrom
Kezzsim:obtsor

Kezzsim commented Jul 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

danielballan commented Oct 16, 2025

Uh oh!

danielballan Oct 22, 2025

Uh oh!

danielballan Oct 22, 2025

Uh oh!

Kezzsim commented Oct 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

danielballan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Kezzsim commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

danielballan commented Oct 16, 2025

Uh oh!

danielballan Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

danielballan Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Kezzsim commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

danielballan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Kezzsim commented Jul 17, 2025 •

edited

Loading

Kezzsim commented Oct 23, 2025 •

edited

Loading