Skip to content

[pull] trunk from spiceai:trunk#790

Merged
pull[bot] merged 4 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk
Apr 30, 2026
Merged

[pull] trunk from spiceai:trunk#790
pull[bot] merged 4 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 30, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

phillipleblanc and others added 4 commits April 30, 2026 16:56
…nferred (#10601)

When a Hadoop catalog's warehouse root scheme (e.g. `s3://`) does not match
the scheme used in the table metadata locations (e.g. `s3a://`), the existing
scheme inference logic in `HadoopCatalogBuilder::inner_build` rebuilds the
catalog with the inferred warehouse root. However, since the iceberg-rust v0.9
upgrade (#9917), the `OpenDalStorageFactory::S3` variant validates that paths
match its `configured_scheme`. If the factory was constructed with
`configured_scheme: \"s3\"` but the rebuilt catalog tries to operate on
`s3a://...` paths, it errors with:

    DataInvalid => Invalid s3 url: s3a://hadoop/, should start with s3://hadoop/

This caused the `catalogs/iceberg-hadoop` cookbook recipe to fail to register
its catalog (the recipe writes table metadata via Spark using `s3a://`).

Fix:

- Add `HadoopCatalogBuilder::with_storage_factory_builder(Fn(&str) -> Arc<...>)`,
  a closure that produces a `StorageFactory` for a given URL scheme. When set,
  the factory is materialized in `inner_build` from the current warehouse root
  scheme, which allows scheme inference to rebuild the factory with the
  inferred scheme on the recursive `inner_build(false)` call.

- Update the iceberg catalog connector to use the new builder closure for the
  S3 case, so the `CustomAwsCredentialLoader` is captured and the factory is
  rebuilt with the correct scheme when inferred.

- Add a regression test `get_s3_to_s3a_inferred_hadoop_catalog` that
  configures the warehouse root as `s3://hadoop/` (while the underlying
  metadata uses `s3a://hadoop/`) and verifies the catalog builds successfully
  via scheme inference + factory rebuild.
* v2.0.0-rc.4 release notes

* Fix

* Fix

* PM edits

Co-authored-by: Copilot <copilot@github.com>

---------

Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>
Replace license.workspace = true with license-file.workspace = true across all workspace crates, and update [workspace.package] to use license-file = "LICENSE" instead of license = "Apache-2.0".
@pull pull Bot locked and limited conversation to collaborators Apr 30, 2026
@pull pull Bot added the ⤵️ pull label Apr 30, 2026
@pull pull Bot merged commit cbd47ba into TheRakeshPurohit:trunk Apr 30, 2026
2 of 15 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants