Version
- Nessie server: 0.107.5 (Quarkus image
ghcr.io/projectnessie/nessie:0.107.5)
- Iceberg client: 1.7.0
- Spark: 3.5.x
- pyiceberg: 0.7.x (also affected when using HadoopFileIO via RestCatalog Python client)
Description
When a Spark client writes to a Nessie REST catalog using HadoopFileIO (the default classical Iceberg FileIO via hadoop-aws), the resulting metadata.location is prefixed with scheme s3a://. The Nessie server registers the warehouse with scheme s3://. The internal IcebergConfigurer.icebergConfigPerTable() writeable derivation appears to apply S3Utils.normalizeS3Scheme only to the warehouse path, not to the metadata.location it compares against — so the resulting array writeable[] is empty and all PUT/DELETE signing requests are subsequently rejected with HTTP 403 unauthorized signing request.
End result: tables created via HadoopFileIO against a Nessie REST catalog backed by S3-compatible storage (MinIO, AWS S3) become silently unwriteable. Reads work fine.
Reproduction
Minimal Nessie config
nessie:
image: ghcr.io/projectnessie/nessie:0.107.5
environment:
nessie.catalog.default-warehouse: dev
nessie.catalog.warehouses.dev.location: s3://my-bucket/ # scheme s3://
nessie.catalog.service.s3.default-options.endpoint: http://minio:9000
nessie.catalog.service.s3.default-options.path-style-access: "true"
nessie.catalog.service.s3.default-options.region: us-east-1
Spark client config that triggers the bug
spark-submit \
--conf spark.sql.catalog.demo=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.demo.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
--conf spark.sql.catalog.demo.uri=http://nessie:19120/iceberg \
--conf spark.sql.catalog.demo.warehouse=dev \
# io-impl OMITTED -> defaults to HadoopFileIO which triggers the bug.
--conf spark.hadoop.fs.s3a.endpoint=http://minio:9000 \
...
Operation that fails
df = spark.createDataFrame([(1, "x")], ["id", "name"])
df.writeTo("demo.test_ns.test_table").create()
Observed error
org.apache.iceberg.exceptions.CommitFailedException:
Unauthorized signing request
caused by: org.apache.iceberg.aws.s3.signer.S3V4RestSignerClient:
HTTP 403 — signing rejected, writeable=[]
Inspecting GET /iceberg/v1/{prefix}/config?warehouse=dev:
{
"overrides": {
"s3.signer.uri": "http://nessie:19120/iceberg/v1/aws/s3/sign",
"writeable": []
}
}
The Spark HadoopFileIO writes metadata.location: s3a://my-bucket/... to the commit. The warehouse is registered as s3://my-bucket/. Tables written with HadoopFileIO have metadata.location: s3a://...; tables written with io-impl=S3FileIO (AWS SDK v2 direct) have metadata.location: s3://... and work correctly.
Root cause hypothesis
In IcebergConfigurer.icebergConfigPerTable() (servers/quarkus-server/src/main/java/.../catalog/IcebergConfigurer.java or equivalent; approximate lines 225-320 of 0.107.5), the writeable derivation appears to be roughly:
String warehouseNormalized = S3Utils.normalizeS3Scheme(warehouse.getLocation());
// "s3://my-bucket/"
String metadataLoc = table.metadataLocation();
// "s3a://my-bucket/.../00000-...metadata.json"
if (metadataLoc.startsWith(warehouseNormalized)) {
// "s3a://...".startsWith("s3://...") -> FALSE
writeable.add(metadataLoc);
}
// -> writeable[] stays empty
S3Utils.normalizeS3Scheme() is called on the warehouse but not on the metadataLoc before the startsWith check. The scheme mismatch (s3:// vs s3a://) makes the predicate always false, so the writeable array stays empty and the signing endpoint then rejects every PUT/DELETE.
Suggested fix
Normalize both sides before the comparison:
String warehouseNormalized = S3Utils.normalizeS3Scheme(warehouse.getLocation());
String metadataLocNormalized = S3Utils.normalizeS3Scheme(metadataLoc);
if (metadataLocNormalized.startsWith(warehouseNormalized)) {
writeable.add(metadataLoc);
}
A more defensive variant would normalize the warehouse scheme at storage time and normalize all metadata locations once at startup.
Workaround
Force io-impl=org.apache.iceberg.aws.s3.S3FileIO on every Spark client. S3FileIO writes s3:// natively, eliminating the mismatch.
Impact
- Severity: high for any user mixing Spark + Nessie REST + S3-compatible storage (MinIO, AWS S3) with default
HadoopFileIO.
- Workaround known: yes (
io-impl=S3FileIO) but requires Iceberg 1.7+ and iceberg-aws-bundle.
- Silenced confusion: the failure surfaces in the signing endpoint, not the scheme check itself — users typically spend hours debugging credentials before reaching the actual cause.
Happy to help test a fix against this scenario if useful.
Version
ghcr.io/projectnessie/nessie:0.107.5)Description
When a Spark client writes to a Nessie REST catalog using
HadoopFileIO(the default classical Iceberg FileIO viahadoop-aws), the resultingmetadata.locationis prefixed with schemes3a://. The Nessie server registers the warehouse with schemes3://. The internalIcebergConfigurer.icebergConfigPerTable()writeable derivation appears to applyS3Utils.normalizeS3Schemeonly to the warehouse path, not to themetadata.locationit compares against — so the resulting arraywriteable[]is empty and all PUT/DELETE signing requests are subsequently rejected withHTTP 403 unauthorized signing request.End result: tables created via
HadoopFileIOagainst a Nessie REST catalog backed by S3-compatible storage (MinIO, AWS S3) become silently unwriteable. Reads work fine.Reproduction
Minimal Nessie config
Spark client config that triggers the bug
spark-submit \ --conf spark.sql.catalog.demo=org.apache.iceberg.spark.SparkCatalog \ --conf spark.sql.catalog.demo.catalog-impl=org.apache.iceberg.rest.RESTCatalog \ --conf spark.sql.catalog.demo.uri=http://nessie:19120/iceberg \ --conf spark.sql.catalog.demo.warehouse=dev \ # io-impl OMITTED -> defaults to HadoopFileIO which triggers the bug. --conf spark.hadoop.fs.s3a.endpoint=http://minio:9000 \ ...Operation that fails
Observed error
Inspecting
GET /iceberg/v1/{prefix}/config?warehouse=dev:{ "overrides": { "s3.signer.uri": "http://nessie:19120/iceberg/v1/aws/s3/sign", "writeable": [] } }The Spark
HadoopFileIOwritesmetadata.location: s3a://my-bucket/...to the commit. The warehouse is registered ass3://my-bucket/. Tables written withHadoopFileIOhavemetadata.location: s3a://...; tables written withio-impl=S3FileIO(AWS SDK v2 direct) havemetadata.location: s3://...and work correctly.Root cause hypothesis
In
IcebergConfigurer.icebergConfigPerTable()(servers/quarkus-server/src/main/java/.../catalog/IcebergConfigurer.javaor equivalent; approximate lines 225-320 of 0.107.5), the writeable derivation appears to be roughly:S3Utils.normalizeS3Scheme()is called on the warehouse but not on themetadataLocbefore thestartsWithcheck. The scheme mismatch (s3://vss3a://) makes the predicate always false, so the writeable array stays empty and the signing endpoint then rejects every PUT/DELETE.Suggested fix
Normalize both sides before the comparison:
A more defensive variant would normalize the warehouse scheme at storage time and normalize all metadata locations once at startup.
Workaround
Force
io-impl=org.apache.iceberg.aws.s3.S3FileIOon every Spark client. S3FileIO writess3://natively, eliminating the mismatch.Impact
HadoopFileIO.io-impl=S3FileIO) but requires Iceberg 1.7+ andiceberg-aws-bundle.Happy to help test a fix against this scenario if useful.