IcebergConfigurer.writeable check fails for HadoopFileIO tables (scheme s3a:// vs warehouse s3://) — signing rejects all commits

## Version

- Nessie server: **0.107.5** (Quarkus image `ghcr.io/projectnessie/nessie:0.107.5`)
- Iceberg client: 1.7.0
- Spark: 3.5.x
- pyiceberg: 0.7.x (also affected when using HadoopFileIO via RestCatalog Python client)

## Description

When a Spark client writes to a Nessie REST catalog using `HadoopFileIO` (the default classical Iceberg FileIO via `hadoop-aws`), the resulting `metadata.location` is prefixed with scheme `s3a://`. The Nessie server registers the warehouse with scheme `s3://`. The internal `IcebergConfigurer.icebergConfigPerTable()` writeable derivation appears to apply `S3Utils.normalizeS3Scheme` only to the warehouse path, not to the `metadata.location` it compares against — so the resulting array `writeable[]` is empty and **all PUT/DELETE signing requests are subsequently rejected** with `HTTP 403 unauthorized signing request`.

End result: tables created via `HadoopFileIO` against a Nessie REST catalog backed by S3-compatible storage (MinIO, AWS S3) become silently unwriteable. Reads work fine.

## Reproduction

### Minimal Nessie config

```yaml
nessie:
  image: ghcr.io/projectnessie/nessie:0.107.5
  environment:
    nessie.catalog.default-warehouse: dev
    nessie.catalog.warehouses.dev.location: s3://my-bucket/    # scheme s3://
    nessie.catalog.service.s3.default-options.endpoint: http://minio:9000
    nessie.catalog.service.s3.default-options.path-style-access: "true"
    nessie.catalog.service.s3.default-options.region: us-east-1
```

### Spark client config that triggers the bug

```bash
spark-submit \
  --conf spark.sql.catalog.demo=org.apache.iceberg.spark.SparkCatalog \
  --conf spark.sql.catalog.demo.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
  --conf spark.sql.catalog.demo.uri=http://nessie:19120/iceberg \
  --conf spark.sql.catalog.demo.warehouse=dev \
  # io-impl OMITTED -> defaults to HadoopFileIO which triggers the bug.
  --conf spark.hadoop.fs.s3a.endpoint=http://minio:9000 \
  ...
```

### Operation that fails

```python
df = spark.createDataFrame([(1, "x")], ["id", "name"])
df.writeTo("demo.test_ns.test_table").create()
```

### Observed error

```
org.apache.iceberg.exceptions.CommitFailedException:
  Unauthorized signing request
  caused by: org.apache.iceberg.aws.s3.signer.S3V4RestSignerClient:
    HTTP 403 — signing rejected, writeable=[]
```

Inspecting `GET /iceberg/v1/{prefix}/config?warehouse=dev`:

```json
{
  "overrides": {
    "s3.signer.uri": "http://nessie:19120/iceberg/v1/aws/s3/sign",
    "writeable": []
  }
}
```

The Spark `HadoopFileIO` writes `metadata.location: s3a://my-bucket/...` to the commit. The warehouse is registered as `s3://my-bucket/`. Tables written with `HadoopFileIO` have `metadata.location: s3a://...`; tables written with `io-impl=S3FileIO` (AWS SDK v2 direct) have `metadata.location: s3://...` and work correctly.

## Root cause hypothesis

In `IcebergConfigurer.icebergConfigPerTable()` (`servers/quarkus-server/src/main/java/.../catalog/IcebergConfigurer.java` or equivalent; approximate lines 225-320 of 0.107.5), the writeable derivation appears to be roughly:

```java
String warehouseNormalized = S3Utils.normalizeS3Scheme(warehouse.getLocation());
// "s3://my-bucket/"

String metadataLoc = table.metadataLocation();
// "s3a://my-bucket/.../00000-...metadata.json"

if (metadataLoc.startsWith(warehouseNormalized)) {
  // "s3a://...".startsWith("s3://...") -> FALSE
  writeable.add(metadataLoc);
}
// -> writeable[] stays empty
```

`S3Utils.normalizeS3Scheme()` is called on the warehouse but **not** on the `metadataLoc` before the `startsWith` check. The scheme mismatch (`s3://` vs `s3a://`) makes the predicate always false, so the writeable array stays empty and the signing endpoint then rejects every PUT/DELETE.

## Suggested fix

Normalize both sides before the comparison:

```java
String warehouseNormalized = S3Utils.normalizeS3Scheme(warehouse.getLocation());
String metadataLocNormalized = S3Utils.normalizeS3Scheme(metadataLoc);

if (metadataLocNormalized.startsWith(warehouseNormalized)) {
  writeable.add(metadataLoc);
}
```

A more defensive variant would normalize the warehouse scheme at storage time and normalize all metadata locations once at startup.

## Workaround

Force `io-impl=org.apache.iceberg.aws.s3.S3FileIO` on every Spark client. S3FileIO writes `s3://` natively, eliminating the mismatch.

## Impact

- **Severity:** high for any user mixing Spark + Nessie REST + S3-compatible storage (MinIO, AWS S3) with default `HadoopFileIO`.
- **Workaround known:** yes (`io-impl=S3FileIO`) but requires Iceberg 1.7+ and `iceberg-aws-bundle`.
- **Silenced confusion:** the failure surfaces in the signing endpoint, not the scheme check itself — users typically spend hours debugging credentials before reaching the actual cause.

Happy to help test a fix against this scenario if useful.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IcebergConfigurer.writeable check fails for HadoopFileIO tables (scheme s3a:// vs warehouse s3://) — signing rejects all commits #12426

Version

Description

Reproduction

Minimal Nessie config

Spark client config that triggers the bug

Operation that fails

Observed error

Root cause hypothesis

Suggested fix

Workaround

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

IcebergConfigurer.writeable check fails for HadoopFileIO tables (scheme s3a:// vs warehouse s3://) — signing rejects all commits #12426

Description

Version

Description

Reproduction

Minimal Nessie config

Spark client config that triggers the bug

Operation that fails

Observed error

Root cause hypothesis

Suggested fix

Workaround

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions