Skip to content

Support STAC Alternate Assets in TiTiler #137

Open
@j08lue

Description

@j08lue

Description

Some STAC catalogs keep links to cloud-readable assets (S3) in assets > alternate > key > href, rather than the main asset > href - i.e. using the Alternate Assets STAC extension.

We considered adding a mechanism to TiTiler-PgSTAC to use these alternate asset hrefs, but the preferred solution was to implement this at the application level for now.

Examples

  • https://landsatlook.usgs.gov/stac-server/collections/landsat-c2ard-sr/items/LC09_AK_001011_20240818_20240822_02_SR?.asset=asset-reduced_resolution_browse

    "assets": {
      "blue": {
        "title": "Blue Band (B2)",
        "description": "Collection 2 ARD Blue Band (B2) Surface Reflectance",
        "type": "image/vnd.stac.geotiff; cloud-optimized=true",
        "roles": [
          "data"
        ],
        "eo:bands": [
          {
            "name": "B2",
            "common_name": "blue",
            "gsd": 30,
            "center_wavelength": 0.48
          }
        ],
        "href": "https://landsatlook.usgs.gov/tile/collection02/oli-tirs/2024/AK/001/011/LC09_AK_001011_20240818_20240822_02/LC09_AK_001011_20240818_20240822_02_SR_B2.TIF",
        "alternate": {
          "s3": {
            "storage:platform": "AWS",
            "storage:requester_pays": true,
            "href": "s3://usgs-landsat-ard/collection02/oli-tirs/2024/AK/001/011/LC09_AK_001011_20240818_20240822_02/LC09_AK_001011_20240818_20240822_02_SR_B2.TIF"
          }
        }
      }
    }
  • https://pgstac.demo.cloudferro.com/collections/sentinel-2-l2a/items

    "assets": {
      "AOT_10m": {
        "gsd": 10,
        "href": "https://zipper.dataspace.copernicus.eu/odata/v1/Products(fa1f904e-15d2-40da-a4df-8231de3975ea)/Nodes(S2A_MSIL2A_20240328T100631_N0510_R022_T33UVR_20240328T142447.SAFE)/Nodes(GRANULE)/Nodes(L2A_T33UVR_A045779_20240328T100900)/Nodes(IMG_DATA)/Nodes(R10m)/Nodes(T33UVR_20240328T100631_AOT_10m.jp2)/$value",
        "type": "image/jp2",
        "roles": [
          "data",
          "sampling:original",
          "gsd:10m"
        ],
        "title": "Aerosol optical thickness (AOT) - 10m",
        "alternate": {
          "s3": {
            "href": "s3://eodata/Sentinel-2/MSI/L2A/2024/03/28/S2A_MSIL2A_20240328T100631_N0510_R022_T33UVR_20240328T142447.SAFE/GRANULE/L2A_T33UVR_A045779_20240328T100900/IMG_DATA/R10m/T33UVR_20240328T100631_AOT_10m.jp2",
            "storage:tier": "hot",
            "storage:region": "waw",
            "storage:platform": "CREODIAS",
            "storage:requester_pays": false
          }
        }
      }
    }

Possible solution

Since we were facing collections like this for VEDA, we implemented a dedicated reader in veda-backend: class PgSTACReaderAlt(PgSTACReader)

VEDA Backend implemented a tiler for alternate assets called s3 this way:

@attr.s
class PgSTACReaderAlt(PgSTACReader):
    """Custom STAC Reader for the alternate asset format used widely by NASA.

    Only accept `pystac.Item` as input (while rio_tiler.io.STACReader accepts url or pystac.Item)

    """

    def _get_asset_info(self, asset: str) -> AssetInfo:
        """Validate asset names and return asset's url.
        Args:
            asset (str): STAC asset name.
        Returns:
            str: STAC asset href.
        """
        if asset not in self.assets:
            raise InvalidAssetName(f"{asset} is not valid")

        asset_info = self.input.assets[asset]
        extras = asset_info.extra_fields

        if ("alternate" not in extras) or ("s3" not in extras["alternate"]):
            raise MissingAssets("No alternate asset found")

        info = AssetInfo(url=extras["alternate"]["s3"]["href"], metadata=extras)

        info["env"] = {}

        if "file:header_size" in asset_info.extra_fields:
            h = asset_info.extra_fields["file:header_size"]
            info["env"].update({"GDAL_INGESTED_BYTES_AT_OPEN": h})

        if requester_pays := extras["alternate"]["s3"].get("storage:requester_pays"):
            if requester_pays:
                info["env"].update({"AWS_REQUEST_PAYER": "requester"})

        if bands := extras.get("raster:bands"):
            stats = [
                (b["statistics"]["minimum"], b["statistics"]["maximum"])
                for b in bands
                if {"minimum", "maximum"}.issubset(b.get("statistics", {}))
            ]
            if len(stats) == len(bands):
                info["dataset_statistics"] = stats

        return info

This alternative tiler is then exposed at a new endpoint on the same application: /alt/collections/{collection_id}/items/{item_id}.

It is up to the user to basically use the right URL pattern for a specific collection.

The new base route /alt, as in /alt/collections, is non-invasive on the existing routes.

An alternative could be to add a parameter to pass to endpoints like /collections/{collection_id}/items/{item_id}/tiles/{tileMatrixSetId}/{z}/{x}/{y}.{format} like &alternate_assets_key=s3 or so. Or not specify the key but a flag - since s3 seems to be a commonly used alternate asset key?

Acceptance criteria

  • The TiTiler tiles endpoints can pick up assets from alternate assets s3 instead of the main item href, for selected collections

Labels

  • enhancement
  • discussion
  • help wanted

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions