Skip to content

Possible datasets for benchmarks #1

Open
@brendan-ward

Description

@brendan-ward

A few data sources to consider for bigger benchmarks:

U.S. high resolution hydrography data

These are served by 4th-code watersheds (download a *_gdb.zip) that have data within an ESRI File Geodatabase.

  • NHDFlowline: river / stream center lines
  • NHDWaterbody: lakes / rivers
  • (several other layers with other geometry types but generally smaller in volume)

We use some of these in pyogrio

Useful for testing intersection of waterbodies and flowlines, clipping, etc.

World Database on Protected Areas
(see the download button)

3GB dataset that has terrestrial and marine protected areas

One of the "advantages" for doing bencharks with some of these is that the geometries are not always clean, so these could be good for benchmarking things like making them valid or unioning them together, or intersecting them with admin boundaries like countries or EEZs (below).

Marine regions

For example, the World EEZ (Exclusive Economic Zones) dataset is a useful one to try and intersect with marine protected areas above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions