-
Notifications
You must be signed in to change notification settings - Fork 12
Description
We have a use case in which a PostGIS table with a bytea containing well-known binary (WKB) geometry - or, more generally, any source, including CSV, with a string field containing a well-known text (WKT) representation of geometry - that needs to be processed using one of the GIS plugins / transforms.
As far as I know, currently, there is no way to convert between these representations. This raises the question of where and how such conversions should be supported within Apache Hop. When reading/writing from PostgreLSQL/PostGIS or other sources, geometry data may appear as follows:
- a PostgreSQL field of type
varchar/textwith WKT or EWKT (with SRID) - a PostgreSQL field of type "bytea" - appearing as PostGIS "geometry" type - with values stored as PostGIS WKB or EWKB (with SRID)
- a PostgreSQL field of type "bytea" with values stored in other binary forms (not supported here).
In order to use GIS operations in Hop, such as Geometry Information, Geoprocessing, and CRS transforms, we need an explicit Geometry value type.
Currently, there is AFAIK no straightforward transform to cast forth and back from string (as WKT/EWKT) or bytea (as WKB/EWKB) to an explicit Hop geometry type (recognized by the Hop GIS transforms), as far as I know.
Options considered:
- Extend the core "Table Input" and "Table Output" transforms. Add PostGIS geometry handling directly in Hop’s built-in database transforms.
- ✅ Transparent to users
- ❌ Touches core Hop code and all DB backends
- ❌ Harder to maintain; DB-specific logic in generic transforms
- Extend the “Select Values” transform. Enable the existing “Select Values” transform to cast from/to the geometry type using the GIS plugin’s “Geometry” value type.
- ✅ Reuses existing, well-known transform
- ✅ Works with Spark/Flink/Dataflow engines
⚠️ Requires extending the Geometry ValueMeta to implement conversions (WKT↔Geometry, WKB↔Geometry)
- Add a new GIS transform: create a transform that handles all conversions between geometry types from/To WKT, EWKT, WKB, PostGIS geometry, and the Hop GIS geometry type. That could include SRID handling and format options.
- ✅ Keeps DB and GIS code separate
- ✅ Easier to discover for users working with GIS data
⚠️ Functionally overlaps with "Select Values"⚠️ Likely Hop-engine only
I would lean towards option 3, which adds a transform called i.e. "Select GIS Value."
- This option keeps both core DB transforms and existing GIS transforms untouched.
- Provides a clean, GIS-specific interface for converting between geometry representations.
- It can also reuse the conversion logic from the GIS plugin's ValueMetaGeometry implementation internally.
Example use case pipeline:
Read WKT field from CSV or read from PostgreSQL/PostGIS table.
→ Use "Select GIS Value" transform to cast either WKB to Geometry or WKT to Geometry.
→ Apply GIS transforms (e.g., Geoprocessing).
→ Optionally cast Geometry to WKB or Geometry to WKT for output
Alternatives:
No alternaties known.
Has the feature been requested before?
Not AFAIK.
If the feature request is approved, would you be willing to submit a PR?
Depends on the option chosen: If it's about extending the existing transforms, then probably not. If it's about option 3 and a new transform, then probably yes.