Skip to content

TRT-635: Migrate GDAL commands to Python bindings.#42

Merged
owenlittlejohns merged 6 commits intomainfrom
TRT-635-remove-shell-command-execution
Oct 7, 2025
Merged

TRT-635: Migrate GDAL commands to Python bindings.#42
owenlittlejohns merged 6 commits intomainfrom
TRT-635-remove-shell-command-execution

Conversation

@owenlittlejohns
Copy link
Copy Markdown
Member

@owenlittlejohns owenlittlejohns commented Oct 2, 2025

Description

Alright then. This is the big one! (Or at least one of the things we've been talking about for the longest)

This PR converts requests to GDAL over to using the available Python bindings, rather than executing shell commands directly. I'll leave some comments on the PR, but I think a couple of really helpful links:

Also, I must admit, I initially leant on the Google AI generated answer for how to convert ogr2ogr to Python, but the options in VectorTranslateOptions gave me a bit of confidence.

I think we're also possibly missing some regression tests - looks like we don't really test reprojection or shape file workflows.

Jira Issue ID

TRT-635 - the neverending story

Local Test Steps

  • Pull this branch. Execute the unit tests: ./bin/build-image && ./bin/build-test && ./bin/run-test. All tests should pass.
  • Fire up an instance of Harmony in a Box, with LOCALLY_DEPLOYED_SERVICES=harmony-gdal-adapter
  • Run through the HGA Regression tests, by:
    • Creating the papermill-hga conda environment: conda env create -f hga/environment.yml.
    • Activating that environment: conda activate papermill-hga.
    • Starting a Jupyter notebook server: jupyter notebook.
    • Navigating in your browser to the HGA regression test notebook.
    • Setting harmony_host_url='http://localhost:3000'.
    • Running all tests. They should pass.
  • Confirm the service version has been added to the output file:
    • Go to http://localhost:3000/jobs and pick a job with a GeoTIFF output.
    • Download the output file.
    • Run gdalinfo on that output.
    • Look at the gdalinfo output in your terminal to find a line:
gdal_subsetter_version=3.0.5

(This will be in the Metadata section, a little bit above the corner points.)

PR Acceptance Checklist

  • Jira ticket acceptance criteria met.
  • version.txt and CHANGELOG.md updated if any service code is changed.
  • Tests added/updated and passing.
  • Documentation updated (if needed).

Comment thread gdal_subsetter/transform.py Outdated
Comment thread gdal_subsetter/transform.py Outdated
Comment thread gdal_subsetter/transform.py
Copy link
Copy Markdown
Member

@flamingbear flamingbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, so much nicer. But I see at least one typo and had a question about using process from the Format.

Comment on lines -339 to -345
def execute_gdal_command(self, *args) -> list[str]:
"""This is used to execute gdal* commands."""
self.logger.info(
args[0] + " " + " ".join(["'{}'".format(arg) for arg in args[1:]])
)
result_str = check_output(args).decode("utf-8")
return result_str.split("\n")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Comment thread gdal_subsetter/transform.py Outdated
def resize(self, layerid: str, srcfile: str, dstdir: str) -> str:
"""Resizes the input layer and returns the name of the output file.

Uses a call to gdal_translate using information in message.format.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't do this anymore.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it kind of does - osgeo.gdal.Translate is just a binding around gdal_translate - but I can improve the documentation string.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 62f49ef.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely thought gdal_translate was the executable. I didn't realize osgeo.gdal.Translate actually called it.

Uses a call to gdal_translate using information in message.format.

"""
interpolation = self.message.format.process("interpolation")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't process mark this as being "used" or something? I thought process was a way to remove things from a message before downstream services use it. I only have a vague idea about this and I'm sure you told me. But just looking at this change and want to make sure we're not breaking something.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - process is meant to do that. From what I can tell:

So:

  1. My first concern was that the message wouldn't be able to reuse the same property within the service, and some of the message properties are used multiple times (CRS, for example). This is irrelevant. Message.output_data won't preclude parameter reuse in the same service.
  2. Do we want to strip things out of the message? I'm honestly not sure. I could see times when it's good (preventing repeat processing) and times when it's bad (the same parameter being needed in multiple services).

I guess I don't have an answer for what we should do, but definitely acknowledge this is now doing something different. What are your thoughts?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions. This used to remove an "interpolation" value so it wouldn't be available to downstream services, but this is the service that should do interpolation probably. I don't think a lot of services are using the process mechanism. But I also don't know if they should be.

How is that for a non-answer?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆

I added a disclaimer in the method and function documentation strings that makes it clear the parameters aren't being removed in b37c762.

resampleAlg=resample_algorithm,
height=output_height,
width=output_width,
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing the unscale option from L476? (nevermind, we stopped supporting PNGs)

Comment thread gdal_subsetter/transform.py Outdated
Comment thread gdal_subsetter/transform.py Outdated
Comment thread gdal_subsetter/transform.py
Comment thread gdal_subsetter/transform.py Outdated

interpolation = rgetattr(harmony_message, "format.interpolation", None)

if interpolation in get_args(ResampleAlgorithms):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely had to google get_args

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I think we might have used it in xarray when we were doing the DataTree migration, but it's not my most familiar go-to.

@owenlittlejohns
Copy link
Copy Markdown
Member Author

owenlittlejohns commented Oct 6, 2025

The unit tests failing are... interesting. The failures are imports of _gdal from inside the osgeo dependency. I'd not changed that, and I can't see a failure in the installation of GDAL in the CI/CD logs. Given that I hadn't changed anything in the environment between the last time the tests passed and this commit, I'm a bit surprised, if I'm honest.

Update: The conda install command in service.Dockerfile was still trying to get gdal from the defaults channel. It now retrieves it from conda-forge and that seems to have made things work again. I'll not pretend to understand the difference between the two versions, though.

Copy link
Copy Markdown
Member

@flamingbear flamingbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have opinions on the change to the message format behavior. But if you leave it maybe note it as a difference in behavior? Thanks for fixing the other things. I do think keeping this smaller and adding tests and refactoring should be done in another PR.

Comment thread docker/service.Dockerfile

# Install GDAL
RUN conda run --name hga conda install gdal=3.6.2
RUN conda run --name hga conda install gdal=3.6.2 --channel conda-forge --override-channels
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a head scratcher that I won't get pulled into. But good catch.

@owenlittlejohns
Copy link
Copy Markdown
Member Author

I do think keeping this smaller and adding tests and refactoring should be done in another PR.

Yup. I think that would be the best target for the next PR. (Not sure whether that comes before or after adding regression tests for projection. But these feel like the two next things to do)

@owenlittlejohns owenlittlejohns merged commit 2d097ea into main Oct 7, 2025
4 checks passed
@owenlittlejohns owenlittlejohns deleted the TRT-635-remove-shell-command-execution branch October 7, 2025 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants