Add saint.tech (IMG Saxony-Anhalt) image provider DAG#5576
Open
wprashed wants to merge 26 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves #5571
This PR adds a new Provider API DAG for SAiNT (IMG Saxony-Anhalt) to ingest CC-licensed images of touristic points of interest, tours, and events in Saxony-Anhalt.
Changes Made
saint_techas a default provider incatalog/dags/common/loader/provider_details.pyand set its default image category toPHOTOGRAPH.catalog/dags/providers/provider_api_scripts/saint.pycontaining theSaintDataIngesterclass. The script handles pagination via thepageandpageSizeparameters, passing an API key fetched from the Airflow VariableAPI_KEY_SAINT. It extracts thePrimaryImageattributes (URL, width, height), POI title, foreign identifier, and evaluates the attached licenses to ensure compatibility.SaintDataIngesterincatalog/dags/providers/provider_workflows.pyto run on a monthly schedule.catalog/tests/dags/providers/provider_api_scripts/test_saint.pyfor fetching parameters, batch data extraction, and correct parsing of records.Testing Instructions
airflow variables set API_KEY_SAINT "<your-saint-tech-api-key>"just catalog/test tests/dags/providers/provider_api_scripts/test_saint.pysaint_tech_workflowDAG to verify real-world data ingestion works as expected.