Skip to content

Further modularize #42

@tcmetzger

Description

@tcmetzger

Refactor three functional areas:

  • The core functionality of processing the data into a cloud warehouse-compatible format happens in a core module (io.py). This core module always takes a NumPy array as input and only has minimal dependencies.
  • All handling of concrete input and output data happens in handler modules. For version 0.1, we will have a handler module for local TIF file input and a handler module to write data to BigQuery. In future version, we could add handler modules to use data from an S3 bucket as input, store data in Redshift, etc. These handler modules have their own dependencies (such as rasterio for TIF reading, or google.cloud.bigquery for writing to BigQuery). Those dependencies only get installed when the handler module is actually used.
  • Interaction with end users happens through the CLI module which wraps all functionalities of the handler modules and the core module. The CLI will also handle installation of additional dependencies (e.g. when the TIF reading handler module is used for the first time), and set up of credentials where necessary (e.g. ask for the location of the credentials file for BigQuery)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

Status

Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions