Skip to content

Releases: The-Academic-Observatory/observatory-platform

0.6.0

06 Dec 03:42
2b34ebf

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 0.5.0...0.6.0

0.5.0

23 Jun 04:17
814cb28

Choose a tag to compare

What's Changed

Full Changelog: 0.4.0...0.5.0

0.4.0

20 May 03:38
353f7a7

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 0.3.0...0.4.0

0.3.0

18 Mar 00:29
fbbc382

Choose a tag to compare

What's Changed

Full Changelog: 0.2.1...0.3.0

0.3.0-dev

06 Jan 06:33

Choose a tag to compare

0.3.0-dev Pre-release
Pre-release
INF-73: Minor updates unit tests

0.2.1

30 Sep 01:09
64b62cb

Choose a tag to compare

This release includes the following bugfix in the Dockerfile:

  • Install apache-airflow-providers-google==5.1.0 with --no-deps so that pip doesn't spend forever trying to resolve dependencies for the package, which we only use for remote logging and secret manager backend in the cloud deployment. The google-cloud-secret-manager Python package is added as a dependency in requirements.txt.

0.2.0

29 Sep 21:28
b8c9cac

Choose a tag to compare

This release includes the following changes / new features:

  • Upgrade to Airflow 2.1.4.
  • Stream Telescope: remove use of XComs so that it is easier to maintain.
  • Updated documentation.
  • download_files: uses DownloadInfo class and prefix_dir parameter to allow prefixing the filename paths.
  • Remove third party get_file and _hash_file functions as they are replaced by get_file_hash and download_file.
  • Command line interface: added generate workflow and project commands.
  • Added OrganisationTelescope.

And the following bugfixes:

  • Docker Compose file: rename deprecated Airflow config environment variables.
  • Docker Compose file: change AIRFLOW__SECRETS__BACKEND to use class installed from apache-airflow-providers-google package and remove airflow subpackage as is no longer required.
  • Fix typo in config.yaml.jinja2.
  • Fix on_failure_callback function.

0.1.1

20 Sep 22:33
ce5893b

Choose a tag to compare

This release includes the following bugfixes:

  • Sdist building:
    • added missing data_files in config.cfg.
  • Docker Compose / Airflow 2:
    • Received the error "daemonic processes are not allowed to have children" when tasks ran that use multiprocessing.Pool, to address it added AIRFLOW__CORE__EXECUTE_TASKS_NEW_PYTHON_INTERPRETER to the Docker Compose file. This is the same error described in this Stack Overflow post.
    • Set Docker Compose volume paths correctly for editable workflows packages when deployed to Terraform.
  • Terraform:
    • For Terraform config where Google Cloud Secrets that were made had their value set to the secret key instead of the secret value.
    • Update TerraformBuilder so that it builds with the latest changes.
    • Observatory API: postgres connection prefix deprecated in PostgresSQL 1.4, so changed in Terraform file to postgresql.
  • Address inconsistent use of dates:
    • Change type hints pendulum.datetime to pendulum.DateTime (the class, not function).
    • Change datetime.datetime calls to pendulum.datetime.
    • Make select_table_shard_dates return List[pendulum.Date]
    • Add a make_release_date function, which returns a pendulum.DateTime instance, which is required for some of the downstream functions that use it.
  • get_airflow_connection_url: call get_uri to get the uri.

And the following new features:

  • Added black to precommit config.
  • load_dags.py:
    • When DagBag has import errors, raise an exception that has a message with all of the errors so that the Dag import errors are visible in the
  • Testing:
    • Add simple threaded httpserver for testing use
  • Utilities
    • Add get_observatory_http_header to create simple header dict using custom user agent
    • Add get_fiename_from_url to get a filename from a http url
    • Add get_chunks function to split lists into constant size (unless last chunk) chunks.
    • Add get_airflow_connection_url to pull a url from an airflow connection, validate it, and add trailing "/" if necessary.
    • Add converter function for csv to jsonl files.
    • Add http get response functions for simple interfaces to standardise getting http raw text response, xml -> dict, json->dict.
    • Add AsyncHttpFileDownloader with download_file and download_files interfaces for downloading files using http. Allows custom headers to be used in http connection.
    • download_files allows concurrent downloading through asyncio and aiohttp. Supports retry on failure with exponential backoff.
    • download_file piggybacks off download_files. No speed benefit from asyncio, but provides a simpler interface.
    • add get_airflow_connection_password
    • add unzip_files function
    • add find_replace_file (sed cli replacement)
    • add fn to wrap shell cmd calls. treats non zero exit as error.
  • Snapshot telescope:
    • Add upload_downloaded as a snapshot telescope task. I noticed a lot of the upload_downloaded tasks in snapshot telescopes are identical in implementation. They all just upload the download_files list of files from the release object to the download_bucket in the cloud. Since this is a standard pattern we have adopted, it may as well just be part of the snapshot telescope implementation.
    • Add download, extract, transform tasks to template.
  • Stream telescope:
    • Add download, upload_downloaded, extract, transform tasks to template.

0.1.0

10 Sep 05:02
434cbfc

Choose a tag to compare

First observatory-platform release.