Skip to content

v2.0 - Parquet dataset ETL architecture

Latest
Compare
Choose a tag to compare
@ghukill ghukill released this 24 Feb 14:07
52cb699

This v2.0 release marks a change in TIMDEX ETL to use parquet datasets as the primary storage architecture.

What's Changed

  • Install timdex-dataset-api library from main by @ghukill in #223
  • TIMX 454 - parse BS4 in isolated thread to avoid dangling memory pointers by @ghukill in #231
  • TIMX 406 - add provenance data to transformed records by @ghukill in #233
  • TIMX 459 - update logging by @ghukill in #244

Full Changelog: v1.6.0...v2.0