Background
Currently, track.days option makes Pramen check previous days record counts to ensure data at the source hasn't changed. But there is a new requirement:
- If data already loaded for a specific day, never re-check the record count
- If data hasn't been loaded for a specific track day, re-check the record count
Feature
Add an option to allow look back for ingestions for previous days if there were no data at the source before.
Example [Optional]
--
Proposed Solution [Optional]
Option 1
Maybe add an option, for example:
pramen {
backfill.days = 5
track.days = 0
}
to implement such a behavior.
Option 2
Re-purpose track.changes for metastore tables. If it is not set tread track.days as backfill.days.