-
Notifications
You must be signed in to change notification settings - Fork 126
Open
Labels
good first issueGood for newcomersGood for newcomershacktoberfestIssues open for Hacktoberfest contributorsIssues open for Hacktoberfest contributorshelp wantedExtra attention is neededExtra attention is needed
Description
Description:
Currently, stats are not available on resumed syncs because the state object does not persist stat-related information for each stream. As a result, we lose visibility into sync progress and estimated time after a restart.
Problem:
- On resuming a sync, metrics like total records to process and progress percentage are not available
- This impacts user experience and observability
- Estimations (e.g., time remaining) cannot be calculated accurately without the initial stats
Proposed Solution:
- Update the state schema to include stat information for each stream (e.g., total record count, synced count)
- Store necessary values in state so that they can be reloaded on resume
- function to check pool.AddRecordsToSync(totalCount) during resume, based on the saved stats
Action Items:
- Extend state struct to include relevant stats per stream
- Modify sync/resume flow to read/write stats to state
- Add tests to validate stat persistence across sync sessions
olake/drivers/mongodb/internal/backfill.go
Lines 72 to 77 in 2efdf81
// TODO: to get estimated time need to update pool.AddRecordsToSync(totalCount) (Can be done via storing some vars in state) | |
rawChunkArray := chunks.Array() | |
// convert to premitive.ObjectID | |
for _, chunk := range rawChunkArray { | |
premitiveMinID, _ := primitive.ObjectIDFromHex(chunk.Min.(string)) | |
premitiveMaxID, _ := primitive.ObjectIDFromHex(chunk.Max.(string)) |
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomershacktoberfestIssues open for Hacktoberfest contributorsIssues open for Hacktoberfest contributorshelp wantedExtra attention is neededExtra attention is needed