[MAPPS-4702] Fixing the Data loss in Dataset Query Input #149
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Jira Link: https://sentinelone.atlassian.net/browse/MAPPS-4702
🥅 Goal
Need to fix the data loss which is happening on the Dataset query input.
Issue
After performing some debugging on the application in my Splunk box, I discovered that data loss was occurring due to improper handling of the checkpoint, start time, and end time in the app. In the previous application code, there was a validation checking if the splunk_dt was greater than the checkpoint_date. However, some events had the same timestamp, and since the previous design treated the timestamp as the primary key, this caused data loss when events shared the same timestamp.
🛠️ Solution
To address the issue and prevent data loss, I updated the checkpointing logic. The application now performs proper validations and checkpointing, ensuring that events with the same timestamp are handled correctly without causing data loss.
🏫 Testing
Tested the changes on my Splunk box by validating the payload being sent with the API call to ensure accuracy and proper handling. Additionally, I verified that the event count matches correctly.



