-
Notifications
You must be signed in to change notification settings - Fork 636
fix: (cli) make export more stable and introduce stream in sketch.search() as optional #3662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…n the cli client. Also integrate sketch.export in the cli client as optional.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable streaming feature for sketch exports to improve stability with large datasets. The implementation across the API client and CLI is well-executed, with corresponding updates to documentation and the addition of end-to-end tests. I've included one suggestion to refactor the new test case for better maintainability by reducing code duplication. Overall, this is a great enhancement.
|
/gemini description |
|
It looks like you might have intended to use |
|
/gemini summary |
Summary of ChangesThis pull request significantly improves the export functionality within Timesketch by addressing memory efficiency for large datasets and offering more granular control over export content. It introduces a streaming mechanism for search results, allowing users to export extensive data without encountering memory issues, and provides options to perform either a focused event export or a complete sketch export directly from the command-line interface. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
|
| search_obj = search.Search(sketch=sketch) | ||
|
|
||
| click.echo(f"Number of events in that sketch: {search_obj.expected_size}") | ||
| if use_sketch_export: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if I run the CLI with both flags --stream and use_sketch_export at the same time? If I understand this correctly, my stream flag would be ignored?
Maybe we can change the export method in the API client to also support a stream flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
Summary
This pull request significantly improves the export functionality within Timesketch by addressing offering more granular control over export content. It introduces a streaming mechanism for search results, allowing users to export extensive data without encountering memory issues, and provides options to perform either a focused event export or a complete sketch export directly from the command-line interface.
Highlights
sketch.search().to_file()to enable streaming of large search results directly to a file.timesketch sketch exportcommand now supports a--streamflag to leverage the new streaming capability and a--use_sketch_exportflag to choose between event-only export (default) and a comprehensive sketch export (including stories, aggregations, views, and metadata).streamparameter and the enhancedsketch exportcommand options.sketch exportcommand, covering default, streaming, and full sketch export scenarios to ensure stability and correctness.Also:
For large results, you can use the optional
streamparameter to avoid loading the whole result into memory:Also introducing test for it and updating docs