Update update_config.py - update log files in background#3354
Update update_config.py - update log files in background#3354tpd-opitz wants to merge 11 commits into
Conversation
improve boot up time by - [x] fixing todays log(s) synchronously - [x] fixing older logs asynchronously
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce boot time during datastore upgrade 122 by converting the newest log files synchronously and deferring conversion of older logs to “background” processing.
Changes:
- Refactors
upgrade_datastore_122()to process the most recently modified log file first, then schedule remaining log conversions asynchronously (with a synchronous fallback). - Updates the unit test for datastore upgrade 122 (adds a delay and imports
time).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| packages/helpermodules/update_config.py | Changes datastore upgrade 122 to prioritize the newest log file and attempt background processing for older files. |
| packages/helpermodules/update_config_test.py | Modifies the datastore 122 test to wait after running the upgrade. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
packages/helpermodules/update_config.py:3143
- The background thread ultimately calls
self._append_datastore_version(122), which mutatesself.all_received_topicsand publishes to MQTT via__update_topic. Sinceupdate()continues with other steps that iterate overself.all_received_topics(e.g.,__remove_outdated_topics), this introduces a real risk of concurrent mutation during iteration (RuntimeError) and hard-to-reproduce startup behavior. Suggestion: keep the background thread strictly file-I/O only (no interaction withself/ MQTT), and perform any datastore_version update on the main thread (or protectall_received_topicswith a lock and only append once after all background work is done).
# Process remaining files in background to avoid blocking startup
if len(path_list) > 1:
remaining_files = path_list[1:]
log.debug(f"Starting background thread to process {len(remaining_files)} remaining files")
threading.Thread(target=process_files, args=(remaining_files,), daemon=True).start()
else:
ef13dcb to
1d42379
Compare
| def process_files(file_paths: List[Path]) -> None: | ||
| """Process remaining files and append datastore version afterwards.""" | ||
| for file_path in file_paths: | ||
| process_file(file_path) | ||
| self._append_datastore_version(122) | ||
|
|
| shuffled_paths_by_folder = {} | ||
| for folder_name, file_names in folders.items(): | ||
| shuffled_names = file_names[:] | ||
| random.shuffle(shuffled_names) | ||
| shuffled_paths_by_folder[folder_name] = [ | ||
| Path(f"/mock_base/data/{folder_name}/{file_name}") | ||
| for file_name in shuffled_names | ||
| ] |
There was a problem hiding this comment.
there is no flakiness caused by the random file order
|
Die Log-Dateien werden immer mal in Format und Daten angepasst, wie Du in der update_config siehst. Bisher wurde das immer seriell gemacht. |
Ich denke, dass die Logfile-Updates in eine zentrale Funktion ausgelagert werden sollte, so dass bei zukünftigen updates der Struktur nur die eigentliche Update-Routine als lambda übergeben wird. Diese zentrale Funktion könnte sich dann auch darum kümmern, dass es nur einen background Prozess gibt, in den sie alle Updates als queue ein stellt... |
improve boot up time by