Fix test history cron job crash and optimize performance#4942
Open
DanielRyanSmith wants to merge 1 commit into
Open
Fix test history cron job crash and optimize performance#4942DanielRyanSmith wants to merge 1 commit into
DanielRyanSmith wants to merge 1 commit into
Conversation
- Fix ValueError crash in get_aligned_run_info by supporting datetime strings both with and without microseconds (fallback to %Y-%m-%dT%H:%M:%SZ). - Implement in-memory caching for GCS recent statuses to avoid redundant downloads/uploads, reducing GCS traffic by ~90%. - Parallelize browser processing (Chrome, Edge, Firefox, Safari) using ThreadPoolExecutor with thread-local NDB contexts. - Implement deterministic key names for TestHistoryEntry using SHA-256 hashes of test names to ensure idempotency and prevent duplicate entries on retries. - Implement checkpointing to commit processed date to Datastore and flush cached GCS statuses only every 20 revisions. - Increase Datastore batch write size from 200 to 500 to optimize write throughput. - Add --force CLI flag to allow manual start date override when Datastore is not empty. - Pre-compile regular expressions to optimize whitespace substitution in loops. - Print main() return value on exit to show timeout/completion status. TAG=agy CONV=9096c6ce-d7f3-4a97-aa8d-31e76c7337c5
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR fixes a ValueError crash in the test history cron job and implements major performance optimizations (caching, parallelization, and checkpointing) to speed up processing and catch up on months of missing history.
Root Cause / Motivation
2025-10-02T15:34:39Z), because the script strictly expected%Y-%m-%dT%H:%M:%S.%fZ.Detailed Changelog
process_test_history.py:_parse_datetimehelper to support datetime parsing with/without microseconds._prev_test_statuses_cachefor GCS status files.process_single_runto run within thread-local NDB contexts and use the GCS cache.ThreadPoolExecutor(4 workers).TestHistoryEntrybased on SHA-256 hashes of test names to ensure idempotency.mainloop to commit date and flush GCS cache every 20 revisions.--forceCLI flag to bypass empty Datastore check when manually setting start date.main()return value on exit.