fix(dashboard-api): async hygiene in routers/extensions.py#1022
Merged
Lightheartdevs merged 1 commit intoLight-Heart-Labs:mainfrom Apr 27, 2026
Merged
Conversation
Three async-hygiene defects in routers/extensions.py: - extension_logs and extensions_catalog called blocking urllib.urlopen directly on the event-loop thread. With the Console modal polling every 2s and a 30s agent timeout, one slow host-agent response could stall the dashboard-api for up to 30s at a time. - _call_agent, _call_agent_invalidate_compose_cache, and _call_agent_compose_rename caught `except Exception`, swallowing non-network programmer errors with a misleading "host agent unreachable" log. Narrow to (URLError, HTTPError, OSError, TimeoutError) and log the actual exception. - _cleanup_stale_progress was dispatched via run_in_executor with a discarded Future, so failures surfaced only as "Future exception was never retrieved" warnings in stderr. Offload blocking urllib calls via asyncio.to_thread (matching the existing pattern in main.py's api_settings_env_save). Attach a log-on-exception done-callback to the cleanup future. Tests cover the URLError swallow path, the new re-raise behaviour on non-network errors, and the cleanup callback logging.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Fixes three async-hygiene defects in
routers/extensions.py: blocking urllib calls on the event-loop thread, over-broadexcept Exceptioncatches that swallow programmer errors, and a fire-and-forget executor future that silently discards cleanup failures.Why
extension_logsandextensions_catalogcalledurllib.urlopendirectly on the async event loop. The Console modal polls every 2 s with a 30 s agent timeout, so a single slow host-agent response could block the entire dashboard-api event loop for up to 30 s._call_agent,_call_agent_invalidate_compose_cache, and_call_agent_compose_renamecaughtexcept Exception, meaningAttributeError,TypeError, and other programmer bugs were silently swallowed and logged as "host agent unreachable" — masking real bugs._cleanup_stale_progresswas dispatched viarun_in_executor(None, ...)with the returned Future immediately discarded, so any unhandled exception inside the cleanup only produced an opaque "Future exception was never retrieved" warning in stderr with no context.How
_fetch_agent_logs(url, headers, data, timeout) -> str— a plain synchronous function that can be safely offloaded viaasyncio.to_thread.extension_logs: replaced inlineurllib.urlopenblock withawait asyncio.to_thread(_fetch_agent_logs, ...), matching the existingasyncio.to_threadpattern already used inmain.py'sapi_settings_env_save.extensions_catalog: replaced direct_check_agent_health()call withawait asyncio.to_thread(_check_agent_health)._call_agent,_call_agent_invalidate_compose_cache,_call_agent_compose_rename,_check_agent_health: narrowedexcept Exception→except (urllib.error.URLError, urllib.error.HTTPError, OSError, TimeoutError). Non-network exceptions now propagate with a full stack trace instead of a misleading warning.extensions_catalog: retained therun_in_executorFuture in_cleanup_futureand attached_log_cleanup_erroras adone_callbackto log any exception at ERROR level with full context.Testing
tests/test_extensions.py):test_call_agent_returns_false_on_urlerror— network errors returnFalseand log a warning.test_call_agent_reraises_non_network_errors—AttributeErrorpropagates (behaviour change from old broad catch).test_catalog_logs_when_cleanup_future_fails— cleanup RuntimeError is logged at ERROR level, catalog endpoint still returns 200./api/extensions/catalogand/api/extensions/{id}/logsreturn promptly without blocking other requests.Platform Impact
Known Considerations
Non-blocking follow-ups:
_call_agent_invalidate_compose_cacheand_call_agent_compose_rename(the other two narrowed helpers)._log_cleanup_errorout of theextensions_catalogfunction body to module level for reuse.except Exception:blocks inmain.py.