Skip to content

fix: use create_api_app in OrchestrationClient to prevent read-only container failures#20905

Open
Br1an67 wants to merge 1 commit intoPrefectHQ:mainfrom
Br1an67:fix/19317-ha-readonly-automation
Open

fix: use create_api_app in OrchestrationClient to prevent read-only container failures#20905
Br1an67 wants to merge 1 commit intoPrefectHQ:mainfrom
Br1an67:fix/19317-ha-readonly-automation

Conversation

@Br1an67
Copy link
Contributor

@Br1an67 Br1an67 commented Mar 1, 2026

Closes #19317

Background services (e.g. actions service) use OrchestrationClient, which previously called create_app(). This triggers UI static directory creation via create_ui_static_subpath(), which fails with PermissionError in read-only containers (common in rootless/secure deployments). The exception is swallowed, causing automation actions to silently fail and messages to go to DLQ with no error logs.

Changes

1. Use create_api_app for OrchestrationClient (src/prefect/server/api/clients.py)

OrchestrationClient only needs API routes for in-process ASGI calls. Using create_api_app() instead of create_app() avoids UI creation and background services startup entirely.

2. Add error logging for unexpected action failures (src/prefect/server/events/actions.py)

Catch and log unexpected exceptions in the action consumer before re-raising. This provides visibility when actions fail for reasons other than ActionFailed, instead of messages silently going to DLQ.

3. Regression test (tests/server/api/test_clients.py)

Verifies that OrchestrationClient uses create_api_app.

Checklist

@codspeed-hq
Copy link

codspeed-hq bot commented Mar 1, 2026

Merging this PR will not alter performance

✅ 2 untouched benchmarks


Comparing Br1an67:fix/19317-ha-readonly-automation (79ce72e) with main (62b415b)

Open in CodSpeed

@Br1an67 Br1an67 force-pushed the fix/19317-ha-readonly-automation branch from 03a04e6 to f5c8c29 Compare March 1, 2026 06:48
# will point it to the the currently running server instance.
# ephemeral=True skips UI static file creation, which would fail in
# read-only containers (e.g. rootless/secure deployments).
api_app = create_app(ephemeral=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better solution to this issue would be to use create_api_app instead of create_app. That will fix the UI issue and also avoid adding a lifecycle that starts background services when we don't want them.

@Br1an67
Copy link
Contributor Author

Br1an67 commented Mar 2, 2026

Thanks @desertaxle, that's a much cleaner approach — create_api_app avoids both the UI issue and the unnecessary lifespan/background services. I've updated the PR to use it instead.

@Br1an67 Br1an67 force-pushed the fix/19317-ha-readonly-automation branch from f31b03f to 2c1c421 Compare March 2, 2026 16:33
@Br1an67
Copy link
Contributor Author

Br1an67 commented Mar 2, 2026

Reverted back to create_app(ephemeral=True) — the switch to create_api_app() caused 394 test failures for two reasons:

  1. 404 errors: create_api_app() returns the API sub-app with routes at the root (/work_pools/filter), but BaseClient uses base_url='/api', so requests go to /api/work_pools/filter which doesn't match any routes on the bare sub-app.

  2. AttributeError on PrefectRouter.routes: create_api_app() with different kwargs (no exception handlers) creates a separate cached instance. When create_app(final=True) has already run, del router.routes makes the global routers unusable for any new include_router() calls.

create_app(ephemeral=True) is the correct fix because:

…ontainer failures

OrchestrationClient only needs API routes for in-process ASGI calls.
Using create_api_app() instead of create_app() avoids UI static file
creation (PermissionError in read-only containers) and background
services startup.

Also adds error logging for unexpected action failures in the message
handler, so they appear in logs instead of silently going to DLQ.

Closes PrefectHQ#19317
@Br1an67 Br1an67 force-pushed the fix/19317-ha-readonly-automation branch from 2c1c421 to 79ce72e Compare March 3, 2026 14:56
@Br1an67
Copy link
Contributor Author

Br1an67 commented Mar 3, 2026

Updated to use create_api_app() directly as suggested. It provides exactly what OrchestrationClient needs — just the API routes for in-process ASGI calls — without UI creation or background services.

@Br1an67 Br1an67 changed the title fix: use ephemeral mode in OrchestrationClient to prevent read-only container failures fix: use create_api_app in OrchestrationClient to prevent read-only container failures Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Automations 'run-deployment' action silently fails in HA when UI static dir is read-only

2 participants