fix(signals): evict stale db connections in temporal activities#58709
Draft
posthog[bot] wants to merge 1 commit into
Draft
fix(signals): evict stale db connections in temporal activities#58709posthog[bot] wants to merge 1 commit into
posthog[bot] wants to merge 1 commit into
Conversation
Long-running Temporal workers don't go through Django's request cycle, so the request_started / request_finished signals that normally call close_old_connections() never fire. Connections that exceed CONN_MAX_AGE or get killed by the database stay in the per-thread pool until the next ORM call fails — which is what surfaced as an OperationalError: the connection is closed inside run_signal_semantic_search_activity. Apply the existing @close_db_connections decorator from posthog.temporal.common.utils across every signals Temporal activity that touches the Django ORM, so connections are evicted before and after each invocation. Skipped activities that only talk to ClickHouse, Kafka, object storage, or the Temporal client. Generated-By: PostHog Code Task-Id: a526a5ec-c960-4d99-b513-3b0f0cbe87a7
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A new
OperationalError: the connection is closedexception surfaced in the signals semantic-search Temporal activity. The traceback lands insiderun_signal_semantic_search_activityatproducts/signals/backend/temporal/signal_queries.py, whereawait Team.objects.aget(pk=input.team_id)calls into psycopg's_check_connection_okand finds the cached Django connection closed before it can create a cursor.Long-running Temporal workers don't go through Django's request cycle, so the
request_started/request_finishedsignals that normally invokeclose_old_connections()never fire. Connections that exceedCONN_MAX_AGEor are killed by Postgres/middleboxes stay in the per-thread pool until the next ORM call fails. The same unguarded pattern exists across the rest of the signals Temporal activities, even thoughposthog/temporal/common/utils.pyalready exposes the canonicalclose_db_connectionsdecorator and other products (tasks) already apply it.Changes
Stack
@close_db_connectionsfromposthog.temporal.common.utilsbeneath@temporalio.activity.defnand@scoped_temporal()on every signals Temporal activity that touches the Django ORM:signal_queries.py—fetch_signal_type_examples_activity,run_signal_semantic_search_activity,wait_for_signal_in_clickhouse_activity,fetch_signals_for_report_activitysummary.py—mark_report_in_progress_activity,mark_report_ready_activity,mark_report_failed_activity,mark_report_pending_input_activity,reset_report_to_potential_activityreingestion.py—soft_delete_report_signals_activity,delete_report_activity,reingest_signals_activity,process_team_signals_batch_activity,delete_team_reports_activitygrouping.py—get_embedding_activity,fetch_report_contexts_activity,assign_and_emit_signal_activitybackfill_error_tracking.py—fetch_error_tracking_issues_activity,emit_backfill_signal_activityemit_eval_signal.py—emit_eval_signal_activityreport_safety_judge.py—report_safety_judge_activityagentic/select_repository.py—select_repository_activityagentic/report.py—run_agentic_report_activityActivities that only talk to ClickHouse, Kafka, object storage, or the Temporal client (
publish_report_completed_activity,pause_grouping_until_activity,get_grouping_paused_state_activity,restore_grouping_pause_activity,read_signals_from_s3_activity) are left untouched.The decorator no-ops under
settings.TEST, so existing pytest fixtures that rely ontransaction=Trueare unaffected.How did you test this code?
Agent-authored change.
ast.parse.ruff check products/signals/backend/temporal/— all checks passed.ruff format --check products/signals/backend/temporal/— all files already formatted.products/tasks/backend/temporal/process_task/activities/*.py, where the contract is documented inposthog/temporal/common/utils.pyand exercised byposthog/temporal/tests/common/test_utils.py.Publish to changelog?
no
🤖 Agent context
PostHog Code agent worked from a signal report flagging a single new
OperationalError: the connection is closedissue inrun_signal_semantic_search_activity. The report identified the canonicalclose_db_connectionsdecorator as the existing fix pattern (already used in the tasks product) and recommended a sweep across the rest of the signals activities. The agent verified the decorator's stacking expectations (@activity.defnon top,@close_db_connectionsinnermost), then applied it across every signals activity that calls the Django ORM, skipping ones that only touch ClickHouse, Kafka, object storage, or the Temporal client to avoid noise.Created with PostHog Code