Skip to content

fix: prevent WorkManager APPEND chain poisoning in background worker#402

Closed
imnexerio wants to merge 2 commits into
ABausG:mainfrom
imnexerio:fix/workmanager-append-chain-poisoning
Closed

fix: prevent WorkManager APPEND chain poisoning in background worker#402
imnexerio wants to merge 2 commits into
ABausG:mainfrom
imnexerio:fix/workmanager-append-chain-poisoning

Conversation

@imnexerio
Copy link
Copy Markdown

Problem

HomeWidgetBackgroundWorker uses ExistingWorkPolicy.APPEND with enqueueUniqueWork(). If initializeFlutterEngine() throws (e.g., stale callback handle, lookup failure), doWork() returns Result.failure().

With APPEND, a failed work item cancels all subsequently appended work items. This failure state is persisted in WorkManager's SQLite database and survives process death, force stop, and device reboots — permanently breaking all background callbacks until the user manually clears app data.

Root Cause

The combination of ExistingWorkPolicy.APPEND + Result.failure() creates an unrecoverable state:

1=Engine init fails → Result.failure() returned
2=WorkManager marks the chain as failed
3=All future enqueueUniqueWork(APPEND) calls append to the failed chain → immediately cancelled
4=No background callbacks execute until WorkManager's DB is wiped (clear app data)

Fix
1=APPEND → APPEND_OR_REPLACE — If the chain is failed/cancelled, new work replaces it instead of being appended to a dead chain
2=Result.failure() → Result.success() on engine init failure — prevents chain poisoning in the first place
Added resetEngine() — properly destroys a partially initialized FlutterEngine and resets static state so the next work item retries from clean state
3=Added null check for engine after init block — graceful handling if engine is null

- Change ExistingWorkPolicy from APPEND to APPEND_OR_REPLACE so a failed
  or cancelled chain is replaced instead of permanently blocking future work.

- Return Result.success() with failure output data instead of Result.failure()
  when engine initialization fails, preventing chain poisoning while preserving
  observability through output data (engine_init_failed flag + error message).

- Add coroutine Mutex to synchronize engine initialization, preventing
  concurrent workers from racing on the init check.

- Destroy FlutterEngine on the main thread via mainHandler.post() as
  required by the FlutterEngine API.

- Preserve pending callback queue across engine resets so earlier workers'
  callbacks can still be dispatched after a successful re-initialization.

Fixes: WorkManager chain state is persisted in its SQLite database and
survives process death, force stop, and device reboots. Once any doWork()
returned Result.failure(), all subsequently appended items were immediately
cancelled, permanently breaking background callbacks until the user manually
cleared app data.
Copilot AI review requested due to automatic review settings February 25, 2026 03:47
@docs-page
Copy link
Copy Markdown

docs-page Bot commented Feb 25, 2026

To view this pull requests documentation preview, visit the following URL:

docs.page/abausg/home_widget~402

Documentation is deployed and generated using docs.page.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 25, 2026

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

📥 Commits

Reviewing files that changed from the base of the PR and between 0e7f102 and a301e4f.

📒 Files selected for processing (1)
  • packages/home_widget/android/src/main/kotlin/es/antonborri/home_widget/HomeWidgetBackgroundWorker.kt

✏️ Tip: You can disable in-progress messages and the fortune message in your review settings.

Tip

CodeRabbit can enforce grammar and style rules using `languagetool`.

Configure the reviews.tools.languagetool setting to enable/disable rules and categories. Refer to the LanguageTool Community to learn more.

Walkthrough

Introduces thread-safe Flutter engine initialization using Mutex in HomeWidgetBackgroundWorker. Adds CancellationException handling with engine reset, enhanced error recovery that queues retry work, defensive null checks post-initialization, and a resetEngine() helper method. No public API changes.

Changes

Cohort / File(s) Summary
Engine Initialization & Error Handling
packages/home_widget/android/src/main/kotlin/es/antonborri/home_widget/HomeWidgetBackgroundWorker.kt
Added mutex-based synchronization for engine creation. Introduced CancellationException handling that triggers engine reset and cooperative cancellation. Enhanced error recovery: when engine initialization fails, resets state, queues callback for later dispatch, and returns success with error details (avoiding chain poisoning). Added defensive null-check fallback post-initialization. Implemented resetEngine() helper method to safely destroy engine and reset service state on main thread. Updated work queueing to use APPEND_OR_REPLACE strategy with explanatory comments about failure recovery. Added defensive synchronization around pending callbacks. Expanded imports for cancellation and coroutine primitives (Mutex, withLock).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically identifies the main change: preventing WorkManager APPEND chain poisoning in the background worker, which is the core issue addressed throughout the changeset.
Description check ✅ Passed The PR description comprehensively covers the problem statement, root cause analysis, and solution details. However, the required checklist items for testing, documentation, and examples are not explicitly marked/confirmed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical issue where background widget callbacks could become permanently broken after Flutter engine initialization failures. The issue stemmed from using ExistingWorkPolicy.APPEND with WorkManager, which caused failed work items to poison the entire work chain persistently.

Changes:

  • Changed WorkManager policy from APPEND to APPEND_OR_REPLACE to prevent work chain poisoning
  • Modified error handling to return Result.success() with metadata instead of Result.failure() when engine initialization fails
  • Added resetEngine() function to properly clean up failed engine initialization state while preserving queued callbacks
  • Added mutex-based synchronization to protect engine initialization from concurrent access
  • Added special handling for CancellationException to respect cooperative coroutine cancellation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

channel = MethodChannel(it.dartExecutor.binaryMessenger, CHANNEL_NAME)
channel.setMethodCallHandler(this)
} ?: run {
Log.w(TAG, "Flutter engine is null after initialization, skipping callback")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the engine is null at this point (which the comment acknowledges is a theoretical scenario that shouldn't occur), the work item's data is silently lost - not queued for later processing like in the error path at lines 48-55. While the Log.w provides some visibility, the lost callback data could make debugging difficult.

Consider either queuing the data (similar to the error path) or using Log.e instead of Log.w to make this unexpected condition more visible in logs.

Suggested change
Log.w(TAG, "Flutter engine is null after initialization, skipping callback")
Log.e(TAG, "Flutter engine is null after initialization, queuing callback data and skipping immediate callback")
val data = inputData.getString(DATA_KEY) ?: ""
val args = listOf(HomeWidgetPlugin.getHandle(context), data)
synchronized(serviceStarted) {
queue.add(args)
}

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +36
try {
initializeFlutterEngine()
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The engine?.dartExecutor?.executeDartCallback(callback) call on line 117 (outside the shown diff but in the initializeFlutterEngine function) uses the nullable engine field. Between the assignment at line 110 (engine = FlutterEngine(context)) and line 117, another thread could call resetEngine() and set engine to null, causing the callback execution to silently fail.

Since this entire block runs on Dispatchers.Main inside a withContext, and resetEngine() posts to mainHandler (which also runs on the main thread), this is actually safe from concurrent modification. However, the code would be clearer if a local variable was used to hold the engine reference throughout the initialization.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/home_widget/android/src/main/kotlin/es/antonborri/home_widget/HomeWidgetBackgroundWorker.kt`:
- Around line 73-80: The defensive branch that logs "Flutter engine is null..."
currently returns Result.success(), dropping the callback payload (args); change
it to return Result.retry() (or re-enqueue the work with the original args) so
the WorkManager preserves and retries the work when the engine is available;
update the same behavior in the similar null-check later (the engine null branch
around lines 83-84) and ensure references to engine, channel, TAG, and Result
are handled consistently so callbacks (args) are not lost.
- Around line 178-187: Pin the WorkManager dependency to a version that defines
ExistingWorkPolicy.APPEND_OR_REPLACE by updating the
androidx.work:work-runtime-ktx dependency in the module's Gradle configuration
to 2.4.0 or later (recommend 2.8.1 to match the example app); this ensures the
call to WorkManager.getInstance(context).enqueueUniqueWork(...,
ExistingWorkPolicy.APPEND_OR_REPLACE, workRequest) will compile and run reliably
across devices.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa7453a and 0e7f102.

📒 Files selected for processing (1)
  • packages/home_widget/android/src/main/kotlin/es/antonborri/home_widget/HomeWidgetBackgroundWorker.kt

Comment on lines +73 to +80
// Defensive null check: engine could theoretically become null if
// resetEngine() is called between the mutex release above and here.
engine?.let {
channel = MethodChannel(it.dartExecutor.binaryMessenger, CHANNEL_NAME)
channel.setMethodCallHandler(this)
} ?: run {
Log.w(TAG, "Flutter engine is null after initialization, skipping callback")
return Result.success()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not drop the callback payload in the defensive engine == null branch.

If this fallback is hit, the current work item returns success without enqueuing args, so that callback is lost.

Proposed fix
-    // Defensive null check: engine could theoretically become null if
-    // resetEngine() is called between the mutex release above and here.
-    engine?.let {
-      channel = MethodChannel(it.dartExecutor.binaryMessenger, CHANNEL_NAME)
-      channel.setMethodCallHandler(this)
-    } ?: run {
-      Log.w(TAG, "Flutter engine is null after initialization, skipping callback")
-      return Result.success()
-    }
-
     val data = inputData.getString(DATA_KEY) ?: ""
     val args = listOf(HomeWidgetPlugin.getHandle(context), data)
+
+    // Defensive null check: engine could theoretically become null if
+    // resetEngine() is called between the mutex release above and here.
+    engine?.let {
+      channel = MethodChannel(it.dartExecutor.binaryMessenger, CHANNEL_NAME)
+      channel.setMethodCallHandler(this)
+    } ?: run {
+      Log.w(TAG, "Flutter engine is null after initialization, queueing callback")
+      synchronized(serviceStarted) {
+        queue.add(args)
+      }
+      return Result.success(
+          Data.Builder()
+              .putBoolean("engine_init_failed", true)
+              .putString("error", "Flutter engine unavailable after initialization")
+              .build()
+      )
+    }

Also applies to: 83-84

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/home_widget/android/src/main/kotlin/es/antonborri/home_widget/HomeWidgetBackgroundWorker.kt`
around lines 73 - 80, The defensive branch that logs "Flutter engine is null..."
currently returns Result.success(), dropping the callback payload (args); change
it to return Result.retry() (or re-enqueue the work with the original args) so
the WorkManager preserves and retries the work when the engine is available;
update the same behavior in the similar null-check later (the engine null branch
around lines 83-84) and ensure references to engine, channel, TAG, and Result
are handled consistently so callbacks (args) are not lost.

Comment on lines +178 to +187
// Use APPEND_OR_REPLACE instead of APPEND to prevent work chain
// poisoning. With APPEND, if any work item in the chain fails, all
// subsequently appended items are immediately cancelled by WorkManager.
// This failure state is persisted in WorkManager's SQLite database and
// survives process death, force stop, and device reboots — permanently
// breaking background callbacks until the user clears app data.
// APPEND_OR_REPLACE replaces the chain if it has failed or been
// cancelled, allowing recovery without user intervention.
WorkManager.getInstance(context)
.enqueueUniqueWork(UNIQUE_WORK_NAME, ExistingWorkPolicy.APPEND, workRequest)
.enqueueUniqueWork(UNIQUE_WORK_NAME, ExistingWorkPolicy.APPEND_OR_REPLACE, workRequest)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd -t f 'build\.gradle(\.kts)?$|libs\.versions\.toml$' | head -20

Repository: ABausG/home_widget

Length of output: 207


🏁 Script executed:

rg -i 'androidx\.work|work-runtime|workmanager' --type gradle --type toml -n | head -50

Repository: ABausG/home_widget

Length of output: 277


🌐 Web query:

AndroidX WorkManager ExistingWorkPolicy.APPEND_OR_REPLACE introduction version history

💡 Result:

ExistingWorkPolicy.APPEND_OR_REPLACE was introduced in AndroidX WorkManager 2.4.0 (stable release date July 22, 2020), where it’s listed as a new API change (“Add ExistingWorkPolicy.APPEND_OR_REPLACE…”) in the 2.4.0 release notes. [1]

Version history (high level):

  • 2.4.0 (2020-07-22): ExistingWorkPolicy.APPEND_OR_REPLACE added. [1]
  • Later versions (2.5+ through current): still present; current API docs describe its behavior (append like APPEND, but if prerequisites are FAILED/CANCELLED, they’re dropped and the new work starts a new sequence). [2]

Implication: to use APPEND_OR_REPLACE, depend on androidx.work:work-runtime >= 2.4.0. [1]

Sources

  1. WorkManager release notes (Android Developers) — https://developer.android.com/jetpack/androidx/releases/work
  2. ExistingWorkPolicy API reference (Android Developers) — https://developer.android.com/reference/androidx/work/ExistingWorkPolicy

Pin WorkManager version constraint to at least 2.4.0 where APPEND_OR_REPLACE was introduced.

The main package uses androidx.work:work-runtime-ktx:2.+ which resolves to any 2.x version, including 2.0–2.3 that do not define ExistingWorkPolicy.APPEND_OR_REPLACE. Update packages/home_widget/android/build.gradle line 55 to 2.4.0 or higher (e.g., 2.8.1 to match the example app), ensuring the policy is available at compile and runtime.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/home_widget/android/src/main/kotlin/es/antonborri/home_widget/HomeWidgetBackgroundWorker.kt`
around lines 178 - 187, Pin the WorkManager dependency to a version that defines
ExistingWorkPolicy.APPEND_OR_REPLACE by updating the
androidx.work:work-runtime-ktx dependency in the module's Gradle configuration
to 2.4.0 or later (recommend 2.8.1 to match the example app); this ensures the
call to WorkManager.getInstance(context).enqueueUniqueWork(...,
ExistingWorkPolicy.APPEND_OR_REPLACE, workRequest) will compile and run reliably
across devices.

@imnexerio imnexerio closed this Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants