Is your feature request related to a problem? Please describe.
Right now, when a mass upgrade hits an offline device, the operation goes from in-progress straight to failed once Celery's 4 auto-retries (~10 minutes total) run out. The operation is dead, and an admin has to chase down and re-trigger every failed device by hand — fine for one device, painful at 30, unmanageable at 300.
There's nothing in the data model that lets the operator say "keep trying", no counter for how many attempts have already happened, and no timestamp telling a future task when to try again. The first thing to build is the schema — until those fields exist, the failure handler has nowhere to mark "we're waiting", the Beat scanner has nowhere to look, and the admin/API has nothing to display or accept.
Describe the solution I would implement
I would like to introduce the model layer that the rest of the persistence work depends on.
-
Add a persistent boolean to AbstractUpgradeOperation:
- Default
False so today's standalone single-device upgrades keep their fail-fast behavior unless the caller explicitly opts in.
- Putting the flag on the per-device operation means the failure handler can just read
self.persistent — no FK lookup needed. Single-device REST API and Python-scripting callers opt in by writing the field on the serializer (sub-issue 08).
-
Add a persistent boolean to AbstractBatchUpgradeOperation:
- Default
True so creating a mass upgrade defaults to "retry everything" — both the admin form checkbox and the REST API field come pre-checked.
- From there the value carries over to each child operation at creation time (bullet 5).
-
Add two new fields to AbstractUpgradeOperation:
retry_count — PositiveIntegerField, default 0. Bumped every time an operation transitions in-progress → pending.
next_retry_at — nullable DateTimeField, db_index=True. The Beat scanner from sub-issue 04 queries WHERE status='pending' AND next_retry_at <= NOW() every tick. The column is nullable and only ever gets set on ops that have entered pending, so the btree stays naturally sparse — historical successes never show up.
-
Extend UpgradeOperation.STATUS_CHOICES with ("pending", _("pending")). I'd append it after aborted so the admin filter dropdown order for the existing five values doesn't shuffle. BatchUpgradeOperation doesn't get a new status — sub-issue 03's calculate_and_update_status() update keeps a batch with pending children sitting at in-progress.
-
Propagate persistent from batch to child inside DeviceFirmware.create_upgrade_operation() (base/models.py:436–448):
- When a
batch is supplied, set operation.persistent = batch.persistent before full_clean().
- That's the only place I need to touch — every batch-driven child creation flows through here.
-
Lock persistent after launch via clean() on both models. On an already-saved instance, I'd pull the stored value (type(self).objects.values_list("persistent", flat=True).get(pk=self.pk)) and raise ValidationError if the in-memory self.persistent doesn't match:
AbstractBatchUpgradeOperation: only fire the rejection once status != "idle" (idle is STATUS_CHOICES[0][0], the pre-launch state), so editing while the batch is still queued is still allowed.
AbstractUpgradeOperation: no state check needed — the default status is already in-progress, so any saved row is by definition post-launch.
clean() runs via full_clean() (from ModelForm.is_valid() and DRF ValidatedModelSerializer.is_valid()); a bare instance.save() skips it. This is a form/API guard, not a DB constraint.
-
Give every new field a verbose_name=_(...) and help_text=_(...) so the admin form, the REST OpenAPI schema, and inline help all read meaningfully instead of showing raw attribute names.
-
Ship migration 0018_persistent_fields.py plus the matching tests/openwisp2/sample_firmware_upgrader/migrations/ entry. Purely additive: 4× AddField (persistent on UpgradeOperation, persistent on BatchUpgradeOperation, retry_count, next_retry_at — the last one gets its btree index from db_index=True automatically) plus AlterField for the extended STATUS_CHOICES. Every new field has a default, so existing rows stay valid and no data migration is needed.
-
Unit tests covering: field defaults; propagation behaviour (with batch / without batch / persistent=False batch); immutability on both models (post-save() for UpgradeOperation, post-idle for BatchUpgradeOperation, including the ValidationError message); the db_index=True on next_retry_at produces an index in the generated migration SQL; and the migration applies cleanly forward and backward on both empty and seeded databases.
Is your feature request related to a problem? Please describe.
Right now, when a mass upgrade hits an offline device, the operation goes from
in-progressstraight tofailedonce Celery's 4 auto-retries (~10 minutes total) run out. The operation is dead, and an admin has to chase down and re-trigger every failed device by hand — fine for one device, painful at 30, unmanageable at 300.There's nothing in the data model that lets the operator say "keep trying", no counter for how many attempts have already happened, and no timestamp telling a future task when to try again. The first thing to build is the schema — until those fields exist, the failure handler has nowhere to mark "we're waiting", the Beat scanner has nowhere to look, and the admin/API has nothing to display or accept.
Describe the solution I would implement
I would like to introduce the model layer that the rest of the persistence work depends on.
Add a
persistentboolean toAbstractUpgradeOperation:Falseso today's standalone single-device upgrades keep their fail-fast behavior unless the caller explicitly opts in.self.persistent— no FK lookup needed. Single-device REST API and Python-scripting callers opt in by writing the field on the serializer (sub-issue 08).Add a
persistentboolean toAbstractBatchUpgradeOperation:Trueso creating a mass upgrade defaults to "retry everything" — both the admin form checkbox and the REST API field come pre-checked.Add two new fields to
AbstractUpgradeOperation:retry_count—PositiveIntegerField, default 0. Bumped every time an operation transitionsin-progress → pending.next_retry_at— nullableDateTimeField,db_index=True. The Beat scanner from sub-issue 04 queriesWHERE status='pending' AND next_retry_at <= NOW()every tick. The column is nullable and only ever gets set on ops that have enteredpending, so the btree stays naturally sparse — historical successes never show up.Extend
UpgradeOperation.STATUS_CHOICESwith("pending", _("pending")). I'd append it afterabortedso the admin filter dropdown order for the existing five values doesn't shuffle.BatchUpgradeOperationdoesn't get a new status — sub-issue 03'scalculate_and_update_status()update keeps a batch with pending children sitting atin-progress.Propagate
persistentfrom batch to child insideDeviceFirmware.create_upgrade_operation()(base/models.py:436–448):batchis supplied, setoperation.persistent = batch.persistentbeforefull_clean().Lock
persistentafter launch viaclean()on both models. On an already-saved instance, I'd pull the stored value (type(self).objects.values_list("persistent", flat=True).get(pk=self.pk)) and raiseValidationErrorif the in-memoryself.persistentdoesn't match:AbstractBatchUpgradeOperation: only fire the rejection oncestatus != "idle"(idle isSTATUS_CHOICES[0][0], the pre-launch state), so editing while the batch is still queued is still allowed.AbstractUpgradeOperation: no state check needed — the defaultstatusis alreadyin-progress, so any saved row is by definition post-launch.clean()runs viafull_clean()(fromModelForm.is_valid()and DRFValidatedModelSerializer.is_valid()); a bareinstance.save()skips it. This is a form/API guard, not a DB constraint.Give every new field a
verbose_name=_(...)andhelp_text=_(...)so the admin form, the REST OpenAPI schema, and inline help all read meaningfully instead of showing raw attribute names.Ship migration
0018_persistent_fields.pyplus the matchingtests/openwisp2/sample_firmware_upgrader/migrations/entry. Purely additive: 4×AddField(persistentonUpgradeOperation,persistentonBatchUpgradeOperation,retry_count,next_retry_at— the last one gets its btree index fromdb_index=Trueautomatically) plusAlterFieldfor the extendedSTATUS_CHOICES. Every new field has a default, so existing rows stay valid and no data migration is needed.Unit tests covering: field defaults; propagation behaviour (with batch / without batch /
persistent=Falsebatch); immutability on both models (post-save()forUpgradeOperation, post-idleforBatchUpgradeOperation, including theValidationErrormessage); thedb_index=Trueonnext_retry_atproduces an index in the generated migration SQL; and the migration applies cleanly forward and backward on both empty and seeded databases.