Skip to content

[feature:gsoc26] Add persistence model fields, pending status and batch to child propagation for persistent mass upgrades #417

@Eeshu-Yadav

Description

@Eeshu-Yadav

Is your feature request related to a problem? Please describe.

Right now, when a mass upgrade hits an offline device, the operation goes from in-progress straight to failed once Celery's 4 auto-retries (~10 minutes total) run out. The operation is dead, and an admin has to chase down and re-trigger every failed device by hand — fine for one device, painful at 30, unmanageable at 300.

There's nothing in the data model that lets the operator say "keep trying", no counter for how many attempts have already happened, and no timestamp telling a future task when to try again. The first thing to build is the schema — until those fields exist, the failure handler has nowhere to mark "we're waiting", the Beat scanner has nowhere to look, and the admin/API has nothing to display or accept.

Describe the solution I would implement

I would like to introduce the model layer that the rest of the persistence work depends on.

  1. Add a persistent boolean to AbstractUpgradeOperation:

    • Default False so today's standalone single-device upgrades keep their fail-fast behavior unless the caller explicitly opts in.
    • Putting the flag on the per-device operation means the failure handler can just read self.persistent — no FK lookup needed. Single-device REST API and Python-scripting callers opt in by writing the field on the serializer (sub-issue 08).
  2. Add a persistent boolean to AbstractBatchUpgradeOperation:

    • Default True so creating a mass upgrade defaults to "retry everything" — both the admin form checkbox and the REST API field come pre-checked.
    • From there the value carries over to each child operation at creation time (bullet 5).
  3. Add two new fields to AbstractUpgradeOperation:

    • retry_countPositiveIntegerField, default 0. Bumped every time an operation transitions in-progress → pending.
    • next_retry_at — nullable DateTimeField, db_index=True. The Beat scanner from sub-issue 04 queries WHERE status='pending' AND next_retry_at <= NOW() every tick. The column is nullable and only ever gets set on ops that have entered pending, so the btree stays naturally sparse — historical successes never show up.
  4. Extend UpgradeOperation.STATUS_CHOICES with ("pending", _("pending")). I'd append it after aborted so the admin filter dropdown order for the existing five values doesn't shuffle. BatchUpgradeOperation doesn't get a new status — sub-issue 03's calculate_and_update_status() update keeps a batch with pending children sitting at in-progress.

  5. Propagate persistent from batch to child inside DeviceFirmware.create_upgrade_operation() (base/models.py:436–448):

    • When a batch is supplied, set operation.persistent = batch.persistent before full_clean().
    • That's the only place I need to touch — every batch-driven child creation flows through here.
  6. Lock persistent after launch via clean() on both models. On an already-saved instance, I'd pull the stored value (type(self).objects.values_list("persistent", flat=True).get(pk=self.pk)) and raise ValidationError if the in-memory self.persistent doesn't match:

    • AbstractBatchUpgradeOperation: only fire the rejection once status != "idle" (idle is STATUS_CHOICES[0][0], the pre-launch state), so editing while the batch is still queued is still allowed.
    • AbstractUpgradeOperation: no state check needed — the default status is already in-progress, so any saved row is by definition post-launch.
    • clean() runs via full_clean() (from ModelForm.is_valid() and DRF ValidatedModelSerializer.is_valid()); a bare instance.save() skips it. This is a form/API guard, not a DB constraint.
  7. Give every new field a verbose_name=_(...) and help_text=_(...) so the admin form, the REST OpenAPI schema, and inline help all read meaningfully instead of showing raw attribute names.

  8. Ship migration 0018_persistent_fields.py plus the matching tests/openwisp2/sample_firmware_upgrader/migrations/ entry. Purely additive: 4× AddField (persistent on UpgradeOperation, persistent on BatchUpgradeOperation, retry_count, next_retry_at — the last one gets its btree index from db_index=True automatically) plus AlterField for the extended STATUS_CHOICES. Every new field has a default, so existing rows stay valid and no data migration is needed.

  9. Unit tests covering: field defaults; propagation behaviour (with batch / without batch / persistent=False batch); immutability on both models (post-save() for UpgradeOperation, post-idle for BatchUpgradeOperation, including the ValidationError message); the db_index=True on next_retry_at produces an index in the generated migration SQL; and the migration applies cleanly forward and backward on both empty and seeded databases.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestgsoc-ideaIssues part of Google Summer of Code project

Type

Projects

Status

ToDo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions