Skip to content

Firmware validation #169

Closed
Closed
@trarbr

Description

@trarbr

Post-boot firmware validation plays a vital role in a robust firmware upgrade
strategy. The nerves_runtime project offers suggestions for how users can
perform firmware validation, but leaves implementation to its users. Some
Nerves systems provide out-of-the-box support for firmware validation, but
others do not, and this discrepancy increases the amount of work that must be
done to properly implement firmware validation. This is a proposal to add
standardized procedures around firmware validation to the Nerves framework, in
order to make this powerful feature more accessible to Nerves users.

Firmware validation is input to a decision that must be made at boot time -
which target (kernel and rootfs) to boot. This is a decision that must be made
as early as possible in the boot process, in order to provide protection
against as many possible firmware faults as possible. The best place to make
the decision is in the bootloader, but bootloaders vary between systems, and
not all are equally capable. This means the method of marking firmware valid
must be provided by the system itself. The most obvious way of doing this is
through fwup scripts.

All the official Nerves systems provide a revert.fw file, and
Nerves.Runtime exposes a revert function to invoke fwup with that
configuration file. In a similar vein, a mark_valid.fw file could be added
to all systems, and a mark_valid function could be added to
Nerves.Runtime. When users have checked that their firmware is valid,
they will invoke the Nerves.Runtime.mark_valid function. If they reboot
without calling this function first, the system will automatically revert
to the previous version of the firmware.

Automatic reverts of bad firmware can be confusing to new users. Thus,
firmware should probably validate itself automatically by default. Users then
need a way to disabling firmware autovalidation. I suggest letting users
control this by setting an environment variable (e.g.
NERVES_FW_AUTOVALIDATE) at firmware creation time.

In order to make the decision about which target to boot, a few facts must be
known. Facts whose state must be kept across reboots as persistent variables:

  • What is the intended target? nerves_fw_active
  • Has the intended target been booted? nerves_fw_booted
  • Has the intended target been validated? nerves_fw_validated

The variables will be modified at different points during the firmware lifecycle:

  1. When firmware is first burned onto the device for the first time, it will be assumed valid:
    • nerves_fw_active = a
    • nerves_fw_booted = 0
    • nerves_fw_validated = 1
  2. When firmware is upgraded, the active target will change and validation state reset:
    • nerves_fw_active = b
    • nerves_fw_booted = 0
    • nerves_fw_validated = 0
  3. When the system is rebooted for the first time after upgrading, the boot attempt is recorded:
    • nerves_fw_booted = 1
  4. Once booted, the users' Nerves application will perform validation checks:
    1. If the firmware validation checks succeed, the firmware is marked as valid:
      • nerves_fw_validated = 1
    2. If the firmware validation checks fail, the system will reboot without marking the firmware valid, and the boot process will revert to the previous target
      • nerves_fw_active = a
      • nerves_fw_validated = 1

How exactly this is done varies from system to system, but the methodology is
the same for all systems.

If users have not disabled automatic firmware validation, the firmware will
validate itself. This can be done by setting nerves_fw_validated = 1 in the
firmware upgrade task or on the first boot.

Also, note that firmware upgrade scripts should require current firmware to be
marked valid before performing an upgrade. This is to ensure a safe fallback
in case of a bad upgrade.

Bootloader logic for checking validation is already implemented in
nerves_system_bbb.
Here, firmware is marked valid by setting a U-Boot variable. However, this
approach to marking firmware valid is not usable for all Nerves systems (see
Grub2 below). Also, it is not possible for users to disable autovalidation at
firmware creation time. Thus I suggest adding a fwup file to mark firmware
valid, and adding an environment variable to disable autovalidation at
firmware creation time.

Very similar logic for validation checks can be performed in systems which use
the Grub2 bootloader (e.g.
nerves_system_x86_&4).
However, Grub2 uses the Grub environment block (a file with a certain
structure) rather than the U-Boot environment. This means implementation will
be different. In fact, it requires jumping through a few hoops since fwup
can't modify the Grub environment block. But it can be done, and the interface
can be the same: a fwup file that marks firmware valid, and an environment
variable to disable autovalidation at firmware creation time.

The Raspberry Pi bootloader is, unfortunately, unable to modify its boot
environment. Because of this, the decision about which firmware to boot can
only be made after booting. This can be done with a shell script evaluated by
erlinit prior to booting the Nerves application. There is also another reason
why this is necessary: On RPi systems we need to rewrite the MBR in order to
revert the firmware. This means the Raspberry Pi will not be as capable in
terms of reverting bad firmware - if the system crashes before erlinit can
invoke the shell script, the system will be caught in an endless reboot loop.
Unfortunately, I don't see a way out of this due to the design of its
bootloader. Apart from this deviation, the interface to firmware validation can
be the same: a fwup file that marks firmware valid, and an environment
variable to disable autovalidation at firmware creation time.

I would like to hear your opinions on this proposal. To check that the idea is
viable, I have partially complete implementations of this for Grub2 and
Raspberry Pi based systems. If you want to see the changes, I can submit draft
PRs for you to check out. If there's a consensus that this is a good idea I'd be
happy to move forward with the implementations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions