A bootloader for STM32
With this bootloader I want to acheive 3 key goals
- DFU - Ability to update deivce firmware using IO peripherals instead of a debugger module
- Security - Only run trusted application binaries that have been verified
- Portability - The ability to port this across different boards and use different IO peripherals to download the firmware
- Safe Rollback - In the case a loaded firmware fails or crashes, there must be the ability to revert the previous version
Iteration 1 of botloader will provide the following:
- Dual image flash architecture
- CRC image validation
- Automatic rollback on update failure
Iteration 2:
- Watchdog app rollback
A single image architecture can support larger application sizes and has a simpler flash architecture. However, it does not allow for a rollback mechanism when an app update fails, as the sectors that store the working app image will already be reased by the time a failure is detected. For example, if a power loss, reset, or a transmission failure occurs during the update, there is no working app on the device for it to continue functioning without having to update again. A dual image architecture retains the previous image in case the update fails. This comes at the cost of reduced app sizes and architecural complexity, but it provides a signicantly more stable and robust DFU mechanism.
With the chosen dual image mechanism, I wanted to design it in a manner in which the applications being compiled for the DFU can be unware of which slot it will occupy. The issue will be addresses in the vector table i.e the vector table will contain addresses at the image slot it was compiled for. To overcome this, there are 2 options:
- Relocate the vector table to RAM (done by the bootloader) and update the addresses in the vector with the proper offset.
- Build 2 application binaries, 1 for each slot, then query the bootlader for which one needs to be updated and send the approprate one.
With option 1, although it eneables compiling a single binary, I anticipate some issue with things like functions points not working. Reading more about PIC/PIE (Position Independent Code/Position Independent Executable), there seems to be some instablity with getting it to function properly. The downside of option 2 is that two application binaries must be 'shipped' for DFU, but it provides more stability as opposed to options 1. Having to 'ship' 2 images, in most case, does not come at much additional cost as most new host devices (devices wich will be uploading the image to the target) have enough storage capacity to store both. For this reason I choose option 2.
This is needed to detect any corruptions to the image that was wirtten to flash. Using CRC validation helps to prevent jumping to a bad/corrupted image that may be corrupted during transmission, flash memory bit flips, or some other way. If a CRC mismatch is detected, the bootloader will jump to the previously working image. The host must generate a CRC32 for the app binary and add it to an 'app header" along with the image size (used for transmission completion detection). The STM's CRC peripheral can be utilized generate a new CRC32 for the downloaded image and check it against the CRC32 in the app's header.
STM32 chips provide 2 watchdogs: Independent (IWDG) and Windowed (WWDG). The IWDG uses it's own clock source as opposed to the sysslk used by the WWDG, which provides the ability to detect clock misconfigurations. However the WWDG provides the ability to ensure that feeds inly occur during reset windows, but this is not needed for this project. Since the IWDG is robust to handle sysclk failures, the IWDG is used. Once the image is marked as valid and ready to boot, IWDG must be enabled before the app starts, which the app must feed. This is to detect application, clock, or CPU failures and resets the board. The bootloader must check if the board was reset due to a watchdog timer (by reading bit 30 of RCC_CSR, see sec. 6.3.18 of reference manual) and if a certain number of resets have occured, a rollback must be done (to the previous image) so that an infinite bootloop does not occur.
A host application (i.e a python script) to make loading firmware to the board easier. This will append some app header bytes to the compiled binary. For the first iteration, only the application size and CRC is needed. This script will coordinate through the COM port with the target board and first transmit the header bytes, followed by the app image bytes. If the app is validated on the target side after download, the bootloader will respond accordingly.
For the first iteration, it will check if the button is pressed during reset and go into DFU mode; if the button is not pressed the existing "active" app image will be run.
Figure 1: Bootloader flow chartFigure 2: DFU State Machine
On this board, there is 512Kb of flash memory, split into 8 sectors
Figure 3: Memory layout. STM32F491 Reference manual, p45Each sector can be erased independently, thus the bootloader and the applications must occur in sector boundaries to allow for DFU. Bootloader will take up sector 0 and 1 (16KB + 16KB), metadata will take up sector 2 (16KB), applications can use up rest. This allows application sizes of up to 208KB
----------------------------------------
| S0 (16KB) | Bootloader |
----------------------------------------
| S1 (16KB) | Bootloader |
----------------------------------------
| S2 (16KB) | Metadata |
----------------------------------------
| S3 (16KB) | App A (start) |
----------------------------------------
| S4 (64KB) | App A (continued) |
----------------------------------------
| S5 (128KB) | App A (continued) |
----------------------------------------
| S6 (128KB) | App B (Start) |
----------------------------------------
| S7 (128KB) | App B (Continued) |
----------------------------------------
- Integrate the firmware size and CRC into the app binary and generate these during the build. The currtle method of sending these as part of the handshake, can only detect failures during the DFU downloading process. Integrating them into the image binary itself will expand the CRC coverage to a larger part of the DFU pipeline.
[1] STM32F401 Reference manual
[2] From Zero to main(): How to Write a Bootloader from Scratch, François Baldassari
