05-11-2020, 05:34 AM
Most of the feedback is stuff I've mentioned before in other thread (but lets get is all in one place):
Ideally the active slot would start at 0x00001000 in order to permit SoftDevice firmware to run. This implies linking mcuboot at the end of FLASH (rather than the beginning) and providing a simple binary to load in the first 4k of FLASH. The initial version of this simple binary really can be *very* simple... just setup the vector table to point to mcuboot and jump to the mcuboot reset vector. It might eventually need a couple of extra tweaks but for early development it is all that is required.
Additionally when you talk about "support for DFU" I think there really are *two* use-cases hiding behind that:
1. An application that, when operating normally, that permits DFU by providing BLE interfaces that allow the standby slot to be modified (alongside its other duties running the watch). This is the normal case for production devices and works well when the images are reasonably well controlled (e.g. tested prior to update, etc) and where updates are secure (signed, no promiscous pairing, etc).
2. A recovery mode, triggered by some special action by the user of the device, which is used when the active application is no longer capable of booting to the point that DFU works.
In the last post you describe a recovery mode that works by swapping the active and standby apps. This assumes that the DFU definitely works in the standby app and I remain worried that that. Given the standby app in this case is a real smartwatch app that performs duties beyond offering OTA DFU there is too much risk that other changes on the system could render it inoperable (maybe it has a but where it crashes at boot when it tries to decode a corrupted or missing image from the external FLASH). It is for these reasons that I advocate implementing recovery mode as a special case that is fully independent of the active/standby swapping code.
How the recovery code works is secondary... I have proposed the recovery slot idea but what really matters is the use-case. In other words I don't think it is sufficient to rely on two closely related smartwatch images (e.g. two slightly different versions of a same zephyr app) *not* to contain the same latent bug that will brick the watch if it is ever triggered.
Ideally the active slot would start at 0x00001000 in order to permit SoftDevice firmware to run. This implies linking mcuboot at the end of FLASH (rather than the beginning) and providing a simple binary to load in the first 4k of FLASH. The initial version of this simple binary really can be *very* simple... just setup the vector table to point to mcuboot and jump to the mcuboot reset vector. It might eventually need a couple of extra tweaks but for early development it is all that is required.
Additionally when you talk about "support for DFU" I think there really are *two* use-cases hiding behind that:
1. An application that, when operating normally, that permits DFU by providing BLE interfaces that allow the standby slot to be modified (alongside its other duties running the watch). This is the normal case for production devices and works well when the images are reasonably well controlled (e.g. tested prior to update, etc) and where updates are secure (signed, no promiscous pairing, etc).
2. A recovery mode, triggered by some special action by the user of the device, which is used when the active application is no longer capable of booting to the point that DFU works.
In the last post you describe a recovery mode that works by swapping the active and standby apps. This assumes that the DFU definitely works in the standby app and I remain worried that that. Given the standby app in this case is a real smartwatch app that performs duties beyond offering OTA DFU there is too much risk that other changes on the system could render it inoperable (maybe it has a but where it crashes at boot when it tries to decode a corrupted or missing image from the external FLASH). It is for these reasons that I advocate implementing recovery mode as a special case that is fully independent of the active/standby swapping code.
How the recovery code works is secondary... I have proposed the recovery slot idea but what really matters is the use-case. In other words I don't think it is sufficient to rely on two closely related smartwatch images (e.g. two slightly different versions of a same zephyr app) *not* to contain the same latent bug that will brick the watch if it is ever triggered.