-
-
Couldn't load subscription status.
- Fork 1.6k
Description
While working on #4395 I discovered a serious fundamental problem with how LFS is used in framework-arduinoadafruitnrf52. The LFS used on this project is configured as follows:
#define FLASH_NRF52_PAGE_SIZE 4096
#define LFS_FLASH_TOTAL_SIZE (7*FLASH_NRF52_PAGE_SIZE)
#define LFS_BLOCK_SIZE 128
static struct lfs_config _InternalFSConfig =
{
...
.sync = _internal_flash_sync,
.read_size = LFS_BLOCK_SIZE,
.prog_size = LFS_BLOCK_SIZE,
.block_size = LFS_BLOCK_SIZE,
.block_count = LFS_FLASH_TOTAL_SIZE / LFS_BLOCK_SIZE,
.lookahead = 128,
}
The key thing is that the nrf52 flash pages are 4096 bytes long. But the LFS block size is only 128 bytes long. This is super unfortunate because this breaks the LFS guarantee of being resistent (in fact, normally, immune) to corruption due to CPU reset/loss-of-power while writing files. LFS is designed with the assumption that if flash can be corrupted only the current blocks it has 'in flight' that it hasn't yet called _internal_flash_sync() upon are at risk.
I understand why the adafruit devs probably had to accept this limitation because on the existing nrf52840
memory map there is only 7 pages (about 28KB) available. If they had done the 'normal' LFS thing of one BLOCK to one PAGE, there would only be seven pages - which isn't much of a filesystem (i.e. an absolute max of threeish 4KB files, and their associated metadata).
relationship between block size (128) and flash page size (4096)
Why putting 32 128 byte BLOCKS into a single 4096 byte PAGE causes problems:
The LFS sync operation is supposed to guarantee that all previous writes have committed to disk. However if we fail during this operation in our flash_cache routines (because of reset/loss-of-power) we will lose NOT JUST the LFS blocks that were just written but potentially/likely other blocks which were unlucky enough to be on the same FLASH page. This breaks the LFS assumptions about how reliable the underlying blocks are. LFS can't cope with random other blocks (which as far as it's concerned we weren't even writing to) getting corrupted during sync().
The current adafruit flash_cache has all reads/writes go to a 4K scratch RAM buffer. The actual programming of FLASH only happens in _internal_flash_sync() - by erasing a 4K flash page (slow) and then two 2K writes (slow).
So fundamentally: if we get reset while doing any portion of that page erase/write operation we will lose not just the 1 (ish) LFS block which was being synced, but up to 31ish random filesystem blocks who were unlucky to be nearby.
Ouch.
Possible fix we could apply now
I considered the idea of minimizing the window of risk to those 31 innocent bystander blocks. Instead of waiting till sync() to do the flash page erase and write, what if we do the erase earlier - once we have pulled the page into our 4KB ram cache (and now consider it dirty). Rough description of what this change would mean:
- _internal_flash_read/flash_nrf5x_read - no change to current logic.
- _internal_flash_prog/flash_nrf5x_write/flash_cache_write - after reading from cache, immediately start the block erase and rewrite all of the non 0xff bytes which are not in the current 128 byte block.
- sync/flash_cache_flush - write the modified bytes for the current 128 byte LFS block (we can assume that the erase has already occurred)
Alas - I rejected doing this work now because: we could still toast a bunch unrelated 128 byte LFS blocks if we lost power during cache_write. Since the risk of loss is still high IMO not worth doing this improvement.
Possible fixes in the 3.0 timeframe
Given these limitations (and the small number of files involved - about 5 files from meshtastic and one file (BLE pairing info) from adafruit). I propose the following:
- Encourage nrf52 based hw vendors to always include a cheap (and large) SPI FLASH chip. If such a chip is available, use that with LFS (where LFS can work great - see our esp32 boards for example)
- Make (or find in open-source land) a new SimpleBlockFS class - which maps from names to blocks. This would allow us to have up to 7 files inside the internal nrf52 FS. And we'd be guaranteed to lose only a single file if losing power while writing that file.
- Store the device config in a separate file from the (non critical and large) nodeDB file
- Have SafeFile (or similar) do standard 'reliable embedded system' 'flippy-flop' A-B failover between two device-config files, so that if writing fails to one the other is guaranteed to be good.
Given the size of this change (and non backwards compatible flash representation) I suspect ya'll will want this in 3.0/not-sooner. With the fixes in #4395/#4184 the flash on nrf52 (I think) is now quite reliable EXCEPT: If you cycle the power while writing to it (fortunately we write to the flash very rarely).