drivers/mtd/w25n: implement bad block management

Use the chip's built-in 20-entry non-volatile Bad Block Management
Look-Up Table (datasheet section 8.2.7) to transparently route around
bad blocks.

Init:
- Reserve the top 24 blocks of the array as a spare pool
- Clamp the MTD geometry to W25N_USER_BLOCKS = 1000 so upper layers
  never see the spare area (125 MB usable instead of 128 MB)
- Force BUF=1 alongside enabling ECC. The W25N01GVxxIT variant
  power-ups with BUF=0 (Continuous Read mode), in which Read Data
  ignores the column address and always starts at byte 0 - which
  silently broke any read targeting a non-zero column (OOB markers,
  sub-page reads in w25n_read).
- Scan all 1024 blocks for factory bad markers (non-FFh at byte 0 of
  the spare area of page 0) and remap any user-area bad blocks via the
  A1h BBM command. Idempotent across reboots: blocks already present
  in the LUT are skipped, so repeated scans don't consume LUT slots.

Runtime:
- On E-FAIL from w25n_block_erase or P-FAIL from w25n_program_execute,
  allocate a spare and issue A1h, then retry the operation once. The
  chip routes the retry to the spare PBA transparently. Data buffer is
  reloaded on program retry.
- Uncorrectable read ECC is left as -EIO (soft errors shouldn't trigger
  permanent remap, and remapping discards data we may still recover).

Safeguards against burning LUT slots on bogus bad blocks:
- w25n_pick_free_spare erases each candidate spare as an active proof
  of life before returning it - the factory OOB marker alone isn't
  trusted.
- w25n_bbm_swap rejects A1h with LBA outside the user area or PBA
  outside the spare pool.

Stack discipline for the logger-thread hot path:
- The 20-entry cached LUT lives in the device struct, not on the stack.
- w25n_read_bbm_lut decodes 4 bytes at a time instead of reading the
  full 80-byte LUT dump into a local buffer.

Boot diagnostics are emitted via syslog so they appear unconditionally:
- [w25n] BBM scan summary (new/remapped/unremapped/previously-remapped/
  LUT slots used)
- [w25n] W25N01GV ready line with user blocks, spare count, geometry,
  and actual SPI frequency
- [w25n] per-remap info and warnings on runtime E-FAIL/P-FAIL paths

Note: existing littlefs filesystems become unmountable because the
block count shrinks from 1024 to 1000; both PX4 board init.c paths
already mount with autoformat so they reformat on first boot after
this change.

Signed-off-by: Julian Oes <julian@oes.ch>
This commit is contained in:
Julian Oes
2026-04-14 09:22:25 +12:00
committed by Michal Lenc
parent f51052d780
commit 0c287d6f45
+443 -13
View File
File diff suppressed because it is too large Load Diff