Lup Yuen Lee 12e8f92a28 CI: Retry build upon failure
In Jan-Feb 2026: NuttX CI hit a [record high usage of GitHub Runners](https://github.com/apache/nuttx/issues/17914), exceeding the limit enforced by ASF Infrastructure Team. We analysed the PRs and discovered that most GitHub Runners were wasted on __(1) Failure to Download the Build Dependencies__ for DTC Device Tree, OpenAMP Messaging, MicroADB Debugger, MCUBoot Bootloader, NimBLE Bluetooth, etc __(2) Resubmitting PR Commits__:

- [Video: Analysing the Most Expensive PR](https://youtu.be/swFaxaTCEQg)
- [Video: Second Most Expensive PR](https://youtu.be/uSpQkzBogEw)
- [Video: Third Most Expensive PR](https://youtu.be/J7w1gyjwZ1w)
- [Video: Most Expensive Apps PR](https://youtu.be/182h8cRpfvI)
- [Spreadsheet: Most Expensive PRs](https://docs.google.com/spreadsheets/d/1HY7fIZzd_fs3QPyA0TX7vsYOjL86m1fNOf1Wls93luI/edit?gid=70515654#gid=70515654)

Why would __Download Failures__ waste GitHub Runners? That's because Download Failures will terminate the Entire CI Build (across All CI Jobs), requiring a restart of the CI Build. And the CI Build isn't terminated immediately upon failure: NuttX CI waits for the CI Job to complete (e.g. arm-01), before terminating the CI Build. Which means that CI Builds can get terminated 2.5 hours into the CI Build, wasting 2.5 elapsed hours x [7.4 parallel processes](https://lupyuen.org/articles/ci3#live-metric-for-full-time-runners) of GitHub Runners.

This PR proposes to __Retry the Build for Each CI Target__. NuttX CI shall rebuild each CI Target (e.g. `sim:nsh`), upon failure, up to 3 times (total 4 builds). Each rebuild will be attempted after a Randomised Delay with Exponential
Backoff, initially set to 60 seconds, then 120 seconds, 240 seconds. The rebuilds will mitigate the effects of Intermittent Download Failures that occur in GitHub Actions. (And eliminate developer frustration)

If the build fails after 3 retries: Subsequent CI Targets will __not be allowed to rebuild__ upon failure. This is to prevent cascading build failures from overloading GitHub Actions, and consuming too many GitHub Runners.

Note that NuttX CI shall retry the build for __Any Kind of Build Failure__, including Download Failures, Compile Errors and Config Errors. We designed it simplistically due to our current constraints: (1) Lack of CI Expertise (2) NuttX CI is Mission Critical (3) Legacy CI Scripts are Highly Complex. To prevent Compile Errors and Config Errors: We expect NuttX Devs to [Build and Test PRs in Our Own Repos](https://github.com/apache/nuttx/issues/18568), before submitting to NuttX.

What about __Resubmitting PR Commits__ and its wastage of GitHub Runners? We also require NuttX Devs to [Build and Test PRs in Our Own Repos](https://github.com/apache/nuttx/issues/18568), before resubmitting to NuttX. GitHub Runners will then be charged to the developer's quota, without affecting the GitHub Runners quota for Apache NuttX Project. We plan to [Kill All CI Jobs](https://youtu.be/182h8cRpfvI?si=MmAuwLISZPPMoqDq&t=1479) for PRs that have been switched to Draft Mode. We'll monitor this through the [NuttX Build Monitor](https://github.com/apache/nuttx/issues/18659).

Modified Files:

`tools/testbuild.sh`: We introduce a New Wrapper Function `retrytest` that will call the Existing Function `dotest`, to build the CI Target and retry on error.

`Documentation/components/tools/testbuild.rst`: Updated the `testbuild.sh` doc with the Retry Logic.

Signed-off-by: Lup Yuen Lee <luppy@appkaki.com>
2026-04-15 12:30:17 +02:00
2026-03-23 18:16:46 -04:00
2026-04-15 12:30:17 +02:00
2026-02-25 10:17:40 +01:00

POSIX Badge License Issues Tracking Badge Contributors GitHub Build Badge Documentation Badge

Apache NuttX is a real-time operating system (RTOS) with an emphasis on standards compliance and small footprint. Scalable from 8-bit to 64-bit microcontroller environments, the primary governing standards in NuttX are POSIX and ANSI standards. Additional standard APIs from Unix and other common RTOSs (such as VxWorks) are adopted for functionality not available under these standards, or for functionality that is not appropriate for deeply-embedded environments (such as fork()).

For brevity, many parts of the documentation will refer to Apache NuttX as simply NuttX.

Getting Started

First time on NuttX? Read the Getting Started guide! If you don't have a board available, NuttX has its own simulator that you can run on terminal.

Documentation

You can find the current NuttX documentation on the Documentation Page.

Alternatively, you can build the documentation yourself by following the Documentation Build Instructions.

The old NuttX documentation is still available in the Apache wiki.

Supported Boards

NuttX supports a wide variety of platforms. See the full list on the Supported Platforms page.

Contributing

If you wish to contribute to the NuttX project, read the Contributing guidelines for information on Git usage, coding standard, workflow and the NuttX principles.

License

The code in this repository is under either the Apache 2 license, or a license compatible with the Apache 2 license. See the License Page for more information.

S
Description
Apache NuttX is a mature, real-time embedded operating system (RTOS)
Readme Multiple Licenses 510 MiB
Languages
C 95.2%
Linker Script 1.4%
Assembly 1.1%
CMake 1%
Makefile 0.6%
Other 0.6%