Documentation: update the gprof usage

Update the usage part to demostrate two easily accessible environments,
one QEMU and one real board.

Signed-off-by: chenxiaoyi <chenxiaoyi@xiaomi.com>
This commit is contained in:
chenxiaoyi
2025-12-12 17:08:31 +08:00
committed by Xiang Xiao
parent 0d757d0134
commit 43405d34c4
@@ -17,65 +17,98 @@ gprof can be used to:
Usage
=====
Build
-----
QEMU example
------------
For this example, we're using **QEMU** and **aarch64-none-elf-gcc** with the **qemu-armv8a** board.
Enable the following configuration in NuttX::
1. Configure ``./tools/configure.sh -E qemu-armv8a:nsh`` and make sure ``CONFIG_SYSTEM_GPROF`` and ``CONFIG_PROFILE_MINI`` are enabled
2. Build ``make -j``
3. Launch qemu::
CONFIG_SYSTEM_GPROF
qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
-machine virt,virtualization=on,gic-version=3 \
-chardev stdio,id=con,mux=on -serial chardev:con \
-mon chardev=con,mode=readline -semihosting -kernel ./nuttx
Using in NuttX
--------------
4. Mount hostfs for saving data later::
1. Start profiling::
nsh> mount -t hostfs -o fs=. /mnt
nsh> gprof start
5. Start profiling::
2. Stop profiling::
nsh> gprof start
nsh> gprof stop
6. Do some test and stop profiling::
3. Dump profiling data::
nsh> gprof stop
nsh> gprof dump /tmp/gmon.out
7. Dump profiling data::
Analyzing on Host
-----------------
nsh> gprof dump /mnt/gmon.out
1. Pull the profiling data to host::
8. Analyze the data on host using gprof tool::
adb pull /tmp/gmon.out ./gmon.out
$ aarch64-none-elf-gprof nuttx gmon.out -b
2. Analyze the data using gprof tool::
.. note:: The saved file format complies with the standard gprof format.
For detailed instructions on gprof command usage, please refer to the GNU gprof manual:
https://ftp.gnu.org/old-gnu/Manuals/gprof-2.9.1/html_mono/gprof.html
The saved file format complies with the standard gprof format.
For detailed instructions on gprof command usage, please refer to the GNU gprof manual:
https://ftp.gnu.org/old-gnu/Manuals/gprof-2.9.1/html_mono/gprof.html
Example output::
arm-none-eabi-gprof ./nuttx/nuttx gmon.out -b
$ aarch64-none-elf-gprof nuttx gmon.out -b
Flat profile:
Example output:
Each sample counts as 0.001 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
75.58 12.44 12.44 12462 0.00 0.00 up_idle
24.30 16.44 4.00 4 1.00 1.00 up_ndelay
0.05 16.45 0.01 177 0.00 0.00 pl011_txint
0.02 16.45 0.00 35 0.00 0.00 uart_readv
```
arm-none-eabi-gprof nuttx/nuttx gmon.out -b
Flat profile:
This output shows the performance profile of the program,
including execution time and call counts for each function.
The flat profile table provides a quick overview of where the program spends most of its time.
This information can be used to identify performance bottlenecks and optimize critical parts of the code.
Each sample counts as 0.001 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
66.41 3.55 3.55 43 0.08 0.08 sdelay
33.44 5.34 1.79 44 0.04 0.04 delay
0.07 5.34 0.00 up_idle
0.04 5.34 0.00 nx_start
0.02 5.34 0.00 fdtdump_main
0.02 5.34 0.00 nxsem_wait
0.00 5.34 0.00 1 0.00 5.34 hello_main
0.00 5.34 0.00 1 0.00 0.00 singal_handler
Real board example
------------------
Let take **esp32s3-devkit** as an example.
```
Test the flat profile
~~~~~~~~~~~~~~~~~~~~~
1. Configure ``./tools/configure.sh -E esp32s3-devkit:nsh`` and make sure these items are enabled::
This output shows the performance profile of the program,
including execution time and call counts for each function.
The flat profile table provides a quick overview of where the program spends most of its time.
In this example, `sdelay` and `delay` functions consume the majority of execution time.
This information can be used to identify performance bottlenecks and optimize critical parts of the code.
# for gprof
CONFIG_PROFILE_MINI=y
CONFIG_SYSTEM_GPROF=y
# save and transfer data
CONFIG_FS_TMPFS=y
CONFIG_SYSTEM_YMODEM=y
2. Build and flash ``make flash ESPTOOL_PORT=/dev/ttyUSB0 -j``
3. Run ``minicom -D /dev/ttyUSB0 -b 115200`` to connect to the board
4. Start profiling::
nsh> gprof start
# do some test here, such as ostest
nsh> gprof stop
nsh> gprof dump /tmp/gmon.out
nsh> sb /tmp/gmon.out
5. Receive the file on PC, and analyze the data on host::
$ cp nuttx nuttx_prof
$ xtensa-esp32s3-elf-objcopy -I elf32-xtensa-le --rename-section .flash.text=.text nuttx_prof
$ xtensa-esp32s3-elf-gprof nuttx_prof gmon.out
Test the call graph profile
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. Add compiler option ``-pg`` to the component, such as ostest Makefile, like: ``CFLAGS += -pg``
2. Enable the configuration item ``CONFIG_FRAME_POINTER``
The other steps are the same as the flat profile.