mirror of
https://github.com/apache/nuttx.git
synced 2026-05-21 21:34:07 +08:00
Documentation: update the gprof usage
Update the usage part to demostrate two easily accessible environments, one QEMU and one real board. Signed-off-by: chenxiaoyi <chenxiaoyi@xiaomi.com>
This commit is contained in:
@@ -17,65 +17,98 @@ gprof can be used to:
|
||||
Usage
|
||||
=====
|
||||
|
||||
Build
|
||||
-----
|
||||
QEMU example
|
||||
------------
|
||||
For this example, we're using **QEMU** and **aarch64-none-elf-gcc** with the **qemu-armv8a** board.
|
||||
|
||||
Enable the following configuration in NuttX::
|
||||
1. Configure ``./tools/configure.sh -E qemu-armv8a:nsh`` and make sure ``CONFIG_SYSTEM_GPROF`` and ``CONFIG_PROFILE_MINI`` are enabled
|
||||
2. Build ``make -j``
|
||||
3. Launch qemu::
|
||||
|
||||
CONFIG_SYSTEM_GPROF
|
||||
qemu-system-aarch64 -cpu cortex-a53 -smp 4 -nographic \
|
||||
-machine virt,virtualization=on,gic-version=3 \
|
||||
-chardev stdio,id=con,mux=on -serial chardev:con \
|
||||
-mon chardev=con,mode=readline -semihosting -kernel ./nuttx
|
||||
|
||||
Using in NuttX
|
||||
--------------
|
||||
4. Mount hostfs for saving data later::
|
||||
|
||||
1. Start profiling::
|
||||
nsh> mount -t hostfs -o fs=. /mnt
|
||||
|
||||
nsh> gprof start
|
||||
5. Start profiling::
|
||||
|
||||
2. Stop profiling::
|
||||
nsh> gprof start
|
||||
|
||||
nsh> gprof stop
|
||||
6. Do some test and stop profiling::
|
||||
|
||||
3. Dump profiling data::
|
||||
nsh> gprof stop
|
||||
|
||||
nsh> gprof dump /tmp/gmon.out
|
||||
7. Dump profiling data::
|
||||
|
||||
Analyzing on Host
|
||||
-----------------
|
||||
nsh> gprof dump /mnt/gmon.out
|
||||
|
||||
1. Pull the profiling data to host::
|
||||
8. Analyze the data on host using gprof tool::
|
||||
|
||||
adb pull /tmp/gmon.out ./gmon.out
|
||||
$ aarch64-none-elf-gprof nuttx gmon.out -b
|
||||
|
||||
2. Analyze the data using gprof tool::
|
||||
.. note:: The saved file format complies with the standard gprof format.
|
||||
For detailed instructions on gprof command usage, please refer to the GNU gprof manual:
|
||||
https://ftp.gnu.org/old-gnu/Manuals/gprof-2.9.1/html_mono/gprof.html
|
||||
|
||||
The saved file format complies with the standard gprof format.
|
||||
For detailed instructions on gprof command usage, please refer to the GNU gprof manual:
|
||||
https://ftp.gnu.org/old-gnu/Manuals/gprof-2.9.1/html_mono/gprof.html
|
||||
Example output::
|
||||
|
||||
arm-none-eabi-gprof ./nuttx/nuttx gmon.out -b
|
||||
$ aarch64-none-elf-gprof nuttx gmon.out -b
|
||||
Flat profile:
|
||||
|
||||
Example output:
|
||||
Each sample counts as 0.001 seconds.
|
||||
% cumulative self self total
|
||||
time seconds seconds calls s/call s/call name
|
||||
75.58 12.44 12.44 12462 0.00 0.00 up_idle
|
||||
24.30 16.44 4.00 4 1.00 1.00 up_ndelay
|
||||
0.05 16.45 0.01 177 0.00 0.00 pl011_txint
|
||||
0.02 16.45 0.00 35 0.00 0.00 uart_readv
|
||||
|
||||
```
|
||||
arm-none-eabi-gprof nuttx/nuttx gmon.out -b
|
||||
Flat profile:
|
||||
This output shows the performance profile of the program,
|
||||
including execution time and call counts for each function.
|
||||
The flat profile table provides a quick overview of where the program spends most of its time.
|
||||
This information can be used to identify performance bottlenecks and optimize critical parts of the code.
|
||||
|
||||
Each sample counts as 0.001 seconds.
|
||||
% cumulative self self total
|
||||
time seconds seconds calls s/call s/call name
|
||||
66.41 3.55 3.55 43 0.08 0.08 sdelay
|
||||
33.44 5.34 1.79 44 0.04 0.04 delay
|
||||
0.07 5.34 0.00 up_idle
|
||||
0.04 5.34 0.00 nx_start
|
||||
0.02 5.34 0.00 fdtdump_main
|
||||
0.02 5.34 0.00 nxsem_wait
|
||||
0.00 5.34 0.00 1 0.00 5.34 hello_main
|
||||
0.00 5.34 0.00 1 0.00 0.00 singal_handler
|
||||
Real board example
|
||||
------------------
|
||||
Let take **esp32s3-devkit** as an example.
|
||||
|
||||
```
|
||||
Test the flat profile
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
1. Configure ``./tools/configure.sh -E esp32s3-devkit:nsh`` and make sure these items are enabled::
|
||||
|
||||
This output shows the performance profile of the program,
|
||||
including execution time and call counts for each function.
|
||||
The flat profile table provides a quick overview of where the program spends most of its time.
|
||||
In this example, `sdelay` and `delay` functions consume the majority of execution time.
|
||||
This information can be used to identify performance bottlenecks and optimize critical parts of the code.
|
||||
# for gprof
|
||||
CONFIG_PROFILE_MINI=y
|
||||
CONFIG_SYSTEM_GPROF=y
|
||||
|
||||
# save and transfer data
|
||||
CONFIG_FS_TMPFS=y
|
||||
CONFIG_SYSTEM_YMODEM=y
|
||||
|
||||
2. Build and flash ``make flash ESPTOOL_PORT=/dev/ttyUSB0 -j``
|
||||
3. Run ``minicom -D /dev/ttyUSB0 -b 115200`` to connect to the board
|
||||
4. Start profiling::
|
||||
|
||||
nsh> gprof start
|
||||
|
||||
# do some test here, such as ostest
|
||||
|
||||
nsh> gprof stop
|
||||
nsh> gprof dump /tmp/gmon.out
|
||||
nsh> sb /tmp/gmon.out
|
||||
|
||||
5. Receive the file on PC, and analyze the data on host::
|
||||
|
||||
$ cp nuttx nuttx_prof
|
||||
$ xtensa-esp32s3-elf-objcopy -I elf32-xtensa-le --rename-section .flash.text=.text nuttx_prof
|
||||
$ xtensa-esp32s3-elf-gprof nuttx_prof gmon.out
|
||||
|
||||
Test the call graph profile
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
1. Add compiler option ``-pg`` to the component, such as ostest Makefile, like: ``CFLAGS += -pg``
|
||||
2. Enable the configuration item ``CONFIG_FRAME_POINTER``
|
||||
|
||||
The other steps are the same as the flat profile.
|
||||
|
||||
Reference in New Issue
Block a user