Implement Linux-compatible TFD_TIMER_CANCEL_ON_SET flag for timerfd to
allow applications to detect discontinuous changes to CLOCK_REALTIME.
Background:
According to Linux timerfd_create(2) man page, when a timerfd is created
with CLOCK_REALTIME and TFD_TIMER_CANCEL_ON_SET flag is specified, the
read() operation should fail with ECANCELED if the real-time clock
undergoes a discontinuous change. This allows applications to detect
and respond to system time changes.
Implementation:
1. Add clock notifier infrastructure (clock_notifier.h/c) to notify
interested parties when system time changes
2. Implement TFD_TIMER_CANCEL_ON_SET flag support in timerfd
3. Register timerfd with clock notifier when flag is set
4. Cancel timer and return ECANCELED when clock change is detected
5. Notify clock changes in:
- clock_settime()
- clock_initialize()
- clock_timekeeping_set_wall_time()
- rpmsg_rtc time synchronization
Changes include:
- New files: include/nuttx/clock_notifier.h, sched/clock/clock_notifier.c
- Modified: fs/vfs/fs_timerfd.c to handle TFD_TIMER_CANCEL_ON_SET
- Modified: clock subsystem to call notifier chain on time changes
- Modified: rpmsg_rtc to notify time changes during sync
Use case example:
Applications using timerfd with absolute time can now detect when
system time is adjusted (e.g., NTP sync, manual time change) and
take appropriate action such as recalculating timeouts or updating
scheduled events.
Reference: https://man7.org/linux/man-pages/man2/timerfd_create.2.html
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
For re-enter case, the inode information may unlock with not filled,
and get by other user. that lead to the error behavior.
Especially SMP scene.
Signed-off-by: buxiasen <buxiasen@xiaomi.com>
Fix MISRA Rule 10.3 exist clock to sclock cause wide type implicit conversion to narrow type, NSEC_PER_USEC to l
Signed-off-by: jiangtao16 <jiangtao16@xiaomi.com>
Signals in NuttX serve two primary purposes:
1. Synchronization and wake-up:
Signals can be used to block threads on specific signal sets and later
wake them up by delivering the corresponding signals to those threads.
2. Asynchronous notification:
Signals can also be used to install callback handlers for specific signals, allowing threads to
asynchronously invoke those handlers when the signals are delivered.
This change introduces the ability to disable all signal functionality to reduce footprint for NuttX.
Signed-off-by: Chengdong Wang wangchengdong@lixiang.com
issue description:
task A: NSH:
1.open-> reboot->sync->task_fsfsync
2.nx_vopen-> context switch
3.fdlist_allocate: ----> 4.fsync->file_sync->assert(inode or priv is empty)
(new fd with empty filep)
5.file_vopen:
(init empty filep)
6.return fd
Task A allocates a new fd with an empty filep in fdlist_allocate. Before
it can fully initialize the filep in file_vopen, the NSH task triggers a
file - system sync operation. The sync operation encounters the empty
filep associated with the newly allocated fd, causing the assertion to
fail and the system to crash.
To resolve this race condition, we should modify the fd allocation
process. Instead of allocating a new fd with an empty filep first and
then initializing it later, we should use the file_allocate_from_inode
function. This function allows us to initialize the file structure first
and then bind it to the new filep when allocating the fd. By doing so,
we ensure that the filep is always properly initialized before it is
used in any file - system operations, thus preventing the assertion
failure and the subsequent system crash.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
Issue:
When using a stack-allocated file structure, the sequence:
1. file_open() initializes the stack file structure
2. file_mmap() creates memory mapping and increments reference count
3. file_munmap() decrements reference count and may free the file structure
4. file_close() attempts to close already freed structure → crash
Root cause:
The memory mapping operations (fs_reffilep/fs_putfilep) manage reference counts
independently and can free the stack-allocated file structure prematurely.
Solution:
- Add reference count protection during file_open() for stack-allocated files
- Clear reference count appropriately during file_close()
- This ensures the file structure remains valid throughout its lifetime
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
When a thread is terminated via pthread_exit() while blocked in epoll_wait(),
the file reference taken at the beginning of epoll_wait() is not properly
released, leading to resource leaks.
Problem scenario found during libuv test:
1. Echo server thread is blocked in epoll_wait()
2. Main task sends pthread_kill signal to the server thread
3. Signal handler calls pthread_exit() to terminate the thread
4. epoll_wait() is interrupted before reaching the file_put() call
5. The epoll fd reference count remains elevated
6. epoll_do_close() is never called, leaving fds in internal queues
7. File descriptors leak
Solution:
Register a TLS (Thread Local Storage) cleanup handler using tls_cleanup_push()
at the beginning of epoll_wait() blocking section. This ensures that if the
thread exits abnormally (via pthread_exit, pthread_cancel, etc.), the cleanup
handler (epoll_cleanup) will be called automatically to release the file
reference via file_put().
The cleanup handler is properly paired with tls_cleanup_pop() when epoll_wait()
completes normally, ensuring the handler is only invoked on abnormal exit.
This fix is applied to both epoll_wait() code paths (with and without extended
mode) to ensure consistent behavior.
Impact:
- Prevents epoll fd reference count leaks on thread cancellation
- Ensures proper cleanup even when epoll_wait() is interrupted by pthread_exit
- Critical for multi-threaded applications using signals and thread termination
- Works together with previous fix for teardown/oneshot list cleanup
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
When an epoll fd is closed, the file descriptors in the teardown and
oneshot lists were not being properly dereferenced, leading to fd leaks.
This fix ensures that all fds in the teardown and oneshot lists are
properly released via file_put() during epoll_do_close(), matching the
behavior for fds in the setup list.
Impact:
- Prevents fd leaks when epoll fd is closed
- Ensures proper cleanup of all tracked file descriptors
- Critical for applications using EPOLLONESHOT or fd removal operations
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
when i is zero and file_get is failed, num is -1,
it's uint32_max for nfds.
so remove num and simplify code logic.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
nxsig_ismember() has a return type of int, but the current
implementation returns a boolean value, which is incorrect.
All callers should determine membership by checking whether
the return value is 1 or 0, which is also consistent with the POSIX sigismember() API.
Signed-off-by: Chengdong Wang wangchengdong@lixiang.com
It acts as register_driver but also populates the inode size.
This allows to return a non-zero size when calling stat() on an eeprom
driver.
The conditions (CONFIG_PSEUDOFS_FILE or CONFIG_FS_SHMFS) for the
declaration of the inode size field have also been removed so that other
drivers can populate this field in the future.
Signed-off-by: Antoine Juckler <6445757+ajuckler@users.noreply.github.com>
* Recurrency is removed from filesystem directory rename.
* Fixes use after free in buffer that was used as output and argument.
Signed-off-by: Tomasz 'CeDeROM' CEDRO <tomek@cedro.info>
To save more space (equivalent to the size of one erase sector of
MTD device) and to achieve faster read and write speeds, a method
for direct writing was introduced at the FTL layer.
This can be accomplished simply by using the following oflags during
the open operation:
1. O_DIRECT. when this flag is passed in, ftl internally uses
the direct write strategy and no read cache is used in ftl;
otherwise, each write will be executed with the minimum
granularity of flash erase sector size which means a
"sector read back - erase sector - write sector" operation
is performed by using a read cache buffer in heap.
2. O_SYNC. When this flag is passed in, we assume that the
flash has been erased in advance and no erasure operation
will be performed internally within ftl. O_SYNC will take
effect only when both O_DIRECT and O_SYNC are passed in
simultaneously.
3. For uniformity, we remapped the mount flag in mount.h and
unified it with the open flag in fcntl.h. The repetitive
parts of their definitions were reused, and the remaining
part of the mount flag redefine to the unused bit of open
flags.
Signed-off-by: jingfei <jingfei@xiaomi.com>
Double free occurred in lib_put_pathbuffer if CONFIG_FS_NOTIFY option
was enabled. The second if statement has to be called only if the
close operation returned error. The bug was introduced in 14f5c48
and was causing misc/lib_tempbuffer.c:141 debug assertion.
Signed-off-by: Michal Lenc <michallenc@seznam.cz>
these command FIOGCLEX/FIOCLEX/FIONCLEX are related to struct fd,
so need to use ioctl to implement.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
This patch is a rework of the NuttX file descriptor implementation. The
goal is two-fold:
1. Improve POSIX compliance. The old implementation tied file description
to inode only, not the file struct. POSIX however dictates otherwise.
2. Fix a bug with descriptor duplication (dup2() and dup3()). There is
an existing race condition with this POSIX API that currently results
in a kernel side crash.
The crash occurs when a partially open / closed file descriptor is
duplicated. The reason for the crash is that even if the descriptor is
closed, the file might still be in use by the kernel (due to e.g. ongoing
write to file). The open file data is changed by file_dup3() and this
causes a crash in the device / drivers themselves as they lose access to
the inode and private data.
The fix is done by separating struct file into file and file descriptor
structs. The file struct can live on even if the descriptor is closed,
fixing the crash. This also fixes the POSIX issue, as two descriptors
can now point to the same file.
Signed-off-by: Ville Juven <ville.juven@unikie.com>
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
1. The call to file_close_without_clear in file_dup3 does not clear
the tag information, so there is no need to back it up.
2. file_dup3 don't need to copy tag information, tag is only valid for fd.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
Currently the code is dumped into one massive file; fs_files. Move the
different logical parts into their own files.
Signed-off-by: Ville Juven <ville.juven@unikie.com>
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
When a new pseudofile is created, the inode reference count needs to
be bumped to protect the node.
Signed-off-by: Ville Juven <ville.juven@unikie.com>
Allow users to operate poll in the kernel using the file_poll
approach, as file is protected with reference counting,
making it more secure.
Signed-off-by: dongjiuzhu1 <dongjiuzhu1@xiaomi.com>
When we use fcntl for dup, an fd is directly passed. If we have opened FDCHECK. we need to restore this file descriptor.
open FDCHECK and test this:
`
int main(int ac, char **av)
{
int fd1= open("./1.txt", O_WRONLY | O_CREAT, 0666);
if (fd1 < 0)
{
printf("open err\n");
return fd1;
}
int fd2= open("./2.txt", O_WRONLY | O_CREAT, 0666);
if (fd2 < 0)
{
printf("open err\n");
close(fd1);
return fd2;
}
//close(fd2);
int fd3 = fcntl(fd1, F_DUPFD, fd2);
printf("fd3 = %d\n", fd3);
close(fd1);
close(fd3);
return 0;
}
`
Signed-off-by: zhangshoukui <zhangshoukui@xiaomi.com>
Add the noinstrument_function attribute to the poll_notify function
to avoid it being looped if -finstrument-functions is set to the
fs/vfs files.
Signed-off-by: Tiago Medicci Serrano <tiago.medicci@espressif.com>
* Make readv/writev implementations update struct uio
This can simplify partial result handling.
* change the error number on the overflow from EOVERFLOW to EINVAL
to match NetBSD
* add a commented out uio_offset field. I used "#if 0" here as
C comments can't nest.
* add a few helper functions
Note on uio_copyfrom/uio_copyto:
although i'm not quite happy with the "offset" functionality,
it's necessary to simplify the adaptation of some drivers like
drivers/serial/serial.c, which (ab)uses the user-supplied buffer
as a line-buffer.
Modify the kernel to use only atomic_xx and atomic64_xx interfaces,
avoiding the use of sizeof or typeof to determine the type of
atomic operations, thereby simplifying the kernel's atomic
interface operations.
Signed-off-by: zhangyuan29 <zhangyuan29@xiaomi.com>
Most tools used for compliance and SBOM generation use SPDX identifiers
This change brings us a step closer to an easy SBOM generation.
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
The problem has been inherited from the original libc readv/writev
implementation. However, now it's exposed in more situations because
this implemenation is used to back read/write as well.
I expect this fixes the regressions observed on the Espressif CI.
https://github.com/apache/nuttx/pull/13498#issuecomment-2448031197
Note that, even with this fix, these "compat" readv/writev
implementations are still inheritedly broken. (E.g. consider that
a data boundary happens to match one of iovec boundaries) However,
this fix is enough for read/write, where iovcnt is always 1.
This patch fixed userspace headers conflict. Architecture-related definition and API should not be exposed to users.
Signed-off-by: ouyangxiangzhen <ouyangxiangzhen@xiaomi.com>
currently, nuttx implements readv/writev on the top of read/write.
while it might work for the simplest cases, it's broken by design.
for example, it's impossible to make it work correctly for files
which need to preserve data boundaries without allocating a single
contiguous buffer. (udp socket, some character devices, etc)
this change is a start of the migration to a better design.
that is, implement read/write on the top of readv/writev.
to avoid a single huge change, following things will NOT be done in
this commit:
* fix actual bugs caused by the original readv-based-on-read design.
(cf. https://github.com/apache/nuttx/pull/12674)
* adapt filesystems/drivers to actually benefit from the new interface.
(except a few trivial examples)
* eventually retire the old interface.
* retire read/write syscalls. implement them in libc instead.
* pread/pwrite/preadv/pwritev (except the introduction of struct uio,
which is a preparation to back these variations with the new
interface.)