There can be a failure in IOB allocation to some asynchronous behavior caused by the use of sem_post(). Consider this scenario:

Task A holds an IOB.  There are no further IOBs.  The value of semcount is zero.
Task B calls iob_alloc().  Since there are not IOBs, it calls sem_wait().  The v
alue of semcount is now -1.

Task A frees the IOB.  iob_free() adds the IOB to the free list and calls sem_post() this makes Task B ready to run and sets semcount to zero NOT 1.  There is one IOB in the free list and semcount is zero.  When Task B wakes up it would increment the sem_count back to the correct value.

But an interrupt or another task runs occurs before Task B executes.  The interrupt or other tak takes the IOB off of the free list and decrements the semcount.  But since semcount is then < 0, this causes the assertion because that is an invalid state in the interrupt handler.

So I think that the root cause is that there the asynchrony between incrementing the semcount.  This change separates the list of IOBs:  Currently there is only a free list of IOBs.  The problem, I believe, is because of asynchronies due sem_post() post cause the semcount and the list content to become out of sync.  This change adds a new 'committed' list:  When there is a task waiting for an IOB, it will go into the committed list rather than the free list before the semaphore is posted.  On the waiting side, when awakened from the semaphore wait, it will expect to find its IOB in the committed list, rather than free list.

In this way, the content of the free list and the value of the semaphore count always remain in sync.
This commit is contained in:
Gregory Nutt
2017-05-16 11:03:35 -06:00
parent a6e556d31c
commit 6a3800f611
7 changed files with 271 additions and 113 deletions
+90 -47
View File
@@ -55,6 +55,44 @@
* Private Functions
****************************************************************************/
/****************************************************************************
* Name: iob_alloc_qcommitted
*
* Description:
* Allocate an I/O buffer by taking the buffer at the head of the committed
* list.
*
****************************************************************************/
FAR struct iob_qentry_s *iob_alloc_qcommitted(void)
{
FAR struct iob_qentry_s *iobq = NULL;
irqstate_t flags;
/* We don't know what context we are called from so we use extreme measures
* to protect the committed list: We disable interrupts very briefly.
*/
flags = enter_critical_section();
/* Take the I/O buffer from the head of the committed list */
iobq = g_iob_qcommitted;
if (iobq != NULL)
{
/* Remove the I/O buffer from the committed list */
g_iob_qcommitted = iobq->io_flink;
/* Put the I/O buffer in a known state */
iobq->qe_head = NULL; /* Nothing is contained */
}
leave_critical_section(flags);
return iobq;
}
/****************************************************************************
* Name: iob_allocwait_qentry
*
@@ -78,73 +116,78 @@ static FAR struct iob_qentry_s *iob_allocwait_qentry(void)
*/
flags = enter_critical_section();
do
/* Try to get an I/O buffer chain container. If successful, the semaphore
* count will bedecremented atomically.
*/
qentry = iob_tryalloc_qentry();
while (ret == OK && qentry == NULL)
{
/* Try to get an I/O buffer chain container. If successful, the
* semaphore count will be decremented atomically.
/* If not successful, then the semaphore count was less than or equal
* to zero (meaning that there are no free buffers). We need to wait
* for an I/O buffer chain container to be released when the
* semaphore count will be incremented.
*/
qentry = iob_tryalloc_qentry();
if (!qentry)
ret = sem_wait(&g_qentry_sem);
if (ret < 0)
{
/* If not successful, then the semaphore count was less than or
* equal to zero (meaning that there are no free buffers). We
* need to wait for an I/O buffer chain container to be released
* when the semaphore count will be incremented.
int errcode = get_errno();
/* EINTR is not an error! EINTR simply means that we were
* awakened by a signal and we should try again.
*
* REVISIT: Many end-user interfaces are required to return
* with an error if EINTR is set. Most uses of this function
* is in internal, non-user logic. But are there cases where
* the error should be returned.
*/
ret = sem_wait(&g_qentry_sem);
if (ret < 0)
if (errcode == EINTR)
{
int errcode = get_errno();
/* EINTR is not an error! EINTR simply means that we were
* awakened by a signal and we should try again.
*
* REVISIT: Many end-user interfaces are required to return
* with an error if EINTR is set. Most uses of this function
* is in internal, non-user logic. But are there cases where
* the error should be returned.
/* Force a success indication so that we will continue
* looping.
*/
if (errcode == EINTR)
{
/* Force a success indication so that we will continue
* looping.
*/
ret = 0;
}
else
{
/* Stop the loop and return a error */
DEBUGASSERT(errcode > 0);
ret = -errcode;
}
ret = 0;
}
else
{
/* When we wake up from wait successfully, an I/O buffer chain
* container was returned to the free list. However, if there
* are concurrent allocations from interrupt handling, then I
* suspect that there is a race condition. But no harm, we
* will just wait again in that case.
/* Stop the loop and return a error */
DEBUGASSERT(errcode > 0);
ret = -errcode;
}
}
else
{
/* When we wake up from wait successfully, an I/O buffer chain container was
* freed and we hold a count for one IOB. Unless somehting
* failed, we should have an IOB waiting for us in the
* committed list.
*/
qentry = iob_alloc_qcommitted();
DEBUGASSERT(qentry != NULL);
if (qentry == NULL)
{
/* This should not fail, but we allow for that possibility to
* handle any potential, non-obvious race condition. Perhaps
* the free IOB ended up in the g_iob_free list?
*
* We need release our count so that it is available to
* iob_tryalloc_qentry(), perhaps allowing another thread to
* take our count. In that event, iob_tryalloc_qentry() will
* fail above and we will have to wait again.
*
* TODO: Consider a design modification to permit us to
* complete the allocation without losing our count.
* iob_tryalloc(), perhaps allowing another thread to take our
* count. In that event, iob_tryalloc() will fail above and
* we will have to wait again.
*/
sem_post(&g_qentry_sem);
qentry = iob_tryalloc_qentry();
}
}
}
while (ret == OK && !qentry);
leave_critical_section(flags);
return qentry;