net/l2/l3/l4: add support of iob offload

1. Add new config CONFIG_NET_LL_GUARDSIZE to isolation of l2 stack,
   which will benefit l3(IP) layer for multi-MAC(l2) implementation,
   especially in some NICs such as celluler net driver.

new configuration options: CONFIG_NET_LL_GUARDSIZE

CONFIG_NET_LL_GUARDSIZE will reserved l2 buffer header size of
network buffer to isolate the L2/L3 (MAC/IP) data on network layer,
which will be beneficial to L3 network layer protocol transparent
transmission and forwarding

------------------------------------------------------------
Layout of frist iob entry:

        iob_data (aligned by CONFIG_IOB_ALIGNMENT)
            |
            |                  io_offset(CONFIG_NET_LL_GUARDSIZE)
            |                                |
            -------------------------------------------------
      iob   |            Reserved            |    io_len    |
            -------------------------------------------------

-------------------------------------------------------------
Layout of different NICs implementation:

        iob_data (aligned by CONFIG_IOB_ALIGNMENT)
            |
            |                 io_offset(CONFIG_NET_LL_GUARDSIZE)
            |                                |
            -------------------------------------------------
 Ethernet   |       Reserved    | ETH_HDRLEN |    io_len    |
            ---------------------------------|---------------
 8021Q      |   Reserved  | ETH_8021Q_HDRLEN |    io_len    |
            ---------------------------------|---------------
 ipforward  |            Reserved            |    io_len    |
            -------------------------------------------------

--------------------------------------------------------------------

2. Support iob offload to l2 driver to avoid unnecessary memory copy

Support send/receive iob vectors directly between the NICs and l3/l4
stack to avoid unnecessary memory copies, especially on hardware that
supports Scatter/gather, which can greatly improve performance.

new interface to support iob offload:

  ------------------------------------------
  |    IOB version     |     original      |
  |----------------------------------------|
  |  devif_iob_poll()  |   devif_poll()    |
  |       ...          |       ...         |
  ------------------------------------------

--------------------------------------------------------------------

1> NIC hardware support Scatter/gather transfer

TX:

                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll("NIC"_txpoll)                callback() // "NIC"_txpoll
                                                  |
                            dev->d_iob:           |
                                                ---------------         ---------------
                             io_data       iob1 |  |          |    iob3 |  |          |
                                    \           ---------------         ---------------
                                  ---------------  |       --------------- |
                             iob0 |  |          |  |  iob2 |  |          | |
                                  ---------------  |       --------------- |
                                     \             |          /           /
                                        \          |       /           /
                                   ----------------------------------------------
                    NICs io vector |    |    |    |    |    |    |    |    |    |
                                   ----------------------------------------------

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
          pkt/ipv[4/6]_input()/...
                    |
                    |
     NICs io vector receive(iov_base to each iobs)

--------------------------------------------------------------------

2> CONFIG_IOB_BUFSIZE is greater than MTU:

TX:

"(CONFIG_IOB_BUFSIZE) > (MAX_NETDEV_PKTSIZE + CONFIG_NET_GUARDSIZE + CONFIG_NET_LL_GUARDSIZE)"

                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll("NIC"_txpoll)                callback() // "NIC"_txpoll
                                                  |
                                             "NIC"_send()
                          (dev->d_iob->io_data[CONFIG_NET_LL_GUARDSIZE - NET_LL_HDRLEN(dev)])

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
          pkt/ipv[4/6]_input()/...
                    |
                    |
     NICs io vector receive(iov_base to io_data)

--------------------------------------------------------------------

3> Compatible with all old flat buffer NICs

TX:
                tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
                           /              \
                          /                \
devif_poll_[l3|l4]_connections()     devif_iob_send() (nocopy:udp/icmp/...)
           /                                   \      (copy:tcp)
          /                                     \
  devif_iob_poll(devif_poll_callback())  devif_poll_callback() /* new interface, gather iobs to flat buffer */
       /                                           \
      /                                             \
 devif_poll("NIC"_txpoll)                     "NIC"_send()(dev->d_buf)

RX:

  [tcp|udp|icmp|...]ipv[4|6]_data_handler()(iob_concat/append to readahead)
                    |
                    |
      [tcp|udp|icmp|...]_ipv[4|6]_in()/...
                    |
                    |
               netdev_input()  /* new interface, Scatter/gather flat/iob buffer */
                    |
                    |
          pkt/ipv[4|6]_input()/...
                    |
                    |
    NICs io vector receive(Orignal flat buffer)

3. Iperf passthrough on NuttX simulator:

  -------------------------------------------------
  |  Protocol      | Server | Client |            |
  |-----------------------------------------------|
  |  TCP           |  813   |   834  |  Mbits/sec |
  |  TCP(Offload)  | 1720   |  1100  |  Mbits/sec |
  |  UDP           |   22   |   757  |  Mbits/sec |
  |  UDP(Offload)  |   25   |  1250  |  Mbits/sec |
  -------------------------------------------------

Signed-off-by: chao an <anchao@xiaomi.com>
This commit is contained in:
chao an
2022-11-23 19:39:40 +08:00
committed by Xiang Xiao
parent c26fd6565d
commit 34d2cde8a8
59 changed files with 1766 additions and 913 deletions
+215 -7
View File
@@ -618,11 +618,7 @@ static inline int devif_poll_tcp_connections(FAR struct net_driver_s *dev,
#endif
/****************************************************************************
* Public Functions
****************************************************************************/
/****************************************************************************
* Name: devif_poll
* Name: devif_poll_connections
*
* Description:
* This function will traverse each active network connection structure and
@@ -646,9 +642,10 @@ static inline int devif_poll_tcp_connections(FAR struct net_driver_s *dev,
*
****************************************************************************/
int devif_poll(FAR struct net_driver_s *dev, devif_poll_callback_t callback)
static int devif_poll_connections(FAR struct net_driver_s *dev,
devif_poll_callback_t callback)
{
int bstop = false;
int bstop;
/* Traverse all of the active packet connections and perform the poll
* action.
@@ -772,6 +769,217 @@ int devif_poll(FAR struct net_driver_s *dev, devif_poll_callback_t callback)
return bstop;
}
/****************************************************************************
* Name: devif_iob_poll
*
* Description:
* This function will traverse each active network connection structure and
* will perform network polling operations. devif_poll() may be called
* asynchronously with the network driver can accept another outgoing
* packet.
*
* This function will call the provided callback function for every active
* connection. Polling will continue until all connections have been polled
* or until the user-supplied function returns a non-zero value (which it
* should do only if it cannot accept further write data).
*
* When the callback function is called, there may be an outbound packet
* waiting for service in the device packet buffer, and if so the d_len
* field is set to a value larger than zero. The device driver should then
* send out the packet.
*
* This is the iob buffer version of devif_input(),
* this function will support send/receive iob vectors directly between
* the driver and l3/l4 stack to avoid unnecessary memory copies,
* especially on hardware that supports Scatter/gather, which can
* greatly improve performance
* this function will uses d_iob as packets input which used by some
* NICs such as celluler net driver.
*
* If NIC hardware support Scatter/gather transfer
*
* tcp_poll()/udp_poll()/pkt_poll()/...(l3/l4)
* / \
* / \
* devif_poll_[l3|l4]_connections() devif_iob_send() (nocopy:udp/icmp/...)
* / \ (copy:tcp)
* / \
* devif_iob_poll("NIC"_txpoll) callback() // "NIC"_txpoll
*
*
* Assumptions:
* This function is called from the MAC device driver with the network
* locked.
*
****************************************************************************/
static int devif_iob_poll(FAR struct net_driver_s *dev,
devif_poll_callback_t callback)
{
int bstop;
/* Device polling, prepare iob buffer */
if (netdev_iob_prepare(dev, false, 0) != OK)
{
return true;
}
/* Perform all connections poll */
bstop = devif_poll_connections(dev, callback);
/* Device polling completed, release iob */
netdev_iob_release(dev);
dev->d_buf = NULL;
return bstop;
}
/****************************************************************************
* Name: devif_poll_callback
*
* Description:
* This function will help us to gather multiple iob memory slices into a
* linear device buffer. if devices with small memory, this function will
* trigger a memory copy if net device start transmit the iob slices to
* flat buffer
*
****************************************************************************/
static int devif_poll_callback(FAR struct net_driver_s *dev)
{
if (dev->d_len > 0)
{
return true;
}
return false;
}
/****************************************************************************
* Public Functions
****************************************************************************/
/****************************************************************************
* Name: devif_poll
*
* Description:
* This function will traverse each active network connection structure and
* will perform network polling operations. devif_poll() may be called
* asynchronously with the network driver can accept another outgoing
* packet.
*
* This function will call the provided callback function for every active
* connection. Polling will continue until all connections have been polled
* or until the user-supplied function returns a non-zero value (which it
* should do only if it cannot accept further write data).
*
* When the callback function is called, there may be an outbound packet
* waiting for service in the device packet buffer, and if so the d_len
* field is set to a value larger than zero. The device driver should then
* send out the packet.
*
* Compatible with all old flat buffer NICs
*
* tcp_poll()/udp_poll()/pkt_poll()/...(l3|l4)
* / \
* / \
* devif_poll_[l3|l4]_connections() devif_iob_send() (nocopy:udp/icmp/..)
* / \ (copy:tcp)
* / \
* devif_iob_poll(devif_poll_callback()) devif_poll_callback()
* / \
* / \
* devif_poll("NIC"_txpoll) "NIC"_send()(dev->d_buf)
*
*
* Assumptions:
* This function is called from the MAC device driver with the network
* locked.
*
****************************************************************************/
int devif_poll(FAR struct net_driver_s *dev, devif_poll_callback_t callback)
{
uint16_t llhdrlen;
int bstop = false;
FAR uint8_t *buf;
if (dev->d_buf == NULL)
{
return devif_iob_poll(dev, callback);
}
buf = dev->d_buf;
/* Device polling, prepare iob buffer */
if (netdev_iob_prepare(dev, false, 0) != OK)
{
return true;
}
llhdrlen = NET_LL_HDRLEN(dev);
do
{
/* Perform all connections poll */
bstop = devif_poll_connections(dev, devif_poll_callback);
if (dev->d_len > 0)
{
/* Copy iob to flat buffer */
iob_copyout(buf + llhdrlen,
dev->d_iob, dev->d_len, 0);
/* Copy l2 header (arp out) */
memcpy(buf, dev->d_iob->io_data +
(CONFIG_NET_LL_GUARDSIZE - llhdrlen), llhdrlen);
/* Restore flat buffer pointer */
dev->d_buf = buf;
/* Call the real device callback */
bstop = callback(dev);
/* Flat buffer changed by NIC ? */
if (dev->d_buf != buf)
{
if (dev->d_buf == NULL)
{
break;
}
buf = dev->d_buf;
}
/* Finish copy, reset iob */
netdev_iob_prepare(dev, false, 0);
iob_update_pktlen(dev->d_iob, 0);
}
}
while (bstop);
/* Device polling completed, release iob */
netdev_iob_release(dev);
/* Restore the flat buffer */
dev->d_buf = buf;
return bstop;
}
/****************************************************************************
* Name: devif_out
*