[v2] net/af_packet: remove timestamp from packet status

Message ID 1631553801-75072-1-git-send-email-tudor.cornea@gmail.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] net/af_packet: remove timestamp from packet status |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/iol-x86_64-unit-testing fail Testing issues
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing fail Testing issues
ci/iol-intel-Performance success Performance Testing PASS

Commit Message

Tudor Cornea Sept. 13, 2021, 5:23 p.m. UTC
  We should eliminate the timestamp status from the packet
status. This should only matter if timestamping is enabled
on the socket, but we might hit a kernel bug, which is fixed
in newer releases.

For interfaces of type 'veth', the sent skb is forwarded
to the peer and back into the network stack which timestamps
it on the RX path if timestamping is enabled globally
(which happens if any socket enables timestamping).

When the skb is destructed, tpacket_destruct_skb() is called
and it calls __packet_set_timestamp() which doesn't check
the flags on the socket and returns the timestamp if it is
set in the skb (and for veth it is, as mentioned above).

See the following kernel commit for reference [1]:

net: packetmmap: fix only tx timestamp on request

The packetmmap tx ring should only return timestamps if requested
via setsockopt PACKET_TIMESTAMP, as documented. This allows
compatibility with non-timestamp aware user-space code which checks
tp_status == TP_STATUS_AVAILABLE; not expecting additional timestamp
flags to be set in tp_status.

[1] https://www.spinics.net/lists/kernel/msg3959391.html

Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>

---
v2:
* Remove compile-time check for kernel version
---
 drivers/net/af_packet/rte_eth_af_packet.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)
  

Comments

Ferruh Yigit Sept. 20, 2021, 5:48 p.m. UTC | #1
On 9/13/2021 6:23 PM, Tudor Cornea wrote:
> We should eliminate the timestamp status from the packet
> status. This should only matter if timestamping is enabled
> on the socket, but we might hit a kernel bug, which is fixed
> in newer releases.
> 
> For interfaces of type 'veth', the sent skb is forwarded
> to the peer and back into the network stack which timestamps
> it on the RX path if timestamping is enabled globally
> (which happens if any socket enables timestamping).
> 
> When the skb is destructed, tpacket_destruct_skb() is called
> and it calls __packet_set_timestamp() which doesn't check
> the flags on the socket and returns the timestamp if it is
> set in the skb (and for veth it is, as mentioned above).
> 
> See the following kernel commit for reference [1]:
> 
> net: packetmmap: fix only tx timestamp on request
> 
> The packetmmap tx ring should only return timestamps if requested
> via setsockopt PACKET_TIMESTAMP, as documented. This allows
> compatibility with non-timestamp aware user-space code which checks
> tp_status == TP_STATUS_AVAILABLE; not expecting additional timestamp
> flags to be set in tp_status.
> 
> [1] https://www.spinics.net/lists/kernel/msg3959391.html
> 
> Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
> Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
> 
> ---
> v2:
> * Remove compile-time check for kernel version

OK, Stephen's comment makes sense.

> ---
>  drivers/net/af_packet/rte_eth_af_packet.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
> index b73b211..7ecea4e 100644
> --- a/drivers/net/af_packet/rte_eth_af_packet.c
> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> @@ -167,6 +167,22 @@ eth_af_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>  	return num_rx;
>  }
>  
> +static inline bool tx_ring_status_unavailable(uint32_t tp_status)
> +{

Minor syntax comment, can you have the 'static inline bool' part in separate
line. And a basic function comment can be good.

Thanks,
ferruh

> +	/*
> +	 * We eliminate the timestamp status from the packet status.
> +	 * This should only matter if timestamping is enabled on the socket,
> +	 * but there is a bug in the kernel which is fixed in newer releases.
> +	 *
> +	 * See the following kernel commit for reference:
> +	 *     commit 171c3b151118a2fe0fc1e2a9d1b5a1570cfe82d2
> +	 *     net: packetmmap: fix only tx timestamp on request
> +	 */
> +	tp_status &= ~(TP_STATUS_TS_SOFTWARE | TP_STATUS_TS_RAW_HARDWARE);
> +
> +	return tp_status != TP_STATUS_AVAILABLE;
> +}
> +
>  /*
>   * Callback to handle sending packets through a real NIC.
>   */
> @@ -212,8 +228,8 @@ eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
>  		}
>  
>  		/* point at the next incoming frame */
> -		if ((ppd->tp_status != TP_STATUS_AVAILABLE) &&
> -		    (poll(&pfd, 1, -1) < 0))
> +		if (tx_ring_status_unavailable(ppd->tp_status) &&
> +		    poll(&pfd, 1, -1) < 0)
>  			break;
>  
>  		/* copy the tx frame data */
>
  
Tudor Cornea Sept. 21, 2021, 9:02 p.m. UTC | #2
Thanks for the suggestion. I will send a new version of the patch with the
required changes.

Tudor

On Mon, 20 Sept 2021 at 20:49, Ferruh Yigit <ferruh.yigit@intel.com> wrote:

> On 9/13/2021 6:23 PM, Tudor Cornea wrote:
> > We should eliminate the timestamp status from the packet
> > status. This should only matter if timestamping is enabled
> > on the socket, but we might hit a kernel bug, which is fixed
> > in newer releases.
> >
> > For interfaces of type 'veth', the sent skb is forwarded
> > to the peer and back into the network stack which timestamps
> > it on the RX path if timestamping is enabled globally
> > (which happens if any socket enables timestamping).
> >
> > When the skb is destructed, tpacket_destruct_skb() is called
> > and it calls __packet_set_timestamp() which doesn't check
> > the flags on the socket and returns the timestamp if it is
> > set in the skb (and for veth it is, as mentioned above).
> >
> > See the following kernel commit for reference [1]:
> >
> > net: packetmmap: fix only tx timestamp on request
> >
> > The packetmmap tx ring should only return timestamps if requested
> > via setsockopt PACKET_TIMESTAMP, as documented. This allows
> > compatibility with non-timestamp aware user-space code which checks
> > tp_status == TP_STATUS_AVAILABLE; not expecting additional timestamp
> > flags to be set in tp_status.
> >
> > [1] https://www.spinics.net/lists/kernel/msg3959391.html
> >
> > Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
> > Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
> >
> > ---
> > v2:
> > * Remove compile-time check for kernel version
>
> OK, Stephen's comment makes sense.
>
> > ---
> >  drivers/net/af_packet/rte_eth_af_packet.c | 20 ++++++++++++++++++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> b/drivers/net/af_packet/rte_eth_af_packet.c
> > index b73b211..7ecea4e 100644
> > --- a/drivers/net/af_packet/rte_eth_af_packet.c
> > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> > @@ -167,6 +167,22 @@ eth_af_packet_rx(void *queue, struct rte_mbuf
> **bufs, uint16_t nb_pkts)
> >       return num_rx;
> >  }
> >
> > +static inline bool tx_ring_status_unavailable(uint32_t tp_status)
> > +{
>
> Minor syntax comment, can you have the 'static inline bool' part in
> separate
> line. And a basic function comment can be good.
>
> Thanks,
> ferruh
>
> > +     /*
> > +      * We eliminate the timestamp status from the packet status.
> > +      * This should only matter if timestamping is enabled on the
> socket,
> > +      * but there is a bug in the kernel which is fixed in newer
> releases.
> > +      *
> > +      * See the following kernel commit for reference:
> > +      *     commit 171c3b151118a2fe0fc1e2a9d1b5a1570cfe82d2
> > +      *     net: packetmmap: fix only tx timestamp on request
> > +      */
> > +     tp_status &= ~(TP_STATUS_TS_SOFTWARE | TP_STATUS_TS_RAW_HARDWARE);
> > +
> > +     return tp_status != TP_STATUS_AVAILABLE;
> > +}
> > +
> >  /*
> >   * Callback to handle sending packets through a real NIC.
> >   */
> > @@ -212,8 +228,8 @@ eth_af_packet_tx(void *queue, struct rte_mbuf
> **bufs, uint16_t nb_pkts)
> >               }
> >
> >               /* point at the next incoming frame */
> > -             if ((ppd->tp_status != TP_STATUS_AVAILABLE) &&
> > -                 (poll(&pfd, 1, -1) < 0))
> > +             if (tx_ring_status_unavailable(ppd->tp_status) &&
> > +                 poll(&pfd, 1, -1) < 0)
> >                       break;
> >
> >               /* copy the tx frame data */
> >
>
>
  

Patch

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
index b73b211..7ecea4e 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -167,6 +167,22 @@  eth_af_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 	return num_rx;
 }
 
+static inline bool tx_ring_status_unavailable(uint32_t tp_status)
+{
+	/*
+	 * We eliminate the timestamp status from the packet status.
+	 * This should only matter if timestamping is enabled on the socket,
+	 * but there is a bug in the kernel which is fixed in newer releases.
+	 *
+	 * See the following kernel commit for reference:
+	 *     commit 171c3b151118a2fe0fc1e2a9d1b5a1570cfe82d2
+	 *     net: packetmmap: fix only tx timestamp on request
+	 */
+	tp_status &= ~(TP_STATUS_TS_SOFTWARE | TP_STATUS_TS_RAW_HARDWARE);
+
+	return tp_status != TP_STATUS_AVAILABLE;
+}
+
 /*
  * Callback to handle sending packets through a real NIC.
  */
@@ -212,8 +228,8 @@  eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		}
 
 		/* point at the next incoming frame */
-		if ((ppd->tp_status != TP_STATUS_AVAILABLE) &&
-		    (poll(&pfd, 1, -1) < 0))
+		if (tx_ring_status_unavailable(ppd->tp_status) &&
+		    poll(&pfd, 1, -1) < 0)
 			break;
 
 		/* copy the tx frame data */