[dpdk-dev,v1] gso: fix marking TCP checksum flag in TCP segments

Message ID 1524406859-29585-1-git-send-email-ophirmu@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Ophir Munk April 22, 2018, 2:20 p.m. UTC
  Large TCP packets which are marked with PKT_TX_TCP_SEG flag are
segmented and the flag is cleared in the resulting segments, however,
the segments checksum is not updated. It is therefore required to set
the PKT_TX_TCP_CKSUM flag in each TCP segment in order to mark for the
sending driver the need to update the TCP checksum before transmitting
the segment.

Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
---
 lib/librte_gso/rte_gso.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Ophir Munk April 22, 2018, 2:47 p.m. UTC | #1
Hi Ferruh, Thomas,
Please note that this patch should be merged before the net/tap TSO patches:

https://dpdk.org/dev/patchwork/patch/38666/
https://dpdk.org/dev/patchwork/patch/38667/

Regards,
Ophir

> -----Original Message-----
> From: Ophir Munk
> Sent: Sunday, April 22, 2018 5:21 PM
> To: dev@dpdk.org; Jiayu Hu <jiayu.hu@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>; Ophir
> Munk <ophirmu@mellanox.com>; stable@dpdk.org
> Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> Large TCP packets which are marked with PKT_TX_TCP_SEG flag are
> segmented and the flag is cleared in the resulting segments, however, the
> segments checksum is not updated. It is therefore required to set the
> PKT_TX_TCP_CKSUM flag in each TCP segment in order to mark for the
> sending driver the need to update the TCP checksum before transmitting the
> segment.
> 
> Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> ---
>  lib/librte_gso/rte_gso.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c index
> a44e3d4..e9ce9ce 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
>  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
>  			 (gso_ctx->gso_types &
> DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
>  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
>  				direct_pool, indirect_pool,
>  				pkts_out, nb_pkts_out);
>  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
>  			(gso_ctx->gso_types &
> DEV_TX_OFFLOAD_TCP_TSO)) {
>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
>  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
>  				direct_pool, indirect_pool,
>  				pkts_out, nb_pkts_out);
> --
> 2.7.4
  
Hu, Jiayu April 23, 2018, 4:13 a.m. UTC | #2
Hi Ophir,

In the GSO design, the GSO library doesn't care about checksums, which
means it doesn't check if input packets have correct checksums, and it
doesn't do any checksum related work for the output GSO segments. It
depends on the callers to use HW or SW checksum calculation for output
packets. This is why the GSO library doesn't set PKT_TX_TCP_CKSUM. So
I don't think it's a bug.

In my opinion, it's not a good idea to enable HW TCP checksum calculation
silently, and without the aware of the caller. In fact, the caller always know it
does SW TSO (i.e. GSO), instead of real HW TSO. If the caller wants HW checksum
calculation, it can add PKT_TX_TCP_CKSUM to ol_flags before or after calling the
GSO library.

Add Konstantin for more suggestions.

Thanks,
Jiayu

> -----Original Message-----
> From: Ophir Munk [mailto:ophirmu@mellanox.com]
> Sent: Sunday, April 22, 2018 10:21 PM
> To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>; Ophir
> Munk <ophirmu@mellanox.com>; stable@dpdk.org
> Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> Large TCP packets which are marked with PKT_TX_TCP_SEG flag are
> segmented and the flag is cleared in the resulting segments, however,
> the segments checksum is not updated. It is therefore required to set
> the PKT_TX_TCP_CKSUM flag in each TCP segment in order to mark for the
> sending driver the need to update the TCP checksum before transmitting
> the segment.
> 
> Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> ---
>  lib/librte_gso/rte_gso.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index a44e3d4..e9ce9ce 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
>  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
>  			 (gso_ctx->gso_types &
> DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
>  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
>  				direct_pool, indirect_pool,
>  				pkts_out, nb_pkts_out);
>  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
>  			(gso_ctx->gso_types &
> DEV_TX_OFFLOAD_TCP_TSO)) {
>  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
>  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
>  				direct_pool, indirect_pool,
>  				pkts_out, nb_pkts_out);
> --
> 2.7.4
  
Ophir Munk April 24, 2018, 9:44 a.m. UTC | #3
Hi Jiayu,
Please find comments inline

> -----Original Message-----
> From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> Sent: Monday, April 23, 2018 7:14 AM
> To: Ophir Munk <ophirmu@mellanox.com>; dev@dpdk.org; Ananyev,
> Konstantin <konstantin.ananyev@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> Hi Ophir,
> 
> In the GSO design, the GSO library doesn't care about checksums, which
> means it doesn't check if input packets have correct checksums, and it
> doesn't do any checksum related work for the output GSO segments. It
> depends on the callers to use HW or SW checksum calculation for output
> packets. This is why the GSO library doesn't set PKT_TX_TCP_CKSUM. So I
> don't think it's a bug.
> 

Can you please reconsider this design? I think the GSO library should imitate the HW behavior where TCP segments checksum is automatically calculated without explicitly requesting it. I am not saying that GSO library itself should calculate the checksums - but at least it should mark each segment as requiring this calculation.

> In my opinion, it's not a good idea to enable HW TCP checksum calculation
> silently, and without the aware of the caller. In fact, the caller always know it
> does SW TSO (i.e. GSO), instead of real HW TSO. 

This is not correct. Consider net_failsafe with 2 sub-devices: one is a HW PCI device, the other one is a SW TAP device. Failsafe must work transparently with these two sub-devices and the caller cannot tell if TSO is done in SW or HW. 

> If the caller wants HW
> checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags before or
> after calling the GSO library.
> 

FYI - TAP TSO patches were submitted to dpdk.org mailing list. These patches use the GSO library.
https://dpdk.org/dev/patchwork/patch/38666/
https://dpdk.org/dev/patchwork/patch/38667/

Running testpmd with TAP TSO is currently broken without the suggested librte_gso patch. 
Please note testpmd implementation (app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are configured:

  if (tso_segsz)
      ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the packet flags are only marked with PKT_TX_TCP_SEG and no PKT_TX_TCP_CKSUM ***
   else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM) 
       ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is marked only if TSO is not applicable ***
  else {
      tcp_hdr->cksum =
         get_udptcp_checksum(l3_hdr, tcp_hdr,

In other words - testpmd does not set TCP_CKSUM along with TCP_SEG therefore using testpmd with TAP/TSO will result in TCP segments with 0 (incorrect) TCP checksums.

In addition - please note the comments in lib/librte_mbuf/rte_mbuf.h which specify that PKT_TX_TCP_SEG flag implies the PKT_TX_TCP_CKSUM (hence it is not required to be explicitly set by the caller)

/**
* TCP segmentation offload. To enable this offload feature for a
* packet to be transmitted on hardware supporting TSO:
*  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
*    PKT_TX_TCP_CKSUM)
...

> Add Konstantin for more suggestions.
> 
> Thanks,
> Jiayu
> 
> > -----Original Message-----
> > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > Sent: Sunday, April 22, 2018 10:21 PM
> > To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> Ophir
> > Munk <ophirmu@mellanox.com>; stable@dpdk.org
> > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> >
> > Large TCP packets which are marked with PKT_TX_TCP_SEG flag are
> > segmented and the flag is cleared in the resulting segments, however,
> > the segments checksum is not updated. It is therefore required to set
> > the PKT_TX_TCP_CKSUM flag in each TCP segment in order to mark for the
> > sending driver the need to update the TCP checksum before transmitting
> > the segment.
> >
> > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> > ---
> >  lib/librte_gso/rte_gso.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c index
> > a44e3d4..e9ce9ce 100644
> > --- a/lib/librte_gso/rte_gso.c
> > +++ b/lib/librte_gso/rte_gso.c
> > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> >  			 (gso_ctx->gso_types &
> > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
> >  				direct_pool, indirect_pool,
> >  				pkts_out, nb_pkts_out);
> >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> >  			(gso_ctx->gso_types &
> > DEV_TX_OFFLOAD_TCP_TSO)) {
> >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> >  				direct_pool, indirect_pool,
> >  				pkts_out, nb_pkts_out);
> > --
> > 2.7.4
  
Ananyev, Konstantin April 24, 2018, 10:56 a.m. UTC | #4
> -----Original Message-----
> From: Ophir Munk [mailto:ophirmu@mellanox.com]
> Sent: Tuesday, April 24, 2018 10:44 AM
> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> Hi Jiayu,
> Please find comments inline
> 
> > -----Original Message-----
> > From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> > Sent: Monday, April 23, 2018 7:14 AM
> > To: Ophir Munk <ophirmu@mellanox.com>; dev@dpdk.org; Ananyev,
> > Konstantin <konstantin.ananyev@intel.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > stable@dpdk.org
> > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> >
> > Hi Ophir,
> >
> > In the GSO design, the GSO library doesn't care about checksums, which
> > means it doesn't check if input packets have correct checksums, and it
> > doesn't do any checksum related work for the output GSO segments. It
> > depends on the callers to use HW or SW checksum calculation for output
> > packets. This is why the GSO library doesn't set PKT_TX_TCP_CKSUM. So I
> > don't think it's a bug.
> >
> 
> Can you please reconsider this design? I think the GSO library should imitate the HW behavior where TCP segments checksum is
> automatically calculated without explicitly requesting it. I am not saying that GSO library itself should calculate the checksums - but at least
> it should mark each segment as requiring this calculation.

But gso has no idea how this packet will be processed after it.
Caller can choose to calculate L3/L4 cksum in SW or might be going to use HW offloads.
In later case nothing stops the caller to update mbuf->ol_flags in a way he likes (TCP_CKSUM, IP_CKSUM, etc.).
Konstantin

> 
> > In my opinion, it's not a good idea to enable HW TCP checksum calculation
> > silently, and without the aware of the caller. In fact, the caller always know it
> > does SW TSO (i.e. GSO), instead of real HW TSO.
> 
> This is not correct. Consider net_failsafe with 2 sub-devices: one is a HW PCI device, the other one is a SW TAP device. Failsafe must work
> transparently with these two sub-devices and the caller cannot tell if TSO is done in SW or HW.
> 
> > If the caller wants HW
> > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags before or
> > after calling the GSO library.
> >
> 
> FYI - TAP TSO patches were submitted to dpdk.org mailing list. These patches use the GSO library.
> https://dpdk.org/dev/patchwork/patch/38666/
> https://dpdk.org/dev/patchwork/patch/38667/
> 
> Running testpmd with TAP TSO is currently broken without the suggested librte_gso patch.
> Please note testpmd implementation (app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> configured:
> 
>   if (tso_segsz)
>       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the packet flags are only marked with PKT_TX_TCP_SEG and no
> PKT_TX_TCP_CKSUM ***
>    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
>        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is marked only if TSO is not applicable ***
>   else {
>       tcp_hdr->cksum =
>          get_udptcp_checksum(l3_hdr, tcp_hdr,
> 
> In other words - testpmd does not set TCP_CKSUM along with TCP_SEG therefore using testpmd with TAP/TSO will result in TCP segments
> with 0 (incorrect) TCP checksums.
> 
> In addition - please note the comments in lib/librte_mbuf/rte_mbuf.h which specify that PKT_TX_TCP_SEG flag implies the
> PKT_TX_TCP_CKSUM (hence it is not required to be explicitly set by the caller)
> 
> /**
> * TCP segmentation offload. To enable this offload feature for a
> * packet to be transmitted on hardware supporting TSO:
> *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> *    PKT_TX_TCP_CKSUM)
> ...
> 
> > Add Konstantin for more suggestions.
> >
> > Thanks,
> > Jiayu
> >
> > > -----Original Message-----
> > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > Sent: Sunday, April 22, 2018 10:21 PM
> > > To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > Ophir
> > > Munk <ophirmu@mellanox.com>; stable@dpdk.org
> > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> > >
> > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag are
> > > segmented and the flag is cleared in the resulting segments, however,
> > > the segments checksum is not updated. It is therefore required to set
> > > the PKT_TX_TCP_CKSUM flag in each TCP segment in order to mark for the
> > > sending driver the need to update the TCP checksum before transmitting
> > > the segment.
> > >
> > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> > > ---
> > >  lib/librte_gso/rte_gso.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c index
> > > a44e3d4..e9ce9ce 100644
> > > --- a/lib/librte_gso/rte_gso.c
> > > +++ b/lib/librte_gso/rte_gso.c
> > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > >  			 (gso_ctx->gso_types &
> > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
> > >  				direct_pool, indirect_pool,
> > >  				pkts_out, nb_pkts_out);
> > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > >  			(gso_ctx->gso_types &
> > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > >  				direct_pool, indirect_pool,
> > >  				pkts_out, nb_pkts_out);
> > > --
> > > 2.7.4
  
Ophir Munk April 24, 2018, 11:45 a.m. UTC | #5
Hi Konstantin,
Please see inline

> -----Original Message-----
> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> Sent: Tuesday, April 24, 2018 1:56 PM
> To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu <jiayu.hu@intel.com>;
> dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> 
> 
> > -----Original Message-----
> > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > Sent: Tuesday, April 24, 2018 10:44 AM
> > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > stable@dpdk.org
> > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > segments
> >
> > Hi Jiayu,
> > Please find comments inline
> >
> > > -----Original Message-----
> > > From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> > > Sent: Monday, April 23, 2018 7:14 AM
> > > To: Ophir Munk <ophirmu@mellanox.com>; dev@dpdk.org; Ananyev,
> > > Konstantin <konstantin.ananyev@intel.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > stable@dpdk.org
> > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > segments
> > >
> > > Hi Ophir,
> > >
> > > In the GSO design, the GSO library doesn't care about checksums,
> > > which means it doesn't check if input packets have correct
> > > checksums, and it doesn't do any checksum related work for the
> > > output GSO segments. It depends on the callers to use HW or SW
> > > checksum calculation for output packets. This is why the GSO library
> > > doesn't set PKT_TX_TCP_CKSUM. So I don't think it's a bug.
> > >
> >
> > Can you please reconsider this design? I think the GSO library should
> > imitate the HW behavior where TCP segments checksum is automatically
> > calculated without explicitly requesting it. I am not saying that GSO library
> itself should calculate the checksums - but at least it should mark each
> segment as requiring this calculation.
> 
> But gso has no idea how this packet will be processed after it.

GSO shouldn't know. It should only mark the fact that a new TCP segment was created without a TCP checksum. 

> Caller can choose to calculate L3/L4 cksum in SW or might be going to use
> HW offloads.

Assuming TSO is configured then you suggest that TAP PMD will mark by itself the TCP_CKSUM flag for all packets returned from GSO library?

> In later case nothing stops the caller to update mbuf->ol_flags in a way he
> likes (TCP_CKSUM, IP_CKSUM, etc.).
> Konstantin
> 

Please note that TCP_SEG flag is cleared by GSO library in 2 different cases:
1. Packet length equals to or is bigger than GSO size. In this case new TCP segments are created with no TCP checksum. 
2. Packet length is smaller than GSO size. In this case no TCP segmentation is required. The original packet is returned and its existing TCP checksum is OK.

In the latter case TAP PMD will always calculate TCP checksum in SW (performance concerns) where this could have been saved. 
I am thinking of a practical scenario where TSO is configured but all packets are smaller than GSO size, however TAP PMD unnecessarily recalculates their checksums.

How do you suggest to avoid this scenario?

> >
> > > In my opinion, it's not a good idea to enable HW TCP checksum
> > > calculation silently, and without the aware of the caller. In fact,
> > > the caller always know it does SW TSO (i.e. GSO), instead of real HW TSO.
> >
> > This is not correct. Consider net_failsafe with 2 sub-devices: one is
> > a HW PCI device, the other one is a SW TAP device. Failsafe must work
> transparently with these two sub-devices and the caller cannot tell if TSO is
> done in SW or HW.
> >
> > > If the caller wants HW
> > > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags before
> > > or after calling the GSO library.
> > >
> >
> > FYI - TAP TSO patches were submitted to dpdk.org mailing list. These
> patches use the GSO library.
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdpd
> >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38666%2F&data=02%7C01%7Coph
> irmu%40me
> >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> 9ba6a4d1
> >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=CF7EvhXG%2BrH%
> 2BPiQEbvM0
> > mC%2FSpqobneKaoV03j5VrSDw%3D&reserved=0
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdpd
> >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38667%2F&data=02%7C01%7Coph
> irmu%40me
> >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> 9ba6a4d1
> >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=j9WVIj%2FKq6EN
> WXu3mr5By1
> > toSowU8bqJRGZ19SxiGoI%3D&reserved=0
> >
> > Running testpmd with TAP TSO is currently broken without the suggested
> librte_gso patch.
> > Please note testpmd implementation (app/test-pmd/csumonly.c
> > b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> > configured:
> >
> >   if (tso_segsz)
> >       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the packet
> flags are only marked with PKT_TX_TCP_SEG and no
> > PKT_TX_TCP_CKSUM ***
> >    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> >        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is
> marked only if TSO is not applicable ***
> >   else {
> >       tcp_hdr->cksum =
> >          get_udptcp_checksum(l3_hdr, tcp_hdr,
> >
> > In other words - testpmd does not set TCP_CKSUM along with TCP_SEG
> > therefore using testpmd with TAP/TSO will result in TCP segments with 0
> (incorrect) TCP checksums.
> >
> > In addition - please note the comments in lib/librte_mbuf/rte_mbuf.h
> > which specify that PKT_TX_TCP_SEG flag implies the PKT_TX_TCP_CKSUM
> > (hence it is not required to be explicitly set by the caller)
> >
> > /**
> > * TCP segmentation offload. To enable this offload feature for a
> > * packet to be transmitted on hardware supporting TSO:
> > *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > *    PKT_TX_TCP_CKSUM)
> > ...
> >
> > > Add Konstantin for more suggestions.
> > >
> > > Thanks,
> > > Jiayu
> > >
> > > > -----Original Message-----
> > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > Sent: Sunday, April 22, 2018 10:21 PM
> > > > To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > Ophir
> > > > Munk <ophirmu@mellanox.com>; stable@dpdk.org
> > > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > segments
> > > >
> > > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag are
> > > > segmented and the flag is cleared in the resulting segments,
> > > > however, the segments checksum is not updated. It is therefore
> > > > required to set the PKT_TX_TCP_CKSUM flag in each TCP segment in
> > > > order to mark for the sending driver the need to update the TCP
> > > > checksum before transmitting the segment.
> > > >
> > > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > > Cc: stable@dpdk.org
> > > >
> > > > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> > > > ---
> > > >  lib/librte_gso/rte_gso.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > index a44e3d4..e9ce9ce 100644
> > > > --- a/lib/librte_gso/rte_gso.c
> > > > +++ b/lib/librte_gso/rte_gso.c
> > > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > > >  			 (gso_ctx->gso_types &
> > > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
> > > >  				direct_pool, indirect_pool,
> > > >  				pkts_out, nb_pkts_out);
> > > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > > >  			(gso_ctx->gso_types &
> > > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > >  				direct_pool, indirect_pool,
> > > >  				pkts_out, nb_pkts_out);
> > > > --
> > > > 2.7.4
  
Ananyev, Konstantin April 24, 2018, 12:31 p.m. UTC | #6
Hi Ophir,

> 
> Hi Konstantin,
> Please see inline
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > Sent: Tuesday, April 24, 2018 1:56 PM
> > To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu <jiayu.hu@intel.com>;
> > dev@dpdk.org
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > stable@dpdk.org
> > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> >
> >
> >
> > > -----Original Message-----
> > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > Sent: Tuesday, April 24, 2018 10:44 AM
> > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > stable@dpdk.org
> > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > segments
> > >
> > > Hi Jiayu,
> > > Please find comments inline
> > >
> > > > -----Original Message-----
> > > > From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> > > > Sent: Monday, April 23, 2018 7:14 AM
> > > > To: Ophir Munk <ophirmu@mellanox.com>; dev@dpdk.org; Ananyev,
> > > > Konstantin <konstantin.ananyev@intel.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > stable@dpdk.org
> > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > segments
> > > >
> > > > Hi Ophir,
> > > >
> > > > In the GSO design, the GSO library doesn't care about checksums,
> > > > which means it doesn't check if input packets have correct
> > > > checksums, and it doesn't do any checksum related work for the
> > > > output GSO segments. It depends on the callers to use HW or SW
> > > > checksum calculation for output packets. This is why the GSO library
> > > > doesn't set PKT_TX_TCP_CKSUM. So I don't think it's a bug.
> > > >
> > >
> > > Can you please reconsider this design? I think the GSO library should
> > > imitate the HW behavior where TCP segments checksum is automatically
> > > calculated without explicitly requesting it. I am not saying that GSO library
> > itself should calculate the checksums - but at least it should mark each
> > segment as requiring this calculation.
> >
> > But gso has no idea how this packet will be processed after it.
> 
> GSO shouldn't know. It should only mark the fact that a new TCP segment was created without a TCP checksum.
> 

Ok, but new IP header was also created. And might be outer ip/udp (in case of tunnel).
If we go that way we'll have to set flags for each them.

> > Caller can choose to calculate L3/L4 cksum in SW or might be going to use
> > HW offloads.
> 
> Assuming TSO is configured then you suggest that TAP PMD will mark by itself the TCP_CKSUM flag for all packets returned from GSO
> library?

Yes.

> 
> > In later case nothing stops the caller to update mbuf->ol_flags in a way he
> > likes (TCP_CKSUM, IP_CKSUM, etc.).
> > Konstantin
> >
> 
> Please note that TCP_SEG flag is cleared by GSO library in 2 different cases:
> 1. Packet length equals to or is bigger than GSO size. In this case new TCP segments are created with no TCP checksum.
> 2. Packet length is smaller than GSO size. In this case no TCP segmentation is required. The original packet is returned and its existing TCP
> checksum is OK.
> 
> In the latter case TAP PMD will always calculate TCP checksum in SW (performance concerns) where this could have been saved.
> I am thinking of a practical scenario where TSO is configured but all packets are smaller than GSO size, however TAP PMD unnecessarily
> recalculates their checksums.
> 
> How do you suggest to avoid this scenario?

Probably something like that:

rc = rte_gso_segment(pkt_in, gso_ctx, pkts_out, nb_pkts_out);
if (rc == 1 && pkt_in == pkts_out[0] == pkt_in) {
     /* no gso was performed */
} else {
   /* new packets, update ol_flags if needed */
}

?

Another possibility - might be make chages in librte_gso to allow user to
specify what flags to set for the output packets (probably via  rte_gso_ctx.flag) 

Konstantin

> >
> > > > In my opinion, it's not a good idea to enable HW TCP checksum
> > > > calculation silently, and without the aware of the caller. In fact,
> > > > the caller always know it does SW TSO (i.e. GSO), instead of real HW TSO.
> > >
> > > This is not correct. Consider net_failsafe with 2 sub-devices: one is
> > > a HW PCI device, the other one is a SW TAP device. Failsafe must work
> > transparently with these two sub-devices and the caller cannot tell if TSO is
> > done in SW or HW.
> > >
> > > > If the caller wants HW
> > > > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags before
> > > > or after calling the GSO library.
> > > >
> > >
> > > FYI - TAP TSO patches were submitted to dpdk.org mailing list. These
> > patches use the GSO library.
> > >
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdpd
> > >
> > k.org%2Fdev%2Fpatchwork%2Fpatch%2F38666%2F&data=02%7C01%7Coph
> > irmu%40me
> > >
> > llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> > 9ba6a4d1
> > >
> > 49256f461b%7C0%7C0%7C636601641779974217&sdata=CF7EvhXG%2BrH%
> > 2BPiQEbvM0
> > > mC%2FSpqobneKaoV03j5VrSDw%3D&reserved=0
> > >
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdpd
> > >
> > k.org%2Fdev%2Fpatchwork%2Fpatch%2F38667%2F&data=02%7C01%7Coph
> > irmu%40me
> > >
> > llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> > 9ba6a4d1
> > >
> > 49256f461b%7C0%7C0%7C636601641779974217&sdata=j9WVIj%2FKq6EN
> > WXu3mr5By1
> > > toSowU8bqJRGZ19SxiGoI%3D&reserved=0
> > >
> > > Running testpmd with TAP TSO is currently broken without the suggested
> > librte_gso patch.
> > > Please note testpmd implementation (app/test-pmd/csumonly.c
> > > b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> > > configured:
> > >
> > >   if (tso_segsz)
> > >       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the packet
> > flags are only marked with PKT_TX_TCP_SEG and no
> > > PKT_TX_TCP_CKSUM ***
> > >    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> > >        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is
> > marked only if TSO is not applicable ***
> > >   else {
> > >       tcp_hdr->cksum =
> > >          get_udptcp_checksum(l3_hdr, tcp_hdr,
> > >
> > > In other words - testpmd does not set TCP_CKSUM along with TCP_SEG
> > > therefore using testpmd with TAP/TSO will result in TCP segments with 0
> > (incorrect) TCP checksums.
> > >
> > > In addition - please note the comments in lib/librte_mbuf/rte_mbuf.h
> > > which specify that PKT_TX_TCP_SEG flag implies the PKT_TX_TCP_CKSUM
> > > (hence it is not required to be explicitly set by the caller)
> > >
> > > /**
> > > * TCP segmentation offload. To enable this offload feature for a
> > > * packet to be transmitted on hardware supporting TSO:
> > > *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > > *    PKT_TX_TCP_CKSUM)
> > > ...
> > >
> > > > Add Konstantin for more suggestions.
> > > >
> > > > Thanks,
> > > > Jiayu
> > > >
> > > > > -----Original Message-----
> > > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > > Sent: Sunday, April 22, 2018 10:21 PM
> > > > > To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > Ophir
> > > > > Munk <ophirmu@mellanox.com>; stable@dpdk.org
> > > > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > segments
> > > > >
> > > > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag are
> > > > > segmented and the flag is cleared in the resulting segments,
> > > > > however, the segments checksum is not updated. It is therefore
> > > > > required to set the PKT_TX_TCP_CKSUM flag in each TCP segment in
> > > > > order to mark for the sending driver the need to update the TCP
> > > > > checksum before transmitting the segment.
> > > > >
> > > > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > > > Cc: stable@dpdk.org
> > > > >
> > > > > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> > > > > ---
> > > > >  lib/librte_gso/rte_gso.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > >
> > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > > index a44e3d4..e9ce9ce 100644
> > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > > > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > > > >  			 (gso_ctx->gso_types &
> > > > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
> > > > >  				direct_pool, indirect_pool,
> > > > >  				pkts_out, nb_pkts_out);
> > > > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > > > >  			(gso_ctx->gso_types &
> > > > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > > >  				direct_pool, indirect_pool,
> > > > >  				pkts_out, nb_pkts_out);
> > > > > --
> > > > > 2.7.4
  
Hu, Jiayu April 24, 2018, 12:55 p.m. UTC | #7
Hi Konstantin and Ophir,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Tuesday, April 24, 2018 8:31 PM
> To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu <jiayu.hu@intel.com>;
> dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> segments
> 
> 
> Hi Ophir,
> 
> >
> > Hi Konstantin,
> > Please see inline
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > Sent: Tuesday, April 24, 2018 1:56 PM
> > > To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu
> <jiayu.hu@intel.com>;
> > > dev@dpdk.org
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > stable@dpdk.org
> > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> segments
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > Sent: Tuesday, April 24, 2018 10:44 AM
> > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org; Ananyev,
> Konstantin
> > > > <konstantin.ananyev@intel.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > stable@dpdk.org
> > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > segments
> > > >
> > > > Hi Jiayu,
> > > > Please find comments inline
> > > >
> > > > > -----Original Message-----
> > > > > From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> > > > > Sent: Monday, April 23, 2018 7:14 AM
> > > > > To: Ophir Munk <ophirmu@mellanox.com>; dev@dpdk.org;
> Ananyev,
> > > > > Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > <olgas@mellanox.com>; Pascal Mazon
> <pascal.mazon@6wind.com>;
> > > > > stable@dpdk.org
> > > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > segments
> > > > >
> > > > > Hi Ophir,
> > > > >
> > > > > In the GSO design, the GSO library doesn't care about checksums,
> > > > > which means it doesn't check if input packets have correct
> > > > > checksums, and it doesn't do any checksum related work for the
> > > > > output GSO segments. It depends on the callers to use HW or SW
> > > > > checksum calculation for output packets. This is why the GSO library
> > > > > doesn't set PKT_TX_TCP_CKSUM. So I don't think it's a bug.
> > > > >
> > > >
> > > > Can you please reconsider this design? I think the GSO library should
> > > > imitate the HW behavior where TCP segments checksum is
> automatically
> > > > calculated without explicitly requesting it. I am not saying that GSO
> library
> > > itself should calculate the checksums - but at least it should mark each
> > > segment as requiring this calculation.
> > >
> > > But gso has no idea how this packet will be processed after it.
> >
> > GSO shouldn't know. It should only mark the fact that a new TCP segment
> was created without a TCP checksum.
> >
> 
> Ok, but new IP header was also created. And might be outer ip/udp (in case
> of tunnel).
> If we go that way we'll have to set flags for each them.
> 
> > > Caller can choose to calculate L3/L4 cksum in SW or might be going to
> use
> > > HW offloads.
> >
> > Assuming TSO is configured then you suggest that TAP PMD will mark by
> itself the TCP_CKSUM flag for all packets returned from GSO
> > library?
> 
> Yes.
> 
> >
> > > In later case nothing stops the caller to update mbuf->ol_flags in a way
> he
> > > likes (TCP_CKSUM, IP_CKSUM, etc.).
> > > Konstantin
> > >
> >
> > Please note that TCP_SEG flag is cleared by GSO library in 2 different cases:
> > 1. Packet length equals to or is bigger than GSO size. In this case new TCP
> segments are created with no TCP checksum.
> > 2. Packet length is smaller than GSO size. In this case no TCP
> segmentation is required. The original packet is returned and its existing
> TCP
> > checksum is OK.
> >
> > In the latter case TAP PMD will always calculate TCP checksum in SW
> (performance concerns) where this could have been saved.
> > I am thinking of a practical scenario where TSO is configured but all
> packets are smaller than GSO size, however TAP PMD unnecessarily
> > recalculates their checksums.
> >
> > How do you suggest to avoid this scenario?
> 
> Probably something like that:
> 
> rc = rte_gso_segment(pkt_in, gso_ctx, pkts_out, nb_pkts_out);
> if (rc == 1 && pkt_in == pkts_out[0] == pkt_in) {
>      /* no gso was performed */
> } else {
>    /* new packets, update ol_flags if needed */
> }
> 
> ?
> 
> Another possibility - might be make chages in librte_gso to allow user to
> specify what flags to set for the output packets (probably via
> rte_gso_ctx.flag)

It would be a solution. We can add flags to rte_gso_ctx.flag and the GSO library
can do some extra work (like checksum) according to the flags.

Thanks,
Jiayu
> 
> Konstantin
> 
> > >
> > > > > In my opinion, it's not a good idea to enable HW TCP checksum
> > > > > calculation silently, and without the aware of the caller. In fact,
> > > > > the caller always know it does SW TSO (i.e. GSO), instead of real HW
> TSO.
> > > >
> > > > This is not correct. Consider net_failsafe with 2 sub-devices: one is
> > > > a HW PCI device, the other one is a SW TAP device. Failsafe must work
> > > transparently with these two sub-devices and the caller cannot tell if
> TSO is
> > > done in SW or HW.
> > > >
> > > > > If the caller wants HW
> > > > > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags
> before
> > > > > or after calling the GSO library.
> > > > >
> > > >
> > > > FYI - TAP TSO patches were submitted to dpdk.org mailing list. These
> > > patches use the GSO library.
> > > >
> > >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdp
> d
> > > >
> > >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38666%2F&data=02%7C01%7Cop
> h
> > > irmu%40me
> > > >
> > >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4
> d
> > > 9ba6a4d1
> > > >
> > >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=CF7EvhXG%2BrH%
> > > 2BPiQEbvM0
> > > > mC%2FSpqobneKaoV03j5VrSDw%3D&reserved=0
> > > >
> > >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdp
> d
> > > >
> > >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38667%2F&data=02%7C01%7Cop
> h
> > > irmu%40me
> > > >
> > >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4
> d
> > > 9ba6a4d1
> > > >
> > >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=j9WVIj%2FKq6EN
> > > WXu3mr5By1
> > > > toSowU8bqJRGZ19SxiGoI%3D&reserved=0
> > > >
> > > > Running testpmd with TAP TSO is currently broken without the
> suggested
> > > librte_gso patch.
> > > > Please note testpmd implementation (app/test-pmd/csumonly.c
> > > > b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> > > > configured:
> > > >
> > > >   if (tso_segsz)
> > > >       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the
> packet
> > > flags are only marked with PKT_TX_TCP_SEG and no
> > > > PKT_TX_TCP_CKSUM ***
> > > >    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> > > >        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is
> > > marked only if TSO is not applicable ***
> > > >   else {
> > > >       tcp_hdr->cksum =
> > > >          get_udptcp_checksum(l3_hdr, tcp_hdr,
> > > >
> > > > In other words - testpmd does not set TCP_CKSUM along with
> TCP_SEG
> > > > therefore using testpmd with TAP/TSO will result in TCP segments with
> 0
> > > (incorrect) TCP checksums.
> > > >
> > > > In addition - please note the comments in lib/librte_mbuf/rte_mbuf.h
> > > > which specify that PKT_TX_TCP_SEG flag implies the
> PKT_TX_TCP_CKSUM
> > > > (hence it is not required to be explicitly set by the caller)
> > > >
> > > > /**
> > > > * TCP segmentation offload. To enable this offload feature for a
> > > > * packet to be transmitted on hardware supporting TSO:
> > > > *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > > > *    PKT_TX_TCP_CKSUM)
> > > > ...
> > > >
> > > > > Add Konstantin for more suggestions.
> > > > >
> > > > > Thanks,
> > > > > Jiayu
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > > > Sent: Sunday, April 22, 2018 10:21 PM
> > > > > > To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > > <olgas@mellanox.com>; Pascal Mazon
> <pascal.mazon@6wind.com>;
> > > > > Ophir
> > > > > > Munk <ophirmu@mellanox.com>; stable@dpdk.org
> > > > > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > > segments
> > > > > >
> > > > > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag
> are
> > > > > > segmented and the flag is cleared in the resulting segments,
> > > > > > however, the segments checksum is not updated. It is therefore
> > > > > > required to set the PKT_TX_TCP_CKSUM flag in each TCP segment
> in
> > > > > > order to mark for the sending driver the need to update the TCP
> > > > > > checksum before transmitting the segment.
> > > > > >
> > > > > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > > > > Cc: stable@dpdk.org
> > > > > >
> > > > > > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> > > > > > ---
> > > > > >  lib/librte_gso/rte_gso.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > > > index a44e3d4..e9ce9ce 100644
> > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > > > > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > > > > >  			 (gso_ctx->gso_types &
> > > > > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size,
> ipid_delta,
> > > > > >  				direct_pool, indirect_pool,
> > > > > >  				pkts_out, nb_pkts_out);
> > > > > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > > > > >  			(gso_ctx->gso_types &
> > > > > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > > > >  				direct_pool, indirect_pool,
> > > > > >  				pkts_out, nb_pkts_out);
> > > > > > --
> > > > > > 2.7.4
  
Ophir Munk April 24, 2018, 1:41 p.m. UTC | #8
Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> Sent: Tuesday, April 24, 2018 3:31 PM
> To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu <jiayu.hu@intel.com>;
> dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> 
> Hi Ophir,
> 
> >
> > Hi Konstantin,
> > Please see inline
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > Sent: Tuesday, April 24, 2018 1:56 PM
> > > To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu
> > > <jiayu.hu@intel.com>; dev@dpdk.org
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > stable@dpdk.org
> > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > segments
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > Sent: Tuesday, April 24, 2018 10:44 AM
> > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org; Ananyev,
> > > > Konstantin <konstantin.ananyev@intel.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > stable@dpdk.org
> > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > segments
> > > >
> > > > Hi Jiayu,
> > > > Please find comments inline
> > > >
> > > > > -----Original Message-----
> > > > > From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> > > > > Sent: Monday, April 23, 2018 7:14 AM
> > > > > To: Ophir Munk <ophirmu@mellanox.com>; dev@dpdk.org; Ananyev,
> > > > > Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > > stable@dpdk.org
> > > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in
> > > > > TCP segments
> > > > >
> > > > > Hi Ophir,
> > > > >
> > > > > In the GSO design, the GSO library doesn't care about checksums,
> > > > > which means it doesn't check if input packets have correct
> > > > > checksums, and it doesn't do any checksum related work for the
> > > > > output GSO segments. It depends on the callers to use HW or SW
> > > > > checksum calculation for output packets. This is why the GSO
> > > > > library doesn't set PKT_TX_TCP_CKSUM. So I don't think it's a bug.
> > > > >
> > > >
> > > > Can you please reconsider this design? I think the GSO library
> > > > should imitate the HW behavior where TCP segments checksum is
> > > > automatically calculated without explicitly requesting it. I am
> > > > not saying that GSO library
> > > itself should calculate the checksums - but at least it should mark
> > > each segment as requiring this calculation.
> > >
> > > But gso has no idea how this packet will be processed after it.
> >
> > GSO shouldn't know. It should only mark the fact that a new TCP segment
> was created without a TCP checksum.
> >
> 
> Ok, but new IP header was also created. And might be outer ip/udp (in case
> of tunnel).
> If we go that way we'll have to set flags for each them.
> 

No. The application should set the PKT_TX_IP_CKSUM or other flags which will be forwarded by TAP PMD to GSO library which will clone them in the new TCP segments. The only concern is about PKT_TX_TCP_SEG versus PTK_TX_TCP_CKSUM flags. 
By dpdk design the PKT_TX_TCP_SEGS and PKT_TX_TCP_CKSUM flags are mutual exclusive. They do not co-exist, but they are replaceable. You can verify it in testpmd implementation, other PMDs and code documentation (please note previous code snippets examples in this thread).

> > > Caller can choose to calculate L3/L4 cksum in SW or might be going
> > > to use HW offloads.
> >
> > Assuming TSO is configured then you suggest that TAP PMD will mark by
> > itself the TCP_CKSUM flag for all packets returned from GSO library?
> 
> Yes.
> 
> >
> > > In later case nothing stops the caller to update mbuf->ol_flags in a
> > > way he likes (TCP_CKSUM, IP_CKSUM, etc.).
> > > Konstantin
> > >
> >
> > Please note that TCP_SEG flag is cleared by GSO library in 2 different cases:
> > 1. Packet length equals to or is bigger than GSO size. In this case new TCP
> segments are created with no TCP checksum.
> > 2. Packet length is smaller than GSO size. In this case no TCP
> > segmentation is required. The original packet is returned and its existing
> TCP checksum is OK.
> >
> > In the latter case TAP PMD will always calculate TCP checksum in SW
> (performance concerns) where this could have been saved.
> > I am thinking of a practical scenario where TSO is configured but all
> > packets are smaller than GSO size, however TAP PMD unnecessarily
> recalculates their checksums.
> >
> > How do you suggest to avoid this scenario?
> 
> Probably something like that:
> 
> rc = rte_gso_segment(pkt_in, gso_ctx, pkts_out, nb_pkts_out); if (rc == 1 &&
> pkt_in == pkts_out[0] == pkt_in) {
>      /* no gso was performed */
> } else {
>    /* new packets, update ol_flags if needed */ }
> 
> ?
> 

Regarding the check: "pkts_out[0] == pkt_in " -  I would prefer an "API approach" rather than relying on internal code specifics.

> Another possibility - might be make chages in librte_gso to allow user to
> specify what flags to set for the output packets (probably via
> rte_gso_ctx.flag)
> 

I will gladly review any new enhancement. However, for 18.05, can we please accept this patch as a practical solution for now?

> Konstantin
> 
> > >
> > > > > In my opinion, it's not a good idea to enable HW TCP checksum
> > > > > calculation silently, and without the aware of the caller. In
> > > > > fact, the caller always know it does SW TSO (i.e. GSO), instead of real
> HW TSO.
> > > >
> > > > This is not correct. Consider net_failsafe with 2 sub-devices: one
> > > > is a HW PCI device, the other one is a SW TAP device. Failsafe
> > > > must work
> > > transparently with these two sub-devices and the caller cannot tell
> > > if TSO is done in SW or HW.
> > > >
> > > > > If the caller wants HW
> > > > > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags
> > > > > before or after calling the GSO library.
> > > > >
> > > >
> > > > FYI - TAP TSO patches were submitted to dpdk.org mailing list.
> > > > These
> > > patches use the GSO library.
> > > >
> > >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fd
> > > pd
> > > >
> > >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38666%2F&data=02%7C01%7Coph
> > > irmu%40me
> > > >
> > >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> > > 9ba6a4d1
> > > >
> > >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=CF7EvhXG%2BrH%
> > > 2BPiQEbvM0
> > > > mC%2FSpqobneKaoV03j5VrSDw%3D&reserved=0
> > > >
> > >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fd
> > > pd
> > > >
> > >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38667%2F&data=02%7C01%7Coph
> > > irmu%40me
> > > >
> > >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> > > 9ba6a4d1
> > > >
> > >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=j9WVIj%2FKq6EN
> > > WXu3mr5By1
> > > > toSowU8bqJRGZ19SxiGoI%3D&reserved=0
> > > >
> > > > Running testpmd with TAP TSO is currently broken without the
> > > > suggested
> > > librte_gso patch.
> > > > Please note testpmd implementation (app/test-pmd/csumonly.c
> > > > b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> > > > configured:
> > > >
> > > >   if (tso_segsz)
> > > >       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the
> packet
> > > flags are only marked with PKT_TX_TCP_SEG and no
> > > > PKT_TX_TCP_CKSUM ***
> > > >    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> > > >        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is
> > > marked only if TSO is not applicable ***
> > > >   else {
> > > >       tcp_hdr->cksum =
> > > >          get_udptcp_checksum(l3_hdr, tcp_hdr,
> > > >
> > > > In other words - testpmd does not set TCP_CKSUM along with TCP_SEG
> > > > therefore using testpmd with TAP/TSO will result in TCP segments
> > > > with 0
> > > (incorrect) TCP checksums.
> > > >
> > > > In addition - please note the comments in
> > > > lib/librte_mbuf/rte_mbuf.h which specify that PKT_TX_TCP_SEG flag
> > > > implies the PKT_TX_TCP_CKSUM (hence it is not required to be
> > > > explicitly set by the caller)
> > > >
> > > > /**
> > > > * TCP segmentation offload. To enable this offload feature for a
> > > > * packet to be transmitted on hardware supporting TSO:
> > > > *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > > > *    PKT_TX_TCP_CKSUM)
> > > > ...
> > > >
> > > > > Add Konstantin for more suggestions.
> > > > >
> > > > > Thanks,
> > > > > Jiayu
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > > > Sent: Sunday, April 22, 2018 10:21 PM
> > > > > > To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > > <olgas@mellanox.com>; Pascal Mazon
> <pascal.mazon@6wind.com>;
> > > > > Ophir
> > > > > > Munk <ophirmu@mellanox.com>; stable@dpdk.org
> > > > > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > > segments
> > > > > >
> > > > > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag
> > > > > > are segmented and the flag is cleared in the resulting
> > > > > > segments, however, the segments checksum is not updated. It is
> > > > > > therefore required to set the PKT_TX_TCP_CKSUM flag in each
> > > > > > TCP segment in order to mark for the sending driver the need
> > > > > > to update the TCP checksum before transmitting the segment.
> > > > > >
> > > > > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > > > > Cc: stable@dpdk.org
> > > > > >
> > > > > > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> > > > > > ---
> > > > > >  lib/librte_gso/rte_gso.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_gso/rte_gso.c
> > > > > > b/lib/librte_gso/rte_gso.c index a44e3d4..e9ce9ce 100644
> > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > > > > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > > > > >  			 (gso_ctx->gso_types &
> > > > > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size,
> ipid_delta,
> > > > > >  				direct_pool, indirect_pool,
> > > > > >  				pkts_out, nb_pkts_out);
> > > > > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > > > > >  			(gso_ctx->gso_types &
> > > > > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > > > >  				direct_pool, indirect_pool,
> > > > > >  				pkts_out, nb_pkts_out);
> > > > > > --
> > > > > > 2.7.4
  
Ophir Munk April 24, 2018, 1:53 p.m. UTC | #9
Hi Jiayu,

> -----Original Message-----
> From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> Sent: Tuesday, April 24, 2018 3:56 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Ophir Munk
> <ophirmu@mellanox.com>; dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> Hi Konstantin and Ophir,
> 
... 
> > Another possibility - might be make chages in librte_gso to allow user
> > to specify what flags to set for the output packets (probably via
> > rte_gso_ctx.flag)
> 
> It would be a solution. We can add flags to rte_gso_ctx.flag and the GSO
> library can do some extra work (like checksum) according to the flags.
> 

I think that calculating TCP checksums by GSO library should be optional (if at all). 
In my case I prefer TAP PMD to do the work which is TAP-specific.  
My original intention in this patch was only to mark PKT_TX_TCP_CKSUM flag.

> Thanks,
> Jiayu
> >
> > Konstantin
> >
...
  
Ananyev, Konstantin April 24, 2018, 2:26 p.m. UTC | #10
> -----Original Message-----
> From: Ophir Munk [mailto:ophirmu@mellanox.com]
> Sent: Tuesday, April 24, 2018 2:42 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> Hi Konstantin,
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > Sent: Tuesday, April 24, 2018 3:31 PM
> > To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu <jiayu.hu@intel.com>;
> > dev@dpdk.org
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > stable@dpdk.org
> > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> >
> >
> > Hi Ophir,
> >
> > >
> > > Hi Konstantin,
> > > Please see inline
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > > Sent: Tuesday, April 24, 2018 1:56 PM
> > > > To: Ophir Munk <ophirmu@mellanox.com>; Hu, Jiayu
> > > > <jiayu.hu@intel.com>; dev@dpdk.org
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > stable@dpdk.org
> > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > segments
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > > Sent: Tuesday, April 24, 2018 10:44 AM
> > > > > To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org; Ananyev,
> > > > > Konstantin <konstantin.ananyev@intel.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > > stable@dpdk.org
> > > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > segments
> > > > >
> > > > > Hi Jiayu,
> > > > > Please find comments inline
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> > > > > > Sent: Monday, April 23, 2018 7:14 AM
> > > > > > To: Ophir Munk <ophirmu@mellanox.com>; dev@dpdk.org; Ananyev,
> > > > > > Konstantin <konstantin.ananyev@intel.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > > > > > stable@dpdk.org
> > > > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in
> > > > > > TCP segments
> > > > > >
> > > > > > Hi Ophir,
> > > > > >
> > > > > > In the GSO design, the GSO library doesn't care about checksums,
> > > > > > which means it doesn't check if input packets have correct
> > > > > > checksums, and it doesn't do any checksum related work for the
> > > > > > output GSO segments. It depends on the callers to use HW or SW
> > > > > > checksum calculation for output packets. This is why the GSO
> > > > > > library doesn't set PKT_TX_TCP_CKSUM. So I don't think it's a bug.
> > > > > >
> > > > >
> > > > > Can you please reconsider this design? I think the GSO library
> > > > > should imitate the HW behavior where TCP segments checksum is
> > > > > automatically calculated without explicitly requesting it. I am
> > > > > not saying that GSO library
> > > > itself should calculate the checksums - but at least it should mark
> > > > each segment as requiring this calculation.
> > > >
> > > > But gso has no idea how this packet will be processed after it.
> > >
> > > GSO shouldn't know. It should only mark the fact that a new TCP segment
> > was created without a TCP checksum.
> > >
> >
> > Ok, but new IP header was also created. And might be outer ip/udp (in case
> > of tunnel).
> > If we go that way we'll have to set flags for each them.
> >
> 
> No. The application should set the PKT_TX_IP_CKSUM or other flags which will be forwarded by TAP PMD to GSO library which will clone
> them in the new TCP segments. The only concern is about PKT_TX_TCP_SEG versus PTK_TX_TCP_CKSUM flags.
> By dpdk design the PKT_TX_TCP_SEGS and PKT_TX_TCP_CKSUM flags are mutual exclusive. They do not co-exist, but they are replaceable.
> You can verify it in testpmd implementation, other PMDs and code documentation (please note previous code snippets examples in this
> thread).

I still don't see how it differs.
Let say user has a packet with already calculated ip nd tcp cksum.
He passes it to rte_gso_segment().
If the segmentation did happen - he would need to recalculate both ip and tcp cksum
for each segment.
Same for tunneling.
So if we'll decide follow your logic - gso() would have to set all related flags.
From other side - if application doesn't have cksums calculated, and application plans to
use HW offload for TCP cksum - nothing prevents it from setting PKT_TX_TCP_CKSUM
before calling  rte_gso_segment().

> 
> > > > Caller can choose to calculate L3/L4 cksum in SW or might be going
> > > > to use HW offloads.
> > >
> > > Assuming TSO is configured then you suggest that TAP PMD will mark by
> > > itself the TCP_CKSUM flag for all packets returned from GSO library?
> >
> > Yes.
> >
> > >
> > > > In later case nothing stops the caller to update mbuf->ol_flags in a
> > > > way he likes (TCP_CKSUM, IP_CKSUM, etc.).
> > > > Konstantin
> > > >
> > >
> > > Please note that TCP_SEG flag is cleared by GSO library in 2 different cases:
> > > 1. Packet length equals to or is bigger than GSO size. In this case new TCP
> > segments are created with no TCP checksum.
> > > 2. Packet length is smaller than GSO size. In this case no TCP
> > > segmentation is required. The original packet is returned and its existing
> > TCP checksum is OK.
> > >
> > > In the latter case TAP PMD will always calculate TCP checksum in SW
> > (performance concerns) where this could have been saved.
> > > I am thinking of a practical scenario where TSO is configured but all
> > > packets are smaller than GSO size, however TAP PMD unnecessarily
> > recalculates their checksums.

Actually the question is why it happens that TAP PMD receives a packets
with TCP_SEG flag on and valid TCP checksum already calculated?
As you mentioned above TCP_SEG implies TCP_CKSUM for each segment?

> > >
> > > How do you suggest to avoid this scenario?
> >
> > Probably something like that:
> >
> > rc = rte_gso_segment(pkt_in, gso_ctx, pkts_out, nb_pkts_out); if (rc == 1 &&
> > pkt_in == pkts_out[0] == pkt_in) {
> >      /* no gso was performed */
> > } else {
> >    /* new packets, update ol_flags if needed */ }
> >
> > ?
> >
> 
> Regarding the check: "pkts_out[0] == pkt_in " -  I would prefer an "API approach" rather than relying on internal code specifics.

> 
> > Another possibility - might be make chages in librte_gso to allow user to
> > specify what flags to set for the output packets (probably via
> > rte_gso_ctx.flag)
> >
> 
> I will gladly review any new enhancement. However, for 18.05, can we please accept this patch as a practical solution for now?

I am not sure that it is ok to do things that way - we'll introduce behavior change here.
If there is no better way - it seems safer for 18.05 would be to use the workaround suggested above.

> 
> > Konstantin
> >
> > > >
> > > > > > In my opinion, it's not a good idea to enable HW TCP checksum
> > > > > > calculation silently, and without the aware of the caller. In
> > > > > > fact, the caller always know it does SW TSO (i.e. GSO), instead of real
> > HW TSO.
> > > > >
> > > > > This is not correct. Consider net_failsafe with 2 sub-devices: one
> > > > > is a HW PCI device, the other one is a SW TAP device. Failsafe
> > > > > must work
> > > > transparently with these two sub-devices and the caller cannot tell
> > > > if TSO is done in SW or HW.
> > > > >
> > > > > > If the caller wants HW
> > > > > > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags
> > > > > > before or after calling the GSO library.
> > > > > >
> > > > >
> > > > > FYI - TAP TSO patches were submitted to dpdk.org mailing list.
> > > > > These
> > > > patches use the GSO library.
> > > > >
> > > >
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fd
> > > > pd
> > > > >
> > > >
> > k.org%2Fdev%2Fpatchwork%2Fpatch%2F38666%2F&data=02%7C01%7Coph
> > > > irmu%40me
> > > > >
> > > >
> > llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> > > > 9ba6a4d1
> > > > >
> > > >
> > 49256f461b%7C0%7C0%7C636601641779974217&sdata=CF7EvhXG%2BrH%
> > > > 2BPiQEbvM0
> > > > > mC%2FSpqobneKaoV03j5VrSDw%3D&reserved=0
> > > > >
> > > >
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fd
> > > > pd
> > > > >
> > > >
> > k.org%2Fdev%2Fpatchwork%2Fpatch%2F38667%2F&data=02%7C01%7Coph
> > > > irmu%40me
> > > > >
> > > >
> > llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> > > > 9ba6a4d1
> > > > >
> > > >
> > 49256f461b%7C0%7C0%7C636601641779974217&sdata=j9WVIj%2FKq6EN
> > > > WXu3mr5By1
> > > > > toSowU8bqJRGZ19SxiGoI%3D&reserved=0
> > > > >
> > > > > Running testpmd with TAP TSO is currently broken without the
> > > > > suggested
> > > > librte_gso patch.
> > > > > Please note testpmd implementation (app/test-pmd/csumonly.c
> > > > > b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> > > > > configured:
> > > > >
> > > > >   if (tso_segsz)
> > > > >       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the
> > packet
> > > > flags are only marked with PKT_TX_TCP_SEG and no
> > > > > PKT_TX_TCP_CKSUM ***
> > > > >    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> > > > >        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is
> > > > marked only if TSO is not applicable ***
> > > > >   else {
> > > > >       tcp_hdr->cksum =
> > > > >          get_udptcp_checksum(l3_hdr, tcp_hdr,
> > > > >
> > > > > In other words - testpmd does not set TCP_CKSUM along with TCP_SEG
> > > > > therefore using testpmd with TAP/TSO will result in TCP segments
> > > > > with 0
> > > > (incorrect) TCP checksums.
> > > > >
> > > > > In addition - please note the comments in
> > > > > lib/librte_mbuf/rte_mbuf.h which specify that PKT_TX_TCP_SEG flag
> > > > > implies the PKT_TX_TCP_CKSUM (hence it is not required to be
> > > > > explicitly set by the caller)
> > > > >
> > > > > /**
> > > > > * TCP segmentation offload. To enable this offload feature for a
> > > > > * packet to be transmitted on hardware supporting TSO:
> > > > > *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > > > > *    PKT_TX_TCP_CKSUM)
> > > > > ...
> > > > >
> > > > > > Add Konstantin for more suggestions.
> > > > > >
> > > > > > Thanks,
> > > > > > Jiayu
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Ophir Munk [mailto:ophirmu@mellanox.com]
> > > > > > > Sent: Sunday, April 22, 2018 10:21 PM
> > > > > > > To: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>
> > > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > > > <olgas@mellanox.com>; Pascal Mazon
> > <pascal.mazon@6wind.com>;
> > > > > > Ophir
> > > > > > > Munk <ophirmu@mellanox.com>; stable@dpdk.org
> > > > > > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > > > segments
> > > > > > >
> > > > > > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag
> > > > > > > are segmented and the flag is cleared in the resulting
> > > > > > > segments, however, the segments checksum is not updated. It is
> > > > > > > therefore required to set the PKT_TX_TCP_CKSUM flag in each
> > > > > > > TCP segment in order to mark for the sending driver the need
> > > > > > > to update the TCP checksum before transmitting the segment.
> > > > > > >
> > > > > > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > > > > > Cc: stable@dpdk.org
> > > > > > >
> > > > > > > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> > > > > > > ---
> > > > > > >  lib/librte_gso/rte_gso.c | 2 ++
> > > > > > >  1 file changed, 2 insertions(+)
> > > > > > >
> > > > > > > diff --git a/lib/librte_gso/rte_gso.c
> > > > > > > b/lib/librte_gso/rte_gso.c index a44e3d4..e9ce9ce 100644
> > > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > > > > > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > > > > > >  			 (gso_ctx->gso_types &
> > > > > > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size,
> > ipid_delta,
> > > > > > >  				direct_pool, indirect_pool,
> > > > > > >  				pkts_out, nb_pkts_out);
> > > > > > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > > > > > >  			(gso_ctx->gso_types &
> > > > > > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > > > > >  				direct_pool, indirect_pool,
> > > > > > >  				pkts_out, nb_pkts_out);
> > > > > > > --
> > > > > > > 2.7.4
  
Hu, Jiayu April 25, 2018, 1:51 a.m. UTC | #11
Hi Ophir,

> -----Original Message-----
> From: Ophir Munk [mailto:ophirmu@mellanox.com]
> Sent: Tuesday, April 24, 2018 9:53 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> stable@dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> segments
> 
> Hi Jiayu,
> 
> > -----Original Message-----
> > From: Hu, Jiayu [mailto:jiayu.hu@intel.com]
> > Sent: Tuesday, April 24, 2018 3:56 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Ophir Munk
> > <ophirmu@mellanox.com>; dev@dpdk.org
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > <olgas@mellanox.com>; Pascal Mazon <pascal.mazon@6wind.com>;
> > stable@dpdk.org
> > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> segments
> >
> > Hi Konstantin and Ophir,
> >
> ...
> > > Another possibility - might be make chages in librte_gso to allow user
> > > to specify what flags to set for the output packets (probably via
> > > rte_gso_ctx.flag)
> >
> > It would be a solution. We can add flags to rte_gso_ctx.flag and the GSO
> > library can do some extra work (like checksum) according to the flags.
> >
> 
> I think that calculating TCP checksums by GSO library should be optional (if
> at all).
> In my case I prefer TAP PMD to do the work which is TAP-specific.
> My original intention in this patch was only to mark PKT_TX_TCP_CKSUM
> flag.

One possible workaround is:
We can have many flags. Some flags are for real checksum calculation, but
some flags can be used to tell the GSO library to prepare HW TCP checksum
offloading. When preparing HW TCP checksum offloading, the GSO library can
only set TCP_CKSUM flag.

Since some the HW drivers have supported rte_eth_tx_prepare(), which
will calculate the pseudo IP header checksum, applications can call
rte_eth_tx_prepare() to do real checksum offloading preparation after calling
the GSO library.

Thanks,
Jiayu
> 
> > Thanks,
> > Jiayu
> > >
> > > Konstantin
> > >
> ...
  

Patch

diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
index a44e3d4..e9ce9ce 100644
--- a/lib/librte_gso/rte_gso.c
+++ b/lib/librte_gso/rte_gso.c
@@ -50,12 +50,14 @@  rte_gso_segment(struct rte_mbuf *pkt,
 			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
 			 (gso_ctx->gso_types & DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
 		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);
 	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
 			(gso_ctx->gso_types & DEV_TX_OFFLOAD_TCP_TSO)) {
 		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
+		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
 		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
 				direct_pool, indirect_pool,
 				pkts_out, nb_pkts_out);