[dpdk-dev] [PATCH v1] gso: fix marking TCP checksum flag in TCP segments

Ananyev, Konstantin konstantin.ananyev at intel.com
Tue Apr 24 16:26:58 CEST 2018



> -----Original Message-----
> From: Ophir Munk [mailto:ophirmu at mellanox.com]
> Sent: Tuesday, April 24, 2018 2:42 PM
> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Hu, Jiayu <jiayu.hu at intel.com>; dev at dpdk.org
> Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> stable at dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> Hi Konstantin,
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev at intel.com]
> > Sent: Tuesday, April 24, 2018 3:31 PM
> > To: Ophir Munk <ophirmu at mellanox.com>; Hu, Jiayu <jiayu.hu at intel.com>;
> > dev at dpdk.org
> > Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> > stable at dpdk.org
> > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> >
> >
> > Hi Ophir,
> >
> > >
> > > Hi Konstantin,
> > > Please see inline
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin [mailto:konstantin.ananyev at intel.com]
> > > > Sent: Tuesday, April 24, 2018 1:56 PM
> > > > To: Ophir Munk <ophirmu at mellanox.com>; Hu, Jiayu
> > > > <jiayu.hu at intel.com>; dev at dpdk.org
> > > > Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > > > <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> > > > stable at dpdk.org
> > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > segments
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Ophir Munk [mailto:ophirmu at mellanox.com]
> > > > > Sent: Tuesday, April 24, 2018 10:44 AM
> > > > > To: Hu, Jiayu <jiayu.hu at intel.com>; dev at dpdk.org; Ananyev,
> > > > > Konstantin <konstantin.ananyev at intel.com>
> > > > > Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > > > > <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> > > > > stable at dpdk.org
> > > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > segments
> > > > >
> > > > > Hi Jiayu,
> > > > > Please find comments inline
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Hu, Jiayu [mailto:jiayu.hu at intel.com]
> > > > > > Sent: Monday, April 23, 2018 7:14 AM
> > > > > > To: Ophir Munk <ophirmu at mellanox.com>; dev at dpdk.org; Ananyev,
> > > > > > Konstantin <konstantin.ananyev at intel.com>
> > > > > > Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > > > > > <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> > > > > > stable at dpdk.org
> > > > > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in
> > > > > > TCP segments
> > > > > >
> > > > > > Hi Ophir,
> > > > > >
> > > > > > In the GSO design, the GSO library doesn't care about checksums,
> > > > > > which means it doesn't check if input packets have correct
> > > > > > checksums, and it doesn't do any checksum related work for the
> > > > > > output GSO segments. It depends on the callers to use HW or SW
> > > > > > checksum calculation for output packets. This is why the GSO
> > > > > > library doesn't set PKT_TX_TCP_CKSUM. So I don't think it's a bug.
> > > > > >
> > > > >
> > > > > Can you please reconsider this design? I think the GSO library
> > > > > should imitate the HW behavior where TCP segments checksum is
> > > > > automatically calculated without explicitly requesting it. I am
> > > > > not saying that GSO library
> > > > itself should calculate the checksums - but at least it should mark
> > > > each segment as requiring this calculation.
> > > >
> > > > But gso has no idea how this packet will be processed after it.
> > >
> > > GSO shouldn't know. It should only mark the fact that a new TCP segment
> > was created without a TCP checksum.
> > >
> >
> > Ok, but new IP header was also created. And might be outer ip/udp (in case
> > of tunnel).
> > If we go that way we'll have to set flags for each them.
> >
> 
> No. The application should set the PKT_TX_IP_CKSUM or other flags which will be forwarded by TAP PMD to GSO library which will clone
> them in the new TCP segments. The only concern is about PKT_TX_TCP_SEG versus PTK_TX_TCP_CKSUM flags.
> By dpdk design the PKT_TX_TCP_SEGS and PKT_TX_TCP_CKSUM flags are mutual exclusive. They do not co-exist, but they are replaceable.
> You can verify it in testpmd implementation, other PMDs and code documentation (please note previous code snippets examples in this
> thread).

I still don't see how it differs.
Let say user has a packet with already calculated ip nd tcp cksum.
He passes it to rte_gso_segment().
If the segmentation did happen - he would need to recalculate both ip and tcp cksum
for each segment.
Same for tunneling.
So if we'll decide follow your logic - gso() would have to set all related flags.
>From other side - if application doesn't have cksums calculated, and application plans to
use HW offload for TCP cksum - nothing prevents it from setting PKT_TX_TCP_CKSUM
before calling  rte_gso_segment().

> 
> > > > Caller can choose to calculate L3/L4 cksum in SW or might be going
> > > > to use HW offloads.
> > >
> > > Assuming TSO is configured then you suggest that TAP PMD will mark by
> > > itself the TCP_CKSUM flag for all packets returned from GSO library?
> >
> > Yes.
> >
> > >
> > > > In later case nothing stops the caller to update mbuf->ol_flags in a
> > > > way he likes (TCP_CKSUM, IP_CKSUM, etc.).
> > > > Konstantin
> > > >
> > >
> > > Please note that TCP_SEG flag is cleared by GSO library in 2 different cases:
> > > 1. Packet length equals to or is bigger than GSO size. In this case new TCP
> > segments are created with no TCP checksum.
> > > 2. Packet length is smaller than GSO size. In this case no TCP
> > > segmentation is required. The original packet is returned and its existing
> > TCP checksum is OK.
> > >
> > > In the latter case TAP PMD will always calculate TCP checksum in SW
> > (performance concerns) where this could have been saved.
> > > I am thinking of a practical scenario where TSO is configured but all
> > > packets are smaller than GSO size, however TAP PMD unnecessarily
> > recalculates their checksums.

Actually the question is why it happens that TAP PMD receives a packets
with TCP_SEG flag on and valid TCP checksum already calculated?
As you mentioned above TCP_SEG implies TCP_CKSUM for each segment?

> > >
> > > How do you suggest to avoid this scenario?
> >
> > Probably something like that:
> >
> > rc = rte_gso_segment(pkt_in, gso_ctx, pkts_out, nb_pkts_out); if (rc == 1 &&
> > pkt_in == pkts_out[0] == pkt_in) {
> >      /* no gso was performed */
> > } else {
> >    /* new packets, update ol_flags if needed */ }
> >
> > ?
> >
> 
> Regarding the check: "pkts_out[0] == pkt_in " -  I would prefer an "API approach" rather than relying on internal code specifics.

> 
> > Another possibility - might be make chages in librte_gso to allow user to
> > specify what flags to set for the output packets (probably via
> > rte_gso_ctx.flag)
> >
> 
> I will gladly review any new enhancement. However, for 18.05, can we please accept this patch as a practical solution for now?

I am not sure that it is ok to do things that way - we'll introduce behavior change here.
If there is no better way - it seems safer for 18.05 would be to use the workaround suggested above.

> 
> > Konstantin
> >
> > > >
> > > > > > In my opinion, it's not a good idea to enable HW TCP checksum
> > > > > > calculation silently, and without the aware of the caller. In
> > > > > > fact, the caller always know it does SW TSO (i.e. GSO), instead of real
> > HW TSO.
> > > > >
> > > > > This is not correct. Consider net_failsafe with 2 sub-devices: one
> > > > > is a HW PCI device, the other one is a SW TAP device. Failsafe
> > > > > must work
> > > > transparently with these two sub-devices and the caller cannot tell
> > > > if TSO is done in SW or HW.
> > > > >
> > > > > > If the caller wants HW
> > > > > > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags
> > > > > > before or after calling the GSO library.
> > > > > >
> > > > >
> > > > > FYI - TAP TSO patches were submitted to dpdk.org mailing list.
> > > > > These
> > > > patches use the GSO library.
> > > > >
> > > >
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fd
> > > > pd
> > > > >
> > > >
> > k.org%2Fdev%2Fpatchwork%2Fpatch%2F38666%2F&data=02%7C01%7Coph
> > > > irmu%40me
> > > > >
> > > >
> > llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> > > > 9ba6a4d1
> > > > >
> > > >
> > 49256f461b%7C0%7C0%7C636601641779974217&sdata=CF7EvhXG%2BrH%
> > > > 2BPiQEbvM0
> > > > > mC%2FSpqobneKaoV03j5VrSDw%3D&reserved=0
> > > > >
> > > >
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fd
> > > > pd
> > > > >
> > > >
> > k.org%2Fdev%2Fpatchwork%2Fpatch%2F38667%2F&data=02%7C01%7Coph
> > > > irmu%40me
> > > > >
> > > >
> > llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> > > > 9ba6a4d1
> > > > >
> > > >
> > 49256f461b%7C0%7C0%7C636601641779974217&sdata=j9WVIj%2FKq6EN
> > > > WXu3mr5By1
> > > > > toSowU8bqJRGZ19SxiGoI%3D&reserved=0
> > > > >
> > > > > Running testpmd with TAP TSO is currently broken without the
> > > > > suggested
> > > > librte_gso patch.
> > > > > Please note testpmd implementation (app/test-pmd/csumonly.c
> > > > > b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> > > > > configured:
> > > > >
> > > > >   if (tso_segsz)
> > > > >       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the
> > packet
> > > > flags are only marked with PKT_TX_TCP_SEG and no
> > > > > PKT_TX_TCP_CKSUM ***
> > > > >    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> > > > >        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is
> > > > marked only if TSO is not applicable ***
> > > > >   else {
> > > > >       tcp_hdr->cksum =
> > > > >          get_udptcp_checksum(l3_hdr, tcp_hdr,
> > > > >
> > > > > In other words - testpmd does not set TCP_CKSUM along with TCP_SEG
> > > > > therefore using testpmd with TAP/TSO will result in TCP segments
> > > > > with 0
> > > > (incorrect) TCP checksums.
> > > > >
> > > > > In addition - please note the comments in
> > > > > lib/librte_mbuf/rte_mbuf.h which specify that PKT_TX_TCP_SEG flag
> > > > > implies the PKT_TX_TCP_CKSUM (hence it is not required to be
> > > > > explicitly set by the caller)
> > > > >
> > > > > /**
> > > > > * TCP segmentation offload. To enable this offload feature for a
> > > > > * packet to be transmitted on hardware supporting TSO:
> > > > > *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > > > > *    PKT_TX_TCP_CKSUM)
> > > > > ...
> > > > >
> > > > > > Add Konstantin for more suggestions.
> > > > > >
> > > > > > Thanks,
> > > > > > Jiayu
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Ophir Munk [mailto:ophirmu at mellanox.com]
> > > > > > > Sent: Sunday, April 22, 2018 10:21 PM
> > > > > > > To: dev at dpdk.org; Hu, Jiayu <jiayu.hu at intel.com>
> > > > > > > Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > > > > > > <olgas at mellanox.com>; Pascal Mazon
> > <pascal.mazon at 6wind.com>;
> > > > > > Ophir
> > > > > > > Munk <ophirmu at mellanox.com>; stable at dpdk.org
> > > > > > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > > > > segments
> > > > > > >
> > > > > > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag
> > > > > > > are segmented and the flag is cleared in the resulting
> > > > > > > segments, however, the segments checksum is not updated. It is
> > > > > > > therefore required to set the PKT_TX_TCP_CKSUM flag in each
> > > > > > > TCP segment in order to mark for the sending driver the need
> > > > > > > to update the TCP checksum before transmitting the segment.
> > > > > > >
> > > > > > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > > > > > Cc: stable at dpdk.org
> > > > > > >
> > > > > > > Signed-off-by: Ophir Munk <ophirmu at mellanox.com>
> > > > > > > ---
> > > > > > >  lib/librte_gso/rte_gso.c | 2 ++
> > > > > > >  1 file changed, 2 insertions(+)
> > > > > > >
> > > > > > > diff --git a/lib/librte_gso/rte_gso.c
> > > > > > > b/lib/librte_gso/rte_gso.c index a44e3d4..e9ce9ce 100644
> > > > > > > --- a/lib/librte_gso/rte_gso.c
> > > > > > > +++ b/lib/librte_gso/rte_gso.c
> > > > > > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > > > > > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > > > > > >  			 (gso_ctx->gso_types &
> > > > > > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size,
> > ipid_delta,
> > > > > > >  				direct_pool, indirect_pool,
> > > > > > >  				pkts_out, nb_pkts_out);
> > > > > > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > > > > > >  			(gso_ctx->gso_types &
> > > > > > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > > > > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > > > > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > > > > >  				direct_pool, indirect_pool,
> > > > > > >  				pkts_out, nb_pkts_out);
> > > > > > > --
> > > > > > > 2.7.4



More information about the dev mailing list