[dpdk-stable] [PATCH v2] app/testpmd: fix TX checksum calculation for tunnel

Olivier Matz olivier.matz at 6wind.com
Thu Jul 29 10:25:30 CEST 2021


On Wed, Jul 28, 2021 at 04:07:51PM +0000, Gregory Etelson wrote:
> Hello Oliver,
> 
> Please see my comments below
> 
> > On Tue, Jul 27, 2021 at 04:07:57PM +0300, Gregory Etelson wrote:
> > > TX checksum of a tunnelled packet can be calculated for outer headers
> > > only or for both outer and inner parts. The calculation method is
> > > determined by application.
> > > If TX checksum calculation can be offloaded, hardware ignores existing
> > > checksum value and replaces it with an updated result.
> > 
> > This is not always true. Actually, the checksum value is optionally set by
> > software to the value that is expected by the hardware to offload the
> > checksum correctly. This is done through rte_eth_tx_prepare(), which is called
> > in csumonly test engine.
> > 
> > For instance, on an ixgbe NIC, it does:
> > 
> >   rte_eth_tx_prepare()
> >     eth_dev->tx_pkt_prepare()
> >       ixgbe_prep_pkts()
> >         rte_net_intel_cksum_flags_prepare()
> >           if packet is IP, set IP checksum to 0
> >           if packet is TCP or UDP, set L4 checksum to the phdr csum
> > 
> > This driver-specific rte_eth_tx_prepare() can indeed do nothing and let the
> > hardware ignore the checksum in the packet.
> >
> 
> You are right. I'll update the patch comment in v3.
>  
> > > If TX checksum is calculated by a software, existing value must be
> > > zeroed first.
> > > The testpmd checksum forwarding engine always zeroed inner checksums.
> > > If inner checksum calculation was offloaded, that header was left with
> > > 0 checksum value.
> > > Following outer software checksum calculation produced wrong value.
> > > The patch zeroes inner IPv4 checksum only before software calculation.
> > 
> > Sorry, I think I don't understand the issue. Are you trying to compute the inner
> > checksum by hardware and the outer checksum by software?
> > 
> 
> Correct. Inner checksum is offloaded and outer computed in software.

I think this approach is not sane: the value of the outer checksum depends
on the inner checksum, so it has to be calculated after. There is a comment
in the code about this:

	/* Then process outer headers if any. Note that the software
	 * checksum will be wrong if one of the inner checksums is
	 * processed in hardware. */
	if (info.is_tunnel == 1) {
		tx_ol_flags |= process_outer_cksums(outer_l3_hdr, &info,
				tx_offloads,
				!!(tx_ol_flags & PKT_TX_TCP_SEG));
	}


> Consider this example:
> Tunneled packet arrived at port A and being forwarded through port B.
> The packet arrived at port A with correct inner checksums - L3 and L4.
> Port B TX offloads inner L3 only.
> 
> process_inner_cksums() sets "ipv4_hdr->hdr_checksum = 0;" unconditionally.
> Inner L3 checksum value will be restored by port B TX checksum offload, but when 
> process_outer_cksums() runs software calculation on outer L4 it will use 0 and produce wrong result.
> 
> Therefore, the patch zeros inner checksum values only before actual software calculations.

I better understand your use case, thanks.

However, with your patch, if the inner L4 checksum is wrong when it
arrives on port A, I think it will result in a packet with a wrong outer
L4 checksum and a correct inner L4 checksum. Is it what you expect?

I don't argue against the patch itself. What you suggest better matches
the offload API than what we have today. Can you please send another
version that better explains the use-case?

One more suggestion, maybe for later. Currently, the csumonly engine can
be configured to do the checksum in sw or in hw. Maybe we could add a
"dont-touch" option, to keep the value in the packet. Would it help for
your use-case?

> 
> > > Fixes: 51f694dd40f5 ("app/testpmd: rework checksum forward engine")
> > 
> > I'm not sure the problem origin is this commit (however, I may have
> > misunderstood your issue).
> > 
> > At the time this commit was done, it was required to set the TCP/UDP
> > checksum to the pseudo header checksum to offload an L4 checksum. See:
> > https://git.dpdk.org/dpdk/tree/lib/librte_mbuf/rte_mbuf.h?id=51f694dd40f5
> > #n107
> > 
> > The introduction of rte_eth_tx_prepare() API removed this need, see:
> > https://git.dpdk.org/dpdk/commit/?id=6b520d54ebfe

Just a reminder for this one.

Thanks,
Olivier


> > Thanks,
> > Olivier
> > 
> > > Cc: stable at dpdk.org
> > >
> > > Signed-off-by: Gregory Etelson <getelson at nvidia.com>
> > > ---
> > > v2:
> > >  remove blank line between Fixes and Cc  explicitly compare with 0
> > > value in `if ()`
> > > ---
> > >  app/test-pmd/csumonly.c | 23 ++++++++++++-----------
> > >  1 file changed, 12 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
> > > 0161f72175..bd5ad64a57 100644
> > > --- a/app/test-pmd/csumonly.c
> > > +++ b/app/test-pmd/csumonly.c
> > > @@ -480,17 +480,18 @@ process_inner_cksums(void *l3_hdr, const struct
> > > testpmd_offload_info *info,
> > >
> > >       if (info->ethertype == _htons(RTE_ETHER_TYPE_IPV4)) {
> > >               ipv4_hdr = l3_hdr;
> > > -             ipv4_hdr->hdr_checksum = 0;
> > >
> > >               ol_flags |= PKT_TX_IPV4;
> > >               if (info->l4_proto == IPPROTO_TCP && tso_segsz) {
> > >                       ol_flags |= PKT_TX_IP_CKSUM;
> > >               } else {
> > > -                     if (tx_offloads & DEV_TX_OFFLOAD_IPV4_CKSUM)
> > > +                     if (tx_offloads & DEV_TX_OFFLOAD_IPV4_CKSUM) {
> > >                               ol_flags |= PKT_TX_IP_CKSUM;
> > > -                     else
> > > +                     } else if (ipv4_hdr->hdr_checksum != 0) {
> > > +                             ipv4_hdr->hdr_checksum = 0;
> > >                               ipv4_hdr->hdr_checksum =
> > >                                       rte_ipv4_cksum(ipv4_hdr);
> > > +                     }
> > >               }
> > >       } else if (info->ethertype == _htons(RTE_ETHER_TYPE_IPV6))
> > >               ol_flags |= PKT_TX_IPV6; @@ -501,10 +502,10 @@
> > > process_inner_cksums(void *l3_hdr, const struct testpmd_offload_info
> > *info,
> > >               udp_hdr = (struct rte_udp_hdr *)((char *)l3_hdr + info->l3_len);
> > >               /* do not recalculate udp cksum if it was 0 */
> > >               if (udp_hdr->dgram_cksum != 0) {
> > > -                     udp_hdr->dgram_cksum = 0;
> > > -                     if (tx_offloads & DEV_TX_OFFLOAD_UDP_CKSUM)
> > > +                     if (tx_offloads & DEV_TX_OFFLOAD_UDP_CKSUM) {
> > >                               ol_flags |= PKT_TX_UDP_CKSUM;
> > > -                     else {
> > > +                     } else {
> > > +                             udp_hdr->dgram_cksum = 0;
> > >                               udp_hdr->dgram_cksum =
> > >                                       get_udptcp_checksum(l3_hdr, udp_hdr,
> > >                                               info->ethertype); @@
> > > -514,12 +515,12 @@ process_inner_cksums(void *l3_hdr, const struct
> > testpmd_offload_info *info,
> > >                       ol_flags |= PKT_TX_UDP_SEG;
> > >       } else if (info->l4_proto == IPPROTO_TCP) {
> > >               tcp_hdr = (struct rte_tcp_hdr *)((char *)l3_hdr + info->l3_len);
> > > -             tcp_hdr->cksum = 0;
> > >               if (tso_segsz)
> > >                       ol_flags |= PKT_TX_TCP_SEG;
> > > -             else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> > > +             else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM) {
> > >                       ol_flags |= PKT_TX_TCP_CKSUM;
> > > -             else {
> > > +             } else if (tcp_hdr->cksum != 0) {
> > > +                     tcp_hdr->cksum = 0;
> > >                       tcp_hdr->cksum =
> > >                               get_udptcp_checksum(l3_hdr, tcp_hdr,
> > >                                       info->ethertype); @@ -529,13
> > > +530,13 @@ process_inner_cksums(void *l3_hdr, const struct
> > testpmd_offload_info *info,
> > >       } else if (info->l4_proto == IPPROTO_SCTP) {
> > >               sctp_hdr = (struct rte_sctp_hdr *)
> > >                       ((char *)l3_hdr + info->l3_len);
> > > -             sctp_hdr->cksum = 0;
> > >               /* sctp payload must be a multiple of 4 to be
> > >                * offloaded */
> > >               if ((tx_offloads & DEV_TX_OFFLOAD_SCTP_CKSUM) &&
> > >                       ((ipv4_hdr->total_length & 0x3) == 0)) {
> > >                       ol_flags |= PKT_TX_SCTP_CKSUM;
> > > -             } else {
> > > +             } else if (sctp_hdr->cksum != 0) {
> > > +                     sctp_hdr->cksum = 0;
> > >                       /* XXX implement CRC32c, example available in
> > >                        * RFC3309 */
> > >               }
> > > --
> > > 2.32.0
> > >


More information about the stable mailing list