[dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework

Liu, Jijiang jijiang.liu at intel.com
Wed Dec 3 09:02:01 CET 2014


Hi Thomas,

> -----Original Message-----
> From: Liu, Jijiang
> Sent: Friday, November 28, 2014 12:32 AM
> To: Olivier MATZ
> Cc: Ananyev, Konstantin; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework
> 
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Thursday, November 27, 2014 11:30 PM
> > To: Olivier MATZ; Liu, Jijiang; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework
> >
> > Hi Oliver,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier MATZ
> > > Sent: Thursday, November 27, 2014 9:45 AM
> > > To: Liu, Jijiang; dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 0/3] i40e VXLAN TX checksum rework
> > >
> > > Hi Jijiang,
> > >
> > > Please find below some comments about the specifications. The global
> > > picture looks fine to me.
> > >
> > > I've not reviewed the patch right now, but it's in the pipe.
> > >
> > > On 11/27/2014 09:18 AM, Jijiang Liu wrote:
> > > > We have got some feedback about backward compatibility of VXLAN TX
> > > > checksum offload API with 1G/10G NIC after the i40e VXLAN
> > > TX checksum codes were applied, so we have to rework the APIs on
> > > i40e,
> > including the changes of mbuf, i40e PMD and csum engine.
> > > >
> > > > The main changes in mbuf are as follows, In place of removing
> > > > PKT_TX_VXLAN_CKSUM, we introducing 2 new flags:
> > PKT_TX_OUT_IP_CKSUM,
> > > > PKT_TX_UDP_TUNNEL_PKT,
> > > and a new field: l4_tun_len.
> > >
> > > What about PKT_TX_OUT_UDP_CKSUM instead of
> > PKT_TX_UDP_TUNNEL_PKT? It's
> > > maybe more coherent with the other names.
> >
> > FVL HW don't support outer L4 checksum offload.
> > But to calculate inner checksums correctly, it needs a hint from SW
> > about L4 Tunnelling Type.
> >
> > >
> > >
> > > > Replace the inner_l2_len and the inner_l3_len field with the
> > > > outer_l2_len and
> > outer_l3_len field.
> > > >
> > > > The existing flags are listed below,
> > > > PKT_TX_IP_CKSUM:     HW IPv4 checksum for non-tunnelling packet/ HW
> > inner IPv4 checksum for tunnelling packet
> > > > PKT_TX_TCP_CKSUM:    HW TCP checksum for non-tunnelling packet/ HW
> > inner TCP checksum for tunnelling packet
> > > > PKT_TX_SCTP_CKSUM:   HW SCTP checksum for non-tunnelling packet/ HW
> > inner SCTP checksum for tunnelling packet
> > > > PKT_TX_UDP_CKSUM:    HW SCTP checksum for non-tunnelling packet/ HW
> > inner SCTP checksum for tunnelling packet
> > > > PKT_TX_IPV4:        IPv4 with no HW checksum offload for non-tunnelling
> > packet/inner IPv4 with no HW checksum offload for
> > > tunnelling packet
> > > > PKT_TX_IPV6:        IPv6 non-tunnelling packet/ inner IPv6 with no HW
> > checksum offload for tunnelling packet
> > >
> > > As I suggested in the TSO thread, I think the following semantics is
> > > easier to understand for the user:
> > >
> > >    - PKT_TX_IP_CKSUM: tell the NIC to compute IP cksum
> > >
> > >    - PKT_TX_IPV4: tell the NIC it's an IPv4 packet. Required for L4
> > >      checksum offload or TSO.
> > >
> > >    - PKT_TX_IPV6: tell the NIC it's an IPv6 packet. Required for L4
> > >      checksum offload or TSO.
> > >
> > > I think it won't make a big difference in the FVL driver.
> >
> > No, no big difference here, but I still think it will be a bit cleaner
> > if all 3 flags would be nutually exclusive.
> > In fact,  we can unite all 3 of them them into 2 bits,    same as we doing for L4
> > checksum flags.
> >
> > >
> > >
> > > > let's use a few examples to demonstrate how to use these flags:
> > > > Let say we have a tunnel packet:
> > > > eth_hdr_out/ipv4_hdr_out/udp_hdr_out/vxlan_hdr/ehtr_hdr_in/ipv4_hd
> > > > r_
> > > > in/tcp_hdr_in.There
> > > could be several scenarios:
> > > >
> > > > A) User requests HW offload for ipv4_hdr_out checksum.
> > > > He doesn't care is it a tunnelled packet or not.
> > > > So he sets:
> > > >
> > > > mb->l2_len =  eth_hdr_out;
> > > > mb->l3_len = ipv4_hdr_out;
> > > > mb->ol_flags |= PKT_TX_IPV4_CSUM;
> > > >
> > > > B) User is aware that it is a tunnelled packet and requests HW
> > > > offload for
> > ipv4_hdr_in and tcp_hdr_in *only*.
> > > > He doesn't care about outer IP checksum offload.
> > > > In that case, for FVL  he has 2 choices:
> > > >     1. Treat that packet as a 'proper' tunnelled packet, and fill all the fields:
> > > >       mb->l2_len =  eth_hdr_in;
> > > >       mb->l3_len = ipv4_hdr_in;
> > > >       mb->outer_l2_len = eth_hdr_out;
> > > >       mb->outer_l3_len = ipv4_hdr_out;
> > > >       mb->l4tun_len = vxlan_hdr;
> > > >       mb->ol_flags |= PKT_TX_UDP_TUNNEL_PKT | PKT_TX_IP_CKSUM |
> > > > PKT_TX_TCP_CKSUM;
> > > >
> > > >     2. As user doesn't care about outer IP hdr checksum, he can
> > > > treat
> > everything before ipv4_hdr_in as L2 header.
> > > >     So he knows, that it is a tunnelled packet, but makes HW to
> > > > treat it as
> > ordinary (non-tunnelled) packet:
> > > >       mb->l2_len = eth_hdr_out + ipv4_hdr_out + udp_hdr_out +
> > > > vxlan_hdr +
> > ehtr_hdr_in;
> > > >       mb->l3_len = ipv4_hdr_in;
> > > >       mb->ol_flags |= PKT_TX_IP_CKSUM |  PKT_TX_TCP_CKSUM;
> > > >
> > > > i40e PMD will support both B.1 and B.2.
> > > > ixgbe/igb/em PMD supports only B.2.
> > > > if HW supports both - it will be up to user app which method to choose.
> > >
> > > I think we should have a flag to advertise outer ip and outer udp
> > > checksum offload support, so the application knows which mode can be
> > > used.
> >
> > You mean a new DEV_TX_OFFLOAD_* value, right?
> > Something like:  DEV_TX_OFFLOAD_UDP_TUNNEL?
> > And make i40e_dev_info_get() to return it?
> > Yes, forgot about it, sounds like a proper thing to do.

> Yes, makes sense, I will send a separate patch(bug fixing) to do this. Thanks .

I'm preparing this patch, and will send it out soon, I hope this patch also can be included in DPDK1.8
Thanks.

 

> > Konstantin
> >
> > >
> > >
> > > Regards,
> > > Olivier



More information about the dev mailing list