[dpdk-dev] [PATCH v3 2/5] gso: add TCP/IPv4 GSO support

Kavanagh, Mark B mark.b.kavanagh at intel.com
Thu Sep 14 11:00:36 CEST 2017


>From: Ananyev, Konstantin
>Sent: Thursday, September 14, 2017 9:40 AM
>To: Kavanagh, Mark B <mark.b.kavanagh at intel.com>; Hu, Jiayu
><jiayu.hu at intel.com>
>Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
>Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>
>
>
>> -----Original Message-----
>> From: Kavanagh, Mark B
>> Sent: Thursday, September 14, 2017 9:35 AM
>> To: Hu, Jiayu <jiayu.hu at intel.com>; Ananyev, Konstantin
><konstantin.ananyev at intel.com>
>> Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
>> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>>
>> >From: Hu, Jiayu
>> >Sent: Thursday, September 14, 2017 2:00 AM
>> >To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Kavanagh, Mark B
>> ><mark.b.kavanagh at intel.com>
>> >Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
>> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >
>> >Hi Konstantin,
>> >
>> >> -----Original Message-----
>> >> From: Ananyev, Konstantin
>> >> Sent: Wednesday, September 13, 2017 11:13 PM
>> >> To: Kavanagh, Mark B <mark.b.kavanagh at intel.com>; Hu, Jiayu
>> >> <jiayu.hu at intel.com>
>> >> Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
>> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >>
>> >> Hi Mark,
>> >>
>> >> > -----Original Message-----
>> >> > From: Kavanagh, Mark B
>> >> > Sent: Wednesday, September 13, 2017 3:52 PM
>> >> > To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Hu, Jiayu
>> >> <jiayu.hu at intel.com>
>> >> > Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
>> >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >> >
>> >> > >From: Ananyev, Konstantin
>> >> > >Sent: Wednesday, September 13, 2017 10:38 AM
>> >> > >To: Hu, Jiayu <jiayu.hu at intel.com>
>> >> > >Cc: dev at dpdk.org; Kavanagh, Mark B <mark.b.kavanagh at intel.com>;
>> >> Tan, Jianfeng
>> >> > ><jianfeng.tan at intel.com>
>> >> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
>> >> > >
>> >> > >
>> >> > >
>> >> > >> > > +
>> >> > >> > > +int
>> >> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
>> >> > >> > > +		uint16_t gso_size,
>> >> > >> > > +		uint8_t ipid_delta,
>> >> > >> > > +		struct rte_mempool *direct_pool,
>> >> > >> > > +		struct rte_mempool *indirect_pool,
>> >> > >> > > +		struct rte_mbuf **pkts_out,
>> >> > >> > > +		uint16_t nb_pkts_out)
>> >> > >> > > +{
>> >> > >> > > +	struct ipv4_hdr *ipv4_hdr;
>> >> > >> > > +	uint16_t tcp_dl;
>> >> > >> > > +	uint16_t pyld_unit_size;
>> >> > >> > > +	uint16_t hdr_offset;
>> >> > >> > > +	int ret = 1;
>> >> > >> > > +
>> >> > >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
>> >> +
>> >> > >> > > +			pkt->l2_len);
>> >> > >> > > +	/* Don't process the fragmented packet */
>> >> > >> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
>> >> > >> > > +
>> >> 	IPV4_HDR_DF_MASK)) == 0)) {
>> >> > >> >
>> >> > >> >
>> >> > >> > It is not a check for fragmented packet - it is a check that
>> >> fragmentation
>> >> > >is allowed for that packet.
>> >> > >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
>> >> > >
>> >> > >DF bit doesn't indicate is packet fragmented or not.
>> >> > >It forbids to fragment packet any further.
>> >> > >To check is packet already fragmented or not, you have to check MF bit
>> >> and
>> >> > >frag_offset.
>> >> > >Both have to be zero for un-fragmented packets.
>> >> > >
>> >> > >>
>> >> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF
>bit.
>> >> It's a
>> >> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian
>order.
>> >> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
>> >> > >rte_cpu_to_be_16(
>> >> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is
>fragmented.
>> >> > >>
>> >> > >> >
>> >> > >> > > +		pkts_out[0] = pkt;
>> >> > >> > > +		return ret;
>> >> > >> > > +	}
>> >> > >> > > +
>> >> > >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt-
>> >> >l3_len -
>> >> > >> > > +		pkt->l4_len;
>> >> > >> >
>> >> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
>> >> > >>
>> >> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len
>> >here.
>> >> > >>
>> >> > >> >
>> >> > >> > > +	/* Don't process the packet without data */
>> >> > >> > > +	if (unlikely(tcp_dl == 0)) {
>> >> > >> > > +		pkts_out[0] = pkt;
>> >> > >> > > +		return ret;
>> >> > >> > > +	}
>> >> > >> > > +
>> >> > >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>> >> > >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
>> >> > >> >
>> >> > >> > Hmm, why do we need to count CRC_LEN here?
>> >> > >>
>> >> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
>> >> > >> included in gso_size.
>> >> > >
>> >> > >Why?
>> >> > >What is the point to account crc len into this computation?
>> >> > >Why not just assume that gso_size is already a max_frame_size -
>crc_len
>> >> > >As I remember, when we RX packet crc bytes will be already stripped,
>> >> > >when user populates the packet, he doesn't care about crc bytes too.
>> >> >
>> >> > Hi Konstantin,
>> >> >
>> >> > When packet is tx'd, the 4B for CRC are added back into the packet; if
>the
>> >> payload is already at max capacity, then the actual segment size
>> >> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
>> >> > To prevent that from happening, we account for the CRC len in this
>> >> calculation.
>> >>
>> >>
>> >> Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether frame
>> >> size without crc bytes */
>> >> ?
>>
>> Hey Konstantin,
>>
>> If the user sets the gso_size to 1514, the resultant output segments' size
>should be 1514, and not 1518.

Just to clarify - I meant here that the final output segment, including CRC len, should be 1514. I think this is where we're crossing wires ;)

>
>Yes and then NIC HW will add CRC bytes for you.
>You are not filling CRC bytes in HW, and when providing to the HW size to send
>- it is a payload size
>(CRC bytes are not accounted).
>Konstantin

Yes, exactly - in that case though, the gso_size specified by the user is not the actual final output segment size, but (segment size - 4B), right?

We can set that expectation in documentation, but from an application's/user's perspective, do you think that this might be confusing/misleading?

Thanks again,
Mark  

>
> Consequently, the payload capacity
>> of each segment would be reduced accordingly.
>> The user only cares about the output segment size (i.e. gso_ctx.gso_size);
>we need to ensure that the size of the segments that are
>> produced is consistent with that. As a result, we need to ensure that any
>packet overhead is accounted for in the segment size, before we
>> can determine how much space remains for data.
>>
>> Hope this makes sense.
>>
>> Thanks,
>> Mark
>>
>> >
>> >Exactly, applications can set 1514 to gso_segsz instead of 1518, if the
>lower
>> >layer
>> >will add CRC to the packet.
>> >
>> >Jiayu
>> >
>> >> Konstantin
>> >>
>> >> >
>> >> > If I've missed anything, please do let me know!
>> >> >
>> >> > Thanks,
>> >> > Mark
>> >> >
>> >> > >
>> >> > >Konstantin


More information about the dev mailing list