[dpdk-dev] [PATCH v2 12/12] virtio: add Tso support

Olivier Matz olivier.matz at 6wind.com
Thu Oct 13 17:45:21 CEST 2016



Le 13 octobre 2016 17:29:35 CEST, Yuanhan Liu <yuanhan.liu at linux.intel.com> a écrit :
>On Thu, Oct 13, 2016 at 05:15:24PM +0200, Olivier MATZ wrote:
>> 
>> 
>> On 10/13/2016 05:01 PM, Yuanhan Liu wrote:
>> >On Thu, Oct 13, 2016 at 04:52:25PM +0200, Olivier MATZ wrote:
>> >>
>> >>
>> >>On 10/13/2016 04:16 PM, Yuanhan Liu wrote:
>> >>>On Thu, Oct 13, 2016 at 04:02:49PM +0200, Olivier MATZ wrote:
>> >>>>
>> >>>>
>> >>>>On 10/13/2016 10:18 AM, Yuanhan Liu wrote:
>> >>>>>On Mon, Oct 03, 2016 at 11:00:23AM +0200, Olivier Matz wrote:
>> >>>>>>+/* When doing TSO, the IP length is not included in the pseudo
>header
>> >>>>>>+ * checksum of the packet given to the PMD, but for virtio it
>is
>> >>>>>>+ * expected.
>> >>>>>>+ */
>> >>>>>>+static void
>> >>>>>>+virtio_tso_fix_cksum(struct rte_mbuf *m)
>> >>>>>>+{
>> >>>>>>+	/* common case: header is not fragmented */
>> >>>>>>+	if (likely(rte_pktmbuf_data_len(m) >= m->l2_len + m->l3_len +
>> >>>>>>+			m->l4_len)) {
>> >>>>>...
>> >>>>>>+		/* replace it in the packet */
>> >>>>>>+		th->cksum = new_cksum;
>> >>>>>>+	} else {
>> >>>>>...
>> >>>>>>+		/* replace it in the packet */
>> >>>>>>+		*rte_pktmbuf_mtod_offset(m, uint8_t *,
>> >>>>>>+			m->l2_len + m->l3_len + 16) = new_cksum.u8[0];
>> >>>>>>+		*rte_pktmbuf_mtod_offset(m, uint8_t *,
>> >>>>>>+			m->l2_len + m->l3_len + 17) = new_cksum.u8[1];
>> >>>>>>+	}
>> >>>>>
>> >>>>>The tcp header will always be in the mbuf, right? Otherwise, you
>can't
>> >>>>>update the cksum field here. What's the point of introducing the
>"else
>> >>>>>clause" then?
>> >>>>
>> >>>>Sorry, I don't see the problem you're pointing out here.
>> >>>>
>> >>>>What I want to solve here is to support the cases where the mbuf
>is
>> >>>>segmented in the middle of the network header (which is probably
>a rare
>> >>>>case).
>> >>>
>> >>>How it's gonna segmented?
>> >>
>> >>The mbuf is given by the application. So if the application
>generates a
>> >>segmented mbuf, it should work.
>> >>
>> >>This could happen for instance if the application uses mbuf clones
>to share
>> >>the IP/TCP/data part of the mbuf and prepend a specific
>Ethernet/vlan for
>> >>different destination.
>> >>
>> >>
>> >>>>In the "else" part, I only access the mbuf byte by byte using the
>> >>>>rte_pktmbuf_mtod_offset() accessor. An alternative would have
>been to copy
>> >>>>the header in a linear buffer, fix the checksum, then copy it
>again in the
>> >>>>packet, but there is no mbuf helpers to do these copies for now.
>> >>>
>> >>>In the "else" clause, the ip header is still in the mbuf, right?
>> >>>Why do you have to access it the way like:
>> >>>
>> >>>	ip_version = *rte_pktmbuf_mtod_offset(m, uint8_t *,
>> >>>		m->l2_len) >> 4;
>> >>>
>> >>>Why can't you just use
>> >>>
>> >>>	iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len);
>> >>>	iph->version_ihl ....;
>> >>
>> >>AFAIK, there is no requirement that each network header has to be
>contiguous
>> >>in a mbuf segment.
>> >>
>> >>Of course, a split in the middle of a network header probably never
>> >>happens... but we never knows, as it is not forbidden. I think the
>code
>> >>should be robust enough to avoid accesses to wrong addresses.
>> >>
>> >>Hope it's clear enough :)
>> >
>> >Thanks, but not really. Maybe let me ask this way: what wrong would
>> >happen if we use
>> >	iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len);
>> >to access the IP header? Is it about the endian?
>> 
>> If you have a packet split like this:
>> 
>> mbuf segment 1                     mbuf segment 2
>> ----------------------------       ------------------------------
>> | Ethernet header |  IP hea|       |der | TCP header | data
>> ----------------------------       ------------------------------
>>                    ^
>>                    iph
>
>Thanks, that's clear. How could you be able to access the tcp header
>from the first mbuf then? I mean, how is the following code supposed
>to work?
>
>    prev_cksum.u8[0] = *rte_pktmbuf_mtod_offset(m, uint8_t *,
>			m->l2_len + m->l3_len + 16);
>

Oh I see... Sorry there was a confusion on my side with another (internal) macro that browses the segments if the offset ils not in the first one.

If you agree, let's add the code without the else part, I'll fix it for the rc2.


>> The IP header is not contiguous. So accessing to the end of the
>structure
>> will access to a wrong location.
>> 
>> >One more question is do you have any case to trigger the "else"
>clause?
>> 
>> No, but I think it may happen.
>
>A piece of untest code is not trusted though ...
>
>	--yliu



More information about the dev mailing list