[dpdk-dev] [RFC 0/8] mbuf: structure reorganization

Ananyev, Konstantin konstantin.ananyev at intel.com
Fri Feb 17 19:42:01 CET 2017


Hi guys,

> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Friday, February 17, 2017 2:17 PM
> To: Jan Blunck <jblunck at infradead.org>
> Cc: Ananyev, Konstantin <konstantin.ananyev at intel.com>; dev at dpdk.org
> Subject: Re: [dpdk-dev] [RFC 0/8] mbuf: structure reorganization
> 
> Hi Jan,
> 
> On Fri, 17 Feb 2017 14:38:32 +0100, Jan Blunck <jblunck at infradead.org>
> wrote:
> > On Fri, Feb 17, 2017 at 11:51 AM, Olivier Matz
> > <olivier.matz at 6wind.com> wrote:
> > > Hi Jan,
> > >
> > > On Thu, 16 Feb 2017 18:26:39 +0100, Jan Blunck
> > > <jblunck at infradead.org> wrote:
> > >> On Thu, Feb 16, 2017 at 2:48 PM, Olivier Matz
> > >> <olivier.matz at 6wind.com> wrote:
> > >> > On Mon, 6 Feb 2017 18:41:27 +0000, "Ananyev, Konstantin"
> > >> > <konstantin.ananyev at intel.com> wrote:
> > >> >> >
> > >> >> > The main changes are:
> > >> >> > - reorder structure to increase vector performance on some
> > >> >> > non-ia platforms.
> > >> >> > - add a 64bits timestamp field in the 1st cache line
> > >> >>
> > >> >> Wonder why it deserves to be in first cache line?
> > >> >> How it differs from seqn below (pure SW stuff right now).
> > >> >
> > >> > In case the timestamp is set from a NIC value, it is set in the
> > >> > Rx path. So that's why I think it deserve to be located in the
> > >> > 1st cache line.
> > >> >
> > >> > As you said, the seqn is a pure sw stuff right: it is set in a
> > >> > lib, not in a PMD rx path.
> > >> >
> > >>
> > >> If we talk about setting the timestamp value in the RX path this
> > >> implicitly means software timestamps. Hardware timestamping usually
> > >> works by letting the hardware inject sync events for coarse time
> > >> tracking and additionally injecting fine granular per-packet ticks
> > >> at a specific offset in the packet. Out of performance reasons I
> > >> don't think it makes sense to extract this during the burst and
> > >> write it into the mbuf again.
> > >
> > > From what I've understand, at least it does not work like this for
> > > mellanox NICs: timestamp is a metadata attached to a rx packet. But
> > > maybe they (and other NIC vendors interrested in the feature) can
> > > confirm or not.
> > >
> >
> > Mellanox NICs use a 48bit cycle counter split into a high and low
> > part. To convert the cycle values into a timestamp you need to
> > initialize and maintainer a timecounter that shifts the cycle count
> > e.g. nanosecs. IIRC Mellanox doesn't generate explicit clock events
> > but the cycle counter is large enough so that the user can easily
> > maintain the timecounter by manually updating it.
> >
> > >>
> > >> The problem with timestamps is to get the abstraction right wrt the
> > >> correction factors and the size of the tick vs. the timestamp in
> > >> the events injected. From my perspective it would be better to
> > >> extract the handling of timestamp data into a library with PMD
> > >> specific implementation of the conversions. That way the
> > >> normalized timestamp values can get extracted if they are present.
> > >> The mbuf itself would only indicate the presence of timestamp
> > >> metadata in that case.
> > >
> > > I agree however that we need to properly define the meaning of this
> > > field. My idea is:
> > >
> > > - the timestamp is in nanosecond
> > > - the reference is always the same for a given path: if the
> > > timestamp is set in a PMD, all the packets for this PMD will have
> > > the same reference, but for 2 different PMDs (or a sw lib), the
> > > reference would not be the same.
> > >
> > > I think it's enough for many use cases.
> > > We can later add helpers to compare timestamps with different
> > > references.
> >
> > My point is that I still doubt that it belongs into the first
> > cacheline. It requires accessing other structures for converting into
> > nanoseconds anyway. Optimally I would like to see this happening on
> > access instead but if that isn't achievable at least in a second step.
> 
> Sorry, I don't really get your point. My comprehension of the timestamp
> usage in a PMD is as following:
> 
> rx_burst(struct rxq *rxq, ...)
> {
> 	unsigned long factor = rxq->timestamp_factor;
> 	unsigned port = rxq->port;
> 
> 	for each hw_desc {
> 		m = rte_pktmbuf_alloc(rxq->pool);
> 		m->len = hw_desc->len;
> 		m->port = port;
> 		m->ol_flags =
> 		...
> 		m->timestamp = hw_desc->timestamp * factor;
> 	}
> 	...
> }
> 
> In that case, I think it deserves to be in the 1st cache line.

So you are saying that:
- for some HW that DPDK supports (mlx?) timestamp information
Is available in HW RX descriptor
- and as soon timestamp field will be available in mbuf, you plan
to populate it using this HW RXD field.
Is that so?
Konstantin


More information about the dev mailing list