[dpdk-dev] mbuf changes

Adrien Mazarguil adrien.mazarguil at 6wind.com
Tue Oct 25 15:48:17 CEST 2016


On Tue, Oct 25, 2016 at 02:16:29PM +0200, Morten Brørup wrote:
> Comments inline.

I'm only replying to the nb_segs bits here.

> > -----Original Message-----
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Tuesday, October 25, 2016 1:14 PM
> > To: Adrien Mazarguil
> > Cc: Morten Brørup; Wiles, Keith; dev at dpdk.org; Olivier Matz; Oleg
> > Kuporosov
> > Subject: Re: [dpdk-dev] mbuf changes
> > 
> > On Tue, Oct 25, 2016 at 01:04:44PM +0200, Adrien Mazarguil wrote:
> > > On Tue, Oct 25, 2016 at 12:11:04PM +0200, Morten Brørup wrote:
> > > > Comments inline.
> > > >
> > > > Med venlig hilsen / kind regards
> > > > - Morten Brørup
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> > > > > Sent: Tuesday, October 25, 2016 11:39 AM
> > > > > To: Bruce Richardson
> > > > > Cc: Wiles, Keith; Morten Brørup; dev at dpdk.org; Olivier Matz; Oleg
> > > > > Kuporosov
> > > > > Subject: Re: [dpdk-dev] mbuf changes
> > > > >
> > > > > On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson wrote:
> > > > > > On Mon, Oct 24, 2016 at 04:11:33PM +0000, Wiles, Keith wrote:
> > > > > [...]
> > > > > > > > On Oct 24, 2016, at 10:49 AM, Morten Brørup
> > > > > <mb at smartsharesystems.com> wrote:
> > > > > [...]
> > > > > > > > 5.
> > > > > > > >
> > > > > > > > And here’s something new to think about:
> > > > > > > >
> > > > > > > > m->next already reveals if there are more segments to a
> > packet.
> > > > > Which purpose does m->nb_segs serve that is not already covered
> > by
> > > > > m-
> > > > > >next?
> > > > > >
> > > > > > It is duplicate info, but nb_segs can be used to check the
> > > > > > validity
> > > > > of
> > > > > > the next pointer without having to read the second mbuf
> > cacheline.
> > > > > >
> > > > > > Whether it's worth having is something I'm happy enough to
> > > > > > discuss, though.
> > > > >
> > > > > Although slower in some cases than a full blown "next packet"
> > > > > pointer, nb_segs can also be conveniently abused to link several
> > > > > packets and their segments in the same list without wasting
> > space.
> > > >
> > > > I don’t understand that; can you please elaborate? Are you abusing
> > m->nb_segs as an index into an array in your application? If that is
> > the case, and it is endorsed by the community, we should get rid of m-
> > >nb_segs and add a member for application specific use instead.
> > >
> > > Well, that's just an idea, I'm not aware of any application using
> > > this, however the ability to link several packets with segments seems
> > > useful to me (e.g. buffering packets). Here's a diagram:
> > >
> > >  .-----------.   .-----------.   .-----------.   .-----------.   .---
> > ---
> > >  | pkt 0     |   | seg 1     |   | seg 2     |   | pkt 1     |   |
> > pkt 2
> > >  |      next --->|      next --->|      next --->|      next --->|
> > ...
> > >  | nb_segs 3 |   | nb_segs 1 |   | nb_segs 1 |   | nb_segs 1 |   |
> > >  `-----------'   `-----------'   `-----------'   `-----------'   `---
> > ---
> 
> I see. It makes it possible to refer to a burst of packets (with segments or not) by a single mbuf reference, as an alternative to the current design pattern of using an array and length (struct rte_mbuf **mbufs, unsigned count).
> 
> This would require implementation in the PMDs etc.
> 
> And even in this case, m->nb_segs does not need to be an integer, but could be replaced by a single bit indicating if the segment is a continuation of a packet or the beginning (alternatively the end) of a packet, i.e. the bit can be set for either the first or the last segment in the packet.

Sure however if we keep the current definition, a single bit would not be
enough as it must be nonzero for the buffer to be valid. I think a 8 bit
field is not that expensive for a counter.

> It is an almost equivalent alternative to the fundamental design pattern of using an array of mbuf with count, which is widely implemented in DPDK. And m->next still lives in the second cache line, so I don't see any gain by this.

That's right, it does not have to live in the first cache line, my only
concern was its entire removal.

> I still don't get how m->nb_segs can be abused without m->next.

By "abused" I mean that applications are not supposed to pass this kind of
mbuf lists directly to existing mbuf-handling functions (TX burst,
rte_pktmbuf_free() and so on), however these same applications (even PMDs)
can do so internally temporarily because it's so simple.

The next pointer of the last segment of a packet must still be set to NULL
every time a packet is retrieved from such a list to be processed.

> > However, nb_segs may be a good candidate for demotion, along with
> > possibly the port value, or the reference count.

Yes, I think that's fine as long as it's kept somewhere.

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list