[dpdk-dev] [PATCH v5 1/2] mbuf: support attaching external buffer to mbuf

Yongseok Koh yskoh at mellanox.com
Wed Apr 25 20:22:22 CEST 2018


On Wed, Apr 25, 2018 at 11:02:36AM -0700, Yongseok Koh wrote:
> On Wed, Apr 25, 2018 at 05:23:20PM +0000, Ananyev, Konstantin wrote:
> > 
> > 
> > > -----Original Message-----
> > > From: Yongseok Koh [mailto:yskoh at mellanox.com]
> > > Sent: Wednesday, April 25, 2018 6:07 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev at intel.com>
> > > Cc: Lu, Wenzhuo <wenzhuo.lu at intel.com>; Wu, Jingjing <jingjing.wu at intel.com>; olivier.matz at 6wind.com; dev at dpdk.org;
> > > arybchenko at solarflare.com; stephen at networkplumber.org; thomas at monjalon.net; adrien.mazarguil at 6wind.com;
> > > nelio.laranjeiro at 6wind.com
> > > Subject: Re: [PATCH v5 1/2] mbuf: support attaching external buffer to mbuf
> > > 
> > > On Wed, Apr 25, 2018 at 01:31:42PM +0000, Ananyev, Konstantin wrote:
> > > [...]
> > > > >  /** Mbuf prefetch */
> > > > >  #define RTE_MBUF_PREFETCH_TO_FREE(m) do {       \
> > > > >  	if ((m) != NULL)                        \
> > > > > @@ -1213,11 +1306,127 @@ static inline int rte_pktmbuf_alloc_bulk(struct rte_mempool *pool,
> > > > >  }
> > > > >
> > > > >  /**
> > > > > + * Attach an external buffer to a mbuf.
> > > > > + *
> > > > > + * User-managed anonymous buffer can be attached to an mbuf. When attaching
> > > > > + * it, corresponding free callback function and its argument should be
> > > > > + * provided. This callback function will be called once all the mbufs are
> > > > > + * detached from the buffer.
> > > > > + *
> > > > > + * The headroom for the attaching mbuf will be set to zero and this can be
> > > > > + * properly adjusted after attachment. For example, ``rte_pktmbuf_adj()``
> > > > > + * or ``rte_pktmbuf_reset_headroom()`` can be used.
> > > > > + *
> > > > > + * More mbufs can be attached to the same external buffer by
> > > > > + * ``rte_pktmbuf_attach()`` once the external buffer has been attached by
> > > > > + * this API.
> > > > > + *
> > > > > + * Detachment can be done by either ``rte_pktmbuf_detach_extbuf()`` or
> > > > > + * ``rte_pktmbuf_detach()``.
> > > > > + *
> > > > > + * Attaching an external buffer is quite similar to mbuf indirection in
> > > > > + * replacing buffer addresses and length of a mbuf, but a few differences:
> > > > > + * - When an indirect mbuf is attached, refcnt of the direct mbuf would be
> > > > > + *   2 as long as the direct mbuf itself isn't freed after the attachment.
> > > > > + *   In such cases, the buffer area of a direct mbuf must be read-only. But
> > > > > + *   external buffer has its own refcnt and it starts from 1. Unless
> > > > > + *   multiple mbufs are attached to a mbuf having an external buffer, the
> > > > > + *   external buffer is writable.
> > > > > + * - There's no need to allocate buffer from a mempool. Any buffer can be
> > > > > + *   attached with appropriate free callback and its IO address.
> > > > > + * - Smaller metadata is required to maintain shared data such as refcnt.
> > > > > + *
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: This API may change without prior notice.
> > > > > + * Once external buffer is enabled by allowing experimental API,
> > > > > + * ``RTE_MBUF_DIRECT()`` and ``RTE_MBUF_INDIRECT()`` are no longer
> > > > > + * exclusive. A mbuf can be considered direct if it is neither indirect nor
> > > > > + * having external buffer.
> > > > > + *
> > > > > + * @param m
> > > > > + *   The pointer to the mbuf.
> > > > > + * @param buf_addr
> > > > > + *   The pointer to the external buffer we're attaching to.
> > > > > + * @param buf_iova
> > > > > + *   IO address of the external buffer we're attaching to.
> > > > > + * @param buf_len
> > > > > + *   The size of the external buffer we're attaching to. If memory for
> > > > > + *   shared data is not provided, buf_len must be larger than the size of
> > > > > + *   ``struct rte_mbuf_ext_shared_info`` and padding for alignment. If not
> > > > > + *   enough, this function will return NULL.
> > > > > + * @param shinfo
> > > > > + *   User-provided memory for shared data. If NULL, a few bytes in the
> > > > > + *   trailer of the provided buffer will be dedicated for shared data and
> > > > > + *   the shared data will be properly initialized. Otherwise, user must
> > > > > + *   initialize the content except for free callback and its argument. The
> > > > > + *   pointer of shared data will be stored in m->shinfo.
> > > > > + * @param free_cb
> > > > > + *   Free callback function to call when the external buffer needs to be
> > > > > + *   freed.
> > > > > + * @param fcb_opaque
> > > > > + *   Argument for the free callback function.
> > > > > + *
> > > > > + * @return
> > > > > + *   A pointer to the new start of the data on success, return NULL
> > > > > + *   otherwise.
> > > > > + */
> > > > > +static inline char * __rte_experimental
> > > > > +rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
> > > > > +	rte_iova_t buf_iova, uint16_t buf_len,
> > > > > +	struct rte_mbuf_ext_shared_info *shinfo,
> > > > > +	rte_mbuf_extbuf_free_callback_t free_cb, void *fcb_opaque)
> > > > > +{
> > > > > +	/* Additional attachment should be done by rte_pktmbuf_attach() */
> > > > > +	RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m));
> > > >
> > > > Shouldn't we have here something like:
> > > > RTE_ASSERT(RTE_MBUF_DIRECT(m) && rte_mbuf_refcnt_read(m) == 1);
> > > > ?
> > > 
> > > Right. That's better. Attaching mbuf should be direct and writable.
> > > 
> > > > > +
> > > > > +	m->buf_addr = buf_addr;
> > > > > +	m->buf_iova = buf_iova;
> > > > > +
> > > > > +	if (shinfo == NULL) {
> > > >
> > > > Instead of allocating shinfo ourselves - wound's it be better to rely
> > > > on caller always allocating afeeling it for us (he can do that at the end/start of buffer,
> > > > or whenever he likes to.
> > > 
> > > It is just for convenience. For some users, external attachment could be
> > > occasional and casual, e.g. punt control traffic from kernel/hv. For such
> > > non-serious cases, it is good to provide this small utility.
> > 
> > For such users that small utility could be a separate function then:
> > shinfo_inside_buf() or so.
> 
> I like this idea! As this is an inline function and can be called in a datapath,
> shorter code is better if it isn't expected to be used frequently.
> 
> Will take this idea for the new version. Thanks.

However, if this API is called with shinfo=NULL (builtin constant), this code
block won't get included in compile time because it is an inline function.

What is disadvantage to keep this block here? More intuitive?

Advantage of keeping it here could be simplicity. No need to call the utility in
advance.

Or separating this code to another inline function could make the API prototype
simpler because free_cb and its arg should be passed via shinfo.

static inline char * __rte_experimental
rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr,
		rte_iova_t buf_iova, uint16_t buf_len,
		struct rte_mbuf_ext_shared_info *shinfo)

I'm still inclined to write the utility function like you suggested.
Thoughts?

Thanks,
Yongseok

> > > > Again in that case - caller can provide one shinfo to several mbufs (with different buf_addrs)
> > > > and would know for sure that free_cb wouldn't be overwritten by mistake.
> > > > I.E. mbuf code will only update refcnt inside shinfo.
> > > 
> > > I think you missed the discussion with other people yesterday. This change is
> > > exactly for that purpose. Like I documented above, if this API is called with
> > > shinfo being provided, it will use the user-provided shinfo instead of sparing a
> > > few byte in the trailer and won't touch the shinfo.
> > 
> > As I can see your current code always update free_cb and fcb_opaque.
> > Which is kind of strange these fields shold be the same for all instances of the shinfo.


More information about the dev mailing list