[dpdk-dev] [PATCH v4 1/2] mbuf: support attaching external buffer to mbuf
Olivier Matz
olivier.matz at 6wind.com
Tue Apr 24 18:02:44 CEST 2018
Hi Andrew, Yongseok,
On Tue, Apr 24, 2018 at 03:28:33PM +0300, Andrew Rybchenko wrote:
> On 04/24/2018 04:38 AM, Yongseok Koh wrote:
> > This patch introduces a new way of attaching an external buffer to a mbuf.
> >
> > Attaching an external buffer is quite similar to mbuf indirection in
> > replacing buffer addresses and length of a mbuf, but a few differences:
> > - When an indirect mbuf is attached, refcnt of the direct mbuf would be
> > 2 as long as the direct mbuf itself isn't freed after the attachment.
> > In such cases, the buffer area of a direct mbuf must be read-only. But
> > external buffer has its own refcnt and it starts from 1. Unless
> > multiple mbufs are attached to a mbuf having an external buffer, the
> > external buffer is writable.
> > - There's no need to allocate buffer from a mempool. Any buffer can be
> > attached with appropriate free callback.
> > - Smaller metadata is required to maintain shared data such as refcnt.
>
> Really useful. Many thanks. See my notes below.
>
> It worries me that detach is more expensive than it really required since it
> requires to restore mbuf as direct. If mbuf mempool is used for mbufs
> as headers for external buffers only all these actions are absolutely
> useless.
I agree on the principle. And we have the same issue with indirect mbuf.
Currently, the assumption is that a free mbuf (inside a mempool) is
initialized as a direct mbuf. We can think about optimizations here,
but I'm not sure it should be in this patchset.
[...]
> > @@ -688,14 +704,33 @@ rte_mbuf_to_baddr(struct rte_mbuf *md)
> > }
> > /**
> > + * Returns TRUE if given mbuf is cloned by mbuf indirection, or FALSE
> > + * otherwise.
> > + *
> > + * If a mbuf has its data in another mbuf and references it by mbuf
> > + * indirection, this mbuf can be defined as a cloned mbuf.
> > + */
> > +#define RTE_MBUF_CLONED(mb) ((mb)->ol_flags & IND_ATTACHED_MBUF)
> > +
> > +/**
> > * Returns TRUE if given mbuf is indirect, or FALSE otherwise.
> > */
> > -#define RTE_MBUF_INDIRECT(mb) ((mb)->ol_flags & IND_ATTACHED_MBUF)
> > +#define RTE_MBUF_INDIRECT(mb) RTE_MBUF_CLONED(mb)
>
> It is still confusing that INDIRECT != !DIRECT.
> May be we have no good options right now, but I'd suggest to at least
> deprecate
> RTE_MBUF_INDIRECT() and completely remove it in the next release.
Agree. I may have missed something, but is my previous suggestion
not doable?
- direct = embeds its own data (and indirect = !direct)
- clone (or another name) = data is another mbuf
- extbuf = data is in an external buffer
Deprecating the macro is a good idea.
> > + m->buf_addr = buf_addr;
> > + m->buf_iova = buf_iova;
> > +
> > + if (shinfo == NULL) {
> > + shinfo = RTE_PTR_ALIGN_FLOOR(RTE_PTR_SUB(buf_end,
> > + sizeof(*shinfo)), sizeof(uintptr_t));
> > + if ((void *)shinfo <= buf_addr)
> > + return NULL;
> > +
> > + m->buf_len = RTE_PTR_DIFF(shinfo, buf_addr);
> > + } else {
> > + m->buf_len = buf_len;
> > + }
> > +
> > + m->data_len = 0;
> > +
> > + rte_pktmbuf_reset_headroom(m);
>
> I would suggest to make data_off one more parameter.
> If I have a buffer with data which I'd like to attach to an mbuf, I'd like
> to control data_off.
Another option is to set the headroom to 0.
Because the after attaching the mbuf to an external buffer, we will
still require to set the length.
A user can do something like this:
rte_pktmbuf_attach_extbuf(m, buf_va, buf_iova, buf_len, shinfo,
free_cb, free_cb_arg);
rte_pktmbuf_append(m, data_len + headroom);
rte_pktmbuf_adj(m, headroom);
>
> > + m->ol_flags |= EXT_ATTACHED_MBUF;
> > + m->shinfo = shinfo;
> > +
> > + rte_mbuf_ext_refcnt_set(shinfo, 1);
>
> Why is assignment used here? Cannot we attach extbuf already attached to
> other mbuf?
In rte_pktmbuf_attach(), this is true. That's not illogical to
keep the same approach here. Maybe an assert could be added?
> May be shinfo should be initialized only if it is not provided (shinfo ==
> NULL on input)?
I don't get why, can you explain please?
Thanks,
Olivier
More information about the dev
mailing list