[dpdk-dev] [PATCH v3 2/2] mempool: pktmbuf pool default fallback for mempool ops error
Hemant Agrawal
hemant.agrawal at nxp.com
Thu Dec 8 09:57:49 CET 2016
Hi Olivier,
Apology for a delayed response.
> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Tuesday, November 22, 2016 2:55 PM
> To: Hemant Agrawal <hemant.agrawal at nxp.com>
> Cc: dev at dpdk.org; jerin.jacob at caviumnetworks.com; david.hunt at intel.com
> Subject: Re: [PATCH v3 2/2] mempool: pktmbuf pool default fallback for
> mempool ops error
>
> Hi Hemant,
>
> Back on this topic, please see some comments below.
>
> On 11/07/2016 01:30 PM, Hemant Agrawal wrote:
> > Hi Olivier,
> >
> >> -----Original Message-----
> >> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> >> Sent: Friday, October 14, 2016 5:41 PM
> >>> On 9/22/2016 6:42 PM, Hemant Agrawal wrote:
> >>>> Hi Olivier
> >>>>
> >>>> On 9/19/2016 7:27 PM, Olivier Matz wrote:
> >>>>> Hi Hemant,
> >>>>>
> >>>>> On 09/16/2016 06:46 PM, Hemant Agrawal wrote:
> >>>>>> In the rte_pktmbuf_pool_create, if the default external mempool
> >>>>>> is not available, the implementation can default to "ring_mp_mc",
> >>>>>> which is an software implementation.
> >>>>>>
> >>>>>> Signed-off-by: Hemant Agrawal <hemant.agrawal at nxp.com>
> >>>>>> ---
> >>>>>> Changes in V3:
> >>>>>> * adding warning message to say that falling back to default sw
> >>>>>> pool
> >>>>>> ---
> >>>>>> lib/librte_mbuf/rte_mbuf.c | 8 ++++++++
> >>>>>> 1 file changed, 8 insertions(+)
> >>>>>>
> >>>>>> diff --git a/lib/librte_mbuf/rte_mbuf.c
> >>>>>> b/lib/librte_mbuf/rte_mbuf.c index 4846b89..8ab0eb1 100644
> >>>>>> --- a/lib/librte_mbuf/rte_mbuf.c
> >>>>>> +++ b/lib/librte_mbuf/rte_mbuf.c
> >>>>>> @@ -176,6 +176,14 @@ rte_pktmbuf_pool_create(const char *name,
> >>>>>> unsigned n,
> >>>>>>
> >>>>>> rte_errno = rte_mempool_set_ops_byname(mp,
> >>>>>> RTE_MBUF_DEFAULT_MEMPOOL_OPS, NULL);
> >>>>>> +
> >>>>>> + /* on error, try falling back to the software based default
> >>>>>> pool */
> >>>>>> + if (rte_errno == -EOPNOTSUPP) {
> >>>>>> + RTE_LOG(WARNING, MBUF, "Default HW Mempool not
> supported. "
> >>>>>> + "falling back to sw mempool \"ring_mp_mc\"");
> >>>>>> + rte_errno = rte_mempool_set_ops_byname(mp, "ring_mp_mc",
> >>>>>> NULL);
> >>>>>> + }
> >>>>>> +
> >>>>>> if (rte_errno != 0) {
> >>>>>> RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
> >>>>>> return NULL;
> >>>>>>
> >>>>>
> >>>>> Without adding a new method ".supported()", the first call to
> >>>>> rte_mempool_populate() could return the same error ENOTSUP. In
> >>>>> this case, it is still possible to fallback.
> >>>>>
> >>>> It will be bit late.
> >>>>
> >>>> On failure, than we have to set the default ops and do a goto
> >>>> before rte_pktmbuf_pool_init(mp, &mbp_priv);
> >>
> >> I still think we can do the job without adding the .supported() method.
> >> The following code is just an (untested) example:
> >>
> >> struct rte_mempool *
> >> rte_pktmbuf_pool_create(const char *name, unsigned n,
> >> unsigned cache_size, uint16_t priv_size, uint16_t data_room_size,
> >> int socket_id)
> >> {
> >> struct rte_mempool *mp;
> >> struct rte_pktmbuf_pool_private mbp_priv;
> >> unsigned elt_size;
> >> int ret;
> >> const char *ops[] = {
> >> RTE_MBUF_DEFAULT_MEMPOOL_OPS, "ring_mp_mc", NULL,
> >> };
> >> const char **op;
> >>
> >> if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
> >> RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
> >> priv_size);
> >> rte_errno = EINVAL;
> >> return NULL;
> >> }
> >> elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
> >> (unsigned)data_room_size;
> >> mbp_priv.mbuf_data_room_size = data_room_size;
> >> mbp_priv.mbuf_priv_size = priv_size;
> >>
> >> for (op = &ops[0]; *op != NULL; op++) {
> >> mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
> >> sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
> >> if (mp == NULL)
> >> return NULL;
> >>
> >> ret = rte_mempool_set_ops_byname(mp, *op, NULL);
> >> if (ret != 0) {
> >> RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
> >> rte_mempool_free(mp);
> >> if (ret == -ENOTSUP)
> >> continue;
> >> rte_errno = -ret;
> >> return NULL;
> >> }
> >> rte_pktmbuf_pool_init(mp, &mbp_priv);
> >>
> >> ret = rte_mempool_populate_default(mp);
> >> if (ret < 0) {
> >> rte_mempool_free(mp);
> >> if (ret == -ENOTSUP)
> >> continue;
> >> rte_errno = -ret;
> >> return NULL;
> >> }
> >> }
> >>
> >> rte_mempool_obj_iter(mp, rte_pktmbuf_init, NULL);
> >>
> >> return mp;
> >> }
> >>
> >>
> > [Hemant] This look fine to me. Please submit a patch for the same.
> >
> >>>>> I've just submitted an RFC, which I think is quite linked:
> >>>>> http://dpdk.org/ml/archives/dev/2016-September/046974.html
> >>>>> Assuming a new parameter "mempool_ops" is added to
> >>>>> rte_pktmbuf_pool_create(), would it make sense to fallback to
> >>>>> "ring_mp_mc"? What about just returning ENOTSUP? The application
> >>>>> could do the job and decide which sw fallback to use.
> >>>>
> >>>> We ran into this issue when trying to run the standard DPDK
> >>>> examples
> >>>> (l3fwd) in VM. Do you think, is it practical to add fallback
> >>>> handling in each of the DPDK examples?
> >>
> >> OK. What is still unclear for me, is how the software is aware of the
> >> different hardware-assisted handlers. Moreover, we could imagine more
> >> software handlers, which could be used depending on the use case.
> >>
> >> I think this choice has to be made by the user or the application:
> >>
> >> - the application may want to use a specific (sw or hw) handler: in
> >> this case, it want to be notified if it fails, instead of having
> >> a quiet fallback to ring_mp_mc
> >> - if several handlers are available, the application may want to
> >> try them in a specific order
> >> - maybe some handlers will have some limitations with some
> >> configurations or driver? The application could decide to use
> >> a different handler according the configuration.
> >>
> >> So that's why I think this is an application decision.
> >>
> > [Hemant] We should simplify it: if the application has supplied the handler, it
> is application's responsibility to take care of failure. Only if the application want
> to use the default handler, the implementation can fallback. The fallback
> handler can again be configurable.
>
> Honestly, I'm not very convinced that having a quiet fallback is the proper
> solution: the application won't be notified that the fallback occured.
>
> I understand your patch solves your use-case, but my fear is that we just
> integrate this minimal patch and never do the harder work of solving all the use-
> cases. I think we need to give the tools to the applications to control what
> occurs because we also need to solve these issues:
> - you have a hardware handler, but you want to use the software
> handler
> - you have several software handlers, and you (as an administrator) or
> the application knows which one is the best to use in your case.
>
> An alternative (which I don't like either) to solve your issue without modifying
> the mbuf code is to have your own handler fallbacking to ring_mp_mc.
>
[Hemant] are you suggesting that we also configure a fallback handler in addition to default?
I agree that we need to provide full flexibility to the applications, who are willing to implement the fallback mechanism.
>
> >>>> Typically when someone is writing a application on host, he need
> >>>> not worry non-availability of the hw offloaded mempool. He may also
> >>>> want to run the same binary in virtual machine. In VM, it is not
> >>>> guaranteed that hw offloaded mempools will be available.
> >>
> >> Running the same binary is of course a need. But if your VM does not
> >> provide the same virtualized hardware than the host, I think the
> >> command line option makes sense.
> >>
> >> AFAIU, on the host, you can use a hw mempool handler or a sw one, and
> >> on the guest, only a sw one. So you need an option to select the
> >> behavior you want on the host, without recompiling.
> >>
> >> I understand that modifying all the applications is not a good option
> >> either. I'm thinking a a EAL parameter that would allow to configure
> >> a library (mbuf in this case). Something like kernel boot options.
> >> Example: testpmd -l 2,4 --opt mbuf.handler="ring_mp_mc" -- [app args]
> >>
> >> I don't know if this is feasible, this has to be discussed first, but
> >> what do you think on the principle? The value could also be an
> >> ordered list if we want a fallback, and this option could be
> >> overriden by the application before creating the mbuf pool. There was
> >> some discussions about a kind of dpdk global configuration some time ago.
> >>
> > [Hemant] I agree that command line option provide a better control in this
> case.
> > On the flipside, We need to be careful that we do not end up having too many
> command line options.
>
> Yes, we should avoid having too many command line arguments.
> I think we should go in the direction of having a sort of key/value database in
> EAL. It could be set by:
> - a generic command line argument
> - a config file (maybe later)
> - the application through an API
>
> Then, it would be used by:
> - dpdk libraries
> - the application (maybe)
>
> This has been discussed some times, the latest was probably:
> http://dpdk.org/ml/archives/dev/2016-June/040079.html
>
> I think it would be a good tool for dpdk configurability, and it would help to
> remove some compile-time options and maybe some eal options.
> Unfortunately, I don't have the time to work on this at the moment.
>
> Regards,
> Olivier
More information about the dev
mailing list