[dpdk-dev] [PATCH v3 2/2] mempool: pktmbuf pool default fallback for mempool ops error

Olivier Matz olivier.matz at 6wind.com
Fri Oct 14 14:10:54 CEST 2016


Hi Hemant,

Sorry for the late answer. Please see some comments inline.

On 10/13/2016 03:15 PM, Hemant Agrawal wrote:
> Hi Olivier,
>     Any updates w.r.t this patch set?
> 
> Regards
> Hemant
> On 9/22/2016 6:42 PM, Hemant Agrawal wrote:
>> Hi Olivier
>>
>> On 9/19/2016 7:27 PM, Olivier Matz wrote:
>>> Hi Hemant,
>>>
>>> On 09/16/2016 06:46 PM, Hemant Agrawal wrote:
>>>> In the rte_pktmbuf_pool_create, if the default external mempool is
>>>> not available, the implementation can default to "ring_mp_mc", which
>>>> is an software implementation.
>>>>
>>>> Signed-off-by: Hemant Agrawal <hemant.agrawal at nxp.com>
>>>> ---
>>>> Changes in V3:
>>>> * adding warning message to say that falling back to default sw pool
>>>> ---
>>>>  lib/librte_mbuf/rte_mbuf.c | 8 ++++++++
>>>>  1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
>>>> index 4846b89..8ab0eb1 100644
>>>> --- a/lib/librte_mbuf/rte_mbuf.c
>>>> +++ b/lib/librte_mbuf/rte_mbuf.c
>>>> @@ -176,6 +176,14 @@ rte_pktmbuf_pool_create(const char *name,
>>>> unsigned n,
>>>>
>>>>      rte_errno = rte_mempool_set_ops_byname(mp,
>>>>              RTE_MBUF_DEFAULT_MEMPOOL_OPS, NULL);
>>>> +
>>>> +    /* on error, try falling back to the software based default
>>>> pool */
>>>> +    if (rte_errno == -EOPNOTSUPP) {
>>>> +        RTE_LOG(WARNING, MBUF, "Default HW Mempool not supported. "
>>>> +            "falling back to sw mempool \"ring_mp_mc\"");
>>>> +        rte_errno = rte_mempool_set_ops_byname(mp, "ring_mp_mc",
>>>> NULL);
>>>> +    }
>>>> +
>>>>      if (rte_errno != 0) {
>>>>          RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
>>>>          return NULL;
>>>>
>>>
>>> Without adding a new method ".supported()", the first call to
>>> rte_mempool_populate() could return the same error ENOTSUP. In this
>>> case, it is still possible to fallback.
>>>
>> It will be bit late.
>>
>> On failure, than we have to set the default ops and do a goto before
>> rte_pktmbuf_pool_init(mp, &mbp_priv);

I still think we can do the job without adding the .supported() method.
The following code is just an (untested) example:

struct rte_mempool *
rte_pktmbuf_pool_create(const char *name, unsigned n,
    unsigned cache_size, uint16_t priv_size, uint16_t data_room_size,
    int socket_id)
{
    struct rte_mempool *mp;
    struct rte_pktmbuf_pool_private mbp_priv;
    unsigned elt_size;
    int ret;
    const char *ops[] = {
        RTE_MBUF_DEFAULT_MEMPOOL_OPS, "ring_mp_mc", NULL,
    };
    const char **op;

    if (RTE_ALIGN(priv_size, RTE_MBUF_PRIV_ALIGN) != priv_size) {
        RTE_LOG(ERR, MBUF, "mbuf priv_size=%u is not aligned\n",
            priv_size);
        rte_errno = EINVAL;
        return NULL;
    }
    elt_size = sizeof(struct rte_mbuf) + (unsigned)priv_size +
        (unsigned)data_room_size;
    mbp_priv.mbuf_data_room_size = data_room_size;
    mbp_priv.mbuf_priv_size = priv_size;

    for (op = &ops[0]; *op != NULL; op++) {
        mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
            sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
        if (mp == NULL)
            return NULL;

        ret = rte_mempool_set_ops_byname(mp, *op, NULL);
        if (ret != 0) {
            RTE_LOG(ERR, MBUF, "error setting mempool handler\n");
            rte_mempool_free(mp);
            if (ret == -ENOTSUP)
                continue;
            rte_errno = -ret;
            return NULL;
        }
        rte_pktmbuf_pool_init(mp, &mbp_priv);

        ret = rte_mempool_populate_default(mp);
        if (ret < 0) {
            rte_mempool_free(mp);
            if (ret == -ENOTSUP)
                continue;
            rte_errno = -ret;
            return NULL;
        }
    }

    rte_mempool_obj_iter(mp, rte_pktmbuf_init, NULL);

    return mp;
}


>>> I've just submitted an RFC, which I think is quite linked:
>>> http://dpdk.org/ml/archives/dev/2016-September/046974.html
>>> Assuming a new parameter "mempool_ops" is added to
>>> rte_pktmbuf_pool_create(), would it make sense to fallback to
>>> "ring_mp_mc"? What about just returning ENOTSUP? The application could
>>> do the job and decide which sw fallback to use.
>>
>> We ran into this issue when trying to run the standard DPDK examples
>> (l3fwd) in VM. Do you think, is it practical to add fallback handling in
>> each of the DPDK examples?

OK. What is still unclear for me, is how the software is aware of the
different hardware-assisted handlers. Moreover, we could imagine more
software handlers, which could be used depending on the use case.

I think this choice has to be made by the user or the application:

- the application may want to use a specific (sw or hw) handler: in
  this case, it want to be notified if it fails, instead of having
  a quiet fallback to ring_mp_mc
- if several handlers are available, the application may want to
  try them in a specific order
- maybe some handlers will have some limitations with some
  configurations or driver? The application could decide to use
  a different handler according the configuration.

So that's why I think this is an application decision.

>> Typically when someone is writing a application on host, he need not
>> worry non-availability of the hw offloaded mempool. He may also want to
>> run the same binary in virtual machine. In VM, it is not guaranteed that
>> hw offloaded mempools will be available.

Running the same binary is of course a need. But if your VM does not
provide the same virtualized hardware than the host, I think the command
line option makes sense.

AFAIU, on the host, you can use a hw mempool handler or a sw one, and
on the guest, only a sw one. So you need an option to select the
behavior you want on the host, without recompiling.

I understand that modifying all the applications is not a good option
either. I'm thinking a a EAL parameter that would allow to configure a
library (mbuf in this case). Something like kernel boot options.
Example: testpmd -l 2,4 --opt mbuf.handler="ring_mp_mc" -- [app args]

I don't know if this is feasible, this has to be discussed first, but
what do you think on the principle? The value could also be an
ordered list if we want a fallback, and this option could be overriden
by the application before creating the mbuf pool. There was some
discussions about a kind of dpdk global configuration some time ago.


Regards,
Olivier


More information about the dev mailing list