[dpdk-dev] [PATCH v2 1/3] mempool: add stack (lifo) mempool handler
Hunt, David
david.hunt at intel.com
Fri Jun 17 16:18:26 CEST 2016
Hi Olivier,
On 23/5/2016 1:55 PM, Olivier Matz wrote:
> Hi David,
>
> Please find some comments below.
>
> On 05/19/2016 04:48 PM, David Hunt wrote:
>> [...]
>> +++ b/lib/librte_mempool/rte_mempool_stack.c
>> @@ -0,0 +1,145 @@
>> +/*-
>> + * BSD LICENSE
>> + *
>> + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
>> + * All rights reserved.
> Should be 2016?
Yes, fixed.
>> ...
>> +
>> +static void *
>> +common_stack_alloc(struct rte_mempool *mp)
>> +{
>> + struct rte_mempool_common_stack *s;
>> + unsigned n = mp->size;
>> + int size = sizeof(*s) + (n+16)*sizeof(void *);
>> +
>> + /* Allocate our local memory structure */
>> + s = rte_zmalloc_socket("common-stack",
> "mempool-stack" ?
Done
>> + size,
>> + RTE_CACHE_LINE_SIZE,
>> + mp->socket_id);
>> + if (s == NULL) {
>> + RTE_LOG(ERR, MEMPOOL, "Cannot allocate stack!\n");
>> + return NULL;
>> + }
>> +
>> + rte_spinlock_init(&s->sl);
>> +
>> + s->size = n;
>> + mp->pool = s;
>> + rte_mempool_set_handler(mp, "stack");
> rte_mempool_set_handler() is a user function, it should be called here
Removed.
>> +
>> + return s;
>> +}
>> +
>> +static int common_stack_put(void *p, void * const *obj_table,
>> + unsigned n)
>> +{
>> + struct rte_mempool_common_stack *s = p;
>> + void **cache_objs;
>> + unsigned index;
>> +
>> + rte_spinlock_lock(&s->sl);
>> + cache_objs = &s->objs[s->len];
>> +
>> + /* Is there sufficient space in the stack ? */
>> + if ((s->len + n) > s->size) {
>> + rte_spinlock_unlock(&s->sl);
>> + return -ENOENT;
>> + }
> The usual return value for a failing put() is ENOBUFS (see in rte_ring).
Done.
>
> After reading it, I realize that it's nearly exactly the same code than
> in "app/test: test external mempool handler".
> http://patchwork.dpdk.org/dev/patchwork/patch/12896/
>
> We should drop one of them. If this stack handler is really useful for
> a performance use-case, it could go in librte_mempool. At the first
> read, the code looks like a demo example : it uses a simple spinlock for
> concurrent accesses to the common pool. Maybe the mempool cache hides
> this cost, in this case we could also consider removing the use of the
> rte_ring.
While I agree that the code is similar, the handler in the test is a
ring based handler,
where as this patch adds an array based handler.
I think that the case for leaving it in as a test for the standard
handler as part of the
previous mempool handler is valid, but maybe there is a case for
removing it if
we add the stack handler. Maybe a future patch?
> Do you have some some performance numbers? Do you know if it scales
> with the number of cores?
For the mempool_perf_autotest, I'm seeing a 30% increase in performance
for the
local cache use-case for 1 - 36 cores (results vary within those tests
between
10-45% gain, but with an average of 30% gain over all the tests.).
However, for the tests with no local cache configured, throughput of the
enqueue/dequeue
drops by about 30%, with the 36 core yelding the largest drop of 40%. So
this handler would
not be recommended in no-cache applications.
> If we can identify the conditions where this mempool handler
> overperforms the default handler, it would be valuable to have them
> in the documentation.
>
Regards,
Dave.
More information about the dev
mailing list