[dpdk-dev] [PATCH v2 1/3] mempool: add stack (lifo) mempool handler

Hunt, David david.hunt at intel.com
Fri Jun 17 16:18:26 CEST 2016


Hi Olivier,

On 23/5/2016 1:55 PM, Olivier Matz wrote:
> Hi David,
>
> Please find some comments below.
>
> On 05/19/2016 04:48 PM, David Hunt wrote:
>> [...]
>> +++ b/lib/librte_mempool/rte_mempool_stack.c
>> @@ -0,0 +1,145 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
>> + *   All rights reserved.
> Should be 2016?

Yes, fixed.

>> ...
>> +
>> +static void *
>> +common_stack_alloc(struct rte_mempool *mp)
>> +{
>> +	struct rte_mempool_common_stack *s;
>> +	unsigned n = mp->size;
>> +	int size = sizeof(*s) + (n+16)*sizeof(void *);
>> +
>> +	/* Allocate our local memory structure */
>> +	s = rte_zmalloc_socket("common-stack",
> "mempool-stack" ?

Done

>> +			size,
>> +			RTE_CACHE_LINE_SIZE,
>> +			mp->socket_id);
>> +	if (s == NULL) {
>> +		RTE_LOG(ERR, MEMPOOL, "Cannot allocate stack!\n");
>> +		return NULL;
>> +	}
>> +
>> +	rte_spinlock_init(&s->sl);
>> +
>> +	s->size = n;
>> +	mp->pool = s;
>> +	rte_mempool_set_handler(mp, "stack");
> rte_mempool_set_handler() is a user function, it should be called here

Removed.

>> +
>> +	return s;
>> +}
>> +
>> +static int common_stack_put(void *p, void * const *obj_table,
>> +		unsigned n)
>> +{
>> +	struct rte_mempool_common_stack *s = p;
>> +	void **cache_objs;
>> +	unsigned index;
>> +
>> +	rte_spinlock_lock(&s->sl);
>> +	cache_objs = &s->objs[s->len];
>> +
>> +	/* Is there sufficient space in the stack ? */
>> +	if ((s->len + n) > s->size) {
>> +		rte_spinlock_unlock(&s->sl);
>> +		return -ENOENT;
>> +	}
> The usual return value for a failing put() is ENOBUFS (see in rte_ring).

Done.

>
> After reading it, I realize that it's nearly exactly the same code than
> in "app/test: test external mempool handler".
> http://patchwork.dpdk.org/dev/patchwork/patch/12896/
>
> We should drop one of them. If this stack handler is really useful for
> a performance use-case, it could go in librte_mempool. At the first
> read, the code looks like a demo example : it uses a simple spinlock for
> concurrent accesses to the common pool. Maybe the mempool cache hides
> this cost, in this case we could also consider removing the use of the
> rte_ring.

While I agree that the code is similar, the handler in the test is a 
ring based handler,
where as this patch adds an array based handler.

I think that the case for leaving it in as a test for the standard 
handler as part of the
previous mempool handler is valid, but maybe there is a case for 
removing it if
we add the stack handler. Maybe a future patch?

> Do you have some some performance numbers? Do you know if it scales
> with the number of cores?

For the mempool_perf_autotest, I'm seeing a 30% increase in performance 
for the
local cache use-case for 1 - 36 cores (results vary within those tests 
between
10-45% gain, but with an average of 30% gain over all the tests.).

However, for the tests with no local cache configured, throughput of the 
enqueue/dequeue
drops by about 30%, with the 36 core yelding the largest drop of 40%. So 
this handler would
not be recommended in no-cache applications.

> If we can identify the conditions where this mempool handler
> overperforms the default handler, it would be valuable to have them
> in the documentation.
>


Regards,
Dave.





More information about the dev mailing list