[dpdk-dev] [PATCH v2] mempool: replace c memcpy code semantics with optimized rte_memcpy

Jerin Jacob jerin.jacob at caviumnetworks.com
Thu Jun 2 11:39:37 CEST 2016


On Thu, Jun 02, 2016 at 09:36:34AM +0200, Olivier MATZ wrote:
> Hi Jerin,
> 
> On 06/01/2016 09:00 AM, Jerin Jacob wrote:
> > On Tue, May 31, 2016 at 11:05:30PM +0200, Olivier MATZ wrote:
> >> Today, the objects pointers are reversed only in the get(). It means
> >> that this code:
> >>
> >> 	rte_mempool_get_bulk(mp, table, 4);
> >> 	for (i = 0; i < 4; i++)
> >> 		printf("obj = %p\n", t[i]);
> >> 	rte_mempool_put_bulk(mp, table, 4);
> >>
> >>
> >> 	printf("-----\n");
> >> 	rte_mempool_get_bulk(mp, table, 4);
> >> 	for (i = 0; i < 4; i++)
> >> 		printf("obj = %p\n", t[i]);
> >> 	rte_mempool_put_bulk(mp, table, 4);
> >>
> >> prints:
> >>
> >> 	addr1
> >> 	addr2
> >> 	addr3
> >> 	addr4
> >> 	-----
> >> 	addr4
> >> 	addr3
> >> 	addr2
> >> 	addr1
> >>
> >> Which is quite strange.
> > 
> > IMO, It is the expected LIFO behavior. Right ?
> > 
> > What is not expected is the following, which is the case after change. Or Am I
> > missing something here?
> > 
> > addr1
> > addr2
> > addr3
> > addr4
> > -----
> > addr1
> > addr2
> > addr3
> > addr4
> > 
> >>
> >> I don't think it would be an issue to replace the loop by a
> >> rte_memcpy(), it may increase the copy speed and it will be
> >> more coherent with the put().
> >>
> 
> I think the LIFO behavior should occur on a per-bulk basis. I mean,
> it should behave like in the exemplaes below:
> 
>   // pool cache is in state X
>   obj1 = mempool_get(mp)
>   obj2 = mempool_get(mp)
>   mempool_put(mp, obj2)
>   mempool_put(mp, obj1)
>   // pool cache is back in state X
> 
>   // pool cache is in state X
>   bulk1 = mempool_get_bulk(mp, 16)
>   bulk2 = mempool_get_bulk(mp, 16)
>   mempool_put_bulk(mp, bulk2, 16)
>   mempool_put_bulk(mp, bulk1, 16)
>   // pool cache is back in state X
> 

Per entry LIFO behavior make more sense in _bulk_ case as recently en-queued buffer
comes out for next "get" makes more chance that buffer in Last level cache.

> Note that today it's not the case for bulks, since object addresses
> are reversed only in get(), we are not back in the original state.
> I don't really see the advantage of this.
> 
> Removing the reversing may accelerate the cache in case of bulk get,
> I think.

I tried in my setup, it was dropping the performance. Have you got
improvement in any setup?

Jerin

> 
> Regards,
> Olivier


More information about the dev mailing list