[dpdk-dev] [PATCH] mbuf: replace c memcpy code semantics with optimized rte_memcpy
Hunt, David
david.hunt at intel.com
Fri Jun 24 17:56:39 CEST 2016
Hi Jerin,
I just ran a couple of tests on this patch on the latest master head on
a couple of machines. An older quad socket E5-4650 and a quad socket
E5-2699 v3
E5-4650:
I'm seeing a gain of 2% for un-cached tests and a gain of 9% on the
cached tests.
E5-2699 v3:
I'm seeing a loss of 0.1% for un-cached tests and a gain of 11% on the
cached tests.
This is purely the autotest comparison, I don't have traffic generator
results. But based on the above, I don't think there are any performance
issues with the patch.
Regards,
Dave.
On 24/5/2016 4:17 PM, Jerin Jacob wrote:
> On Tue, May 24, 2016 at 04:59:47PM +0200, Olivier Matz wrote:
>> Hi Jerin,
>>
>>
>> On 05/24/2016 04:50 PM, Jerin Jacob wrote:
>>> Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
>>> ---
>>> lib/librte_mempool/rte_mempool.h | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
>>> index ed2c110..ebe399a 100644
>>> --- a/lib/librte_mempool/rte_mempool.h
>>> +++ b/lib/librte_mempool/rte_mempool.h
>>> @@ -74,6 +74,7 @@
>>> #include <rte_memory.h>
>>> #include <rte_branch_prediction.h>
>>> #include <rte_ring.h>
>>> +#include <rte_memcpy.h>
>>>
>>> #ifdef __cplusplus
>>> extern "C" {
>>> @@ -917,7 +918,6 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
>>> unsigned n, __rte_unused int is_mp)
>>> {
>>> struct rte_mempool_cache *cache;
>>> - uint32_t index;
>>> void **cache_objs;
>>> unsigned lcore_id = rte_lcore_id();
>>> uint32_t cache_size = mp->cache_size;
>>> @@ -946,8 +946,7 @@ __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table,
>>> */
>>>
>>> /* Add elements back into the cache */
>>> - for (index = 0; index < n; ++index, obj_table++)
>>> - cache_objs[index] = *obj_table;
>>> + rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n);
>>>
>>> cache->len += n;
>>>
>>>
>> The commit title should be "mempool" instead of "mbuf".
> I will fix it.
>
>> Are you seeing some performance improvement by using rte_memcpy()?
> Yes, In some case, In default case, It was replaced with memcpy by the
> compiler itself(gcc 5.3). But when I tried external mempool manager patch and
> then performance dropped almost 800Kpps. Debugging further it turns out that
> external mempool managers unrelated change was knocking out the memcpy.
> explicit rte_memcpy brought back 500Kpps. Remaing 300Kpps drop is still
> unknown(In my test setup, packets are in the local cache, so it must be
> something do with __mempool_put_bulk text alignment change or similar.
>
> Anyone else observed performance drop with external poolmanager?
>
> Jerin
>
>> Regards
>> Olivier
More information about the dev
mailing list