[dpdk-dev] [PATCH v2] Implement memcmp using AVX/SSE instructions.

Matt Laswell laswell at infiniteio.com
Sat May 9 01:25:10 CEST 2015


On Fri, May 8, 2015 at 5:54 PM, Ravi Kerur <rkerur at gmail.com> wrote:

>
>
> On Fri, May 8, 2015 at 3:29 PM, Matt Laswell <laswell at infiniteio.com>
> wrote:
>
>>
>>
>> On Fri, May 8, 2015 at 4:19 PM, Ravi Kerur <rkerur at gmail.com> wrote:
>>
>>> This patch replaces memcmp in librte_hash with rte_memcmp which is
>>> implemented with AVX/SSE instructions.
>>>
>>> +static inline int
>>> +rte_memcmp(const void *_src_1, const void *_src_2, size_t n)
>>> +{
>>> +       const uint8_t *src_1 = (const uint8_t *)_src_1;
>>> +       const uint8_t *src_2 = (const uint8_t *)_src_2;
>>> +       int ret = 0;
>>> +
>>> +       if (n & 0x80)
>>> +               return rte_cmp128(src_1, src_2);
>>> +
>>> +       if (n & 0x40)
>>> +               return rte_cmp64(src_1, src_2);
>>> +
>>> +       if (n & 0x20) {
>>> +               ret = rte_cmp32(src_1, src_2);
>>> +               n -= 0x20;
>>> +               src_1 += 0x20;
>>> +               src_2 += 0x20;
>>> +       }
>>>
>>>
>> Pardon me for butting in, but this seems incorrect for the first two
>> cases listed above, as the function as written will only compare the first
>> 128 or 64 bytes of each source and return the result.  The pattern
>> expressed in the 32 byte case appears more correct, as it compares the
>> first 32 bytes and then lets later pieces of the function handle the
>> smaller remaining bits of the sources. Also, if this function is to handle
>> arbitrarily large source data, the 128 byte case needs to be in a loop.
>>
>> What am I missing?
>>
>
> Current max hash key length supported is 64 bytes, hence no comparison is
> done after 64 bytes. 128 bytes comparison is added to measure performance
> only and there is no use-case as of now. With the current use-cases its not
> required but if there is a need to handle large arbitrary data upto 128
> bytes it can be modified.
>

Ah, gotcha.  I misunderstood and thought that this was meant to be a
generic AVX/SSE enabled memcmp() replacement, and that the use of it in
rte_hash was meant merely as a test case.   If it's more limited than that,
carry on, though you might want to make a note of it in the documentation.
I suspect others will misinterpret the name as I did.

--
Matt Laswell
infinite io, inc.
laswell at infiniteio.com


More information about the dev mailing list