[dpdk-dev] [PATCH 0/2] rewritten rte_hash_crc() call

Neil Horman nhorman at tuxdriver.com
Fri Nov 14 01:52:11 CET 2014


On Thu, Nov 13, 2014 at 06:33:14PM +0100, Thomas Monjalon wrote:
> Any comment on these patches?
> 
> 2014-09-03 12:05, Yerden Zhumabekov:
> > As SSE4.2 provides CRC32 instructions with either 32 and 64 bit operands,
> > new rte_hash_crc_8byte() call assisted with _mm_crc32_u64 intrinsic may be
> > useful.
> > 
> > Then, rte_hash_crc() function is redesigned to take advantage of both 32
> > and 64 bit operands. This improves the function's performance significantly.
> > 
> > Results of my test run on a single CPU core are below.
> > 
> > CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
> > Number of iterations/chunks: 52428800
> > Chunk size: 24
> >   rte_hash_crc:            0.379 sec, hash: 0x14c64e11
> >   rte_hash_crc_new:        0.253 sec, hash: 0x14c64e11
> > Chunk size: 25
> >   rte_hash_crc:            0.442 sec, hash: 0xa9afc779
> >   rte_hash_crc_new:        0.316 sec, hash: 0xa9afc779
> > Chunk size: 26
> >   rte_hash_crc:            0.442 sec, hash: 0x92f2284b
> >   rte_hash_crc_new:        0.316 sec, hash: 0x92f2284b
> > Chunk size: 27
> >   rte_hash_crc:            0.442 sec, hash: 0x7c4655ff
> >   rte_hash_crc_new:        0.316 sec, hash: 0x7c4655ff
> > Chunk size: 28
> >   rte_hash_crc:            0.442 sec, hash: 0xf577c6b4
> >   rte_hash_crc_new:        0.316 sec, hash: 0xf577c6b4
> > Chunk size: 29
> >   rte_hash_crc:            0.505 sec, hash: 0x6e18ba55
> >   rte_hash_crc_new:        0.337 sec, hash: 0x6e18ba55
> > Chunk size: 30
> >   rte_hash_crc:            0.505 sec, hash: 0x35f07dbb
> >   rte_hash_crc_new:        0.337 sec, hash: 0x35f07dbb
> > Chunk size: 31
> >   rte_hash_crc:            0.505 sec, hash: 0x1bf2ee8c
> >   rte_hash_crc_new:        0.337 sec, hash: 0x1bf2ee8c
> > 
> > Yerden Zhumabekov (2):
> >   hash: add new rte_hash_crc_8byte call
> >   hash: rte_hash_crc uses 8- and 4-byte CRC32 intrinsics
> > 
> >  lib/librte_hash/rte_hash_crc.h |   47 +++++++++++++++++++++++++++++++++-------
> >  1 file changed, 39 insertions(+), 8 deletions(-)
> 
> 
Yeah, sorry I didn't speak up earlier.  I meant to ask if the __mm_crc_u64
intrinsic will emit software emulated versions of the sse4.2 instruction in the
event that you build with a config that doesn't enable sse4.2?  If not, then
NAK, since this will break on the default build.  In that event you'll have to
modify the new function to do a runtime cpu flags check to either just use the
instruction inlined with some asm, or emulate it in software.

Neil



More information about the dev mailing list