net/i40e: fix clang build error with 16B descriptors

Message ID 20191112134023.52623-1-bruce.richardson@intel.com (mailing list archive)
State Accepted, archived
Delegated to: xiaolong ye
Headers
Series net/i40e: fix clang build error with 16B descriptors |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-compilation success Compile Testing PASS
ci/travis-robot success Travis build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK

Commit Message

Bruce Richardson Nov. 12, 2019, 1:40 p.m. UTC
  When compiling with 16B descriptor support enabled, clang compiles gave an
error, complaining that the final parameter of _mm256_blend_epi32() had to
be an immediate value (i.e. compile-time constant). While it appears that
GCC was able to convert the constant variable value "fdir_blend_mask" into
the blend call, clang was not doing so. To guarantee the use of an
immediate we convert the variable value to a #define.

Fixes: 7d087a0a8b8e ("net/i40e: support flow director on AVX Rx")
Cc: harry.van.haaren@intel.com

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 drivers/net/i40e/i40e_rxtx_vec_avx2.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
  

Comments

Xiaolong Ye Nov. 13, 2019, 2:31 a.m. UTC | #1
On 11/12, Bruce Richardson wrote:
>When compiling with 16B descriptor support enabled, clang compiles gave an
>error, complaining that the final parameter of _mm256_blend_epi32() had to
>be an immediate value (i.e. compile-time constant). While it appears that
>GCC was able to convert the constant variable value "fdir_blend_mask" into
>the blend call, clang was not doing so. To guarantee the use of an
>immediate we convert the variable value to a #define.
>
>Fixes: 7d087a0a8b8e ("net/i40e: support flow director on AVX Rx")
>Cc: harry.van.haaren@intel.com
>
>Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>---
> drivers/net/i40e/i40e_rxtx_vec_avx2.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx2.c b/drivers/net/i40e/i40e_rxtx_vec_avx2.c
>index b9f1a240c..3bcef1363 100644
>--- a/drivers/net/i40e/i40e_rxtx_vec_avx2.c
>+++ b/drivers/net/i40e/i40e_rxtx_vec_avx2.c
>@@ -529,6 +529,7 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
> 			 * identifies an FDIR ID match, and zeros the RSS value
> 			 * in the mbuf on FDIR match to keep mbuf data clean.
> 			 */
>+#define FDIR_BLEND_MASK ((1 << 3) | (1 << 7))
> 
> 			/* Flags:
> 			 * - Take flags, shift bits to null out
>@@ -557,9 +558,8 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
> 			 * otherwise the mb0_1 register RSS field is zeroed.
> 			 */
> 			const __m256i fdir_zero_mask = _mm256_setzero_si256();
>-			const uint32_t fdir_blend_mask = (1 << 3) | (1 << 7);
> 			__m256i tmp0_1 = _mm256_blend_epi32(fdir_zero_mask,
>-						fdir_mask, fdir_blend_mask);
>+						fdir_mask, FDIR_BLEND_MASK);
> 			__m256i fdir_mb0_1 = _mm256_and_si256(mb0_1, fdir_mask);
> 			mb0_1 = _mm256_andnot_si256(tmp0_1, mb0_1);
> 
>@@ -575,7 +575,7 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
> 			__m256i tmp2_3 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 12);
> 			__m256i fdir_mb2_3 = _mm256_and_si256(mb2_3, tmp2_3);
> 			tmp2_3 = _mm256_blend_epi32(fdir_zero_mask, tmp2_3,
>-						    fdir_blend_mask);
>+						    FDIR_BLEND_MASK);
> 			mb2_3 = _mm256_andnot_si256(tmp2_3, mb2_3);
> 			rx_pkts[i + 2]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb2_3, 3);
> 			rx_pkts[i + 3]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb2_3, 7);
>@@ -583,7 +583,7 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
> 			__m256i tmp4_5 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 8);
> 			__m256i fdir_mb4_5 = _mm256_and_si256(mb4_5, tmp4_5);
> 			tmp4_5 = _mm256_blend_epi32(fdir_zero_mask, tmp4_5,
>-						    fdir_blend_mask);
>+						    FDIR_BLEND_MASK);
> 			mb4_5 = _mm256_andnot_si256(tmp4_5, mb4_5);
> 			rx_pkts[i + 4]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb4_5, 3);
> 			rx_pkts[i + 5]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb4_5, 7);
>@@ -591,7 +591,7 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
> 			__m256i tmp6_7 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 4);
> 			__m256i fdir_mb6_7 = _mm256_and_si256(mb6_7, tmp6_7);
> 			tmp6_7 = _mm256_blend_epi32(fdir_zero_mask, tmp6_7,
>-						    fdir_blend_mask);
>+						    FDIR_BLEND_MASK);
> 			mb6_7 = _mm256_andnot_si256(tmp6_7, mb6_7);
> 			rx_pkts[i + 6]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb6_7, 3);
> 			rx_pkts[i + 7]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb6_7, 7);
>-- 
>2.21.0
>

Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>

Applied to dpdk-next-net-intel, Thanks.
  
Ferruh Yigit Nov. 13, 2019, 3:42 p.m. UTC | #2
On 11/13/2019 2:31 AM, Ye Xiaolong wrote:
> On 11/12, Bruce Richardson wrote:
>> When compiling with 16B descriptor support enabled, clang compiles gave an
>> error, complaining that the final parameter of _mm256_blend_epi32() had to
>> be an immediate value (i.e. compile-time constant). While it appears that
>> GCC was able to convert the constant variable value "fdir_blend_mask" into
>> the blend call, clang was not doing so. To guarantee the use of an
>> immediate we convert the variable value to a #define.
>>
>> Fixes: 7d087a0a8b8e ("net/i40e: support flow director on AVX Rx")
>> Cc: harry.van.haaren@intel.com
>>
>> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
>> ---
>> drivers/net/i40e/i40e_rxtx_vec_avx2.c | 10 +++++-----
>> 1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx2.c b/drivers/net/i40e/i40e_rxtx_vec_avx2.c
>> index b9f1a240c..3bcef1363 100644
>> --- a/drivers/net/i40e/i40e_rxtx_vec_avx2.c
>> +++ b/drivers/net/i40e/i40e_rxtx_vec_avx2.c
>> @@ -529,6 +529,7 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>> 			 * identifies an FDIR ID match, and zeros the RSS value
>> 			 * in the mbuf on FDIR match to keep mbuf data clean.
>> 			 */
>> +#define FDIR_BLEND_MASK ((1 << 3) | (1 << 7))
>>
>> 			/* Flags:
>> 			 * - Take flags, shift bits to null out
>> @@ -557,9 +558,8 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>> 			 * otherwise the mb0_1 register RSS field is zeroed.
>> 			 */
>> 			const __m256i fdir_zero_mask = _mm256_setzero_si256();
>> -			const uint32_t fdir_blend_mask = (1 << 3) | (1 << 7);
>> 			__m256i tmp0_1 = _mm256_blend_epi32(fdir_zero_mask,
>> -						fdir_mask, fdir_blend_mask);
>> +						fdir_mask, FDIR_BLEND_MASK);
>> 			__m256i fdir_mb0_1 = _mm256_and_si256(mb0_1, fdir_mask);
>> 			mb0_1 = _mm256_andnot_si256(tmp0_1, mb0_1);
>>
>> @@ -575,7 +575,7 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>> 			__m256i tmp2_3 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 12);
>> 			__m256i fdir_mb2_3 = _mm256_and_si256(mb2_3, tmp2_3);
>> 			tmp2_3 = _mm256_blend_epi32(fdir_zero_mask, tmp2_3,
>> -						    fdir_blend_mask);
>> +						    FDIR_BLEND_MASK);
>> 			mb2_3 = _mm256_andnot_si256(tmp2_3, mb2_3);
>> 			rx_pkts[i + 2]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb2_3, 3);
>> 			rx_pkts[i + 3]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb2_3, 7);
>> @@ -583,7 +583,7 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>> 			__m256i tmp4_5 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 8);
>> 			__m256i fdir_mb4_5 = _mm256_and_si256(mb4_5, tmp4_5);
>> 			tmp4_5 = _mm256_blend_epi32(fdir_zero_mask, tmp4_5,
>> -						    fdir_blend_mask);
>> +						    FDIR_BLEND_MASK);
>> 			mb4_5 = _mm256_andnot_si256(tmp4_5, mb4_5);
>> 			rx_pkts[i + 4]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb4_5, 3);
>> 			rx_pkts[i + 5]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb4_5, 7);
>> @@ -591,7 +591,7 @@ _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
>> 			__m256i tmp6_7 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 4);
>> 			__m256i fdir_mb6_7 = _mm256_and_si256(mb6_7, tmp6_7);
>> 			tmp6_7 = _mm256_blend_epi32(fdir_zero_mask, tmp6_7,
>> -						    fdir_blend_mask);
>> +						    FDIR_BLEND_MASK);
>> 			mb6_7 = _mm256_andnot_si256(tmp6_7, mb6_7);
>> 			rx_pkts[i + 6]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb6_7, 3);
>> 			rx_pkts[i + 7]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb6_7, 7);
>> -- 
>> 2.21.0
>>
> 
> Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
> 
> Applied to dpdk-next-net-intel, Thanks.
> 

Adding following build error into commit log while pulling to next-net, thanks
to Bruce:

 .../i40e_rxtx_vec_avx2.c:561:21: error: argument to '__builtin_ia32_pblendd256'
must be a constant integer
                         __m256i tmp0_1 = _mm256_blend_epi32(fdir_zero_mask,
                                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  

Patch

diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx2.c b/drivers/net/i40e/i40e_rxtx_vec_avx2.c
index b9f1a240c..3bcef1363 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_avx2.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_avx2.c
@@ -529,6 +529,7 @@  _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 			 * identifies an FDIR ID match, and zeros the RSS value
 			 * in the mbuf on FDIR match to keep mbuf data clean.
 			 */
+#define FDIR_BLEND_MASK ((1 << 3) | (1 << 7))
 
 			/* Flags:
 			 * - Take flags, shift bits to null out
@@ -557,9 +558,8 @@  _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 			 * otherwise the mb0_1 register RSS field is zeroed.
 			 */
 			const __m256i fdir_zero_mask = _mm256_setzero_si256();
-			const uint32_t fdir_blend_mask = (1 << 3) | (1 << 7);
 			__m256i tmp0_1 = _mm256_blend_epi32(fdir_zero_mask,
-						fdir_mask, fdir_blend_mask);
+						fdir_mask, FDIR_BLEND_MASK);
 			__m256i fdir_mb0_1 = _mm256_and_si256(mb0_1, fdir_mask);
 			mb0_1 = _mm256_andnot_si256(tmp0_1, mb0_1);
 
@@ -575,7 +575,7 @@  _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 			__m256i tmp2_3 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 12);
 			__m256i fdir_mb2_3 = _mm256_and_si256(mb2_3, tmp2_3);
 			tmp2_3 = _mm256_blend_epi32(fdir_zero_mask, tmp2_3,
-						    fdir_blend_mask);
+						    FDIR_BLEND_MASK);
 			mb2_3 = _mm256_andnot_si256(tmp2_3, mb2_3);
 			rx_pkts[i + 2]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb2_3, 3);
 			rx_pkts[i + 3]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb2_3, 7);
@@ -583,7 +583,7 @@  _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 			__m256i tmp4_5 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 8);
 			__m256i fdir_mb4_5 = _mm256_and_si256(mb4_5, tmp4_5);
 			tmp4_5 = _mm256_blend_epi32(fdir_zero_mask, tmp4_5,
-						    fdir_blend_mask);
+						    FDIR_BLEND_MASK);
 			mb4_5 = _mm256_andnot_si256(tmp4_5, mb4_5);
 			rx_pkts[i + 4]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb4_5, 3);
 			rx_pkts[i + 5]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb4_5, 7);
@@ -591,7 +591,7 @@  _recv_raw_pkts_vec_avx2(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts,
 			__m256i tmp6_7 = _mm256_alignr_epi8(fdir_mask, fdir_mask, 4);
 			__m256i fdir_mb6_7 = _mm256_and_si256(mb6_7, tmp6_7);
 			tmp6_7 = _mm256_blend_epi32(fdir_zero_mask, tmp6_7,
-						    fdir_blend_mask);
+						    FDIR_BLEND_MASK);
 			mb6_7 = _mm256_andnot_si256(tmp6_7, mb6_7);
 			rx_pkts[i + 6]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb6_7, 3);
 			rx_pkts[i + 7]->hash.fdir.hi = _mm256_extract_epi32(fdir_mb6_7, 7);