[PATCH v3] net/ixgbe: add proper memory barriers for some Rx functions

zhoumin zhoumin at loongson.cn
Mon Jun 12 13:58:53 CEST 2023


Hi Thomas,

On Mon, June 12, 2023 at 6:26PM, Thomas Monjalon wrote:
> 15/05/2023 04:10, Zhang, Qi Z:
>> From: Ruifeng Wang <Ruifeng.Wang at arm.com>
>>> From: Min Zhou <zhoumin at loongson.cn>
>>>> Segmentation fault has been observed while running the
>>>> ixgbe_recv_pkts_lro() function to receive packets on the Loongson
>>>> 3C5000 processor which has 64 cores and 4 NUMA nodes.
>>>>
>>>>  From the ixgbe_recv_pkts_lro() function, we found that as long as the
>>>> first packet has the EOP bit set, and the length of this packet is
>>>> less than or equal to rxq->crc_len, the segmentation fault will
>>>> definitely happen even though on the other platforms. For example, if
>>>> we made the first packet which had the EOP bit set had a zero length by
>>> force, the segmentation fault would happen on X86.
>>>> Because when processd the first packet the first_seg->next will be
>>>> NULL, if at the same time this packet has the EOP bit set and its
>>>> length is less than or equal to rxq->crc_len, the following loop will be
>>> executed:
>>>>      for (lp = first_seg; lp->next != rxm; lp = lp->next)
>>>>          ;
>>>>
>>>> We know that the first_seg->next will be NULL under this condition. So
>>>> the expression of
>>>> lp->next->next will cause the segmentation fault.
>>>>
>>>> Normally, the length of the first packet with EOP bit set will be
>>>> greater than rxq-
>>>>> crc_len. However, the out-of-order execution of CPU may make the read
>>>>> ordering of the
>>>> status and the rest of the descriptor fields in this function not be
>>>> correct. The related codes are as following:
>>>>
>>>>          rxdp = &rx_ring[rx_id];
>>>>   #1     staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);
>>>>
>>>>          if (!(staterr & IXGBE_RXDADV_STAT_DD))
>>>>              break;
>>>>
>>>>   #2     rxd = *rxdp;
>>>>
>>>> The sentence #2 may be executed before sentence #1. This action is
>>>> likely to make the ready packet zero length. If the packet is the
>>>> first packet and has the EOP bit set, the above segmentation fault will
>>> happen.
>>>> So, we should add a proper memory barrier to ensure the read ordering
>>>> be correct. We also did the same thing in the ixgbe_recv_pkts()
>>>> function to make the rxd data be valid even though we did not find
>>> segmentation fault in this function.
>>>> Fixes: 8eecb3295ae ("ixgbe: add LRO support")
>>>> Cc: stable at dpdk.org
>>>>
>>>> Signed-off-by: Min Zhou <zhoumin at loongson.cn>
>>>> ---
>>>> v3:
>>>> - Use rte_smp_rmb() as the proper memory barrier instead of rte_rmb()
>>>> ---
>>>> v2:
>>>> - Make the calling of rte_rmb() for all platforms
>>>> ---
> [...]
>>> Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
>> Applied to dpdk-next-net-intel.
>>
>> Thanks
>> Qi
>>
> Why ignoring checkpatch?
> It is saying:
> "
> Warning in drivers/net/ixgbe/ixgbe_rxtx.c:
> Using rte_smp_[r/w]mb
> "


I'm sorry. Should we never use rte_smp_[r/w]mb in the driver's code?


> Ruifeng proposed "rte_atomic_thread_fence(__ATOMIC_ACQUIRE)"
> in a comment on the v2.


Thanks, I see. I think I also can use rte_atomic_thread_fence() to solve 
this problem. I will send the V4 patch.


>
> I will drop this patch from the pull of next-net-intel branch.
>



More information about the stable mailing list