[PATCH 1/1] net/mlx5: fix device removal event handling

Raslan Darawsheh rasland at nvidia.com
Mon Jun 19 14:14:49 CEST 2023


Hi,

> -----Original Message-----
> From: Slava Ovsiienko <viacheslavo at nvidia.com>
> Sent: Tuesday, May 30, 2023 6:13 PM
> To: dev at dpdk.org
> Cc: Ori Kam <orika at nvidia.com>; Raslan Darawsheh <rasland at nvidia.com>;
> Matan Azrad <matan at nvidia.com>; stable at dpdk.org
> Subject: [PATCH 1/1] net/mlx5: fix device removal event handling
> 
> On the device removal kernel notifies user space application with queueing the
> IBV_DEVICE_FATAL_EVENT and triggering appropriate file descriptor. Mellanox
> kernel driver stack emits this event twice from different layers (mlx5 and
> uverbs). The IB port index is not applicable in the event structure and should
> be ignored for IBV_DEVICE_FATAL_EVENT events.
> 
> Also, on the older kernels (at least from OFED 4.9) there might be race
> conditions causing the event queue close before application fetches the
> IBV_DEVICE_FATAL_EVENT message with ibv_get_async_event() API.
> 
> To provide the reliable device removal event detection the patch:
> 
>   - ignores the IB port index for the IBV_DEVICE_FATAL_EVENT
>   - introduces the flag to notify PMD about removal only once
>   - acks event with ibv_ack_async_event after actual handling
>   - checks for EIO error, making sure queue is not closed yet
> 
> Fixes: 40d9f906f4e2 ("net/mlx5: fix device removal handler for multiport")
> Cc: stable at dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
> ---

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh


More information about the stable mailing list