[dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF

Olivier MATZ olivier.matz at 6wind.com
Tue May 17 09:50:49 CEST 2016


Hi Wenzhuo,

On 05/17/2016 03:11 AM, Lu, Wenzhuo wrote:
>> -----Original Message-----
>> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
>> If I understand well, ixgbevf_dev_link_up_down_handler() is called by
>> ixgbevf_recv_pkts_fake() on a dataplane core. It means that the core that
>> acquired the lock will loop during 100us + 1sec at least.
>> If this core was also in charge of polling other queues of other ports, or timers,
>> many packets will be dropped (even with a 100us loop). I don't think it is
>> acceptable to actively wait inside a rx function.
>>
>> I think it would avoid many issues to delegate this work to the application,
>> maybe by notifying it that the port is in a bad state and must be restarted. The
>> application could then properly stop polling the queues, and stop and restart the
>> port in a separate thread, without bothering the dataplane cores.
> Thanks for the comments.
> Yes, you're right. I had a wrong assumption that every queue is handled by one core.
> But surely it's not right, we cannot tell how the users will deploy their system.
>
> I plan to update this patch set. The solution now is, first let the users choose if they want this
> auto-reset feature. If so, we will apply another series rx/tx functions which have lock. So we
> can stop the rx/tx of the bad ports.
> And we also apply a reset API for users. The APPs should call this API in their management thread or so.
> It means APPs should guarantee the thread safe for the API.
> You see, there're 2 things,
> 1, Lock the rx/tx to stop them for users.
> 2, Apply a resetting API for users, and every NIC can do their own job. APPs need not to worry about the difference
> between different NICs.
>
> Surely, it's not *automatic* now. The reason is DPDK doesn't guarantee the thread safe. So the operations have to be
> left to the APPs and let them to guarantee the thread safe.
>
> And if the users choose not using auto-reset feature, we will leave this work to the APP :)

Yes, I think having 2 modes is a good approach:

- the first mode would let the application know a reset has to
   be performed, without active loop or lock in the rx/tx funcs.
- the second mode would transparently manage the reset in the driver,
   but may lock the core during some time.

By the way, you talk about a reset API, why not just using the
usual stop/start functions? I think it would work the same.

Regards,
Olivier


More information about the dev mailing list