[dpdk-users] vmxnet3 tx queue fills then never empties

Paul Atkins patkins at brocade.com
Fri Oct 7 16:29:03 CEST 2016


Hi,

I am having an issue with the vmxnet3 driver that I have done a lot of 
investigation into (details follow) and have reached the stage where I 
feel I need help from people with access to the vmxnet3 virtual nic code.

The issue is with the dpdk vmxnet3 driver where a tx queue will get 
full, and then never empties.  The trigger for this is sending 1500byte 
packets out of the interface at ~60kpps and then marking the tx 
interface as 'not connected' in vmware.  At this stage the tx queues 
fill up, and when the interface is then marked as 'connected' again in 
vmware some of the tx queues are in a state where they never send any 
further packets.  In my setup that has 4 tx queues with the traffic 
being equally shared over them, I typically see this in 1 of the 4 
queues when the interface comes back up.

If we don't use dpdk and send the traffic via the linux kernel instead 
the problem is not seen, which would suggest a bug in the dpdk driver. 
However, if i modify the linux kernel driver to call 
vmxnet3_tq_tx_complete() inline with the tx code (in the same way that 
it is done for dpdk) then we start to see the bug with the kernel too.  
This suggests a timing issue with calling the tx_complete function. When 
it was called from the interrupt handler the issue was not seen.  
Further, adding debugs into the start of the tx_complete func (before 
any work done) caused the issue to no longer be seen, again suggesting 
some timing race.

I then proceeded to add debug that stores the index of next2fill and 
next2comp in the cmd ring, plus their gen bits, plus the gen bits of the 
3 indices above/below.  For the data ring I stored the index of the 
comp_ring, its gen bit and the gen bits for the 3 indices above/below.  
These values were stored in binary form (no string conversion until 
later) each time we entered/exited the tx_complete function, and each 
time we exited the xmit func.  Once the issue was seen, these were 
formatted, and the values all looked correct for both the working and 
the non working case.

I suspect that the cause of this is some quirk of the way the driver 
code is interacting with the virtual NIC, but I have no access to the 
code for the virtual NIC, so am struggling to make any progress 
identifying the root cause.

Is this a known issue, and do you have any suggestions as to how best to 
proceed with this?

I have seen this with the following versions:

ESXi 5.5 and later (VM version 10)
ESXi 5.0 and later (VM version 8)
Linux 4.4 kernel
dpdk 2.2

thanks,
Paul



More information about the users mailing list