[dpdk-users] vmxnet3 tx queue fills then never empties
Paul Atkins
patkins at brocade.com
Fri Oct 7 16:29:03 CEST 2016
Hi,
I am having an issue with the vmxnet3 driver that I have done a lot of
investigation into (details follow) and have reached the stage where I
feel I need help from people with access to the vmxnet3 virtual nic code.
The issue is with the dpdk vmxnet3 driver where a tx queue will get
full, and then never empties. The trigger for this is sending 1500byte
packets out of the interface at ~60kpps and then marking the tx
interface as 'not connected' in vmware. At this stage the tx queues
fill up, and when the interface is then marked as 'connected' again in
vmware some of the tx queues are in a state where they never send any
further packets. In my setup that has 4 tx queues with the traffic
being equally shared over them, I typically see this in 1 of the 4
queues when the interface comes back up.
If we don't use dpdk and send the traffic via the linux kernel instead
the problem is not seen, which would suggest a bug in the dpdk driver.
However, if i modify the linux kernel driver to call
vmxnet3_tq_tx_complete() inline with the tx code (in the same way that
it is done for dpdk) then we start to see the bug with the kernel too.
This suggests a timing issue with calling the tx_complete function. When
it was called from the interrupt handler the issue was not seen.
Further, adding debugs into the start of the tx_complete func (before
any work done) caused the issue to no longer be seen, again suggesting
some timing race.
I then proceeded to add debug that stores the index of next2fill and
next2comp in the cmd ring, plus their gen bits, plus the gen bits of the
3 indices above/below. For the data ring I stored the index of the
comp_ring, its gen bit and the gen bits for the 3 indices above/below.
These values were stored in binary form (no string conversion until
later) each time we entered/exited the tx_complete function, and each
time we exited the xmit func. Once the issue was seen, these were
formatted, and the values all looked correct for both the working and
the non working case.
I suspect that the cause of this is some quirk of the way the driver
code is interacting with the virtual NIC, but I have no access to the
code for the virtual NIC, so am struggling to make any progress
identifying the root cause.
Is this a known issue, and do you have any suggestions as to how best to
proceed with this?
I have seen this with the following versions:
ESXi 5.5 and later (VM version 10)
ESXi 5.0 and later (VM version 8)
Linux 4.4 kernel
dpdk 2.2
thanks,
Paul
More information about the users
mailing list