[dpdk-dev] IXGBE RX packet loss with 5+ cores

Sanford, Robert rsanford at akamai.com
Tue Oct 13 16:47:36 CEST 2015


>>> [Robert:]
>>> 1. The 82599 device supports up to 128 queues. Why do we see trouble
>>> with as few as 5 queues? What could limit the system (and one port
>>> controlled by 5+ cores) from receiving at line-rate without loss?
>>>
>>> 2. As far as we can tell, the RX path only touches the device
>>> registers when it updates a Receive Descriptor Tail register (RDT[n]),
>>> roughly every rx_free_thresh packets. Is there a big difference
>>> between one core doing this and N cores doing it 1/N as often?

>>[Stephen:]
>>As you add cores, there is more traffic on the PCI bus from each core
>>polling. There is a fix number of PCI bus transactions per second
>>possible.
>>Each core is increasing the number of useless (empty) transactions.

>[Bruce:]
>The polling for packets by the core should not be using PCI bandwidth
>directly,
>as the ixgbe driver (and other drivers) check for the DD bit being set on
>the
>descriptor in memory/cache.

I was preparing to reply with the same point.

>>[Stephen:] Why do you think adding more cores will help?

We're using run-to-completion and sometimes spend too many cycles per pkt.
We realize that we need to move to io+workers model, but wanted a better
understanding of the dynamics involved here.



>[Bruce:] However, using an increased number of queues can
>use PCI bandwidth in other ways, for instance, with more queues you
>reduce the
>amount of descriptor coalescing that can be done by the NICs, so that
>instead of
>having a single transaction of 4 descriptors to one queue, the NIC may
>instead
>have to do 4 transactions each writing 1 descriptor to 4 different
>queues. This
>is possibly why sending all traffic to a single queue works ok - the
>polling on
>the other queues is still being done, but has little effect.

Brilliant! This idea did not occur to me.



--
Thanks guys,
Robert



More information about the dev mailing list