[dpdk-dev] IXGBE RX packet loss with 5+ cores
Sanford, Robert
rsanford at akamai.com
Tue Oct 13 16:47:36 CEST 2015
>>> [Robert:]
>>> 1. The 82599 device supports up to 128 queues. Why do we see trouble
>>> with as few as 5 queues? What could limit the system (and one port
>>> controlled by 5+ cores) from receiving at line-rate without loss?
>>>
>>> 2. As far as we can tell, the RX path only touches the device
>>> registers when it updates a Receive Descriptor Tail register (RDT[n]),
>>> roughly every rx_free_thresh packets. Is there a big difference
>>> between one core doing this and N cores doing it 1/N as often?
>>[Stephen:]
>>As you add cores, there is more traffic on the PCI bus from each core
>>polling. There is a fix number of PCI bus transactions per second
>>possible.
>>Each core is increasing the number of useless (empty) transactions.
>[Bruce:]
>The polling for packets by the core should not be using PCI bandwidth
>directly,
>as the ixgbe driver (and other drivers) check for the DD bit being set on
>the
>descriptor in memory/cache.
I was preparing to reply with the same point.
>>[Stephen:] Why do you think adding more cores will help?
We're using run-to-completion and sometimes spend too many cycles per pkt.
We realize that we need to move to io+workers model, but wanted a better
understanding of the dynamics involved here.
>[Bruce:] However, using an increased number of queues can
>use PCI bandwidth in other ways, for instance, with more queues you
>reduce the
>amount of descriptor coalescing that can be done by the NICs, so that
>instead of
>having a single transaction of 4 descriptors to one queue, the NIC may
>instead
>have to do 4 transactions each writing 1 descriptor to 4 different
>queues. This
>is possibly why sending all traffic to a single queue works ok - the
>polling on
>the other queues is still being done, but has little effect.
Brilliant! This idea did not occur to me.
--
Thanks guys,
Robert
More information about the dev
mailing list