[dpdk-dev] [PATCH RFC] librte_reorder: new reorder library
Bruce Richardson
bruce.richardson at intel.com
Thu Oct 9 11:14:21 CEST 2014
On Wed, Oct 08, 2014 at 04:07:28PM -0700, Matthew Hall wrote:
> On Wed, Oct 08, 2014 at 06:55:41PM -0400, Neil Horman wrote:
> > I think because there is a possibility that multiple workers may be used for a
> > single tx queue.
> >
> > Neil
>
> OK, so, in my application packets are RX'ed to a predictable RX queue and core
> using RSS.
>
> Then you put them into a predictable TX queue for the same core, in the same
> order they came in from the RX queue with RSS enabled.
>
> So you've got a consistent-hashed subset of packets as input, being converted
> to output in the same order.
>
> Will it work, or not work? I'm just curious if my app is doing it wrong and I
> need to fix it, or how this case should be handled in general...
>
> Matthew.
Hi Matthew,
What you are doing will indeed work, and it's the way the vast majority of
the sample apps are written. However, this will not always work for everyone
else, sadly.
First off, with RSS, there are a number of limitations. On the 1G and 10G
NICs RSS works only with IP traffic, and won't work in cases with other
protocols or where IP is encapsulated in anything other than a single VLAN.
Those cases need software load distribution. As well as this, you have very
little control over where flows get put, as the separation into queues
(which go to cores), is only done on the low seven bits. For applications
which work with a small number of flows, e.g. where multiple flows are
contained inside a single tunnel, you get a get a large flow imbalance,
where you get far more traffic coming to one queue/core than to another.
Again in this instance, software load balancing is needed.
Secondly, then, based off that, it is entirely possible when doing software
load balancing to strictly process packets for a flow in order - and indeed
this is what the existing packet distributor does. However, for certain
types of flow where processing of packets for that flow can be done in
parallel, forcing things to be done serially can slow things down. As well
as this, there can sometimes be requirements for the load balancing between
cores to be done as fairly as possible so that it is guaranteed that all
cores have approx the same load, irrespective of the number of input flows.
In these cases, having the option to blindly distribute traffic to cores and
then reorder packets on TX is the best way to ensure even load distribution.
It's not going to be for everyone, but it's good to have the option - and
there are a number of people doing things this way already.
Lastly, there is also the assumption being made that all flows are
independent, which again may not always be the case. If you need ordering
across flows and to share load between cores then reordering on transmission
is the only way to do things.
Hope this helps,
Regards,
/Bruce
More information about the dev
mailing list