[dpdk-dev] qos: traffic shaping at queue level

Dumitrescu, Cristian cristian.dumitrescu at intel.com
Tue Oct 11 16:11:35 CEST 2016



From: Nikhil Jagtap [mailto:nikhil.jagtap at gmail.com]
Sent: Wednesday, October 5, 2016 8:10 AM
To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>
Cc: dev at dpdk.org; users at dpdk.org
Subject: Re: qos: traffic shaping at queue level

Hi Cristian,

Thanks for the info. A few more comments/questions inline.

On 3 October 2016 at 23:42, Dumitrescu, Cristian <cristian.dumitrescu at intel.com<mailto:cristian.dumitrescu at intel.com>> wrote:


From: Nikhil Jagtap [mailto:nikhil.jagtap at gmail.com<mailto:nikhil.jagtap at gmail.com>]
Sent: Friday, September 30, 2016 7:12 AM
To: dev at dpdk.org<mailto:dev at dpdk.org>; Dumitrescu, Cristian <cristian.dumitrescu at intel.com<mailto:cristian.dumitrescu at intel.com>>; users at dpdk.org<mailto:users at dpdk.org>
Subject: Re: qos: traffic shaping at queue level

Hi,
Can someone please answer my queries?
I tried using queue weights to distribute traffic-class bandwidth among the child queues, but did not get the desired results.
[Cristian] Can you please describe what issues you see?
[Nikhil] At the end of a 20 minute test, the total number of packets dequeued from the respective queues were not in the ratio 1:5.
In one other test where 4 equal-rate traffic-streams were hitting 4 different queues of the same TC configured with weights 1:2:4:8, I observed that the queue with highest weight had the least number of dequeued packets when in theory it should have been the one with highest packet count.

[Cristian] No idea why you get into this issue … Please keep me posted once you find the root cause of your issue, maybe there is something that we can improve here.

Regards,
Nikhil

On 27 September 2016 at 15:34, Nikhil Jagtap <nikhil.jagtap at gmail.com<mailto:nikhil.jagtap at gmail.com>> wrote:
Hi,

I have a few questions about the hierarchical scheduler. I am taking a simple example here to get a better understanding.

Reference example:
  pipe rate = 30 mbps
  tc 0 rate = 30 mbps
  traffic-type 0 being queued to queue 0, tc 0.
  traffic-type 1 being queued to queue 1, tc 0.
  Assume traffic-type 0 is being received at the rate of 25 mbps.
  Assume traffic-type 1 is also being received at the rate of 25 mbps.

Requirement:
  To limit traffic-type 0 to (CIR =  5 mbps, PIR = 30 mbps), AND
      limit traffic-type 1 to (CIR = 25 mbps, PIR = 30 mbps).

The questions:
1) I understand that with the scheduler, it is possible to do rate limiting only at the sub-port and pipe levels and not at the individual queue level.
[Cristian] Yes, correct, only subports and pipes own token buckets, with all the pipe traffic classes and queues sharing their pipe token bucket.

Is it possible to achieve rate limiting using the notion of queue weights? For the above example, will assigning weights in 1:5 ratio to the two queues help achieve shaping the two traffic-types at the two different rates?
[Cristian] Yes. However, getting the weight observed accurately relies on all the queues being backlogged (always having packets to dequeue). When a pipe and certain TC is examined for dequeuing, the relative weights are enforced between the queues that have packets at that precise moment in time, with the empty queues being ignored. The fully backlogged scenario is not taking place in practice, and the set of non-empty queues changes over time. As said it the past, having big relative weight ratios between queues helps (1:5 should be good).
[Nikhil] I see. So I guess not having fully backlogged queues could be one of the reasons for the observations I mentioned above where the weights-ratio does not directly translate into rate-ratio. I think I should also mention that there was no pipelining i.e. packet-processing, queueing, dequeing was all being done inline in a run-to-completion model.
a) Would having some kind of pipelining help achieve better rate-ratio? May be say atleast splitting the enqueue and dequeue operations?
b) If pipelining is not an option, what would be the recommended values for enqueue and dequeue packet count in the run-to-completion model? You have mentioned in one of your presentations to use different values for these two. If I go with (enqueue# > dequeue#), don't I run the risk of filling up the scheduler queues and failed enqueues even at rates lower than the scheduler pipe rates? In the other case where (dequeue# > enqueue#), we would end up dequeing all packets that were enqueued every time.

[Cristian]
a) In order to provide determinism for the hierarchical scheduler (e.g. frequent-enough calls of the enqueue and dequeue operations), I recommend dedicating a separate CPU core to run it, as opposed to running a lot of other stuff on the same core, which might result in the scheduler not being called regularly. This requires a pipeline of at least 2x CPU cores, i.e. one running your worker (run-to-completion) which feeds the second core running the scheduler.
b) As documented, for performance reasons, the API is not thread safe, so you need to run enqueue and dequeue of a given port on the same CPU core. Any (enqueue, dequeue) pair with enqueue > dequeue works. For DPDK apps using vector PMD, the burst size is usually 32, then we typically use e.g (32, 28), (32, 24); for apps not using vector PMD, we used in the past (64, 48), (64, 32); recently, in Cisco VPP we used (256, 240), as VPP typical burst size is 256 packets.

2) In continuation to previous question: if queue weights don't help, would it be possible to use metering to achieve rate limiting? Assume we meter individual traffic-types (using CIR-PIR config mentioned above) before queuing it to the scheduler queues. So to achieve the respective queue rates, the dequeuer would be expected to prioritise green packets over yellow.
Looking into the code, the packet color is used as an input to the dropper block, but does not seem to be used anywhere in the scheduler. So I guess it is not possible to prioritise green packets when dequeing?
[Cristian] Packet color is used by Weighted RED (WRED) congestion management scheme on the enqueue side, not on the dequeue side. Once the packet has been enqueued, it cannot be dropped (i.e. every enqueued packet will eventually be dequeued), so rate limiting cannot be enforced on the dequeue side.

Regards,
Nikhil


Thanks.
Nikhil


More information about the dev mailing list