[v7] sched: enable traffic class oversubscription conditionally
Checks
Commit Message
Added new flag to enable or disable TC oversubscription for best
effort traffic class at subport level.
By default TC OV is disabled.
Signed-off-by: Marcin Danilewicz <marcinx.danilewicz@intel.com>
History:
- v1 - TC OV disabled by default
- v2 - throughput improvements
- v3, v4, v5 - changes from comments
- v6 - removed rte_sched_subport_tc_ov_config declaration and map
- v7 - changes from comments on v6
---
lib/sched/rte_sched.c | 93 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 90 insertions(+), 3 deletions(-)
Comments
Hi Marcin,
Comments inline below.
> -----Original Message-----
> From: Danilewicz, MarcinX <marcinx.danilewicz@intel.com>
> Sent: Monday, May 30, 2022 12:55 PM
> To: dev@dpdk.org; Singh, Jasvinder <jasvinder.singh@intel.com>;
> Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Cc: Ajmera, Megha <megha.ajmera@intel.com>
> Subject: [PATCH v7] sched: enable traffic class oversubscription conditionally
>
> Added new flag to enable or disable TC oversubscription for best
> effort traffic class at subport level.
>
> By default TC OV is disabled.
>
> Signed-off-by: Marcin Danilewicz <marcinx.danilewicz@intel.com>
>
> History:
> - v1 - TC OV disabled by default
> - v2 - throughput improvements
> - v3, v4, v5 - changes from comments
> - v6 - removed rte_sched_subport_tc_ov_config declaration and map
> - v7 - changes from comments on v6
I see you moved the history a bit below, but still this is not the proper place for it.
> ---
This is the place for the history log. Please note the "---" line above.
> lib/sched/rte_sched.c | 93
> +++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 90 insertions(+), 3 deletions(-)
>
Still only changes in rte_sched.c and no change in rte_sched.h for the API to configure this feature?
<snip>
Regards,
Cristian
Hi Cristian,
Please find inline answers:
> > History:
> > - v1 - TC OV disabled by default
> > - v2 - throughput improvements
> > - v3, v4, v5 - changes from comments
> > - v6 - removed rte_sched_subport_tc_ov_config declaration and map
> > - v7 - changes from comments on v6
>
> I see you moved the history a bit below, but still this is not the proper place
> for it.
>
> > ---
>
> This is the place for the history log. Please note the "---" line above.
I see.
>
> Still only changes in rte_sched.c and no change in rte_sched.h for the API to
> configure this feature?
Yes, because you said to remove whole
rte_sched_subport_tc_ov_config(struct rte_sched_port *port,
uint32_t subport_id,
bool tc_ov_enable)
here as comment to v4:
> >
> > This function should not exist, please remove it and keep the initial code that
> > computes the tc_ov related variable regardless of whether tc_ov is enabled
> or
> > not.
And by the latest other changes the TC OV is enabled by default. All other init for this feature is done with sched init as per yours other explanations. In turn any can change this new flag, but apparently in code without proper API for that?
Isnt that what you wanted?
BR,
/Marcin
Ps meanwhile I am pushing v8 with --- at the right place.
--------------------------------------------------------------
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.
Hi Marcin,
<snip>
> >
> > Still only changes in rte_sched.c and no change in rte_sched.h for the API
> to
> > configure this feature?
>
> Yes, because you said to remove whole
> rte_sched_subport_tc_ov_config(struct rte_sched_port *port,
> uint32_t subport_id,
> bool tc_ov_enable)
> here as comment to v4:
> > >
> > > This function should not exist, please remove it and keep the initial code
> that
> > > computes the tc_ov related variable regardless of whether tc_ov is
> enabled
> > or
> > > not.
>
> And by the latest other changes the TC OV is enabled by default. All other init
> for this feature is done with sched init as per yours other explanations. In
> turn any can change this new flag, but apparently in code without proper API
> for that?
>
> Isnt that what you wanted?
>
Nope, it looks like we have a misunderstanding here. Looking back at my comments from V3: What I meant is that the configuration values related to this feature (all the tc_ov configuration values) should be computed at initialization regardless of whether this feature is enabled or not in order to minimize code changes and the size of the patch. In V3, you moved a lot of the init code into a different function, but it was my mistake not to realize this was the API function you introduced, sorry about the misunderstanding.
I think we definitely need an API function to simply set the internal subport tc_ov_enabled flag (while also doing the proper argument checks that any well behaved API function must do), but we should not move here the init code that does not really need to be here. Makes sense?
Regards,
Cristian
Hi Cristian,
> Nope, it looks like we have a misunderstanding here. Looking back at my
> comments from V3: What I meant is that the configuration values related to
> this feature (all the tc_ov configuration values) should be computed at
> initialization regardless of whether this feature is enabled or not in order to
> minimize code changes and the size of the patch. In V3, you moved a lot of
> the init code into a different function, but it was my mistake not to realize
> this was the API function you introduced, sorry about the misunderstanding.
That’s the way of life, no simple idea is simple 😊
> I think we definitely need an API function to simply set the internal subport
> tc_ov_enabled flag (while also doing the proper argument checks that any
> well behaved API function must do), but we should not move here the init
> code that does not really need to be here. Makes sense?
Agree. Will work out something asap.
Regards,
/Marcin
@@ -213,6 +213,9 @@ struct rte_sched_subport {
uint8_t *bmp_array;
struct rte_mbuf **queue_array;
uint8_t memory[0] __rte_cache_aligned;
+
+ /* TC oversubscription activation */
+ int tc_ov_enabled;
} __rte_cache_aligned;
struct rte_sched_port {
@@ -1254,6 +1257,9 @@ rte_sched_subport_config(struct rte_sched_port *port,
s->n_pipe_profiles = params->n_pipe_profiles;
s->n_max_pipe_profiles = params->n_max_pipe_profiles;
+ /* TC over-subscription is enabled by default */
+ s->tc_ov_enabled = 1;
+
#ifdef RTE_SCHED_CMAN
if (params->cman_params != NULL) {
s->cman_enabled = true;
@@ -2318,6 +2324,45 @@ grinder_credits_update(struct rte_sched_port *port,
pipe->tb_credits = RTE_MIN(pipe->tb_credits, params->tb_size);
pipe->tb_time += n_periods * params->tb_period;
+ /* Subport TCs */
+ if (unlikely(port->time >= subport->tc_time)) {
+ for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+ subport->tc_credits[i] = sp->tc_credits_per_period[i];
+
+ subport->tc_time = port->time + sp->tc_period;
+ }
+
+ /* Pipe TCs */
+ if (unlikely(port->time >= pipe->tc_time)) {
+ for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+ pipe->tc_credits[i] = params->tc_credits_per_period[i];
+ pipe->tc_time = port->time + params->tc_period;
+ }
+}
+
+static inline void
+grinder_credits_update_with_tc_ov(struct rte_sched_port *port,
+ struct rte_sched_subport *subport, uint32_t pos)
+{
+ struct rte_sched_grinder *grinder = subport->grinder + pos;
+ struct rte_sched_pipe *pipe = grinder->pipe;
+ struct rte_sched_pipe_profile *params = grinder->pipe_params;
+ struct rte_sched_subport_profile *sp = grinder->subport_params;
+ uint64_t n_periods;
+ uint32_t i;
+
+ /* Subport TB */
+ n_periods = (port->time - subport->tb_time) / sp->tb_period;
+ subport->tb_credits += n_periods * sp->tb_credits_per_period;
+ subport->tb_credits = RTE_MIN(subport->tb_credits, sp->tb_size);
+ subport->tb_time += n_periods * sp->tb_period;
+
+ /* Pipe TB */
+ n_periods = (port->time - pipe->tb_time) / params->tb_period;
+ pipe->tb_credits += n_periods * params->tb_credits_per_period;
+ pipe->tb_credits = RTE_MIN(pipe->tb_credits, params->tb_size);
+ pipe->tb_time += n_periods * params->tb_period;
+
/* Subport TCs */
if (unlikely(port->time >= subport->tc_time)) {
subport->tc_ov_wm =
@@ -2348,6 +2393,39 @@ grinder_credits_update(struct rte_sched_port *port,
static inline int
grinder_credits_check(struct rte_sched_port *port,
struct rte_sched_subport *subport, uint32_t pos)
+{
+ struct rte_sched_grinder *grinder = subport->grinder + pos;
+ struct rte_sched_pipe *pipe = grinder->pipe;
+ struct rte_mbuf *pkt = grinder->pkt;
+ uint32_t tc_index = grinder->tc_index;
+ uint64_t pkt_len = pkt->pkt_len + port->frame_overhead;
+ uint64_t subport_tb_credits = subport->tb_credits;
+ uint64_t subport_tc_credits = subport->tc_credits[tc_index];
+ uint64_t pipe_tb_credits = pipe->tb_credits;
+ uint64_t pipe_tc_credits = pipe->tc_credits[tc_index];
+ int enough_credits;
+
+ /* Check pipe and subport credits */
+ enough_credits = (pkt_len <= subport_tb_credits) &&
+ (pkt_len <= subport_tc_credits) &&
+ (pkt_len <= pipe_tb_credits) &&
+ (pkt_len <= pipe_tc_credits);
+
+ if (!enough_credits)
+ return 0;
+
+ /* Update pipe and subport credits */
+ subport->tb_credits -= pkt_len;
+ subport->tc_credits[tc_index] -= pkt_len;
+ pipe->tb_credits -= pkt_len;
+ pipe->tc_credits[tc_index] -= pkt_len;
+
+ return 1;
+}
+
+static inline int
+grinder_credits_check_with_tc_ov(struct rte_sched_port *port,
+ struct rte_sched_subport *subport, uint32_t pos)
{
struct rte_sched_grinder *grinder = subport->grinder + pos;
struct rte_sched_pipe *pipe = grinder->pipe;
@@ -2403,8 +2481,13 @@ grinder_schedule(struct rte_sched_port *port,
uint32_t pkt_len = pkt->pkt_len + port->frame_overhead;
uint32_t be_tc_active;
- if (!grinder_credits_check(port, subport, pos))
- return 0;
+ if (subport->tc_ov_enabled) {
+ if (!grinder_credits_check_with_tc_ov(port, subport, pos))
+ return 0;
+ } else {
+ if (!grinder_credits_check(port, subport, pos))
+ return 0;
+ }
/* Advance port time */
port->time += pkt_len;
@@ -2770,7 +2853,11 @@ grinder_handle(struct rte_sched_port *port,
subport->profile;
grinder_prefetch_tc_queue_arrays(subport, pos);
- grinder_credits_update(port, subport, pos);
+
+ if (subport->tc_ov_enabled)
+ grinder_credits_update_with_tc_ov(port, subport, pos);
+ else
+ grinder_credits_update(port, subport, pos);
grinder->state = e_GRINDER_PREFETCH_MBUF;
return 0;