[dpdk-dev] eventdev: add producer enqueue hint

Message ID 20170612114627.18893-1-jerin.jacob@caviumnetworks.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Jerin Jacob June 12, 2017, 11:46 a.m. UTC
  Some PMD like OCTEONTX can have optimized handling of
events if the PMD knows it is a producer pattern in advance.
For instance, OCTEONTX PMD can support burst mode if
op == RTE_EVENT_OP_NEW.

Since the event producer initialize(set all_op_new == 1) the event
object before the main producer loop, This scheme does not call for
any performance regression on other PMDs.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
---
Another option is to add a flag in enqueue API or have parallel enqueue API.
---
 drivers/event/octeontx/ssovf_worker.c | 12 ++++++++++--
 lib/librte_eventdev/rte_eventdev.h    | 10 +++++++++-
 2 files changed, 19 insertions(+), 3 deletions(-)
  

Comments

Eads, Gage June 26, 2017, 3:44 p.m. UTC | #1
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Monday, June 12, 2017 6:46 AM
> To: dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>; hemant.agrawal@nxp.com; Eads, Gage
> <gage.eads@intel.com>; nipun.gupta@nxp.com; Vangati, Narender
> <narender.vangati@intel.com>; Rao, Nikhil <nikhil.rao@intel.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>
> Subject: [dpdk-dev] [PATCH] eventdev: add producer enqueue hint
> 
> Some PMD like OCTEONTX can have optimized handling of events if the PMD
> knows it is a producer pattern in advance.
> For instance, OCTEONTX PMD can support burst mode if op ==
> RTE_EVENT_OP_NEW.
> 
> Since the event producer initialize(set all_op_new == 1) the event object before
> the main producer loop, This scheme does not call for any performance
> regression on other PMDs.
> 
> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> ---
> Another option is to add a flag in enqueue API or have parallel enqueue API.
> ---
>  drivers/event/octeontx/ssovf_worker.c | 12 ++++++++++--
>  lib/librte_eventdev/rte_eventdev.h    | 10 +++++++++-
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/event/octeontx/ssovf_worker.c
> b/drivers/event/octeontx/ssovf_worker.c
> index ad3fe684d..209c595cf 100644
> --- a/drivers/event/octeontx/ssovf_worker.c
> +++ b/drivers/event/octeontx/ssovf_worker.c
> @@ -196,8 +196,16 @@ ssows_enq(void *port, const struct rte_event *ev)
> uint16_t __hot  ssows_enq_burst(void *port, const struct rte_event ev[],
> uint16_t nb_events)  {
> -	RTE_SET_USED(nb_events);
> -	return ssows_enq(port, ev);
> +	uint16_t i;
> +	struct ssows *ws = port;
> +
> +	if (ev[0].all_op_new) {
> +		rte_smp_wmb();
> +		for (i = 0; i < nb_events; i++)
> +			ssows_new_event(ws, &ev[i]);
> +		return i;
> +	} else
> +		return ssows_enq(port, ev);
>  }
> 
>  void
> diff --git a/lib/librte_eventdev/rte_eventdev.h
> b/lib/librte_eventdev/rte_eventdev.h
> index a248fe90e..1c1a46593 100644
> --- a/lib/librte_eventdev/rte_eventdev.h
> +++ b/lib/librte_eventdev/rte_eventdev.h
> @@ -933,7 +933,15 @@ struct rte_event {
>  			 * and is undefined on dequeue.
>  			 * @see RTE_EVENT_OP_NEW, (RTE_EVENT_OP_*)
>  			 */
> -			uint8_t rsvd:4;
> +			uint8_t all_op_new:1;
> +			/**< Valid only with event enqueue operation - This hint
> +			 * indicates that the enqueue request has only the
> +			 * events with op == RTE_EVENT_OP_NEW.
> +			 * The event producer, typically use this pattern to
> +			 * inject the events to eventdev.
> +			 * @see RTE_EVENT_OP_NEW
> rte_event_enqueue_burst()
> +			 */
> +			uint8_t rsvd:3;
>  			/**< Reserved for future use */
>  			uint8_t sched_type:2;
>  			/**< Scheduler synchronization type
> (RTE_SCHED_TYPE_*)
> --
> 2.13.1

I slightly prefer the parallel enqueue API -- I can see folks making the mistake of setting all_op_new without setting the op to RTE_EVENT_OP_NEW, and later adding a "forward-only" enqueue API could be interesting for the sw PMD -- but this looks fine to me. Curious if others have any thoughts.
  
Jerin Jacob June 27, 2017, 8:08 a.m. UTC | #2
-----Original Message-----
> Date: Mon, 26 Jun 2017 15:44:10 +0000
> From: "Eads, Gage" <gage.eads@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "dev@dpdk.org"
>  <dev@dpdk.org>
> CC: "Richardson, Bruce" <bruce.richardson@intel.com>, "Van Haaren, Harry"
>  <harry.van.haaren@intel.com>, "hemant.agrawal@nxp.com"
>  <hemant.agrawal@nxp.com>, "nipun.gupta@nxp.com" <nipun.gupta@nxp.com>,
>  "Vangati, Narender" <narender.vangati@intel.com>, "Rao, Nikhil"
>  <nikhil.rao@intel.com>
> Subject: RE: [dpdk-dev] [PATCH] eventdev: add producer enqueue hint
> 
> 
> 
> > -----Original Message-----
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Monday, June 12, 2017 6:46 AM
> > To: dev@dpdk.org
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Van Haaren, Harry
> > <harry.van.haaren@intel.com>; hemant.agrawal@nxp.com; Eads, Gage
> > <gage.eads@intel.com>; nipun.gupta@nxp.com; Vangati, Narender
> > <narender.vangati@intel.com>; Rao, Nikhil <nikhil.rao@intel.com>; Jerin Jacob
> > <jerin.jacob@caviumnetworks.com>
> > Subject: [dpdk-dev] [PATCH] eventdev: add producer enqueue hint
> > 
> > Some PMD like OCTEONTX can have optimized handling of events if the PMD
> > knows it is a producer pattern in advance.
> > For instance, OCTEONTX PMD can support burst mode if op ==
> > RTE_EVENT_OP_NEW.
> > 
> > Since the event producer initialize(set all_op_new == 1) the event object before
> > the main producer loop, This scheme does not call for any performance
> > regression on other PMDs.
> > 
> > Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> > ---
> > Another option is to add a flag in enqueue API or have parallel enqueue API.
> > ---
> >  drivers/event/octeontx/ssovf_worker.c | 12 ++++++++++--
> >  lib/librte_eventdev/rte_eventdev.h    | 10 +++++++++-
> >  2 files changed, 19 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/event/octeontx/ssovf_worker.c
> > b/drivers/event/octeontx/ssovf_worker.c
> > index ad3fe684d..209c595cf 100644
> > --- a/drivers/event/octeontx/ssovf_worker.c
> > +++ b/drivers/event/octeontx/ssovf_worker.c
> > @@ -196,8 +196,16 @@ ssows_enq(void *port, const struct rte_event *ev)
> > uint16_t __hot  ssows_enq_burst(void *port, const struct rte_event ev[],
> > uint16_t nb_events)  {
> > -	RTE_SET_USED(nb_events);
> > -	return ssows_enq(port, ev);
> > +	uint16_t i;
> > +	struct ssows *ws = port;
> > +
> > +	if (ev[0].all_op_new) {
> > +		rte_smp_wmb();
> > +		for (i = 0; i < nb_events; i++)
> > +			ssows_new_event(ws, &ev[i]);
> > +		return i;
> > +	} else
> > +		return ssows_enq(port, ev);
> >  }
> > 
> >  void
> > diff --git a/lib/librte_eventdev/rte_eventdev.h
> > b/lib/librte_eventdev/rte_eventdev.h
> > index a248fe90e..1c1a46593 100644
> > --- a/lib/librte_eventdev/rte_eventdev.h
> > +++ b/lib/librte_eventdev/rte_eventdev.h
> > @@ -933,7 +933,15 @@ struct rte_event {
> >  			 * and is undefined on dequeue.
> >  			 * @see RTE_EVENT_OP_NEW, (RTE_EVENT_OP_*)
> >  			 */
> > -			uint8_t rsvd:4;
> > +			uint8_t all_op_new:1;
> > +			/**< Valid only with event enqueue operation - This hint
> > +			 * indicates that the enqueue request has only the
> > +			 * events with op == RTE_EVENT_OP_NEW.
> > +			 * The event producer, typically use this pattern to
> > +			 * inject the events to eventdev.
> > +			 * @see RTE_EVENT_OP_NEW
> > rte_event_enqueue_burst()
> > +			 */
> > +			uint8_t rsvd:3;
> >  			/**< Reserved for future use */
> >  			uint8_t sched_type:2;
> >  			/**< Scheduler synchronization type
> > (RTE_SCHED_TYPE_*)
> > --
> > 2.13.1
> 
> I slightly prefer the parallel enqueue API -- I can see folks making the mistake of setting all_op_new without setting the op to RTE_EVENT_OP_NEW, and later adding a "forward-only" enqueue API could be interesting for the sw PMD -- but this looks fine to me. Curious if others have any thoughts.

If forward-only parallel enqueue API interesting for the SW PMD then I
can drop this one and introduce forward-only API. Let me know if others
have any thoughts?
  
Van Haaren, Harry June 27, 2017, 8:44 a.m. UTC | #3
> From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> Sent: Tuesday, June 27, 2017 9:08 AM
> To: Eads, Gage <gage.eads@intel.com>
> Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Van Haaren, Harry
> <harry.van.haaren@intel.com>; hemant.agrawal@nxp.com; nipun.gupta@nxp.com; Vangati,
> Narender <narender.vangati@intel.com>; Rao, Nikhil <nikhil.rao@intel.com>
> Subject: Re: [dpdk-dev] [PATCH] eventdev: add producer enqueue hint

<snip some patch code>

> > >  void
> > > diff --git a/lib/librte_eventdev/rte_eventdev.h
> > > b/lib/librte_eventdev/rte_eventdev.h
> > > index a248fe90e..1c1a46593 100644
> > > --- a/lib/librte_eventdev/rte_eventdev.h
> > > +++ b/lib/librte_eventdev/rte_eventdev.h
> > > @@ -933,7 +933,15 @@ struct rte_event {
> > >  			 * and is undefined on dequeue.
> > >  			 * @see RTE_EVENT_OP_NEW, (RTE_EVENT_OP_*)
> > >  			 */
> > > -			uint8_t rsvd:4;
> > > +			uint8_t all_op_new:1;
> > > +			/**< Valid only with event enqueue operation - This hint
> > > +			 * indicates that the enqueue request has only the
> > > +			 * events with op == RTE_EVENT_OP_NEW.
> > > +			 * The event producer, typically use this pattern to
> > > +			 * inject the events to eventdev.
> > > +			 * @see RTE_EVENT_OP_NEW
> > > rte_event_enqueue_burst()
> > > +			 */
> > > +			uint8_t rsvd:3;
> > >  			/**< Reserved for future use */
> > >  			uint8_t sched_type:2;
> > >  			/**< Scheduler synchronization type
> > > (RTE_SCHED_TYPE_*)
> > > --
> > > 2.13.1
> >
> > I slightly prefer the parallel enqueue API -- I can see folks making the mistake of
> setting all_op_new without setting the op to RTE_EVENT_OP_NEW, and later adding a
> "forward-only" enqueue API could be interesting for the sw PMD -- but this looks fine to
> me. Curious if others have any thoughts.
> 
> If forward-only parallel enqueue API interesting for the SW PMD then I
> can drop this one and introduce forward-only API. Let me know if others
> have any thoughts?


To make sure I understand correctly, the "parallel API" idea is to add a new function pointer per-PMD, and dedicate it to enqueueing a burst of packets with the same OP? So the end result would be function(s) in the public API like this:

rte_event_enqueue_burst_new(port, new_events, n_events);
rte_event_enqueue_burst_forward(port, new_events, n_events);

Given these are a "specialization" of the generic enqueue_burst() function, the PMD is not obliged to implement them. If they are NULL, the eventdev.c infrastructure can just point the burst_new() and burst_forward() to the generic enqueue without any performance delta?

The cost is some added code in the public header and infrastructure.
The gain is that we don't overload the current API with new behavior. 


Assuming my description of the parallel proposal above is correct, +1 for the parallel function approach. I like APIs that "do what they say on the tin" :)
  
Jerin Jacob June 27, 2017, 10:17 a.m. UTC | #4
-----Original Message-----
> Date: Tue, 27 Jun 2017 08:44:34 +0000
> From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Eads, Gage"
>  <gage.eads@intel.com>
> CC: "dev@dpdk.org" <dev@dpdk.org>, "Richardson, Bruce"
>  <bruce.richardson@intel.com>, "hemant.agrawal@nxp.com"
>  <hemant.agrawal@nxp.com>, "nipun.gupta@nxp.com" <nipun.gupta@nxp.com>,
>  "Vangati, Narender" <narender.vangati@intel.com>, "Rao, Nikhil"
>  <nikhil.rao@intel.com>
> Subject: RE: [dpdk-dev] [PATCH] eventdev: add producer enqueue hint
> 
> > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> > Sent: Tuesday, June 27, 2017 9:08 AM
> > To: Eads, Gage <gage.eads@intel.com>
> > Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Van Haaren, Harry
> > <harry.van.haaren@intel.com>; hemant.agrawal@nxp.com; nipun.gupta@nxp.com; Vangati,
> > Narender <narender.vangati@intel.com>; Rao, Nikhil <nikhil.rao@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH] eventdev: add producer enqueue hint
> 
> <snip some patch code>
> 
> > > >  void
> > > > diff --git a/lib/librte_eventdev/rte_eventdev.h
> > > > b/lib/librte_eventdev/rte_eventdev.h
> > > > index a248fe90e..1c1a46593 100644
> > > > --- a/lib/librte_eventdev/rte_eventdev.h
> > > > +++ b/lib/librte_eventdev/rte_eventdev.h
> > > > @@ -933,7 +933,15 @@ struct rte_event {
> > > >  			 * and is undefined on dequeue.
> > > >  			 * @see RTE_EVENT_OP_NEW, (RTE_EVENT_OP_*)
> > > >  			 */
> > > > -			uint8_t rsvd:4;
> > > > +			uint8_t all_op_new:1;
> > > > +			/**< Valid only with event enqueue operation - This hint
> > > > +			 * indicates that the enqueue request has only the
> > > > +			 * events with op == RTE_EVENT_OP_NEW.
> > > > +			 * The event producer, typically use this pattern to
> > > > +			 * inject the events to eventdev.
> > > > +			 * @see RTE_EVENT_OP_NEW
> > > > rte_event_enqueue_burst()
> > > > +			 */
> > > > +			uint8_t rsvd:3;
> > > >  			/**< Reserved for future use */
> > > >  			uint8_t sched_type:2;
> > > >  			/**< Scheduler synchronization type
> > > > (RTE_SCHED_TYPE_*)
> > > > --
> > > > 2.13.1
> > >
> > > I slightly prefer the parallel enqueue API -- I can see folks making the mistake of
> > setting all_op_new without setting the op to RTE_EVENT_OP_NEW, and later adding a
> > "forward-only" enqueue API could be interesting for the sw PMD -- but this looks fine to
> > me. Curious if others have any thoughts.
> > 
> > If forward-only parallel enqueue API interesting for the SW PMD then I
> > can drop this one and introduce forward-only API. Let me know if others
> > have any thoughts?
> 
> 
> To make sure I understand correctly, the "parallel API" idea is to add a new function pointer per-PMD, and dedicate it to enqueueing a burst of packets with the same OP? So the end result would be function(s) in the public API like this:
> 
> rte_event_enqueue_burst_new(port, new_events, n_events);
> rte_event_enqueue_burst_forward(port, new_events, n_events);
> 
> Given these are a "specialization" of the generic enqueue_burst() function, the PMD is not obliged to implement them. If they are NULL, the eventdev.c infrastructure can just point the burst_new() and burst_forward() to the generic enqueue without any performance delta?
> 
> The cost is some added code in the public header and infrastructure.
> The gain is that we don't overload the current API with new behavior. 
> 
> 
> Assuming my description of the parallel proposal above is correct, +1 for the parallel function approach. I like APIs that "do what they say on the tin" :)

Yes. We are on the same page. I will send the v2.
  

Patch

diff --git a/drivers/event/octeontx/ssovf_worker.c b/drivers/event/octeontx/ssovf_worker.c
index ad3fe684d..209c595cf 100644
--- a/drivers/event/octeontx/ssovf_worker.c
+++ b/drivers/event/octeontx/ssovf_worker.c
@@ -196,8 +196,16 @@  ssows_enq(void *port, const struct rte_event *ev)
 uint16_t __hot
 ssows_enq_burst(void *port, const struct rte_event ev[], uint16_t nb_events)
 {
-	RTE_SET_USED(nb_events);
-	return ssows_enq(port, ev);
+	uint16_t i;
+	struct ssows *ws = port;
+
+	if (ev[0].all_op_new) {
+		rte_smp_wmb();
+		for (i = 0; i < nb_events; i++)
+			ssows_new_event(ws, &ev[i]);
+		return i;
+	} else
+		return ssows_enq(port, ev);
 }
 
 void
diff --git a/lib/librte_eventdev/rte_eventdev.h b/lib/librte_eventdev/rte_eventdev.h
index a248fe90e..1c1a46593 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -933,7 +933,15 @@  struct rte_event {
 			 * and is undefined on dequeue.
 			 * @see RTE_EVENT_OP_NEW, (RTE_EVENT_OP_*)
 			 */
-			uint8_t rsvd:4;
+			uint8_t all_op_new:1;
+			/**< Valid only with event enqueue operation - This hint
+			 * indicates that the enqueue request has only the
+			 * events with op == RTE_EVENT_OP_NEW.
+			 * The event producer, typically use this pattern to
+			 * inject the events to eventdev.
+			 * @see RTE_EVENT_OP_NEW rte_event_enqueue_burst()
+			 */
+			uint8_t rsvd:3;
 			/**< Reserved for future use */
 			uint8_t sched_type:2;
 			/**< Scheduler synchronization type (RTE_SCHED_TYPE_*)