[dpdk-dev,1/2] event/sw: code refractor to reduce the fetch stall

Message ID 1519932900-10571-1-git-send-email-vipin.varghese@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Jerin Jacob
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Varghese, Vipin March 1, 2018, 7:34 p.m. UTC
  With rearranging the code to prefetch the contents before
loop check increases performance from single and multistage
atomic pipeline.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
---
 drivers/event/sw/sw_evdev_scheduler.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)
  

Comments

Jerin Jacob April 2, 2018, 8:06 a.m. UTC | #1
-----Original Message-----
> Date: Fri, 2 Mar 2018 01:04:59 +0530
> From: Vipin Varghese <vipin.varghese@intel.com>
> To: dev@dpdk.org, harry.van.haaren@intel.com
> CC: Vipin Varghese <vipin.varghese@intel.com>
> Subject: [dpdk-dev] [PATCH 1/2] event/sw: code refractor to reduce the
>  fetch stall
> X-Mailer: git-send-email 2.7.4
> 
> With rearranging the code to prefetch the contents before
> loop check increases performance from single and multistage
> atomic pipeline.
> 
> Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>

Harry,

Can you review this patches(1/1 and 1/2)so that I can include it in RC1 pull
request.
  
Van Haaren, Harry April 3, 2018, 12:47 p.m. UTC | #2
Hey,

> -----Original Message-----
> From: Varghese, Vipin
> Sent: Thursday, March 1, 2018 7:35 PM
> To: dev@dpdk.org; Van Haaren, Harry <harry.van.haaren@intel.com>
> Cc: Varghese, Vipin <vipin.varghese@intel.com>
> Subject: [PATCH 1/2] event/sw: code refractor to reduce the fetch stall
> 
> With rearranging the code to prefetch the contents before
> loop check increases performance from single and multistage
> atomic pipeline.
> 
> Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>

There seems to be a compilation issue with this, see "const" in flow_id.
The flow_id variable is updated later, so it can't be marked const.

After the compilation fix, I see a small performance improvement here,
so you can include my Ack for V2 of this patch:

Acked-by: Harry van Haaren <harry.van.haaren@intel.com


> ---
>  drivers/event/sw/sw_evdev_scheduler.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/event/sw/sw_evdev_scheduler.c
> b/drivers/event/sw/sw_evdev_scheduler.c
> index e3a41e0..70d1970 100644
> --- a/drivers/event/sw/sw_evdev_scheduler.c
> +++ b/drivers/event/sw/sw_evdev_scheduler.c
> @@ -44,12 +44,13 @@ sw_schedule_atomic_to_cq(struct sw_evdev *sw, struct
> sw_qid * const qid,
>  	uint32_t qid_id = qid->id;
> 
>  	iq_dequeue_burst(sw, &qid->iq[iq_num], qes, count);
> -	for (i = 0; i < count; i++) {
> -		const struct rte_event *qe = &qes[i];
> -		const uint16_t flow_id = SW_HASH_FLOWID(qes[i].flow_id);
> -		struct sw_fid_t *fid = &qid->fids[flow_id];
> -		int cq = fid->cq;
> 
> +	const struct rte_event *qe = &qes[0];
> +	const uint16_t flow_id = SW_HASH_FLOWID(qes[0].flow_id);

     ^^^^^^ remove the const here.
  

Patch

diff --git a/drivers/event/sw/sw_evdev_scheduler.c b/drivers/event/sw/sw_evdev_scheduler.c
index e3a41e0..70d1970 100644
--- a/drivers/event/sw/sw_evdev_scheduler.c
+++ b/drivers/event/sw/sw_evdev_scheduler.c
@@ -44,12 +44,13 @@  sw_schedule_atomic_to_cq(struct sw_evdev *sw, struct sw_qid * const qid,
 	uint32_t qid_id = qid->id;
 
 	iq_dequeue_burst(sw, &qid->iq[iq_num], qes, count);
-	for (i = 0; i < count; i++) {
-		const struct rte_event *qe = &qes[i];
-		const uint16_t flow_id = SW_HASH_FLOWID(qes[i].flow_id);
-		struct sw_fid_t *fid = &qid->fids[flow_id];
-		int cq = fid->cq;
 
+	const struct rte_event *qe = &qes[0];
+	const uint16_t flow_id = SW_HASH_FLOWID(qes[0].flow_id);
+	struct sw_fid_t *fid = &qid->fids[flow_id];
+	int cq = fid->cq;
+
+	for (i = 0; i < count; i++) {
 		if (cq < 0) {
 			uint32_t cq_idx = qid->cq_next_tx++;
 			if (qid->cq_next_tx == qid->cq_num_mapped_cqs)
@@ -101,6 +102,13 @@  sw_schedule_atomic_to_cq(struct sw_evdev *sw, struct sw_qid * const qid,
 					&sw->cq_ring_space[cq]);
 			p->cq_buf_count = 0;
 		}
+
+		if (likely(i+1 < count)) {
+			qe = (qes + i + 1);
+			flow_id = SW_HASH_FLOWID(qes[i + 1].flow_id);
+			fid = &qid->fids[flow_id];
+			cq = fid->cq;
+		}
 	}
 	iq_put_back(sw, &qid->iq[iq_num], blocked_qes, nb_blocked);