[dpdk-dev] kni: optimize the kni release speed

Message ID bdf9aa61-2363-4b42-942b-ee392c2df7a6@zyc-PC.local (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Matt Feb. 6, 2018, 10:33 a.m. UTC
  Physical addresses in the fifo named alloc_q need to be traversed to
release in user space. The physical address to the virtual address
conversion in kernel space is much better.

Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
---
 lib/librte_eal/linuxapp/kni/kni_dev.h  |  1 +
 lib/librte_eal/linuxapp/kni/kni_misc.c |  1 +
 lib/librte_eal/linuxapp/kni/kni_net.c  | 15 +++++++++++++++
 lib/librte_kni/rte_kni.c               | 26 +-------------------------
 4 files changed, 18 insertions(+), 25 deletions(-)
  

Comments

Ferruh Yigit March 26, 2018, 4:44 p.m. UTC | #1
On 2/6/2018 10:33 AM, zhouyangchao wrote:
> Physical addresses in the fifo named alloc_q need to be traversed to
> release in user space. The physical address to the virtual address
> conversion in kernel space is much better.

Yes current approach should be slower but this is not in data path, this is when
a kni interface released, I expect no recognizable difference.

> 
> Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
> ---
>  lib/librte_eal/linuxapp/kni/kni_dev.h  |  1 +
>  lib/librte_eal/linuxapp/kni/kni_misc.c |  1 +
>  lib/librte_eal/linuxapp/kni/kni_net.c  | 15 +++++++++++++++
>  lib/librte_kni/rte_kni.c               | 26 +-------------------------
>  4 files changed, 18 insertions(+), 25 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/kni/kni_dev.h b/lib/librte_eal/linuxapp/kni/kni_dev.h
> index c9393d8..7cd9bf8 100644
> --- a/lib/librte_eal/linuxapp/kni/kni_dev.h
> +++ b/lib/librte_eal/linuxapp/kni/kni_dev.h
> @@ -92,6 +92,7 @@ struct kni_dev {
>  	void *alloc_va[MBUF_BURST_SZ];
>  };
>  
> +void kni_net_fifo_pa2va(struct kni_dev *kni);
>  void kni_net_rx(struct kni_dev *kni);
>  void kni_net_init(struct net_device *dev);
>  void kni_net_config_lo_mode(char *lo_str);
> diff --git a/lib/librte_eal/linuxapp/kni/kni_misc.c b/lib/librte_eal/linuxapp/kni/kni_misc.c
> index 01574ec..668488b 100644
> --- a/lib/librte_eal/linuxapp/kni/kni_misc.c
> +++ b/lib/librte_eal/linuxapp/kni/kni_misc.c
> @@ -507,6 +507,7 @@ kni_ioctl_release(struct net *net, uint32_t ioctl_num,
>  			dev->pthread = NULL;
>  		}
>  
> +		kni_net_fifo_pa2va(dev);
>  		kni_dev_remove(dev);
>  		list_del(&dev->list);
>  		ret = 0;
> diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c b/lib/librte_eal/linuxapp/kni/kni_net.c
> index 9f9b798..662a527 100644
> --- a/lib/librte_eal/linuxapp/kni/kni_net.c
> +++ b/lib/librte_eal/linuxapp/kni/kni_net.c
> @@ -73,6 +73,21 @@ va2pa(void *va, struct rte_kni_mbuf *m)
>  	return pa;
>  }
>  
> +/* convert physical addresses to virtual addresses in fifo for kni release */
> +void
> +kni_net_fifo_pa2va(struct kni_dev *kni)
> +{
> +	void *fifo = kni->alloc_q;
> +	int i, count = kni_fifo_count(fifo);
> +	void *pa = NULL, *kva, *va;
> +	for (i = 0; i < count; ++i) {
> +		(void)kni_fifo_get(fifo, &pa, 1);
> +		kva = pa2kva(pa);
> +		va = pa2va(pa, kva);
> +		(void)kni_fifo_put(fifo, &va, 1);

kni fifo are single producer, single consumer. For alloc_q kernel side is
consumer, I aware at this stage applications should stop the traffic, but still
I am not comfortable mixing producer/consumer roles here.

Also alloc_q should have physical addresses this logic stores virtual addresses
in it and not sure about this either to mix addressing logic in the queue.

Instead of this conversion, what about moving from alloc_q to free_q? free_q
already has virtual addresses and freed by userspace, so this will be safe.
I suggest keeping alloc_q free logic in the userspace in any case, if alloc_q is
free it won't cost anyway.



And while checking for this I may found something else. We have same problem
with rx_q, it has physical addresses which makes hard to free in userspace. The
existing intention is to give some time to kernel to consume the rx_q so that it
won't be an issue for userspace. But that logic can be wrong.
During the time userspace waits the netdev may be already destroyed and there is
nothing to receive the packet, perhaps we should move wait above the ioctl.
Since you are already checking these parts perhaps you would like to comment :)

> +	}
> +}
> +
>  /*
>   * It can be called to process the request.
>   */
> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
> index 2867411..f8398a9 100644
> --- a/lib/librte_kni/rte_kni.c
> +++ b/lib/librte_kni/rte_kni.c
> @@ -435,30 +435,6 @@ va2pa(struct rte_mbuf *m)
>  			 (unsigned long)m->buf_iova));
>  }
>  
> -static void
> -obj_free(struct rte_mempool *mp __rte_unused, void *opaque, void *obj,
> -		unsigned obj_idx __rte_unused)
> -{
> -	struct rte_mbuf *m = obj;
> -	void *mbuf_phys = opaque;
> -
> -	if (va2pa(m) == mbuf_phys)
> -		rte_pktmbuf_free(m);
> -}
> -
> -static void
> -kni_free_fifo_phy(struct rte_mempool *mp, struct rte_kni_fifo *fifo)
> -{
> -	void *mbuf_phys;
> -	int ret;
> -
> -	do {
> -		ret = kni_fifo_get(fifo, &mbuf_phys, 1);
> -		if (ret)
> -			rte_mempool_obj_iter(mp, obj_free, mbuf_phys);
> -	} while (ret);
> -}
> -
>  int
>  rte_kni_release(struct rte_kni *kni)
>  {
> @@ -484,7 +460,7 @@ rte_kni_release(struct rte_kni *kni)
>  	if (kni_fifo_count(kni->rx_q))
>  		RTE_LOG(ERR, KNI, "Fail to free all Rx-q items\n");
>  
> -	kni_free_fifo_phy(kni->pktmbuf_pool, kni->alloc_q);
> +	kni_free_fifo(kni->alloc_q);
>  	kni_free_fifo(kni->tx_q);
>  	kni_free_fifo(kni->free_q);
>  
>
  
Matt April 3, 2018, 1:30 p.m. UTC | #2
On Tue, Mar 27, 2018 at 12:44 AM Ferruh Yigit <ferruh.yigit@intel.com>
wrote:

> On 2/6/2018 10:33 AM, zhouyangchao wrote:
> > Physical addresses in the fifo named alloc_q need to be traversed to
> > release in user space. The physical address to the virtual address
> > conversion in kernel space is much better.
>
> Yes current approach should be slower but this is not in data path, this
> is when
> a kni interface released, I expect no recognizable difference.
>
>
I changed the number of pre-allocation mbufs to 512 to improve performance.
It cost nearly 3 seconds to release mbufs with a mbuf pool that has more
than 131072 mbufs. I need to lock the forwarding threads for some reason.


> >
> > Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
> > ---
> >  lib/librte_eal/linuxapp/kni/kni_dev.h  |  1 +
> >  lib/librte_eal/linuxapp/kni/kni_misc.c |  1 +
> >  lib/librte_eal/linuxapp/kni/kni_net.c  | 15 +++++++++++++++
> >  lib/librte_kni/rte_kni.c               | 26 +-------------------------
> >  4 files changed, 18 insertions(+), 25 deletions(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/kni/kni_dev.h
> b/lib/librte_eal/linuxapp/kni/kni_dev.h
> > index c9393d8..7cd9bf8 100644
> > --- a/lib/librte_eal/linuxapp/kni/kni_dev.h
> > +++ b/lib/librte_eal/linuxapp/kni/kni_dev.h
> > @@ -92,6 +92,7 @@ struct kni_dev {
> >       void *alloc_va[MBUF_BURST_SZ];
> >  };
> >
> > +void kni_net_fifo_pa2va(struct kni_dev *kni);
> >  void kni_net_rx(struct kni_dev *kni);
> >  void kni_net_init(struct net_device *dev);
> >  void kni_net_config_lo_mode(char *lo_str);
> > diff --git a/lib/librte_eal/linuxapp/kni/kni_misc.c
> b/lib/librte_eal/linuxapp/kni/kni_misc.c
> > index 01574ec..668488b 100644
> > --- a/lib/librte_eal/linuxapp/kni/kni_misc.c
> > +++ b/lib/librte_eal/linuxapp/kni/kni_misc.c
> > @@ -507,6 +507,7 @@ kni_ioctl_release(struct net *net, uint32_t
> ioctl_num,
> >                       dev->pthread = NULL;
> >               }
> >
> > +             kni_net_fifo_pa2va(dev);
> >               kni_dev_remove(dev);
> >               list_del(&dev->list);
> >               ret = 0;
> > diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c
> b/lib/librte_eal/linuxapp/kni/kni_net.c
> > index 9f9b798..662a527 100644
> > --- a/lib/librte_eal/linuxapp/kni/kni_net.c
> > +++ b/lib/librte_eal/linuxapp/kni/kni_net.c
> > @@ -73,6 +73,21 @@ va2pa(void *va, struct rte_kni_mbuf *m)
> >       return pa;
> >  }
> >
> > +/* convert physical addresses to virtual addresses in fifo for kni
> release */
> > +void
> > +kni_net_fifo_pa2va(struct kni_dev *kni)
> > +{
> > +     void *fifo = kni->alloc_q;
> > +     int i, count = kni_fifo_count(fifo);
> > +     void *pa = NULL, *kva, *va;
> > +     for (i = 0; i < count; ++i) {
> > +             (void)kni_fifo_get(fifo, &pa, 1);
> > +             kva = pa2kva(pa);
> > +             va = pa2va(pa, kva);
> > +             (void)kni_fifo_put(fifo, &va, 1);
>
> kni fifo are single producer, single consumer. For alloc_q kernel side is
> consumer, I aware at this stage applications should stop the traffic, but
> still
> I am not comfortable mixing producer/consumer roles here.
>
> Also alloc_q should have physical addresses this logic stores virtual
> addresses
> in it and not sure about this either to mix addressing logic in the queue.
>
> Instead of this conversion, what about moving from alloc_q to free_q?
> free_q
> already has virtual addresses and freed by userspace, so this will be safe.
> I suggest keeping alloc_q free logic in the userspace in any case, if
> alloc_q is
> free it won't cost anyway.
>
>
>
This is a very valuable suggestion. My implementation is not compatible
and needs to upgrade rte_kni.ko and application at the same time.


> And while checking for this I may found something else. We have same
> problem
> with rx_q, it has physical addresses which makes hard to free in
> userspace. The
> existing intention is to give some time to kernel to consume the rx_q so
> that it
> won't be an issue for userspace. But that logic can be wrong.
> During the time userspace waits the netdev may be already destroyed and
> there is
> nothing to receive the packet, perhaps we should move wait above the ioctl.
> Since you are already checking these parts perhaps you would like to
> comment :)
>
>
In the actual scene, its impact on time is small. The mbuf leak in
kni_release does not occur in single-threaded mode, but in
multi-threaded mode. I didn't realize this leak before because I've been
using single-threaded mode. I suggest moving rx_q to free_q first and then
moving alloc_q to free_q when kni release in kernel space.


> > +     }
> > +}
> > +
> >  /*
> >   * It can be called to process the request.
> >   */
> > diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
> > index 2867411..f8398a9 100644
> > --- a/lib/librte_kni/rte_kni.c
> > +++ b/lib/librte_kni/rte_kni.c
> > @@ -435,30 +435,6 @@ va2pa(struct rte_mbuf *m)
> >                        (unsigned long)m->buf_iova));
> >  }
> >
> > -static void
> > -obj_free(struct rte_mempool *mp __rte_unused, void *opaque, void *obj,
> > -             unsigned obj_idx __rte_unused)
> > -{
> > -     struct rte_mbuf *m = obj;
> > -     void *mbuf_phys = opaque;
> > -
> > -     if (va2pa(m) == mbuf_phys)
> > -             rte_pktmbuf_free(m);
> > -}
> > -
> > -static void
> > -kni_free_fifo_phy(struct rte_mempool *mp, struct rte_kni_fifo *fifo)
> > -{
> > -     void *mbuf_phys;
> > -     int ret;
> > -
> > -     do {
> > -             ret = kni_fifo_get(fifo, &mbuf_phys, 1);
> > -             if (ret)
> > -                     rte_mempool_obj_iter(mp, obj_free, mbuf_phys);
> > -     } while (ret);
> > -}
> > -
> >  int
> >  rte_kni_release(struct rte_kni *kni)
> >  {
> > @@ -484,7 +460,7 @@ rte_kni_release(struct rte_kni *kni)
> >       if (kni_fifo_count(kni->rx_q))
> >               RTE_LOG(ERR, KNI, "Fail to free all Rx-q items\n");
> >
> > -     kni_free_fifo_phy(kni->pktmbuf_pool, kni->alloc_q);
> > +     kni_free_fifo(kni->alloc_q);
> >       kni_free_fifo(kni->tx_q);
> >       kni_free_fifo(kni->free_q);
> >
> >
>
>
  

Patch

diff --git a/lib/librte_eal/linuxapp/kni/kni_dev.h b/lib/librte_eal/linuxapp/kni/kni_dev.h
index c9393d8..7cd9bf8 100644
--- a/lib/librte_eal/linuxapp/kni/kni_dev.h
+++ b/lib/librte_eal/linuxapp/kni/kni_dev.h
@@ -92,6 +92,7 @@  struct kni_dev {
 	void *alloc_va[MBUF_BURST_SZ];
 };
 
+void kni_net_fifo_pa2va(struct kni_dev *kni);
 void kni_net_rx(struct kni_dev *kni);
 void kni_net_init(struct net_device *dev);
 void kni_net_config_lo_mode(char *lo_str);
diff --git a/lib/librte_eal/linuxapp/kni/kni_misc.c b/lib/librte_eal/linuxapp/kni/kni_misc.c
index 01574ec..668488b 100644
--- a/lib/librte_eal/linuxapp/kni/kni_misc.c
+++ b/lib/librte_eal/linuxapp/kni/kni_misc.c
@@ -507,6 +507,7 @@  kni_ioctl_release(struct net *net, uint32_t ioctl_num,
 			dev->pthread = NULL;
 		}
 
+		kni_net_fifo_pa2va(dev);
 		kni_dev_remove(dev);
 		list_del(&dev->list);
 		ret = 0;
diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c b/lib/librte_eal/linuxapp/kni/kni_net.c
index 9f9b798..662a527 100644
--- a/lib/librte_eal/linuxapp/kni/kni_net.c
+++ b/lib/librte_eal/linuxapp/kni/kni_net.c
@@ -73,6 +73,21 @@  va2pa(void *va, struct rte_kni_mbuf *m)
 	return pa;
 }
 
+/* convert physical addresses to virtual addresses in fifo for kni release */
+void
+kni_net_fifo_pa2va(struct kni_dev *kni)
+{
+	void *fifo = kni->alloc_q;
+	int i, count = kni_fifo_count(fifo);
+	void *pa = NULL, *kva, *va;
+	for (i = 0; i < count; ++i) {
+		(void)kni_fifo_get(fifo, &pa, 1);
+		kva = pa2kva(pa);
+		va = pa2va(pa, kva);
+		(void)kni_fifo_put(fifo, &va, 1);
+	}
+}
+
 /*
  * It can be called to process the request.
  */
diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index 2867411..f8398a9 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -435,30 +435,6 @@  va2pa(struct rte_mbuf *m)
 			 (unsigned long)m->buf_iova));
 }
 
-static void
-obj_free(struct rte_mempool *mp __rte_unused, void *opaque, void *obj,
-		unsigned obj_idx __rte_unused)
-{
-	struct rte_mbuf *m = obj;
-	void *mbuf_phys = opaque;
-
-	if (va2pa(m) == mbuf_phys)
-		rte_pktmbuf_free(m);
-}
-
-static void
-kni_free_fifo_phy(struct rte_mempool *mp, struct rte_kni_fifo *fifo)
-{
-	void *mbuf_phys;
-	int ret;
-
-	do {
-		ret = kni_fifo_get(fifo, &mbuf_phys, 1);
-		if (ret)
-			rte_mempool_obj_iter(mp, obj_free, mbuf_phys);
-	} while (ret);
-}
-
 int
 rte_kni_release(struct rte_kni *kni)
 {
@@ -484,7 +460,7 @@  rte_kni_release(struct rte_kni *kni)
 	if (kni_fifo_count(kni->rx_q))
 		RTE_LOG(ERR, KNI, "Fail to free all Rx-q items\n");
 
-	kni_free_fifo_phy(kni->pktmbuf_pool, kni->alloc_q);
+	kni_free_fifo(kni->alloc_q);
 	kni_free_fifo(kni->tx_q);
 	kni_free_fifo(kni->free_q);