[dpdk-dev,4/5] net/vhost: remove limit of vhost TX burst size

Message ID 1487926101-4637-5-git-send-email-zhiyong.yang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Yang, Zhiyong Feb. 24, 2017, 8:48 a.m. UTC
  vhost removes limit of TX burst size(32 pkts) and supports to make
an best effort to transmit pkts.

Cc: yuanhan.liu@linux.intel.com
Cc: maxime.coquelin@redhat.com

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
---
 drivers/net/vhost/rte_eth_vhost.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)
  

Comments

Kevin Traynor Feb. 24, 2017, 11:08 a.m. UTC | #1
On 02/24/2017 08:48 AM, Zhiyong Yang wrote:
> vhost removes limit of TX burst size(32 pkts) and supports to make
> an best effort to transmit pkts.
> 
> Cc: yuanhan.liu@linux.intel.com
> Cc: maxime.coquelin@redhat.com
> 
> Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> ---
>  drivers/net/vhost/rte_eth_vhost.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
> index e98cffd..1e1fa34 100644
> --- a/drivers/net/vhost/rte_eth_vhost.c
> +++ b/drivers/net/vhost/rte_eth_vhost.c
> @@ -52,6 +52,7 @@
>  #define ETH_VHOST_QUEUES_ARG		"queues"
>  #define ETH_VHOST_CLIENT_ARG		"client"
>  #define ETH_VHOST_DEQUEUE_ZERO_COPY	"dequeue-zero-copy"
> +#define VHOST_MAX_PKT_BURST 32
>  
>  static const char *valid_arguments[] = {
>  	ETH_VHOST_IFACE_ARG,
> @@ -434,8 +435,27 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
>  		goto out;
>  
>  	/* Enqueue packets to guest RX queue */
> -	nb_tx = rte_vhost_enqueue_burst(r->vid,
> -			r->virtqueue_id, bufs, nb_bufs);
> +	if (likely(nb_bufs <= VHOST_MAX_PKT_BURST))
> +		nb_tx = rte_vhost_enqueue_burst(r->vid, r->virtqueue_id,
> +						bufs, nb_bufs);
> +	else {
> +		uint16_t nb_send = nb_bufs;
> +
> +		while (nb_send) {
> +			uint16_t nb_pkts;
> +			uint16_t num = (uint16_t)RTE_MIN(nb_send,
> +					VHOST_MAX_PKT_BURST);
> +
> +			nb_pkts = rte_vhost_enqueue_burst(r->vid,
> +							  r->virtqueue_id,
> +							  &bufs[nb_tx], num);
> +
> +			nb_tx += nb_pkts;
> +			nb_send -= nb_pkts;
> +			if (nb_pkts < num)
> +				break;
> +		}

In the code above,
- if the VM does not service the queue, this will spin forever
- if the queue is almost full, it will be very slow

In the example of OVS, other ports being serviced by the same core would
stop being serviced or drop a lot of packets.

If you want to remove the 32 pkt limitation, the
rte_vhost_enqueue_burst() api limitation needs to be changed first. It
doesn't make sense to put retry code in eth_vhost_tx() fn, the
application can retry if it wants to.

> +	}
>  
>  	r->stats.pkts += nb_tx;
>  	r->stats.missed_pkts += nb_bufs - nb_tx;
>
  
Bruce Richardson Feb. 24, 2017, 1:04 p.m. UTC | #2
On Fri, Feb 24, 2017 at 11:08:56AM +0000, Kevin Traynor wrote:
> On 02/24/2017 08:48 AM, Zhiyong Yang wrote:
> > vhost removes limit of TX burst size(32 pkts) and supports to make
> > an best effort to transmit pkts.
> > 
> > Cc: yuanhan.liu@linux.intel.com
> > Cc: maxime.coquelin@redhat.com
> > 
> > Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> > ---
> >  drivers/net/vhost/rte_eth_vhost.c | 24 ++++++++++++++++++++++--
> >  1 file changed, 22 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
> > index e98cffd..1e1fa34 100644
> > --- a/drivers/net/vhost/rte_eth_vhost.c
> > +++ b/drivers/net/vhost/rte_eth_vhost.c
> > @@ -52,6 +52,7 @@
> >  #define ETH_VHOST_QUEUES_ARG		"queues"
> >  #define ETH_VHOST_CLIENT_ARG		"client"
> >  #define ETH_VHOST_DEQUEUE_ZERO_COPY	"dequeue-zero-copy"
> > +#define VHOST_MAX_PKT_BURST 32
> >  
> >  static const char *valid_arguments[] = {
> >  	ETH_VHOST_IFACE_ARG,
> > @@ -434,8 +435,27 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
> >  		goto out;
> >  
> >  	/* Enqueue packets to guest RX queue */
> > -	nb_tx = rte_vhost_enqueue_burst(r->vid,
> > -			r->virtqueue_id, bufs, nb_bufs);
> > +	if (likely(nb_bufs <= VHOST_MAX_PKT_BURST))
> > +		nb_tx = rte_vhost_enqueue_burst(r->vid, r->virtqueue_id,
> > +						bufs, nb_bufs);
> > +	else {
> > +		uint16_t nb_send = nb_bufs;
> > +
> > +		while (nb_send) {
> > +			uint16_t nb_pkts;
> > +			uint16_t num = (uint16_t)RTE_MIN(nb_send,
> > +					VHOST_MAX_PKT_BURST);
> > +
> > +			nb_pkts = rte_vhost_enqueue_burst(r->vid,
> > +							  r->virtqueue_id,
> > +							  &bufs[nb_tx], num);
> > +
> > +			nb_tx += nb_pkts;
> > +			nb_send -= nb_pkts;
> > +			if (nb_pkts < num)
> > +				break;
> > +		}
> 
> In the code above,
> - if the VM does not service the queue, this will spin forever
I don't think that is the case. As soon as the enqueue stops sending a
full burst of (presumably 32) pkts, it will hit the break condition and
quit. If it does send the full burst, it makes good forward progress
until it runs out of packets to send and then quits.

> - if the queue is almost full, it will be very slow
Again, should not be the case. As soon as a full burst is not full
enqueued the logic quits the loop.

/Bruce
  
Kevin Traynor Feb. 24, 2017, 1:33 p.m. UTC | #3
On 02/24/2017 01:04 PM, Bruce Richardson wrote:
> On Fri, Feb 24, 2017 at 11:08:56AM +0000, Kevin Traynor wrote:
>> On 02/24/2017 08:48 AM, Zhiyong Yang wrote:
>>> vhost removes limit of TX burst size(32 pkts) and supports to make
>>> an best effort to transmit pkts.
>>>
>>> Cc: yuanhan.liu@linux.intel.com
>>> Cc: maxime.coquelin@redhat.com
>>>
>>> Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
>>> ---
>>>  drivers/net/vhost/rte_eth_vhost.c | 24 ++++++++++++++++++++++--
>>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
>>> index e98cffd..1e1fa34 100644
>>> --- a/drivers/net/vhost/rte_eth_vhost.c
>>> +++ b/drivers/net/vhost/rte_eth_vhost.c
>>> @@ -52,6 +52,7 @@
>>>  #define ETH_VHOST_QUEUES_ARG		"queues"
>>>  #define ETH_VHOST_CLIENT_ARG		"client"
>>>  #define ETH_VHOST_DEQUEUE_ZERO_COPY	"dequeue-zero-copy"
>>> +#define VHOST_MAX_PKT_BURST 32
>>>  
>>>  static const char *valid_arguments[] = {
>>>  	ETH_VHOST_IFACE_ARG,
>>> @@ -434,8 +435,27 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
>>>  		goto out;
>>>  
>>>  	/* Enqueue packets to guest RX queue */
>>> -	nb_tx = rte_vhost_enqueue_burst(r->vid,
>>> -			r->virtqueue_id, bufs, nb_bufs);
>>> +	if (likely(nb_bufs <= VHOST_MAX_PKT_BURST))
>>> +		nb_tx = rte_vhost_enqueue_burst(r->vid, r->virtqueue_id,
>>> +						bufs, nb_bufs);
>>> +	else {
>>> +		uint16_t nb_send = nb_bufs;
>>> +
>>> +		while (nb_send) {
>>> +			uint16_t nb_pkts;
>>> +			uint16_t num = (uint16_t)RTE_MIN(nb_send,
>>> +					VHOST_MAX_PKT_BURST);
>>> +
>>> +			nb_pkts = rte_vhost_enqueue_burst(r->vid,
>>> +							  r->virtqueue_id,
>>> +							  &bufs[nb_tx], num);
>>> +
>>> +			nb_tx += nb_pkts;
>>> +			nb_send -= nb_pkts;
>>> +			if (nb_pkts < num)
>>> +				break;
>>> +		}
>>
>> In the code above,
>> - if the VM does not service the queue, this will spin forever
> I don't think that is the case. As soon as the enqueue stops sending a
> full burst of (presumably 32) pkts, it will hit the break condition and
> quit. If it does send the full burst, it makes good forward progress
> until it runs out of packets to send and then quits.
> 
>> - if the queue is almost full, it will be very slow
> Again, should not be the case. As soon as a full burst is not full
> enqueued the logic quits the loop.
> 

My bad - you are of course correct. In that case it makes sense, as the
retries are just enough to allow all packets a chance to be sent but not
to retry when packets fail to send.

thanks,
Kevin.

> /Bruce
>
  
Maxime Coquelin March 1, 2017, 9:44 a.m. UTC | #4
On 02/24/2017 09:48 AM, Zhiyong Yang wrote:
> vhost removes limit of TX burst size(32 pkts) and supports to make
> an best effort to transmit pkts.
>
> Cc: yuanhan.liu@linux.intel.com
> Cc: maxime.coquelin@redhat.com
>
> Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> ---
>  drivers/net/vhost/rte_eth_vhost.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
> index e98cffd..1e1fa34 100644
> --- a/drivers/net/vhost/rte_eth_vhost.c
> +++ b/drivers/net/vhost/rte_eth_vhost.c
> @@ -52,6 +52,7 @@
>  #define ETH_VHOST_QUEUES_ARG		"queues"
>  #define ETH_VHOST_CLIENT_ARG		"client"
>  #define ETH_VHOST_DEQUEUE_ZERO_COPY	"dequeue-zero-copy"
> +#define VHOST_MAX_PKT_BURST 32
>
>  static const char *valid_arguments[] = {
>  	ETH_VHOST_IFACE_ARG,
> @@ -434,8 +435,27 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
>  		goto out;
>
>  	/* Enqueue packets to guest RX queue */
> -	nb_tx = rte_vhost_enqueue_burst(r->vid,
> -			r->virtqueue_id, bufs, nb_bufs);
> +	if (likely(nb_bufs <= VHOST_MAX_PKT_BURST))
> +		nb_tx = rte_vhost_enqueue_burst(r->vid, r->virtqueue_id,
> +						bufs, nb_bufs);
> +	else {
> +		uint16_t nb_send = nb_bufs;
> +
> +		while (nb_send) {
> +			uint16_t nb_pkts;
> +			uint16_t num = (uint16_t)RTE_MIN(nb_send,
> +					VHOST_MAX_PKT_BURST);
> +
> +			nb_pkts = rte_vhost_enqueue_burst(r->vid,
> +							  r->virtqueue_id,
> +							  &bufs[nb_tx], num);
> +
> +			nb_tx += nb_pkts;
> +			nb_send -= nb_pkts;
> +			if (nb_pkts < num)
> +				break;
> +		}
> +	}
It looks like the if/else could be avoided, but maybe you did so for
performance reason?
If this is the case, maybe you could add a comment or at least state
this in the commit message.

Thanks,
Maxime
>
>  	r->stats.pkts += nb_tx;
>  	r->stats.missed_pkts += nb_bufs - nb_tx;
>
  
Yang, Zhiyong March 1, 2017, 1:24 p.m. UTC | #5
Hi, Maxime:

> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Wednesday, March 1, 2017 5:44 PM
> To: Yang, Zhiyong <zhiyong.yang@intel.com>; dev@dpdk.org
> Cc: yuanhan.liu@linux.intel.com
> Subject: Re: [PATCH 4/5] net/vhost: remove limit of vhost TX burst size
> 
> 
> 
> On 02/24/2017 09:48 AM, Zhiyong Yang wrote:
> > vhost removes limit of TX burst size(32 pkts) and supports to make an
> > best effort to transmit pkts.
> >
> > Cc: yuanhan.liu@linux.intel.com
> > Cc: maxime.coquelin@redhat.com
> >
> > Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
> > ---
> >  drivers/net/vhost/rte_eth_vhost.c | 24 ++++++++++++++++++++++--
> >  1 file changed, 22 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/vhost/rte_eth_vhost.c
> > b/drivers/net/vhost/rte_eth_vhost.c
> > index e98cffd..1e1fa34 100644
> > --- a/drivers/net/vhost/rte_eth_vhost.c
> > +++ b/drivers/net/vhost/rte_eth_vhost.c
> > @@ -52,6 +52,7 @@
> >  #define ETH_VHOST_QUEUES_ARG		"queues"
> >  #define ETH_VHOST_CLIENT_ARG		"client"
> >  #define ETH_VHOST_DEQUEUE_ZERO_COPY	"dequeue-zero-copy"
> > +#define VHOST_MAX_PKT_BURST 32
> >
> >  static const char *valid_arguments[] = {
> >  	ETH_VHOST_IFACE_ARG,
> > @@ -434,8 +435,27 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs,
> uint16_t nb_bufs)
> >  		goto out;
> >
> >  	/* Enqueue packets to guest RX queue */
> > -	nb_tx = rte_vhost_enqueue_burst(r->vid,
> > -			r->virtqueue_id, bufs, nb_bufs);
> > +	if (likely(nb_bufs <= VHOST_MAX_PKT_BURST))
> > +		nb_tx = rte_vhost_enqueue_burst(r->vid, r->virtqueue_id,
> > +						bufs, nb_bufs);
> > +	else {
> > +		uint16_t nb_send = nb_bufs;
> > +
> > +		while (nb_send) {
> > +			uint16_t nb_pkts;
> > +			uint16_t num = (uint16_t)RTE_MIN(nb_send,
> > +					VHOST_MAX_PKT_BURST);
> > +
> > +			nb_pkts = rte_vhost_enqueue_burst(r->vid,
> > +							  r->virtqueue_id,
> > +							  &bufs[nb_tx], num);
> > +
> > +			nb_tx += nb_pkts;
> > +			nb_send -= nb_pkts;
> > +			if (nb_pkts < num)
> > +				break;
> > +		}
> > +	}
> It looks like the if/else could be avoided, but maybe you did so for
> performance reason?
> If this is the case, maybe you could add a comment or at least state this in the
> commit message.

Yes, you are right.
if/else can be avoided and code will  look more clean.
I  choose performance between them. 
Comments will be added in V2 here.

Thanks
Zhiyong

> 
> Thanks,
> Maxime
> >
> >  	r->stats.pkts += nb_tx;
> >  	r->stats.missed_pkts += nb_bufs - nb_tx;
> >
  

Patch

diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
index e98cffd..1e1fa34 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -52,6 +52,7 @@ 
 #define ETH_VHOST_QUEUES_ARG		"queues"
 #define ETH_VHOST_CLIENT_ARG		"client"
 #define ETH_VHOST_DEQUEUE_ZERO_COPY	"dequeue-zero-copy"
+#define VHOST_MAX_PKT_BURST 32
 
 static const char *valid_arguments[] = {
 	ETH_VHOST_IFACE_ARG,
@@ -434,8 +435,27 @@  eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
 		goto out;
 
 	/* Enqueue packets to guest RX queue */
-	nb_tx = rte_vhost_enqueue_burst(r->vid,
-			r->virtqueue_id, bufs, nb_bufs);
+	if (likely(nb_bufs <= VHOST_MAX_PKT_BURST))
+		nb_tx = rte_vhost_enqueue_burst(r->vid, r->virtqueue_id,
+						bufs, nb_bufs);
+	else {
+		uint16_t nb_send = nb_bufs;
+
+		while (nb_send) {
+			uint16_t nb_pkts;
+			uint16_t num = (uint16_t)RTE_MIN(nb_send,
+					VHOST_MAX_PKT_BURST);
+
+			nb_pkts = rte_vhost_enqueue_burst(r->vid,
+							  r->virtqueue_id,
+							  &bufs[nb_tx], num);
+
+			nb_tx += nb_pkts;
+			nb_send -= nb_pkts;
+			if (nb_pkts < num)
+				break;
+		}
+	}
 
 	r->stats.pkts += nb_tx;
 	r->stats.missed_pkts += nb_bufs - nb_tx;