[dpdk-dev,v2] vhost: fix virtio_net false sharing

Message ID 1490283898-23019-1-git-send-email-ktraynor@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Kevin Traynor March 23, 2017, 3:44 p.m. UTC
  The broadcast_rarp field in the virtio_net struct is checked in the
dequeue datapath regardless of whether descriptors are available or not.

As it is checked with cmpset leading to a write, false sharing on the
virtio_net struct can happen between enqueue and dequeue datapaths
regardless of whether a RARP is requested. In OVS, the issue can cause
a uni-directional performance drop of up to 15%.

Fix that by only performing the cmpset if a read of broadcast_rarp
indicates that the cmpset is likely to succeed.

Fixes: a66bcad32240 ("vhost: arrange struct fields for better cache sharing")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
---

V2:
Change from fixing by moving broadcast_rarp location in virtio_net struct
to performing a read before cmpset.

 lib/librte_vhost/virtio_net.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)
  

Comments

Maxime Coquelin March 27, 2017, 7:34 a.m. UTC | #1
On 03/23/2017 04:44 PM, Kevin Traynor wrote:
> The broadcast_rarp field in the virtio_net struct is checked in the
> dequeue datapath regardless of whether descriptors are available or not.
>
> As it is checked with cmpset leading to a write, false sharing on the
> virtio_net struct can happen between enqueue and dequeue datapaths
> regardless of whether a RARP is requested. In OVS, the issue can cause
> a uni-directional performance drop of up to 15%.
>
> Fix that by only performing the cmpset if a read of broadcast_rarp
> indicates that the cmpset is likely to succeed.
>
> Fixes: a66bcad32240 ("vhost: arrange struct fields for better cache sharing")
> Cc: stable@dpdk.org
>
> Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
> ---
>
> V2:
> Change from fixing by moving broadcast_rarp location in virtio_net struct
> to performing a read before cmpset.
>
>  lib/librte_vhost/virtio_net.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)

Nice!
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

I'll try to benchmark it with testpmd only to see if we measure the
same gain without OVS.

Thanks,
Maxime
  
Yuanhan Liu March 27, 2017, 8:33 a.m. UTC | #2
On Mon, Mar 27, 2017 at 09:34:19AM +0200, Maxime Coquelin wrote:
> 
> 
> On 03/23/2017 04:44 PM, Kevin Traynor wrote:
> >The broadcast_rarp field in the virtio_net struct is checked in the
> >dequeue datapath regardless of whether descriptors are available or not.
> >
> >As it is checked with cmpset leading to a write, false sharing on the
> >virtio_net struct can happen between enqueue and dequeue datapaths
> >regardless of whether a RARP is requested. In OVS, the issue can cause
> >a uni-directional performance drop of up to 15%.
> >
> >Fix that by only performing the cmpset if a read of broadcast_rarp
> >indicates that the cmpset is likely to succeed.
> >
> >Fixes: a66bcad32240 ("vhost: arrange struct fields for better cache sharing")
> >Cc: stable@dpdk.org
> >
> >Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
> >---
> >
> >V2:
> >Change from fixing by moving broadcast_rarp location in virtio_net struct
> >to performing a read before cmpset.
> >
> > lib/librte_vhost/virtio_net.c | 16 ++++++++++++++--
> > 1 file changed, 14 insertions(+), 2 deletions(-)
> 
> Nice!
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Applied to dpdk-next-virtio.

Thanks.

	--yliu
> 
> I'll try to benchmark it with testpmd only to see if we measure the
> same gain without OVS.
> 
> Thanks,
> Maxime
  

Patch

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 337470d..d0a3b11 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1057,7 +1057,19 @@  static inline bool __attribute__((always_inline))
 	 *
 	 * Check user_send_rarp() for more information.
+	 *
+	 * broadcast_rarp shares a cacheline in the virtio_net structure
+	 * with some fields that are accessed during enqueue and
+	 * rte_atomic16_cmpset() causes a write if using cmpxchg. This could
+	 * result in false sharing between enqueue and dequeue.
+	 *
+	 * Prevent unnecessary false sharing by reading broadcast_rarp first
+	 * and only performing cmpset if the read indicates it is likely to
+	 * be set.
 	 */
-	if (unlikely(rte_atomic16_cmpset((volatile uint16_t *)
-					 &dev->broadcast_rarp.cnt, 1, 0))) {
+
+	if (unlikely(rte_atomic16_read(&dev->broadcast_rarp) &&
+			rte_atomic16_cmpset((volatile uint16_t *)
+				&dev->broadcast_rarp.cnt, 1, 0))) {
+
 		rarp_mbuf = rte_pktmbuf_alloc(mbuf_pool);
 		if (rarp_mbuf == NULL) {