[v2] net: adjust the header length parse size

Message ID 20200905030646.374157-1-haiyue.wang@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] net: adjust the header length parse size |

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/travis-robot success Travis build: errored
ci/checkpatch success coding style OK

Commit Message

Wang, Haiyue Sept. 5, 2020, 3:06 a.m. UTC
  Align to the rte_mbuf's design about Tx header length data size for the
header length parse result.
	struct {
		uint64_t   l2_len:7;             /*    88: 0  8 */
		uint64_t   l3_len:9;             /*    88: 7  8 */
		uint64_t   l4_len:8;             /*    88:16  8 */
		uint64_t   tso_segsz:16;         /*    88:24  8 */
		uint64_t   outer_l3_len:9;       /*    88:40  8 */
		uint64_t   outer_l2_len:7;       /*    88:49  8 */
	};

Now the IPv6 can support bigger extension header.

The below is the structure hole analysis result:

Before:
struct rte_net_hdr_lens {
        uint8_t                    l2_len;               /*     0     1 */
        uint8_t                    l3_len;               /*     1     1 */
        uint8_t                    l4_len;               /*     2     1 */
        uint8_t                    tunnel_len;           /*     3     1 */
        uint8_t                    inner_l2_len;         /*     4     1 */
        uint8_t                    inner_l3_len;         /*     5     1 */
        uint8_t                    inner_l4_len;         /*     6     1 */

        /* size: 7, cachelines: 1, members: 7 */
        /* last cacheline: 7 bytes */
};

Now:
struct rte_net_hdr_lens {
        uint64_t                   l2_len:7;             /*     0: 0  8 */
        uint64_t                   l3_len:9;             /*     0: 7  8 */
        uint64_t                   l4_len:8;             /*     0:16  8 */
        uint64_t                   tunnel_len:8;         /*     0:24  8 */
        uint64_t                   inner_l2_len:7;       /*     0:32  8 */
        uint64_t                   inner_l3_len:9;       /*     0:39  8 */
        uint64_t                   inner_l4_len:8;       /*     0:48  8 */

        /* size: 8, cachelines: 1, members: 7 */
        /* bit_padding: 8 bits */
        /* last cacheline: 8 bytes */
};

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
---
v2: use bit field to avoid creating a structure hole.
---
 lib/librte_net/rte_net.h | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)
  

Comments

Stephen Hemminger Sept. 5, 2020, 4:56 p.m. UTC | #1
On Sat,  5 Sep 2020 11:06:46 +0800
Haiyue Wang <haiyue.wang@intel.com> wrote:

> Align to the rte_mbuf's design about Tx header length data size for the
> header length parse result.
> 	struct {
> 		uint64_t   l2_len:7;             /*    88: 0  8 */
> 		uint64_t   l3_len:9;             /*    88: 7  8 */
> 		uint64_t   l4_len:8;             /*    88:16  8 */
> 		uint64_t   tso_segsz:16;         /*    88:24  8 */
> 		uint64_t   outer_l3_len:9;       /*    88:40  8 */
> 		uint64_t   outer_l2_len:7;       /*    88:49  8 */
> 	};
> 
> Now the IPv6 can support bigger extension header.
> 
> The below is the structure hole analysis result:
> 
> Before:
> struct rte_net_hdr_lens {
>         uint8_t                    l2_len;               /*     0     1 */
>         uint8_t                    l3_len;               /*     1     1 */
>         uint8_t                    l4_len;               /*     2     1 */
>         uint8_t                    tunnel_len;           /*     3     1 */
>         uint8_t                    inner_l2_len;         /*     4     1 */
>         uint8_t                    inner_l3_len;         /*     5     1 */
>         uint8_t                    inner_l4_len;         /*     6     1 */
> 
>         /* size: 7, cachelines: 1, members: 7 */
>         /* last cacheline: 7 bytes */
> };
> 
> Now:
> struct rte_net_hdr_lens {
>         uint64_t                   l2_len:7;             /*     0: 0  8 */
>         uint64_t                   l3_len:9;             /*     0: 7  8 */
>         uint64_t                   l4_len:8;             /*     0:16  8 */
>         uint64_t                   tunnel_len:8;         /*     0:24  8 */
>         uint64_t                   inner_l2_len:7;       /*     0:32  8 */
>         uint64_t                   inner_l3_len:9;       /*     0:39  8 */
>         uint64_t                   inner_l4_len:8;       /*     0:48  8 */
> 
>         /* size: 8, cachelines: 1, members: 7 */
>         /* bit_padding: 8 bits */
>         /* last cacheline: 8 bytes */
> };
> 
> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>

Bitfields are slow to access, compiler has to do mask/shift operations.
And there is no requirement that structure be the same size.

There is no requirement that fields be ordered the same as
the protocol header. Also tunnel length might get big.
Why not:

struct rte_net_hdr_lens {
	uint8_t l2_len;
	uint8_t inner_l2_len;
	uint16_t l3_len;
	uint16_t inner_l3_len;
	uint16_t tunnel_len;
	uint8_t l4_len;
	uint8_t inner_l4_len;
};
  
Wang, Haiyue Sept. 7, 2020, 2:14 a.m. UTC | #2
Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Sunday, September 6, 2020 00:56
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>; Olivier Matz <olivier.matz@6wind.com>
> Subject: Re: [PATCH v2] net: adjust the header length parse size
> 
> On Sat,  5 Sep 2020 11:06:46 +0800
> Haiyue Wang <haiyue.wang@intel.com> wrote:
> 
> > Align to the rte_mbuf's design about Tx header length data size for the
> > header length parse result.
> > 	struct {
> > 		uint64_t   l2_len:7;             /*    88: 0  8 */
> > 		uint64_t   l3_len:9;             /*    88: 7  8 */
> > 		uint64_t   l4_len:8;             /*    88:16  8 */
> > 		uint64_t   tso_segsz:16;         /*    88:24  8 */
> > 		uint64_t   outer_l3_len:9;       /*    88:40  8 */
> > 		uint64_t   outer_l2_len:7;       /*    88:49  8 */
> > 	};
> >
> > Now the IPv6 can support bigger extension header.
> >
> > The below is the structure hole analysis result:
> >
> > Before:
> > struct rte_net_hdr_lens {
> >         uint8_t                    l2_len;               /*     0     1 */
> >         uint8_t                    l3_len;               /*     1     1 */
> >         uint8_t                    l4_len;               /*     2     1 */
> >         uint8_t                    tunnel_len;           /*     3     1 */
> >         uint8_t                    inner_l2_len;         /*     4     1 */
> >         uint8_t                    inner_l3_len;         /*     5     1 */
> >         uint8_t                    inner_l4_len;         /*     6     1 */
> >
> >         /* size: 7, cachelines: 1, members: 7 */
> >         /* last cacheline: 7 bytes */
> > };
> >
> > Now:
> > struct rte_net_hdr_lens {
> >         uint64_t                   l2_len:7;             /*     0: 0  8 */
> >         uint64_t                   l3_len:9;             /*     0: 7  8 */
> >         uint64_t                   l4_len:8;             /*     0:16  8 */
> >         uint64_t                   tunnel_len:8;         /*     0:24  8 */
> >         uint64_t                   inner_l2_len:7;       /*     0:32  8 */
> >         uint64_t                   inner_l3_len:9;       /*     0:39  8 */
> >         uint64_t                   inner_l4_len:8;       /*     0:48  8 */
> >
> >         /* size: 8, cachelines: 1, members: 7 */
> >         /* bit_padding: 8 bits */
> >         /* last cacheline: 8 bytes */
> > };
> >
> > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> 
> Bitfields are slow to access, compiler has to do mask/shift operations.

Yes, you are right, I use rdtsc to trace the rte_net_get_ptype run clock
about 100000 times, bitfields is near 40, but original is about 30.


> And there is no requirement that structure be the same size.
> 
> There is no requirement that fields be ordered the same as
> the protocol header. Also tunnel length might get big.
> Why not:
> 
> struct rte_net_hdr_lens {
> 	uint8_t l2_len;
> 	uint8_t inner_l2_len;
> 	uint16_t l3_len;
> 	uint16_t inner_l3_len;
> 	uint16_t tunnel_len;
> 	uint8_t l4_len;
> 	uint8_t inner_l4_len;
> };
> 

Thanks for your comment, this is better, and in v3. ;-) The run clock is nearly
the same as original type.
  

Patch

diff --git a/lib/librte_net/rte_net.h b/lib/librte_net/rte_net.h
index 94b06d9ee..a14e3d814 100644
--- a/lib/librte_net/rte_net.h
+++ b/lib/librte_net/rte_net.h
@@ -18,14 +18,15 @@  extern "C" {
  * Structure containing header lengths associated to a packet, filled
  * by rte_net_get_ptype().
  */
+__extension__
 struct rte_net_hdr_lens {
-	uint8_t l2_len;
-	uint8_t l3_len;
-	uint8_t l4_len;
-	uint8_t tunnel_len;
-	uint8_t inner_l2_len;
-	uint8_t inner_l3_len;
-	uint8_t inner_l4_len;
+	uint64_t l2_len:7;
+	uint64_t l3_len:9;
+	uint64_t l4_len:8;
+	uint64_t tunnel_len:8;
+	uint64_t inner_l2_len:7;
+	uint64_t inner_l3_len:9;
+	uint64_t inner_l4_len:8;
 };
 
 /**