[dpdk-dev] [PATCH v6 1/3] lib: add Generic Receive Offload API framework
Tan, Jianfeng
jianfeng.tan at intel.com
Sun Jun 25 18:53:42 CEST 2017
Hi Jiayu,
On 6/23/2017 10:43 PM, Jiayu Hu wrote:
> Generic Receive Offload (GRO) is a widely used SW-based offloading
> technique to reduce per-packet processing overhead. It gains
> performance by reassembling small packets into large ones. This
> patchset is to support GRO in DPDK. To support GRO, this patch
> implements a GRO API framework.
>
> To enable more flexibility to applications, DPDK GRO is implemented as
> a user library. Applications explicitly use the GRO library to merge
> small packets into large ones. DPDK GRO provides two reassembly modes.
> One is called lightweigth mode, the other is called heavyweight mode.
> If applications want merge packets in a simple way, they can use
> lightweigth mode. If applications need more fine-grained controls,
> they can choose heavyweigth mode.
>
> rte_gro_reassemble_burst is the main reassembly API which is used in
> lightweigth mode and processes N packets at a time. For applications,
> performing GRO in lightweigth mode is simple. They just need to invoke
> rte_gro_reassemble_burst. Applications can get GROed packets as soon as
> rte_gro_reassemble_burst returns.
>
> rte_gro_reassemble is the main reassembly API which is used in
> heavyweight mode and processes one packet at a time. For applications,
> performing GRO in heavyweigth mode is relatively complicated. Before
> performing GRO, applications need to create a GRO table by
> rte_gro_tbl_create. Then they can use rte_gro_reassemble to process
> packets one by one. The processed packets are in the GRO table. If
> applications want to get them, applications need to manually flush
> them by flush APIs.
>
> Signed-off-by: Jiayu Hu <jiayu.hu at intel.com>
> ---
> config/common_base | 5 +
> lib/Makefile | 2 +
> lib/librte_gro/Makefile | 50 ++++++++++
> lib/librte_gro/rte_gro.c | 125 ++++++++++++++++++++++++
> lib/librte_gro/rte_gro.h | 191 +++++++++++++++++++++++++++++++++++++
> lib/librte_gro/rte_gro_version.map | 12 +++
> mk/rte.app.mk | 1 +
> 7 files changed, 386 insertions(+)
> create mode 100644 lib/librte_gro/Makefile
> create mode 100644 lib/librte_gro/rte_gro.c
> create mode 100644 lib/librte_gro/rte_gro.h
> create mode 100644 lib/librte_gro/rte_gro_version.map
>
> diff --git a/config/common_base b/config/common_base
> index f6aafd1..167f5ef 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -712,6 +712,11 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
> CONFIG_RTE_LIBRTE_PMD_VHOST=n
>
> #
> +# Compile GRO library
> +#
> +CONFIG_RTE_LIBRTE_GRO=y
> +
> +#
> #Compile Xen domain0 support
> #
> CONFIG_RTE_LIBRTE_XEN_DOM0=n
> diff --git a/lib/Makefile b/lib/Makefile
> index 07e1fd0..ac1c2f6 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -106,6 +106,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
> DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
> DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
> DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
> +DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro
> +DEPDIRS-librte_gro := librte_eal librte_mbuf librte_ether librte_net
>
> ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile
> new file mode 100644
> index 0000000..7e0f128
> --- /dev/null
> +++ b/lib/librte_gro/Makefile
> @@ -0,0 +1,50 @@
> +# BSD LICENSE
> +#
> +# Copyright(c) 2017 Intel Corporation. All rights reserved.
> +# All rights reserved.
> +#
> +# Redistribution and use in source and binary forms, with or without
> +# modification, are permitted provided that the following conditions
> +# are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +# notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +# notice, this list of conditions and the following disclaimer in
> +# the documentation and/or other materials provided with the
> +# distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +# contributors may be used to endorse or promote products derived
> +# from this software without specific prior written permission.
> +#
> +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_gro.a
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> +
> +EXPORT_MAP := rte_gro_version.map
> +
> +LIBABIVER := 1
> +
> +# source files
> +SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
> new file mode 100644
> index 0000000..ebc545f
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro.c
> @@ -0,0 +1,125 @@
> +/*-
> + * BSD LICENSE
> + *
> + * Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in
> + * the documentation and/or other materials provided with the
> + * distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + * contributors may be used to endorse or promote products derived
> + * from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <rte_malloc.h>
> +#include <rte_mbuf.h>
> +
> +#include "rte_gro.h"
> +
> +static gro_tbl_create_fn tbl_create_functions[GRO_TYPE_MAX_NB];
> +static gro_tbl_destroy_fn tbl_destroy_functions[GRO_TYPE_MAX_NB];
> +
> +struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
> + uint16_t max_flow_num,
> + uint16_t max_item_per_flow,
> + uint32_t max_packet_size,
> + uint64_t max_timeout_cycles,
> + uint64_t desired_gro_types)
> +{
> + gro_tbl_create_fn create_tbl_fn;
> + struct rte_gro_tbl *gro_tbl;
> + uint64_t gro_type_flag = 0;
> + uint8_t i;
> +
> + gro_tbl = rte_zmalloc_socket(__func__,
> + sizeof(struct rte_gro_tbl),
> + RTE_CACHE_LINE_SIZE,
> + socket_id);
> + gro_tbl->max_packet_size = max_packet_size;
> + gro_tbl->max_timeout_cycles = max_timeout_cycles;
> + gro_tbl->desired_gro_types = desired_gro_types;
> +
> + for (i = 0; i < GRO_TYPE_MAX_NB; i++) {
> + gro_type_flag = 1 << i;
> + if (desired_gro_types & gro_type_flag) {
> + create_tbl_fn = tbl_create_functions[i];
> + if (create_tbl_fn)
> + create_tbl_fn(socket_id,
> + max_flow_num,
> + max_item_per_flow);
> + else
> + gro_tbl->tbls[i] = NULL;
> + }
> + }
> + return gro_tbl;
> +}
> +
> +void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl)
> +{
> + gro_tbl_destroy_fn destroy_tbl_fn;
> + uint64_t gro_type_flag;
> + uint8_t i;
> +
> + if (gro_tbl == NULL)
> + return;
> + for (i = 0; i < GRO_TYPE_MAX_NB; i++) {
> + gro_type_flag = 1 << i;
> + if (gro_tbl->desired_gro_types & gro_type_flag) {
> + destroy_tbl_fn = tbl_destroy_functions[i];
> + if (destroy_tbl_fn)
> + destroy_tbl_fn(gro_tbl->tbls[i]);
> + gro_tbl->tbls[i] = NULL;
> + }
> + }
> + rte_free(gro_tbl);
> +}
> +
> +uint16_t
> +rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
> + const uint16_t nb_pkts,
> + const struct rte_gro_param param __rte_unused)
> +{
> + return nb_pkts;
> +}
> +
> +int rte_gro_reassemble(struct rte_mbuf *pkt __rte_unused,
> + struct rte_gro_tbl *gro_tbl __rte_unused)
> +{
> + return -1;
> +}
> +
> +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> + uint64_t desired_gro_types __rte_unused,
> + struct rte_mbuf **out __rte_unused,
> + const uint16_t max_nb_out __rte_unused)
> +{
> + return 0;
> +}
> +
> +uint16_t
> +rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> + uint64_t desired_gro_types __rte_unused,
> + struct rte_mbuf **out __rte_unused,
> + const uint16_t max_nb_out __rte_unused)
> +{
> + return 0;
> +}
> diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
> new file mode 100644
> index 0000000..2c547fa
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro.h
> @@ -0,0 +1,191 @@
> +/*-
> + * BSD LICENSE
> + *
> + * Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in
> + * the documentation and/or other materials provided with the
> + * distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + * contributors may be used to endorse or promote products derived
> + * from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_GRO_H_
> +#define _RTE_GRO_H_
Below code snippet is missing:
#ifdef __cplusplus
extern "C" {
#endif
> +
> +/* max number of supported GRO types */
> +#define GRO_TYPE_MAX_NB 64
> +#define GRO_TYPE_SUPPORT_NB 0 /**< current supported GRO num */
I prefer to use _NUM to _NB. Not a strong objection.
> +
> +/**
> + * GRO table, which is used to merge packets. It keeps many reassembly
> + * tables of desired GRO types. Applications need to create GRO tables
> + * before using rte_gro_reassemble to perform GRO.
> + */
> +struct rte_gro_tbl {
> + uint64_t desired_gro_types; /**< GRO types to perform */
> + /* max TTL measured in nanosecond */
> + uint64_t max_timeout_cycles;
> + /* max length of merged packet measured in byte */
> + uint32_t max_packet_size;
> + /* reassebly tables of desired GRO types */
> + void *tbls[GRO_TYPE_MAX_NB];
> +};
> +
> +struct rte_gro_param {
> + uint64_t desired_gro_types; /**< desired GRO types */
> + uint32_t max_packet_size; /**< max length of merged packets */
> + uint16_t max_flow_num; /**< max flow number */
> + uint16_t max_item_per_flow; /**< max packet number per flow */
> +};
> +
> +typedef void *(*gro_tbl_create_fn)(uint16_t socket_id,
> + uint16_t max_flow_num,
> + uint16_t max_item_per_flow);
> +typedef void (*gro_tbl_destroy_fn)(void *tbl);
> +
> +/**
> + * This function create a GRO table, which is used to merge packets.
> + *
> + * @param socket_id
> + * socket index where the Ethernet port connects to.
> + * @param max_flow_num
> + * max number of flows in the GRO table.
> + * @param max_item_per_flow
> + * max packet number per flow. We use the value of (max_flow_num *
> + * max_item_per_fow) to calculate table size.
> + * @param max_packet_size
> + * max length of merged packets. Measured in byte.
> + * @param max_timeout_cycles
> + * max TTL for a packet in the GRO table. It's measured in nanosecond.
> + * @param desired_gro_types
> + * GRO types to perform.
> + * @return
> + * if create successfully, return a pointer which points to the GRO
> + * table. Otherwise, return NULL.
> + */
> +struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
> + uint16_t max_flow_num,
> + uint16_t max_item_per_flow,
> + uint32_t max_packet_size,
> + uint64_t max_timeout_cycles,
> + uint64_t desired_gro_types);
> +/**
> + * This function destroys a GRO table.
> + */
> +void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl);
> +
> +/**
> + * This is one of the main reassembly APIs, which merges numbers of
> + * packets at a time. It assumes that all inputted packets are with
> + * correct checksums. That is, applications should guarantee all
> + * inputted packets are correct. Besides, it doesn't re-calculate
> + * checksums for merged packets. If inputted packets are IP fragmented,
> + * this function assumes them are complete (i.e. with L4 header). After
> + * finishing processing, it returns all GROed packets to applications
> + * immediately.
> + *
> + * @param pkts
> + * a pointer array which points to the packets to reassemble. Besides,
> + * it keeps addresses of GROed packets.
> + * @param nb_pkts
> + * the number of packets to reassemble.
> + * @param param
> + * applications use it to tell rte_gro_reassemble_burst what rules
> + * are demanded.
> + * @return
> + * the number of packets after GROed.
> + */
> +uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> + const uint16_t nb_pkts,
> + const struct rte_gro_param param);
> +
> +/**
> + * Reassembly function, which tries to merge the inputted packet with
> + * one packet in a given GRO table. This function assumes the inputted
> + * packet is with correct checksums. And it won't update checksums if
> + * two packets are merged. Besides, if the inputted packet is IP
> + * fragmented, this function assumes it's a complete packet (i.e. with
> + * L4 header).
> + *
> + * If the inputted packet doesn't have data or it's with unsupported GRO
> + * type, function returns immediately. Otherwise, the inputted packet is
> + * either merged or inserted into the table. If applications want get
> + * packets in the table, they need to call flush APIs.
> + *
> + * @param pkt
> + * packet to reassemble.
> + * @param gro_tbl
> + * a pointer points to a GRO table.
> + * @return
> + * if merge the packet successfully, return a positive value. If fail
> + * to merge, return zero. If the packet doesn't have data, or its GRO
> + * type is unsupported, return a negative value.
> + */
> +int rte_gro_reassemble(struct rte_mbuf *pkt,
> + struct rte_gro_tbl *gro_tbl);
> +
> +/**
> + * This function flushed packets from reassembly tables of desired GRO
> + * types. It won't re-calculate checksums for merged packets in the
> + * tables. That is, the returned packets may be with wrong checksums.
> + *
> + * @param gro_tbl
> + * a pointer points to a GRO table object.
> + * @param desired_gro_types
> + * GRO types whose packets will be flushed.
> + * @param out
> + * a pointer array that is used to keep flushed packets.
> + * @param nb_out
> + * the size of out.
> + * @return
> + * the number of flushed packets. If no packets are flushed, return 0.
> + */
> +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl,
> + uint64_t desired_gro_types,
> + struct rte_mbuf **out,
> + const uint16_t max_nb_out);
> +
> +/**
> + * This function flushes the timeout packets from reassembly tables of
> + * desired GRO types. It won't re-calculate checksums for merged packets
> + * in the tables. That is, the returned packets may be with wrong
> + * checksums.
> + *
> + * @param gro_tbl
> + * a pointer points to a GRO table object.
> + * @param desired_gro_types
> + * rte_gro_timeout_flush only processes packets which belong to the
> + * GRO types specified by desired_gro_types.
> + * @param out
> + * a pointer array that is used to keep flushed packets.
> + * @param nb_out
> + * the size of out.
> + * @return
> + * the number of flushed packets. If no packets are flushed, return 0.
> + */
> +uint16_t rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl,
> + uint64_t desired_gro_types,
> + struct rte_mbuf **out,
> + const uint16_t max_nb_out);
> +#endif
> diff --git a/lib/librte_gro/rte_gro_version.map b/lib/librte_gro/rte_gro_version.map
> new file mode 100644
> index 0000000..827596b
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro_version.map
> @@ -0,0 +1,12 @@
> +DPDK_17.08 {
> + global:
> +
> + rte_gro_tbl_create;
> + rte_gro_tbl_destroy;
> + rte_gro_reassemble_burst;
> + rte_gro_reassemble;
> + rte_gro_flush;
> + rte_gro_timeout_flush;
> +
> + local: *;
> +};
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index bcaf1b3..fc3776d 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -98,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING) += -lrte_ring
> _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL) += -lrte_eal
> _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE) += -lrte_cmdline
> _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER) += -lrte_reorder
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_GRO) += -lrte_gro
>
> ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI) += -lrte_kni
More information about the dev
mailing list