[dpdk-dev,v10,1/3] lib: add Generic Receive Offload API framework

Message ID 1498907323-17563-2-git-send-email-jiayu.hu@intel.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues

Commit Message

Hu, Jiayu July 1, 2017, 11:08 a.m. UTC
  Generic Receive Offload (GRO) is a widely used SW-based offloading
technique to reduce per-packet processing overhead. It gains
performance by reassembling small packets into large ones. This
patchset is to support GRO in DPDK. To support GRO, this patch
implements a GRO API framework.

To enable more flexibility to applications, DPDK GRO is implemented as
a user library. Applications explicitly use the GRO library to merge
small packets into large ones. DPDK GRO provides two reassembly modes.
One is called lightweight mode, the other is called heavyweight mode.
If applications want to merge packets in a simple way and the number
of packets is relatively small, they can use the lightweight mode.
If applications need more fine-grained controls, they can choose the
heavyweight mode.

rte_gro_reassemble_burst is the main reassembly API which is used in
lightweight mode and processes N packets at a time. For applications,
performing GRO in lightweight mode is simple. They just need to invoke
rte_gro_reassemble_burst. Applications can get GROed packets as soon as
rte_gro_reassemble_burst returns.

rte_gro_reassemble is the main reassembly API which is used in
heavyweight mode and tries to merge N inputted packets with the packets
in a givn GRO table. For applications, performing GRO in heavyweight
mode is relatively complicated. Before performing GRO, applications need
to create a GRO table by rte_gro_tbl_create. Then they can use
rte_gro_reassemble to merge packets. The GROed packets are in the GRO
table. If applications want to get them, applications need to manually
flush them by flush API.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 config/common_base                 |   5 ++
 lib/Makefile                       |   2 +
 lib/librte_gro/Makefile            |  50 +++++++++++
 lib/librte_gro/rte_gro.c           | 176 +++++++++++++++++++++++++++++++++++++
 lib/librte_gro/rte_gro.h           | 176 +++++++++++++++++++++++++++++++++++++
 lib/librte_gro/rte_gro_version.map |  12 +++
 mk/rte.app.mk                      |   1 +
 7 files changed, 422 insertions(+)
 create mode 100644 lib/librte_gro/Makefile
 create mode 100644 lib/librte_gro/rte_gro.c
 create mode 100644 lib/librte_gro/rte_gro.h
 create mode 100644 lib/librte_gro/rte_gro_version.map
  

Comments

Jianfeng Tan July 2, 2017, 10:19 a.m. UTC | #1
On 7/1/2017 7:08 PM, Jiayu Hu wrote:
> Generic Receive Offload (GRO) is a widely used SW-based offloading
> technique to reduce per-packet processing overhead. It gains
> performance by reassembling small packets into large ones. This
> patchset is to support GRO in DPDK. To support GRO, this patch
> implements a GRO API framework.
>
> To enable more flexibility to applications, DPDK GRO is implemented as
> a user library. Applications explicitly use the GRO library to merge
> small packets into large ones. DPDK GRO provides two reassembly modes.
> One is called lightweight mode, the other is called heavyweight mode.
> If applications want to merge packets in a simple way and the number
> of packets is relatively small, they can use the lightweight mode.
> If applications need more fine-grained controls, they can choose the
> heavyweight mode.
>
> rte_gro_reassemble_burst is the main reassembly API which is used in
> lightweight mode and processes N packets at a time. For applications,
> performing GRO in lightweight mode is simple. They just need to invoke
> rte_gro_reassemble_burst. Applications can get GROed packets as soon as
> rte_gro_reassemble_burst returns.
>
> rte_gro_reassemble is the main reassembly API which is used in
> heavyweight mode and tries to merge N inputted packets with the packets
> in a givn GRO table. For applications, performing GRO in heavyweight
> mode is relatively complicated. Before performing GRO, applications need
> to create a GRO table by rte_gro_tbl_create. Then they can use
> rte_gro_reassemble to merge packets. The GROed packets are in the GRO
> table. If applications want to get them, applications need to manually
> flush them by flush API.

> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> ---
>   config/common_base                 |   5 ++
>   lib/Makefile                       |   2 +
>   lib/librte_gro/Makefile            |  50 +++++++++++
>   lib/librte_gro/rte_gro.c           | 176 +++++++++++++++++++++++++++++++++++++
>   lib/librte_gro/rte_gro.h           | 176 +++++++++++++++++++++++++++++++++++++
>   lib/librte_gro/rte_gro_version.map |  12 +++
>   mk/rte.app.mk                      |   1 +
>   7 files changed, 422 insertions(+)
>   create mode 100644 lib/librte_gro/Makefile
>   create mode 100644 lib/librte_gro/rte_gro.c
>   create mode 100644 lib/librte_gro/rte_gro.h
>   create mode 100644 lib/librte_gro/rte_gro_version.map
>
> diff --git a/config/common_base b/config/common_base
> index f6aafd1..167f5ef 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -712,6 +712,11 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
>   CONFIG_RTE_LIBRTE_PMD_VHOST=n
>   
>   #
> +# Compile GRO library
> +#
> +CONFIG_RTE_LIBRTE_GRO=y
> +
> +#
>   #Compile Xen domain0 support
>   #
>   CONFIG_RTE_LIBRTE_XEN_DOM0=n
> diff --git a/lib/Makefile b/lib/Makefile
> index 07e1fd0..ac1c2f6 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -106,6 +106,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
>   DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
>   DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>   DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
> +DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro
> +DEPDIRS-librte_gro := librte_eal librte_mbuf librte_ether librte_net
>   
>   ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>   DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile
> new file mode 100644
> index 0000000..7e0f128
> --- /dev/null
> +++ b/lib/librte_gro/Makefile
> @@ -0,0 +1,50 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_gro.a
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> +
> +EXPORT_MAP := rte_gro_version.map
> +
> +LIBABIVER := 1
> +
> +# source files
> +SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
> new file mode 100644
> index 0000000..648835b
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro.c
> @@ -0,0 +1,176 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <rte_malloc.h>
> +#include <rte_mbuf.h>
> +
> +#include "rte_gro.h"
> +
> +typedef void *(*gro_tbl_create_fn)(uint16_t socket_id,
> +		uint16_t max_flow_num,
> +		uint16_t max_item_per_flow);
> +typedef void (*gro_tbl_destroy_fn)(void *tbl);
> +typedef uint32_t (*gro_tbl_item_num_fn)(void *tbl);
> +
> +static gro_tbl_create_fn tbl_create_functions[RTE_GRO_TYPE_MAX_NUM];
> +static gro_tbl_destroy_fn tbl_destroy_functions[RTE_GRO_TYPE_MAX_NUM];
> +static gro_tbl_item_num_fn tbl_item_num_functions[RTE_GRO_TYPE_MAX_NUM];
> +
> +/**
> + * GRO table, which is used to merge packets. It keeps many reassembly
> + * tables of desired GRO types. Applications need to create GRO tables
> + * before using rte_gro_reassemble to perform GRO.
> + */
> +struct gro_tbl {
> +	uint64_t desired_gro_types;	/**< GRO types to perform */
> +	/* max TTL measured in nanosecond */
> +	uint64_t max_timeout_cycles;
> +	/* max length of merged packet measured in byte */
> +	uint32_t max_packet_size;
> +	/* reassebly tables of desired GRO types */
> +	void *tbls[RTE_GRO_TYPE_MAX_NUM];
> +};
> +
> +void *rte_gro_tbl_create(const
> +		const struct rte_gro_param *param)

The name of this API and the definition of struct gro_tbl involve some 
confusion. A gro table contains gro tables? I suppose a better name is 
needed, for example, struct gro_ctl.

> +{
> +	gro_tbl_create_fn create_tbl_fn;
> +	gro_tbl_destroy_fn destroy_tbl_fn;
> +	struct gro_tbl *gro_tbl;
> +	uint64_t gro_type_flag = 0;
> +	uint8_t i, j;
> +
> +	gro_tbl = rte_zmalloc_socket(__func__,
> +			sizeof(struct gro_tbl),
> +			RTE_CACHE_LINE_SIZE,
> +			param->socket_id);
> +	if (gro_tbl == NULL)
> +		return NULL;
> +	gro_tbl->max_packet_size = param->max_packet_size;
> +	gro_tbl->max_timeout_cycles = param->max_timeout_cycles;
> +	gro_tbl->desired_gro_types = param->desired_gro_types;
> +
> +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
> +		gro_type_flag = 1 << i;
> +
> +		if ((param->desired_gro_types & gro_type_flag) == 0)
> +			continue;
> +		create_tbl_fn = tbl_create_functions[i];
> +		if (create_tbl_fn == NULL)
> +			continue;
> +
> +		gro_tbl->tbls[i] = create_tbl_fn(
> +				param->socket_id,
> +				param->max_flow_num,
> +				param->max_item_per_flow);

Here and somewhere else: the alignment seems not correct.
         gro_tbl->tbls[i] = create_tbl_fn(param->socket_id,
                                                                 /* keep 
all parameters aligned like this */
param->max_flow_num,
param->max_item_per_flow);
> +		if (gro_tbl->tbls[i] == NULL) {
> +			/* destroy all allocated tables */
> +			for (j = 0; j < i; j++) {
> +				gro_type_flag = 1 << j;
> +				if ((param->desired_gro_types & gro_type_flag) == 0)
> +					continue;
> +				destroy_tbl_fn = tbl_destroy_functions[j];
> +				if (destroy_tbl_fn)
> +					destroy_tbl_fn(gro_tbl->tbls[j]);
> +			}
> +			rte_free(gro_tbl);
> +			return NULL;
> +		}
> +	}
> +	return gro_tbl;
> +}
> +
> +void rte_gro_tbl_destroy(void *tbl)
> +{
> +	gro_tbl_destroy_fn destroy_tbl_fn;
> +	struct gro_tbl *gro_tbl = (struct gro_tbl *)tbl;
> +	uint64_t gro_type_flag;
> +	uint8_t i;
> +
> +	if (gro_tbl == NULL)
> +		return;
> +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
> +		gro_type_flag = 1 << i;
> +		if ((gro_tbl->desired_gro_types & gro_type_flag) == 0)
> +			continue;
> +		destroy_tbl_fn = tbl_destroy_functions[i];
> +		if (destroy_tbl_fn)
> +			destroy_tbl_fn(gro_tbl->tbls[i]);
> +	}
> +	rte_free(gro_tbl);
> +}
> +
> +uint16_t
> +rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
> +		uint16_t nb_pkts,
> +		const struct rte_gro_param *param __rte_unused)
> +{
> +	return nb_pkts;
> +}
> +
> +uint16_t
> +rte_gro_reassemble(struct rte_mbuf **pkts __rte_unused,
> +		uint16_t nb_pkts,
> +		void *tbl __rte_unused)
> +{
> +	return nb_pkts;
> +}
> +
> +uint16_t
> +rte_gro_timeout_flush(void *tbl __rte_unused,
> +		uint64_t desired_gro_types __rte_unused,
> +		struct rte_mbuf **out __rte_unused,
> +		uint16_t max_nb_out __rte_unused)
> +{
> +	return 0;
> +}
> +
> +uint64_t rte_gro_tbl_item_num(void *tbl)

Does rte_gro_get_count() sound better?

> +{
> +	struct gro_tbl *gro_tbl = (struct gro_tbl *)tbl;
> +	gro_tbl_item_num_fn item_num_fn;
> +	uint64_t item_num = 0;
> +	uint64_t gro_type_flag;
> +	uint8_t i;
> +
> +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
> +		gro_type_flag = 1 << i;
> +		if ((gro_tbl->desired_gro_types & gro_type_flag) == 0)
> +			continue;
> +
> +		item_num_fn = tbl_item_num_functions[i];
> +		if (item_num_fn == NULL)
> +			continue;
> +		item_num += item_num_fn(gro_tbl->tbls[i]);
> +	}
> +	return item_num;
> +}
> diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
> new file mode 100644
> index 0000000..02c9113
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro.h
> @@ -0,0 +1,176 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_GRO_H_
> +#define _RTE_GRO_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * the max packets number that rte_gro_reassemble_burst can
> + * process in each invocation.
> + */
> +#define RTE_GRO_MAX_BURST_ITEM_NUM 128UL
> +
> +/* max number of supported GRO types */
> +#define RTE_GRO_TYPE_MAX_NUM 64
> +#define RTE_GRO_TYPE_SUPPORT_NUM 0	/**< current supported GRO num */
> +
> +
> +struct rte_gro_param {
> +	uint64_t desired_gro_types;	/**< desired GRO types */

Make it gro_types for simplicity.

> +	uint32_t max_packet_size;	/**< max length of merged packets */

Refer to the tcp4 gro implementation, this is the max size for tcp 
payload. But in principle, the 65535-byte limitation (including TCP 
header) is because IP header lenght is 2-byte long.

What does it means for other GRO engines then? I think these should be 
decided by each gro engine. And applications don't have to change them.

> +	uint16_t max_flow_num;	/**< max flow number */
> +	uint16_t max_item_per_flow;	/**< max packet number per flow */
> +
> +	/* socket index where the Ethernet port connects to */

The comment needs to be refined. We have different socket idx for port, 
for pmd thread. We just explain how this will be used: "socket index for 
allocating gro related data structure".

> +	uint16_t socket_id;
> +	/* max TTL for a packet in the GRO table, measured in nanosecond */
> +	uint64_t max_timeout_cycles;

We don't need to set it in lightweight mode. Please add this into the 
comment.

> +};
> +
> +/**
> + * This function create a GRO table, which is used to merge packets in
> + * rte_gro_reassemble.
> + *
> + * @param param
> + *  applications use it to pass needed parameters to create a GRO table.
> + * @return
> + *  if create successfully, return a pointer which points to the GRO
> + *  table. Otherwise, return NULL.
> + */
> +void *rte_gro_tbl_create(
> +		const struct rte_gro_param *param);

Merge above two lines into one.

> +/**
> + * This function destroys a GRO table.
> + */
> +void rte_gro_tbl_destroy(void *tbl);
> +
> +/**
> + * This is one of the main reassembly APIs, which merges numbers of
> + * packets at a time. It assumes that all inputted packets are with
> + * correct checksums. That is, applications should guarantee all
> + * inputted packets are correct. Besides, it doesn't re-calculate
> + * checksums for merged packets. If inputted packets are IP fragmented,
> + * this function assumes them are complete (i.e. with L4 header). After
> + * finishing processing, it returns all GROed packets to applications
> + * immediately.
> + *
> + * @param pkts
> + *  a pointer array which points to the packets to reassemble. Besides,
> + *  it keeps packet addresses for GROed packets.
> + * @param nb_pkts
> + *  the number of packets to reassemble.
> + * @param param
> + *  applications use it to tell rte_gro_reassemble_burst what rules
> + *  are demanded.
> + * @return
> + *  the number of packets after been GROed.
> + */
> +uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> +		uint16_t nb_pkts,
> +		const struct rte_gro_param *param);

Fix the alignment.

> +
> +/**
> + * Reassembly function, which tries to merge inputted packets with
> + * the packets in a given GRO table. This function assumes all inputted
> + * packet is with correct checksums. And it won't update checksums if
> + * two packets are merged. Besides, if inputted packets are IP
> + * fragmented, this function assumes they are complete packets (i.e.
> + * with L4 header).
> + *
> + * If the inputted packets don't have data or are with unsupported GRO
> + * types, they won't be processed and are returned to applications.
> + * Otherwise, the inputted packets are either merged or inserted into
> + * the table. If applications want get packets in the table, they need
> + * to call flush API.
> + *
> + * @param pkts
> + *  packet to reassemble. Besides, after this function finishes, it
> + *  keeps the unprocessed packets (i.e. without data or unsupported
> + *  GRO types).
> + * @param nb_pkts
> + *  the number of packets to reassemble.
> + * @param tbl
> + *  a pointer points to a GRO table.
> + * @return
> + *  return the number of unprocessed packets (i.e. without data or
> + *  unsupported GRO types). If all packets are processed (merged or
> + *  inserted into the table), return 0.
> + */
> +uint16_t rte_gro_reassemble(struct rte_mbuf **pkts,
> +		uint16_t nb_pkts,
> +		void *tbl);
> +
> +/**
> + * This function flushes the timeout packets from reassembly tables of
> + * desired GRO types. The max number of flushed timeout packets is the
> + * element number of the array which is used to keep the flushed packets.
> + *
> + * Besides, this function won't re-calculate checksums for merged
> + * packets in the tables. That is, the returned packets may be with
> + * wrong checksums.
> + *
> + * @param tbl
> + *  a pointer points to a GRO table object.
> + * @param desired_gro_types
> + * rte_gro_timeout_flush only processes packets which belong to the
> + * GRO types specified by desired_gro_types.
> + * @param out
> + *  a pointer array that is used to keep flushed timeout packets.
> + * @param nb_out
> + *  the element number of out. It's also the max number of timeout
> + *  packets that can be flushed finally.
> + * @return
> + *  the number of flushed packets. If no packets are flushed, return 0.
> + */
> +uint16_t rte_gro_timeout_flush(void *tbl,
> +		uint64_t desired_gro_types,
> +		struct rte_mbuf **out,
> +		uint16_t max_nb_out);
> +
> +/**
> + * This function returns the number of packets in a given GRO table.
> + * @param tbl
> + *  pointer points to a GRO table.
> + * @return
> + *  the number of packets in the table.
> + */
> +uint64_t rte_gro_tbl_item_num(void *tbl);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif
> diff --git a/lib/librte_gro/rte_gro_version.map b/lib/librte_gro/rte_gro_version.map
> new file mode 100644
> index 0000000..358fb9d
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro_version.map
> @@ -0,0 +1,12 @@
> +DPDK_17.08 {
> +	global:
> +
> +	rte_gro_tbl_create;
> +	rte_gro_tbl_destroy;

As stated earlier, here are the API names I suggested: 
rte_gro_ctl_create()/rte_gro_ctl_destroy()/rte_gro_get_count().

> +	rte_gro_reassemble_burst;
> +	rte_gro_reassemble;
> +	rte_gro_timeout_flush;
> +	rte_gro_tbl_item_num;
> +
> +	local: *;
> +};
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index bcaf1b3..fc3776d 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -98,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
>   _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
>   _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
>   _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)        	+= -lrte_gro
>   
>   ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>   _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
  
Hu, Jiayu July 3, 2017, 5:56 a.m. UTC | #2
Hi Jianfeng,

> -----Original Message-----
> From: Tan, Jianfeng
> Sent: Sunday, July 2, 2017 6:20 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> stephen@networkplumber.org; yliu@fridaylinux.org; Wu, Jingjing
> <jingjing.wu@intel.com>; Yao, Lei A <lei.a.yao@intel.com>; Bie, Tiwei
> <tiwei.bie@intel.com>
> Subject: Re: [PATCH v10 1/3] lib: add Generic Receive Offload API framework
> 
> 
> 
> On 7/1/2017 7:08 PM, Jiayu Hu wrote:
> > Generic Receive Offload (GRO) is a widely used SW-based offloading
> > technique to reduce per-packet processing overhead. It gains
> > performance by reassembling small packets into large ones. This
> > patchset is to support GRO in DPDK. To support GRO, this patch
> > implements a GRO API framework.
> >
> > To enable more flexibility to applications, DPDK GRO is implemented as
> > a user library. Applications explicitly use the GRO library to merge
> > small packets into large ones. DPDK GRO provides two reassembly modes.
> > One is called lightweight mode, the other is called heavyweight mode.
> > If applications want to merge packets in a simple way and the number
> > of packets is relatively small, they can use the lightweight mode.
> > If applications need more fine-grained controls, they can choose the
> > heavyweight mode.
> >
> > rte_gro_reassemble_burst is the main reassembly API which is used in
> > lightweight mode and processes N packets at a time. For applications,
> > performing GRO in lightweight mode is simple. They just need to invoke
> > rte_gro_reassemble_burst. Applications can get GROed packets as soon as
> > rte_gro_reassemble_burst returns.
> >
> > rte_gro_reassemble is the main reassembly API which is used in
> > heavyweight mode and tries to merge N inputted packets with the packets
> > in a givn GRO table. For applications, performing GRO in heavyweight
> > mode is relatively complicated. Before performing GRO, applications need
> > to create a GRO table by rte_gro_tbl_create. Then they can use
> > rte_gro_reassemble to merge packets. The GROed packets are in the GRO
> > table. If applications want to get them, applications need to manually
> > flush them by flush API.
> 
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > ---
> >   config/common_base                 |   5 ++
> >   lib/Makefile                       |   2 +
> >   lib/librte_gro/Makefile            |  50 +++++++++++
> >   lib/librte_gro/rte_gro.c           | 176
> +++++++++++++++++++++++++++++++++++++
> >   lib/librte_gro/rte_gro.h           | 176
> +++++++++++++++++++++++++++++++++++++
> >   lib/librte_gro/rte_gro_version.map |  12 +++
> >   mk/rte.app.mk                      |   1 +
> >   7 files changed, 422 insertions(+)
> >   create mode 100644 lib/librte_gro/Makefile
> >   create mode 100644 lib/librte_gro/rte_gro.c
> >   create mode 100644 lib/librte_gro/rte_gro.h
> >   create mode 100644 lib/librte_gro/rte_gro_version.map
> >
> > diff --git a/config/common_base b/config/common_base
> > index f6aafd1..167f5ef 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -712,6 +712,11 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
> >   CONFIG_RTE_LIBRTE_PMD_VHOST=n
> >
> >   #
> > +# Compile GRO library
> > +#
> > +CONFIG_RTE_LIBRTE_GRO=y
> > +
> > +#
> >   #Compile Xen domain0 support
> >   #
> >   CONFIG_RTE_LIBRTE_XEN_DOM0=n
> > diff --git a/lib/Makefile b/lib/Makefile
> > index 07e1fd0..ac1c2f6 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -106,6 +106,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) +=
> librte_reorder
> >   DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
> >   DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
> >   DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf
> librte_ether
> > +DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro
> > +DEPDIRS-librte_gro := librte_eal librte_mbuf librte_ether librte_net
> >
> >   ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> >   DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> > diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile
> > new file mode 100644
> > index 0000000..7e0f128
> > --- /dev/null
> > +++ b/lib/librte_gro/Makefile
> > @@ -0,0 +1,50 @@
> > +#   BSD LICENSE
> > +#
> > +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > +#   All rights reserved.
> > +#
> > +#   Redistribution and use in source and binary forms, with or without
> > +#   modification, are permitted provided that the following conditions
> > +#   are met:
> > +#
> > +#     * Redistributions of source code must retain the above copyright
> > +#       notice, this list of conditions and the following disclaimer.
> > +#     * Redistributions in binary form must reproduce the above copyright
> > +#       notice, this list of conditions and the following disclaimer in
> > +#       the documentation and/or other materials provided with the
> > +#       distribution.
> > +#     * Neither the name of Intel Corporation nor the names of its
> > +#       contributors may be used to endorse or promote products derived
> > +#       from this software without specific prior written permission.
> > +#
> > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT NOT
> > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS OF USE,
> > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF THE USE
> > +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# library name
> > +LIB = librte_gro.a
> > +
> > +CFLAGS += -O3
> > +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> > +
> > +EXPORT_MAP := rte_gro_version.map
> > +
> > +LIBABIVER := 1
> > +
> > +# source files
> > +SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c
> > +
> > +# install this header file
> > +SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h
> > +
> > +include $(RTE_SDK)/mk/rte.lib.mk
> > diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
> > new file mode 100644
> > index 0000000..648835b
> > --- /dev/null
> > +++ b/lib/librte_gro/rte_gro.c
> > @@ -0,0 +1,176 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> > + */
> > +
> > +#include <rte_malloc.h>
> > +#include <rte_mbuf.h>
> > +
> > +#include "rte_gro.h"
> > +
> > +typedef void *(*gro_tbl_create_fn)(uint16_t socket_id,
> > +		uint16_t max_flow_num,
> > +		uint16_t max_item_per_flow);
> > +typedef void (*gro_tbl_destroy_fn)(void *tbl);
> > +typedef uint32_t (*gro_tbl_item_num_fn)(void *tbl);
> > +
> > +static gro_tbl_create_fn
> tbl_create_functions[RTE_GRO_TYPE_MAX_NUM];
> > +static gro_tbl_destroy_fn
> tbl_destroy_functions[RTE_GRO_TYPE_MAX_NUM];
> > +static gro_tbl_item_num_fn
> tbl_item_num_functions[RTE_GRO_TYPE_MAX_NUM];
> > +
> > +/**
> > + * GRO table, which is used to merge packets. It keeps many reassembly
> > + * tables of desired GRO types. Applications need to create GRO tables
> > + * before using rte_gro_reassemble to perform GRO.
> > + */
> > +struct gro_tbl {
> > +	uint64_t desired_gro_types;	/**< GRO types to perform */
> > +	/* max TTL measured in nanosecond */
> > +	uint64_t max_timeout_cycles;
> > +	/* max length of merged packet measured in byte */
> > +	uint32_t max_packet_size;
> > +	/* reassebly tables of desired GRO types */
> > +	void *tbls[RTE_GRO_TYPE_MAX_NUM];
> > +};
> > +
> > +void *rte_gro_tbl_create(const
> > +		const struct rte_gro_param *param)
> 
> The name of this API and the definition of struct gro_tbl involve some
> confusion. A gro table contains gro tables? I suppose a better name is
> needed, for example, struct gro_ctl.

Actually, a GRO table includes N reassembly tables. But gro_tbl is not a good
name. I will change the name. Thanks.

> 
> > +{
> > +	gro_tbl_create_fn create_tbl_fn;
> > +	gro_tbl_destroy_fn destroy_tbl_fn;
> > +	struct gro_tbl *gro_tbl;
> > +	uint64_t gro_type_flag = 0;
> > +	uint8_t i, j;
> > +
> > +	gro_tbl = rte_zmalloc_socket(__func__,
> > +			sizeof(struct gro_tbl),
> > +			RTE_CACHE_LINE_SIZE,
> > +			param->socket_id);
> > +	if (gro_tbl == NULL)
> > +		return NULL;
> > +	gro_tbl->max_packet_size = param->max_packet_size;
> > +	gro_tbl->max_timeout_cycles = param->max_timeout_cycles;
> > +	gro_tbl->desired_gro_types = param->desired_gro_types;
> > +
> > +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
> > +		gro_type_flag = 1 << i;
> > +
> > +		if ((param->desired_gro_types & gro_type_flag) == 0)
> > +			continue;
> > +		create_tbl_fn = tbl_create_functions[i];
> > +		if (create_tbl_fn == NULL)
> > +			continue;
> > +
> > +		gro_tbl->tbls[i] = create_tbl_fn(
> > +				param->socket_id,
> > +				param->max_flow_num,
> > +				param->max_item_per_flow);
> 
> Here and somewhere else: the alignment seems not correct.
>          gro_tbl->tbls[i] = create_tbl_fn(param->socket_id,
>                                                                  /* keep
> all parameters aligned like this */
> param->max_flow_num,
> param->max_item_per_flow);

Thanks, I will modify it.

> > +		if (gro_tbl->tbls[i] == NULL) {
> > +			/* destroy all allocated tables */
> > +			for (j = 0; j < i; j++) {
> > +				gro_type_flag = 1 << j;
> > +				if ((param->desired_gro_types &
> gro_type_flag) == 0)
> > +					continue;
> > +				destroy_tbl_fn = tbl_destroy_functions[j];
> > +				if (destroy_tbl_fn)
> > +					destroy_tbl_fn(gro_tbl->tbls[j]);
> > +			}
> > +			rte_free(gro_tbl);
> > +			return NULL;
> > +		}
> > +	}
> > +	return gro_tbl;
> > +}
> > +
> > +void rte_gro_tbl_destroy(void *tbl)
> > +{
> > +	gro_tbl_destroy_fn destroy_tbl_fn;
> > +	struct gro_tbl *gro_tbl = (struct gro_tbl *)tbl;
> > +	uint64_t gro_type_flag;
> > +	uint8_t i;
> > +
> > +	if (gro_tbl == NULL)
> > +		return;
> > +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
> > +		gro_type_flag = 1 << i;
> > +		if ((gro_tbl->desired_gro_types & gro_type_flag) == 0)
> > +			continue;
> > +		destroy_tbl_fn = tbl_destroy_functions[i];
> > +		if (destroy_tbl_fn)
> > +			destroy_tbl_fn(gro_tbl->tbls[i]);
> > +	}
> > +	rte_free(gro_tbl);
> > +}
> > +
> > +uint16_t
> > +rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
> > +		uint16_t nb_pkts,
> > +		const struct rte_gro_param *param __rte_unused)
> > +{
> > +	return nb_pkts;
> > +}
> > +
> > +uint16_t
> > +rte_gro_reassemble(struct rte_mbuf **pkts __rte_unused,
> > +		uint16_t nb_pkts,
> > +		void *tbl __rte_unused)
> > +{
> > +	return nb_pkts;
> > +}
> > +
> > +uint16_t
> > +rte_gro_timeout_flush(void *tbl __rte_unused,
> > +		uint64_t desired_gro_types __rte_unused,
> > +		struct rte_mbuf **out __rte_unused,
> > +		uint16_t max_nb_out __rte_unused)
> > +{
> > +	return 0;
> > +}
> > +
> > +uint64_t rte_gro_tbl_item_num(void *tbl)
> 
> Does rte_gro_get_count() sound better?

OK, I will change the name.

> 
> > +{
> > +	struct gro_tbl *gro_tbl = (struct gro_tbl *)tbl;
> > +	gro_tbl_item_num_fn item_num_fn;
> > +	uint64_t item_num = 0;
> > +	uint64_t gro_type_flag;
> > +	uint8_t i;
> > +
> > +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
> > +		gro_type_flag = 1 << i;
> > +		if ((gro_tbl->desired_gro_types & gro_type_flag) == 0)
> > +			continue;
> > +
> > +		item_num_fn = tbl_item_num_functions[i];
> > +		if (item_num_fn == NULL)
> > +			continue;
> > +		item_num += item_num_fn(gro_tbl->tbls[i]);
> > +	}
> > +	return item_num;
> > +}
> > diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
> > new file mode 100644
> > index 0000000..02c9113
> > --- /dev/null
> > +++ b/lib/librte_gro/rte_gro.h
> > @@ -0,0 +1,176 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> > + */
> > +
> > +#ifndef _RTE_GRO_H_
> > +#define _RTE_GRO_H_
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +/**
> > + * the max packets number that rte_gro_reassemble_burst can
> > + * process in each invocation.
> > + */
> > +#define RTE_GRO_MAX_BURST_ITEM_NUM 128UL
> > +
> > +/* max number of supported GRO types */
> > +#define RTE_GRO_TYPE_MAX_NUM 64
> > +#define RTE_GRO_TYPE_SUPPORT_NUM 0	/**< current supported
> GRO num */
> > +
> > +
> > +struct rte_gro_param {
> > +	uint64_t desired_gro_types;	/**< desired GRO types */
> 
> Make it gro_types for simplicity.

Thanks, I will change it.

> 
> > +	uint32_t max_packet_size;	/**< max length of merged packets
> */
> 
> Refer to the tcp4 gro implementation, this is the max size for tcp
> payload. But in principle, the 65535-byte limitation (including TCP
> header) is because IP header lenght is 2-byte long.

Yes, it's a bug. The length of (IP header + TCP header + payload)
should be less than 64KB. But max_packet_size is uint32_t.

> 
> What does it means for other GRO engines then? I think these should be
> decided by each gro engine. And applications don't have to change them.

Make sense. We shouldn't expose it applications. I will remove this parameter.
About the above bug, I will fix it by limiting the max length of a TCP/IPv4 packet
to (l2_header + 64KB).

> 
> > +	uint16_t max_flow_num;	/**< max flow number */
> > +	uint16_t max_item_per_flow;	/**< max packet number per flow
> */
> > +
> > +	/* socket index where the Ethernet port connects to */
> 
> The comment needs to be refined. We have different socket idx for port,
> for pmd thread. We just explain how this will be used: "socket index for
> allocating gro related data structure".

Thanks, I will change it.

> 
> > +	uint16_t socket_id;
> > +	/* max TTL for a packet in the GRO table, measured in nanosecond
> */
> > +	uint64_t max_timeout_cycles;
> 
> We don't need to set it in lightweight mode. Please add this into the
> comment.

Thanks, I will add this comment.

> 
> > +};
> > +
> > +/**
> > + * This function create a GRO table, which is used to merge packets in
> > + * rte_gro_reassemble.
> > + *
> > + * @param param
> > + *  applications use it to pass needed parameters to create a GRO table.
> > + * @return
> > + *  if create successfully, return a pointer which points to the GRO
> > + *  table. Otherwise, return NULL.
> > + */
> > +void *rte_gro_tbl_create(
> > +		const struct rte_gro_param *param);
> 
> Merge above two lines into one.

Thanks, I will modify it.

> 
> > +/**
> > + * This function destroys a GRO table.
> > + */
> > +void rte_gro_tbl_destroy(void *tbl);
> > +
> > +/**
> > + * This is one of the main reassembly APIs, which merges numbers of
> > + * packets at a time. It assumes that all inputted packets are with
> > + * correct checksums. That is, applications should guarantee all
> > + * inputted packets are correct. Besides, it doesn't re-calculate
> > + * checksums for merged packets. If inputted packets are IP fragmented,
> > + * this function assumes them are complete (i.e. with L4 header). After
> > + * finishing processing, it returns all GROed packets to applications
> > + * immediately.
> > + *
> > + * @param pkts
> > + *  a pointer array which points to the packets to reassemble. Besides,
> > + *  it keeps packet addresses for GROed packets.
> > + * @param nb_pkts
> > + *  the number of packets to reassemble.
> > + * @param param
> > + *  applications use it to tell rte_gro_reassemble_burst what rules
> > + *  are demanded.
> > + * @return
> > + *  the number of packets after been GROed.
> > + */
> > +uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> > +		uint16_t nb_pkts,
> > +		const struct rte_gro_param *param);
> 
> Fix the alignment.
> 
> > +
> > +/**
> > + * Reassembly function, which tries to merge inputted packets with
> > + * the packets in a given GRO table. This function assumes all inputted
> > + * packet is with correct checksums. And it won't update checksums if
> > + * two packets are merged. Besides, if inputted packets are IP
> > + * fragmented, this function assumes they are complete packets (i.e.
> > + * with L4 header).
> > + *
> > + * If the inputted packets don't have data or are with unsupported GRO
> > + * types, they won't be processed and are returned to applications.
> > + * Otherwise, the inputted packets are either merged or inserted into
> > + * the table. If applications want get packets in the table, they need
> > + * to call flush API.
> > + *
> > + * @param pkts
> > + *  packet to reassemble. Besides, after this function finishes, it
> > + *  keeps the unprocessed packets (i.e. without data or unsupported
> > + *  GRO types).
> > + * @param nb_pkts
> > + *  the number of packets to reassemble.
> > + * @param tbl
> > + *  a pointer points to a GRO table.
> > + * @return
> > + *  return the number of unprocessed packets (i.e. without data or
> > + *  unsupported GRO types). If all packets are processed (merged or
> > + *  inserted into the table), return 0.
> > + */
> > +uint16_t rte_gro_reassemble(struct rte_mbuf **pkts,
> > +		uint16_t nb_pkts,
> > +		void *tbl);
> > +
> > +/**
> > + * This function flushes the timeout packets from reassembly tables of
> > + * desired GRO types. The max number of flushed timeout packets is the
> > + * element number of the array which is used to keep the flushed packets.
> > + *
> > + * Besides, this function won't re-calculate checksums for merged
> > + * packets in the tables. That is, the returned packets may be with
> > + * wrong checksums.
> > + *
> > + * @param tbl
> > + *  a pointer points to a GRO table object.
> > + * @param desired_gro_types
> > + * rte_gro_timeout_flush only processes packets which belong to the
> > + * GRO types specified by desired_gro_types.
> > + * @param out
> > + *  a pointer array that is used to keep flushed timeout packets.
> > + * @param nb_out
> > + *  the element number of out. It's also the max number of timeout
> > + *  packets that can be flushed finally.
> > + * @return
> > + *  the number of flushed packets. If no packets are flushed, return 0.
> > + */
> > +uint16_t rte_gro_timeout_flush(void *tbl,
> > +		uint64_t desired_gro_types,
> > +		struct rte_mbuf **out,
> > +		uint16_t max_nb_out);
> > +
> > +/**
> > + * This function returns the number of packets in a given GRO table.
> > + * @param tbl
> > + *  pointer points to a GRO table.
> > + * @return
> > + *  the number of packets in the table.
> > + */
> > +uint64_t rte_gro_tbl_item_num(void *tbl);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif
> > diff --git a/lib/librte_gro/rte_gro_version.map
> b/lib/librte_gro/rte_gro_version.map
> > new file mode 100644
> > index 0000000..358fb9d
> > --- /dev/null
> > +++ b/lib/librte_gro/rte_gro_version.map
> > @@ -0,0 +1,12 @@
> > +DPDK_17.08 {
> > +	global:
> > +
> > +	rte_gro_tbl_create;
> > +	rte_gro_tbl_destroy;
> 
> As stated earlier, here are the API names I suggested:
> rte_gro_ctl_create()/rte_gro_ctl_destroy()/rte_gro_get_count().

Thanks, I will change the names.

> 
> > +	rte_gro_reassemble_burst;
> > +	rte_gro_reassemble;
> > +	rte_gro_timeout_flush;
> > +	rte_gro_tbl_item_num;
> > +
> > +	local: *;
> > +};
> > diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> > index bcaf1b3..fc3776d 100644
> > --- a/mk/rte.app.mk
> > +++ b/mk/rte.app.mk
> > @@ -98,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -
> lrte_ring
> >   _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
> >   _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
> >   _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
> > +_LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)        	+= -lrte_gro
> >
> >   ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> >   _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
  
Yuanhan Liu July 4, 2017, 8:11 a.m. UTC | #3
On Mon, Jul 03, 2017 at 05:56:20AM +0000, Hu, Jiayu wrote:
> > > +/**
> > > + * GRO table, which is used to merge packets. It keeps many reassembly
> > > + * tables of desired GRO types. Applications need to create GRO tables
> > > + * before using rte_gro_reassemble to perform GRO.
> > > + */
> > > +struct gro_tbl {
> > > +	uint64_t desired_gro_types;	/**< GRO types to perform */
> > > +	/* max TTL measured in nanosecond */
> > > +	uint64_t max_timeout_cycles;
> > > +	/* max length of merged packet measured in byte */
> > > +	uint32_t max_packet_size;
> > > +	/* reassebly tables of desired GRO types */
> > > +	void *tbls[RTE_GRO_TYPE_MAX_NUM];
> > > +};
> > > +
> > > +void *rte_gro_tbl_create(const
> > > +		const struct rte_gro_param *param)
> > 
> > The name of this API and the definition of struct gro_tbl involve some
> > confusion. A gro table contains gro tables? I suppose a better name is
> > needed, for example, struct gro_ctl.
> 
> Actually, a GRO table includes N reassembly tables. But gro_tbl is not a good
> name. I will change the name. Thanks.

Haven't looked at the details yet, but, probably, gro_ctx (context) is a better
and more typical name?

	--yliu
  
Yuanhan Liu July 4, 2017, 8:37 a.m. UTC | #4
Haven't looked at the details yet, and below are some quick comments
after a glimpse.

On Sat, Jul 01, 2017 at 07:08:41PM +0800, Jiayu Hu wrote:
...
> +void *rte_gro_tbl_create(const
> +		const struct rte_gro_param *param)

The DPDK style is:

void *
rte_gro_tbl_destroy(...)

Also you should revisit all other functions, as I have seen quite many
coding style issues like this.

> +{
> +	gro_tbl_create_fn create_tbl_fn;
> +	gro_tbl_destroy_fn destroy_tbl_fn;
> +	struct gro_tbl *gro_tbl;
> +	uint64_t gro_type_flag = 0;
> +	uint8_t i, j;
> +
> +	gro_tbl = rte_zmalloc_socket(__func__,
> +			sizeof(struct gro_tbl),
> +			RTE_CACHE_LINE_SIZE,
> +			param->socket_id);
> +	if (gro_tbl == NULL)
> +		return NULL;
> +	gro_tbl->max_packet_size = param->max_packet_size;
> +	gro_tbl->max_timeout_cycles = param->max_timeout_cycles;
> +	gro_tbl->desired_gro_types = param->desired_gro_types;
> +
> +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
> +		gro_type_flag = 1 << i;
> +
> +		if ((param->desired_gro_types & gro_type_flag) == 0)
> +			continue;
> +		create_tbl_fn = tbl_create_functions[i];
> +		if (create_tbl_fn == NULL)
> +			continue;
> +
> +		gro_tbl->tbls[i] = create_tbl_fn(
> +				param->socket_id,
> +				param->max_flow_num,
> +				param->max_item_per_flow);
> +		if (gro_tbl->tbls[i] == NULL) {
> +			/* destroy all allocated tables */
> +			for (j = 0; j < i; j++) {
> +				gro_type_flag = 1 << j;
> +				if ((param->desired_gro_types & gro_type_flag) == 0)
> +					continue;
> +				destroy_tbl_fn = tbl_destroy_functions[j];
> +				if (destroy_tbl_fn)
> +					destroy_tbl_fn(gro_tbl->tbls[j]);
> +			}
> +			rte_free(gro_tbl);
> +			return NULL;

The typical way to handle this is to re-use rte_gro_tbl_destroy() as
much as possible. This saves duplicate code.

> +		}
> +	}
> +	return gro_tbl;
> +}
> +
> +void rte_gro_tbl_destroy(void *tbl)
> +{
> +	gro_tbl_destroy_fn destroy_tbl_fn;
> +	struct gro_tbl *gro_tbl = (struct gro_tbl *)tbl;

The cast (from void *) is unnecessary and can be dropped.

...
> +/**
> + * the max packets number that rte_gro_reassemble_burst can
> + * process in each invocation.
> + */
> +#define RTE_GRO_MAX_BURST_ITEM_NUM 128UL
> +
> +/* max number of supported GRO types */
> +#define RTE_GRO_TYPE_MAX_NUM 64
> +#define RTE_GRO_TYPE_SUPPORT_NUM 0	/**< current supported GRO num */

The reason we need use comment style of "/**< ... */" is because this
is what the doc generator (doxygen) recognizes. If not doing this, your
comment won't be displayed at the generated doc page (for example,
http://dpdk.org/doc/api/rte__ethdev_8h.html#ade7de72f6c0f8102d01a0b3438856900).

The format, as far as I known, could be:

    /**< here is a comment */
    #define A_MACRO		x

Or the one you did for RTE_GRO_TYPE_SUPPORT_NUM: put it at the end
of the line.

That being said, the comments for RTE_GRO_MAX_BURST_ITEM_NUM and
RTE_GRO_TYPE_MAX_NUM should be changed. Again, you should revisit
other places.

> +
> +
> +struct rte_gro_param {
> +	uint64_t desired_gro_types;	/**< desired GRO types */
> +	uint32_t max_packet_size;	/**< max length of merged packets */
> +	uint16_t max_flow_num;	/**< max flow number */
> +	uint16_t max_item_per_flow;	/**< max packet number per flow */
> +
> +	/* socket index where the Ethernet port connects to */

Ditto.

...
> +++ b/lib/librte_gro/rte_gro_version.map
> @@ -0,0 +1,12 @@
> +DPDK_17.08 {
> +	global:
> +
> +	rte_gro_tbl_create;
> +	rte_gro_tbl_destroy;
> +	rte_gro_reassemble_burst;
> +	rte_gro_reassemble;
> +	rte_gro_timeout_flush;
> +	rte_gro_tbl_item_num;

The undocumented habit is to list them in alpha order.

	--yliu
  
Hu, Jiayu July 4, 2017, 4:01 p.m. UTC | #5
Hi Yuanhan,

> -----Original Message-----
> From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> Sent: Tuesday, July 4, 2017 4:37 PM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> stephen@networkplumber.org; Tan, Jianfeng <jianfeng.tan@intel.com>; Wu,
> Jingjing <jingjing.wu@intel.com>; Yao, Lei A <lei.a.yao@intel.com>; Bie,
> Tiwei <tiwei.bie@intel.com>
> Subject: Re: [PATCH v10 1/3] lib: add Generic Receive Offload API framework
> 
> Haven't looked at the details yet, and below are some quick comments
> after a glimpse.
> 
> On Sat, Jul 01, 2017 at 07:08:41PM +0800, Jiayu Hu wrote:
> ...
> > +void *rte_gro_tbl_create(const
> > +		const struct rte_gro_param *param)
> 
> The DPDK style is:
> 
> void *
> rte_gro_tbl_destroy(...)
> 
> Also you should revisit all other functions, as I have seen quite many
> coding style issues like this.

Thanks, I will fix the style issues.

> 
> > +{
> > +	gro_tbl_create_fn create_tbl_fn;
> > +	gro_tbl_destroy_fn destroy_tbl_fn;
> > +	struct gro_tbl *gro_tbl;
> > +	uint64_t gro_type_flag = 0;
> > +	uint8_t i, j;
> > +
> > +	gro_tbl = rte_zmalloc_socket(__func__,
> > +			sizeof(struct gro_tbl),
> > +			RTE_CACHE_LINE_SIZE,
> > +			param->socket_id);
> > +	if (gro_tbl == NULL)
> > +		return NULL;
> > +	gro_tbl->max_packet_size = param->max_packet_size;
> > +	gro_tbl->max_timeout_cycles = param->max_timeout_cycles;
> > +	gro_tbl->desired_gro_types = param->desired_gro_types;
> > +
> > +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
> > +		gro_type_flag = 1 << i;
> > +
> > +		if ((param->desired_gro_types & gro_type_flag) == 0)
> > +			continue;
> > +		create_tbl_fn = tbl_create_functions[i];
> > +		if (create_tbl_fn == NULL)
> > +			continue;
> > +
> > +		gro_tbl->tbls[i] = create_tbl_fn(
> > +				param->socket_id,
> > +				param->max_flow_num,
> > +				param->max_item_per_flow);
> > +		if (gro_tbl->tbls[i] == NULL) {
> > +			/* destroy all allocated tables */
> > +			for (j = 0; j < i; j++) {
> > +				gro_type_flag = 1 << j;
> > +				if ((param->desired_gro_types &
> gro_type_flag) == 0)
> > +					continue;
> > +				destroy_tbl_fn = tbl_destroy_functions[j];
> > +				if (destroy_tbl_fn)
> > +					destroy_tbl_fn(gro_tbl->tbls[j]);
> > +			}
> > +			rte_free(gro_tbl);
> > +			return NULL;
> 
> The typical way to handle this is to re-use rte_gro_tbl_destroy() as
> much as possible. This saves duplicate code.

Thanks, I will change it.

> 
> > +		}
> > +	}
> > +	return gro_tbl;
> > +}
> > +
> > +void rte_gro_tbl_destroy(void *tbl)
> > +{
> > +	gro_tbl_destroy_fn destroy_tbl_fn;
> > +	struct gro_tbl *gro_tbl = (struct gro_tbl *)tbl;
> 
> The cast (from void *) is unnecessary and can be dropped.

Thanks, I will remove them.

> 
> ...
> > +/**
> > + * the max packets number that rte_gro_reassemble_burst can
> > + * process in each invocation.
> > + */
> > +#define RTE_GRO_MAX_BURST_ITEM_NUM 128UL
> > +
> > +/* max number of supported GRO types */
> > +#define RTE_GRO_TYPE_MAX_NUM 64
> > +#define RTE_GRO_TYPE_SUPPORT_NUM 0	/**< current supported
> GRO num */
> 
> The reason we need use comment style of "/**< ... */" is because this
> is what the doc generator (doxygen) recognizes. If not doing this, your
> comment won't be displayed at the generated doc page (for example,
> http://dpdk.org/doc/api/rte__ethdev_8h.html#ade7de72f6c0f8102d01a0b3
> 438856900).
> 
> The format, as far as I known, could be:
> 
>     /**< here is a comment */
>     #define A_MACRO		x
> 
> Or the one you did for RTE_GRO_TYPE_SUPPORT_NUM: put it at the end
> of the line.
> 
> That being said, the comments for RTE_GRO_MAX_BURST_ITEM_NUM and
> RTE_GRO_TYPE_MAX_NUM should be changed. Again, you should revisit
> other places.

Thanks, I will modify the comments style.

> 
> > +
> > +
> > +struct rte_gro_param {
> > +	uint64_t desired_gro_types;	/**< desired GRO types */
> > +	uint32_t max_packet_size;	/**< max length of merged packets
> */
> > +	uint16_t max_flow_num;	/**< max flow number */
> > +	uint16_t max_item_per_flow;	/**< max packet number per flow
> */
> > +
> > +	/* socket index where the Ethernet port connects to */
> 
> Ditto.
> 
> ...
> > +++ b/lib/librte_gro/rte_gro_version.map
> > @@ -0,0 +1,12 @@
> > +DPDK_17.08 {
> > +	global:
> > +
> > +	rte_gro_tbl_create;
> > +	rte_gro_tbl_destroy;
> > +	rte_gro_reassemble_burst;
> > +	rte_gro_reassemble;
> > +	rte_gro_timeout_flush;
> > +	rte_gro_tbl_item_num;
> 
> The undocumented habit is to list them in alpha order.

Thanks, I will change the order.

BRs,
Jiayu
> 
> 	--yliu
  

Patch

diff --git a/config/common_base b/config/common_base
index f6aafd1..167f5ef 100644
--- a/config/common_base
+++ b/config/common_base
@@ -712,6 +712,11 @@  CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=n
 
 #
+# Compile GRO library
+#
+CONFIG_RTE_LIBRTE_GRO=y
+
+#
 #Compile Xen domain0 support
 #
 CONFIG_RTE_LIBRTE_XEN_DOM0=n
diff --git a/lib/Makefile b/lib/Makefile
index 07e1fd0..ac1c2f6 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -106,6 +106,8 @@  DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro
+DEPDIRS-librte_gro := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile
new file mode 100644
index 0000000..7e0f128
--- /dev/null
+++ b/lib/librte_gro/Makefile
@@ -0,0 +1,50 @@ 
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gro.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_gro_version.map
+
+LIBABIVER := 1
+
+# source files
+SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
new file mode 100644
index 0000000..648835b
--- /dev/null
+++ b/lib/librte_gro/rte_gro.c
@@ -0,0 +1,176 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_malloc.h>
+#include <rte_mbuf.h>
+
+#include "rte_gro.h"
+
+typedef void *(*gro_tbl_create_fn)(uint16_t socket_id,
+		uint16_t max_flow_num,
+		uint16_t max_item_per_flow);
+typedef void (*gro_tbl_destroy_fn)(void *tbl);
+typedef uint32_t (*gro_tbl_item_num_fn)(void *tbl);
+
+static gro_tbl_create_fn tbl_create_functions[RTE_GRO_TYPE_MAX_NUM];
+static gro_tbl_destroy_fn tbl_destroy_functions[RTE_GRO_TYPE_MAX_NUM];
+static gro_tbl_item_num_fn tbl_item_num_functions[RTE_GRO_TYPE_MAX_NUM];
+
+/**
+ * GRO table, which is used to merge packets. It keeps many reassembly
+ * tables of desired GRO types. Applications need to create GRO tables
+ * before using rte_gro_reassemble to perform GRO.
+ */
+struct gro_tbl {
+	uint64_t desired_gro_types;	/**< GRO types to perform */
+	/* max TTL measured in nanosecond */
+	uint64_t max_timeout_cycles;
+	/* max length of merged packet measured in byte */
+	uint32_t max_packet_size;
+	/* reassebly tables of desired GRO types */
+	void *tbls[RTE_GRO_TYPE_MAX_NUM];
+};
+
+void *rte_gro_tbl_create(const
+		const struct rte_gro_param *param)
+{
+	gro_tbl_create_fn create_tbl_fn;
+	gro_tbl_destroy_fn destroy_tbl_fn;
+	struct gro_tbl *gro_tbl;
+	uint64_t gro_type_flag = 0;
+	uint8_t i, j;
+
+	gro_tbl = rte_zmalloc_socket(__func__,
+			sizeof(struct gro_tbl),
+			RTE_CACHE_LINE_SIZE,
+			param->socket_id);
+	if (gro_tbl == NULL)
+		return NULL;
+	gro_tbl->max_packet_size = param->max_packet_size;
+	gro_tbl->max_timeout_cycles = param->max_timeout_cycles;
+	gro_tbl->desired_gro_types = param->desired_gro_types;
+
+	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
+		gro_type_flag = 1 << i;
+
+		if ((param->desired_gro_types & gro_type_flag) == 0)
+			continue;
+		create_tbl_fn = tbl_create_functions[i];
+		if (create_tbl_fn == NULL)
+			continue;
+
+		gro_tbl->tbls[i] = create_tbl_fn(
+				param->socket_id,
+				param->max_flow_num,
+				param->max_item_per_flow);
+		if (gro_tbl->tbls[i] == NULL) {
+			/* destroy all allocated tables */
+			for (j = 0; j < i; j++) {
+				gro_type_flag = 1 << j;
+				if ((param->desired_gro_types & gro_type_flag) == 0)
+					continue;
+				destroy_tbl_fn = tbl_destroy_functions[j];
+				if (destroy_tbl_fn)
+					destroy_tbl_fn(gro_tbl->tbls[j]);
+			}
+			rte_free(gro_tbl);
+			return NULL;
+		}
+	}
+	return gro_tbl;
+}
+
+void rte_gro_tbl_destroy(void *tbl)
+{
+	gro_tbl_destroy_fn destroy_tbl_fn;
+	struct gro_tbl *gro_tbl = (struct gro_tbl *)tbl;
+	uint64_t gro_type_flag;
+	uint8_t i;
+
+	if (gro_tbl == NULL)
+		return;
+	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
+		gro_type_flag = 1 << i;
+		if ((gro_tbl->desired_gro_types & gro_type_flag) == 0)
+			continue;
+		destroy_tbl_fn = tbl_destroy_functions[i];
+		if (destroy_tbl_fn)
+			destroy_tbl_fn(gro_tbl->tbls[i]);
+	}
+	rte_free(gro_tbl);
+}
+
+uint16_t
+rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
+		uint16_t nb_pkts,
+		const struct rte_gro_param *param __rte_unused)
+{
+	return nb_pkts;
+}
+
+uint16_t
+rte_gro_reassemble(struct rte_mbuf **pkts __rte_unused,
+		uint16_t nb_pkts,
+		void *tbl __rte_unused)
+{
+	return nb_pkts;
+}
+
+uint16_t
+rte_gro_timeout_flush(void *tbl __rte_unused,
+		uint64_t desired_gro_types __rte_unused,
+		struct rte_mbuf **out __rte_unused,
+		uint16_t max_nb_out __rte_unused)
+{
+	return 0;
+}
+
+uint64_t rte_gro_tbl_item_num(void *tbl)
+{
+	struct gro_tbl *gro_tbl = (struct gro_tbl *)tbl;
+	gro_tbl_item_num_fn item_num_fn;
+	uint64_t item_num = 0;
+	uint64_t gro_type_flag;
+	uint8_t i;
+
+	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
+		gro_type_flag = 1 << i;
+		if ((gro_tbl->desired_gro_types & gro_type_flag) == 0)
+			continue;
+
+		item_num_fn = tbl_item_num_functions[i];
+		if (item_num_fn == NULL)
+			continue;
+		item_num += item_num_fn(gro_tbl->tbls[i]);
+	}
+	return item_num;
+}
diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
new file mode 100644
index 0000000..02c9113
--- /dev/null
+++ b/lib/librte_gro/rte_gro.h
@@ -0,0 +1,176 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GRO_H_
+#define _RTE_GRO_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * the max packets number that rte_gro_reassemble_burst can
+ * process in each invocation.
+ */
+#define RTE_GRO_MAX_BURST_ITEM_NUM 128UL
+
+/* max number of supported GRO types */
+#define RTE_GRO_TYPE_MAX_NUM 64
+#define RTE_GRO_TYPE_SUPPORT_NUM 0	/**< current supported GRO num */
+
+
+struct rte_gro_param {
+	uint64_t desired_gro_types;	/**< desired GRO types */
+	uint32_t max_packet_size;	/**< max length of merged packets */
+	uint16_t max_flow_num;	/**< max flow number */
+	uint16_t max_item_per_flow;	/**< max packet number per flow */
+
+	/* socket index where the Ethernet port connects to */
+	uint16_t socket_id;
+	/* max TTL for a packet in the GRO table, measured in nanosecond */
+	uint64_t max_timeout_cycles;
+};
+
+/**
+ * This function create a GRO table, which is used to merge packets in
+ * rte_gro_reassemble.
+ *
+ * @param param
+ *  applications use it to pass needed parameters to create a GRO table.
+ * @return
+ *  if create successfully, return a pointer which points to the GRO
+ *  table. Otherwise, return NULL.
+ */
+void *rte_gro_tbl_create(
+		const struct rte_gro_param *param);
+/**
+ * This function destroys a GRO table.
+ */
+void rte_gro_tbl_destroy(void *tbl);
+
+/**
+ * This is one of the main reassembly APIs, which merges numbers of
+ * packets at a time. It assumes that all inputted packets are with
+ * correct checksums. That is, applications should guarantee all
+ * inputted packets are correct. Besides, it doesn't re-calculate
+ * checksums for merged packets. If inputted packets are IP fragmented,
+ * this function assumes them are complete (i.e. with L4 header). After
+ * finishing processing, it returns all GROed packets to applications
+ * immediately.
+ *
+ * @param pkts
+ *  a pointer array which points to the packets to reassemble. Besides,
+ *  it keeps packet addresses for GROed packets.
+ * @param nb_pkts
+ *  the number of packets to reassemble.
+ * @param param
+ *  applications use it to tell rte_gro_reassemble_burst what rules
+ *  are demanded.
+ * @return
+ *  the number of packets after been GROed.
+ */
+uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
+		uint16_t nb_pkts,
+		const struct rte_gro_param *param);
+
+/**
+ * Reassembly function, which tries to merge inputted packets with
+ * the packets in a given GRO table. This function assumes all inputted
+ * packet is with correct checksums. And it won't update checksums if
+ * two packets are merged. Besides, if inputted packets are IP
+ * fragmented, this function assumes they are complete packets (i.e.
+ * with L4 header).
+ *
+ * If the inputted packets don't have data or are with unsupported GRO
+ * types, they won't be processed and are returned to applications.
+ * Otherwise, the inputted packets are either merged or inserted into
+ * the table. If applications want get packets in the table, they need
+ * to call flush API.
+ *
+ * @param pkts
+ *  packet to reassemble. Besides, after this function finishes, it
+ *  keeps the unprocessed packets (i.e. without data or unsupported
+ *  GRO types).
+ * @param nb_pkts
+ *  the number of packets to reassemble.
+ * @param tbl
+ *  a pointer points to a GRO table.
+ * @return
+ *  return the number of unprocessed packets (i.e. without data or
+ *  unsupported GRO types). If all packets are processed (merged or
+ *  inserted into the table), return 0.
+ */
+uint16_t rte_gro_reassemble(struct rte_mbuf **pkts,
+		uint16_t nb_pkts,
+		void *tbl);
+
+/**
+ * This function flushes the timeout packets from reassembly tables of
+ * desired GRO types. The max number of flushed timeout packets is the
+ * element number of the array which is used to keep the flushed packets.
+ *
+ * Besides, this function won't re-calculate checksums for merged
+ * packets in the tables. That is, the returned packets may be with
+ * wrong checksums.
+ *
+ * @param tbl
+ *  a pointer points to a GRO table object.
+ * @param desired_gro_types
+ * rte_gro_timeout_flush only processes packets which belong to the
+ * GRO types specified by desired_gro_types.
+ * @param out
+ *  a pointer array that is used to keep flushed timeout packets.
+ * @param nb_out
+ *  the element number of out. It's also the max number of timeout
+ *  packets that can be flushed finally.
+ * @return
+ *  the number of flushed packets. If no packets are flushed, return 0.
+ */
+uint16_t rte_gro_timeout_flush(void *tbl,
+		uint64_t desired_gro_types,
+		struct rte_mbuf **out,
+		uint16_t max_nb_out);
+
+/**
+ * This function returns the number of packets in a given GRO table.
+ * @param tbl
+ *  pointer points to a GRO table.
+ * @return
+ *  the number of packets in the table.
+ */
+uint64_t rte_gro_tbl_item_num(void *tbl);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_gro/rte_gro_version.map b/lib/librte_gro/rte_gro_version.map
new file mode 100644
index 0000000..358fb9d
--- /dev/null
+++ b/lib/librte_gro/rte_gro_version.map
@@ -0,0 +1,12 @@ 
+DPDK_17.08 {
+	global:
+
+	rte_gro_tbl_create;
+	rte_gro_tbl_destroy;
+	rte_gro_reassemble_burst;
+	rte_gro_reassemble;
+	rte_gro_timeout_flush;
+	rte_gro_tbl_item_num;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index bcaf1b3..fc3776d 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -98,6 +98,7 @@  _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)        	+= -lrte_gro
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni