[dpdk-dev,v7,1/3] lib: add Generic Receive Offload API framework

Message ID 1498459430-116048-2-git-send-email-jiayu.hu@intel.com (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/checkpatch success coding style OK

Commit Message

Hu, Jiayu June 26, 2017, 6:43 a.m. UTC
  Generic Receive Offload (GRO) is a widely used SW-based offloading
technique to reduce per-packet processing overhead. It gains
performance by reassembling small packets into large ones. This
patchset is to support GRO in DPDK. To support GRO, this patch
implements a GRO API framework.

To enable more flexibility to applications, DPDK GRO is implemented as
a user library. Applications explicitly use the GRO library to merge
small packets into large ones. DPDK GRO provides two reassembly modes.
One is called lightweigth mode, the other is called heavyweight mode.
If applications want to merge packets in a simple way and the number
of packets is relatively small, they can use the lightweigth mode.
If applications need more fine-grained controls, they can choose the
heavyweigth mode.

rte_gro_reassemble_burst is the main reassembly API which is used in
lightweigth mode and processes N packets at a time. For applications,
performing GRO in lightweigth mode is simple. They just need to invoke
rte_gro_reassemble_burst. Applications can get GROed packets as soon as
rte_gro_reassemble_burst returns.

rte_gro_reassemble is the main reassembly API which is used in
heavyweight mode and processes one packet at a time. For applications,
performing GRO in heavyweigth mode is relatively complicated. Before
performing GRO, applications need to create a GRO table by
rte_gro_tbl_create. Then they can use rte_gro_reassemble to process
packets one by one. The processed packets are in the GRO table. If
applications want to get them, applications need to manually flush
them by flush APIs.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
---
 config/common_base                 |   5 +
 lib/Makefile                       |   2 +
 lib/librte_gro/Makefile            |  50 +++++++++
 lib/librte_gro/rte_gro.c           | 125 ++++++++++++++++++++++
 lib/librte_gro/rte_gro.h           | 205 +++++++++++++++++++++++++++++++++++++
 lib/librte_gro/rte_gro_version.map |  12 +++
 mk/rte.app.mk                      |   1 +
 7 files changed, 400 insertions(+)
 create mode 100644 lib/librte_gro/Makefile
 create mode 100644 lib/librte_gro/rte_gro.c
 create mode 100644 lib/librte_gro/rte_gro.h
 create mode 100644 lib/librte_gro/rte_gro_version.map
  

Comments

Ananyev, Konstantin June 27, 2017, 11:42 p.m. UTC | #1
Hi Jiayu,

> -----Original Message-----
> From: Hu, Jiayu
> Sent: Monday, June 26, 2017 7:44 AM
> To: dev@dpdk.org
> Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; stephen@networkplumber.org;
> yliu@fridaylinux.org; Wu, Jingjing <jingjing.wu@intel.com>; Yao, Lei A <lei.a.yao@intel.com>; Wiles, Keith <keith.wiles@intel.com>; Bie,
> Tiwei <tiwei.bie@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> Subject: [PATCH v7 1/3] lib: add Generic Receive Offload API framework
> 
> Generic Receive Offload (GRO) is a widely used SW-based offloading
> technique to reduce per-packet processing overhead. It gains
> performance by reassembling small packets into large ones. This
> patchset is to support GRO in DPDK. To support GRO, this patch
> implements a GRO API framework.
> 
> To enable more flexibility to applications, DPDK GRO is implemented as
> a user library. Applications explicitly use the GRO library to merge
> small packets into large ones. DPDK GRO provides two reassembly modes.
> One is called lightweigth mode, the other is called heavyweight mode.
> If applications want to merge packets in a simple way and the number
> of packets is relatively small, they can use the lightweigth mode.
> If applications need more fine-grained controls, they can choose the
> heavyweigth mode.
> 
> rte_gro_reassemble_burst is the main reassembly API which is used in
> lightweigth mode and processes N packets at a time. For applications,
> performing GRO in lightweigth mode is simple. They just need to invoke
> rte_gro_reassemble_burst. Applications can get GROed packets as soon as
> rte_gro_reassemble_burst returns.
> 
> rte_gro_reassemble is the main reassembly API which is used in
> heavyweight mode and processes one packet at a time. For applications,
> performing GRO in heavyweigth mode is relatively complicated. Before
> performing GRO, applications need to create a GRO table by
> rte_gro_tbl_create. Then they can use rte_gro_reassemble to process
> packets one by one. The processed packets are in the GRO table. If
> applications want to get them, applications need to manually flush
> them by flush APIs.
> 
> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> ---
>  config/common_base                 |   5 +
>  lib/Makefile                       |   2 +
>  lib/librte_gro/Makefile            |  50 +++++++++
>  lib/librte_gro/rte_gro.c           | 125 ++++++++++++++++++++++
>  lib/librte_gro/rte_gro.h           | 205 +++++++++++++++++++++++++++++++++++++
>  lib/librte_gro/rte_gro_version.map |  12 +++
>  mk/rte.app.mk                      |   1 +
>  7 files changed, 400 insertions(+)
>  create mode 100644 lib/librte_gro/Makefile
>  create mode 100644 lib/librte_gro/rte_gro.c
>  create mode 100644 lib/librte_gro/rte_gro.h
>  create mode 100644 lib/librte_gro/rte_gro_version.map
> 
> diff --git a/config/common_base b/config/common_base
> index f6aafd1..167f5ef 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -712,6 +712,11 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
>  CONFIG_RTE_LIBRTE_PMD_VHOST=n
> 
>  #
> +# Compile GRO library
> +#
> +CONFIG_RTE_LIBRTE_GRO=y
> +
> +#
>  #Compile Xen domain0 support
>  #
>  CONFIG_RTE_LIBRTE_XEN_DOM0=n
> diff --git a/lib/Makefile b/lib/Makefile
> index 07e1fd0..ac1c2f6 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -106,6 +106,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
>  DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
>  DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
>  DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
> +DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro
> +DEPDIRS-librte_gro := librte_eal librte_mbuf librte_ether librte_net
> 
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile
> new file mode 100644
> index 0000000..7e0f128
> --- /dev/null
> +++ b/lib/librte_gro/Makefile
> @@ -0,0 +1,50 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_gro.a
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> +
> +EXPORT_MAP := rte_gro_version.map
> +
> +LIBABIVER := 1
> +
> +# source files
> +SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
> new file mode 100644
> index 0000000..33275e8
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro.c
> @@ -0,0 +1,125 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include <rte_malloc.h>
> +#include <rte_mbuf.h>
> +
> +#include "rte_gro.h"
> +
> +static gro_tbl_create_fn tbl_create_functions[GRO_TYPE_MAX_NUM];
> +static gro_tbl_destroy_fn tbl_destroy_functions[GRO_TYPE_MAX_NUM];
> +
> +struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
> +		uint16_t max_flow_num,
> +		uint16_t max_item_per_flow,
> +		uint32_t max_packet_size,
> +		uint64_t max_timeout_cycles,
> +		uint64_t desired_gro_types)
> +{
> +	gro_tbl_create_fn create_tbl_fn;
> +	struct rte_gro_tbl *gro_tbl;
> +	uint64_t gro_type_flag = 0;
> +	uint8_t i;
> +
> +	gro_tbl = rte_zmalloc_socket(__func__,
> +			sizeof(struct rte_gro_tbl),
> +			RTE_CACHE_LINE_SIZE,
> +			socket_id);
> +	gro_tbl->max_packet_size = max_packet_size;
> +	gro_tbl->max_timeout_cycles = max_timeout_cycles;
> +	gro_tbl->desired_gro_types = desired_gro_types;
> +
> +	for (i = 0; i < GRO_TYPE_MAX_NUM; i++) {
> +		gro_type_flag = 1 << i;
> +		if (desired_gro_types & gro_type_flag) {
> +			create_tbl_fn = tbl_create_functions[i];
> +			if (create_tbl_fn)
> +				create_tbl_fn(socket_id,
> +						max_flow_num,
> +						max_item_per_flow);

As I understand, create_tbl_fn(0 can fail.
You should handle such situation correctly.


> +			else
> +				gro_tbl->tbls[i] = NULL;
> +		}
> +	}
> +	return gro_tbl;
> +}
> +
> +void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl)
> +{
> +	gro_tbl_destroy_fn destroy_tbl_fn;
> +	uint64_t gro_type_flag;
> +	uint8_t i;
> +
> +	if (gro_tbl == NULL)
> +		return;
> +	for (i = 0; i < GRO_TYPE_MAX_NUM; i++) {
> +		gro_type_flag = 1 << i;
> +		if (gro_tbl->desired_gro_types & gro_type_flag) {
> +			destroy_tbl_fn = tbl_destroy_functions[i];
> +			if (destroy_tbl_fn)
> +				destroy_tbl_fn(gro_tbl->tbls[i]);
> +			gro_tbl->tbls[i] = NULL;
> +		}
> +	}
> +	rte_free(gro_tbl);
> +}
> +
> +uint16_t
> +rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
> +		const uint16_t nb_pkts,
> +		const struct rte_gro_param param __rte_unused)
> +{
> +	return nb_pkts;
> +}
> +
> +int rte_gro_reassemble(struct rte_mbuf *pkt __rte_unused,
> +		struct rte_gro_tbl *gro_tbl __rte_unused)
> +{
> +	return -1;
> +}
> +
> +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> +		uint64_t desired_gro_types __rte_unused,
> +		struct rte_mbuf **out __rte_unused,
> +		const uint16_t max_nb_out __rte_unused)
> +{
> +	return 0;
> +}
> +
> +uint16_t
> +rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> +		uint64_t desired_gro_types __rte_unused,
> +		struct rte_mbuf **out __rte_unused,
> +		const uint16_t max_nb_out __rte_unused)
> +{
> +	return 0;
> +}
> diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
> new file mode 100644
> index 0000000..f9d36e8
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro.h
> @@ -0,0 +1,205 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_GRO_H_
> +#define _RTE_GRO_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +/**
> + * the max packets number that rte_gro_reassemble_burst can
> + * process in each invocation.
> + */
> +#define GRO_MAX_BURST_ITEM_NUM 1024UL
> +
> +/* max number of supported GRO types */
> +#define GRO_TYPE_MAX_NUM 64
> +#define GRO_TYPE_SUPPORT_NUM 0	/**< current supported GRO num */

Herre and everywhere public macros should start with RTE_ prefix to follow
DPDK coding style.

> +
> +/**
> + * GRO table, which is used to merge packets. It keeps many reassembly
> + * tables of desired GRO types. Applications need to create GRO tables
> + * before using rte_gro_reassemble to perform GRO.
> + */
> +struct rte_gro_tbl {
> +	uint64_t desired_gro_types;	/**< GRO types to perform */
> +	/* max TTL measured in nanosecond */
> +	uint64_t max_timeout_cycles;
> +	/* max length of merged packet measured in byte */
> +	uint32_t max_packet_size;
> +	/* reassebly tables of desired GRO types */
> +	void *tbls[GRO_TYPE_MAX_NUM];
> +};

Not sure why do you need to define that structure here.
As I understand it is internal to the library.
Just declaration should be enough.

> +
> +struct rte_gro_param {
> +	uint64_t desired_gro_types;	/**< desired GRO types */
> +	uint32_t max_packet_size;	/**< max length of merged packets */
> +	uint16_t max_flow_num;	/**< max flow number */
> +	uint16_t max_item_per_flow;	/**< max packet number per flow */
> +};
> +
> +typedef void *(*gro_tbl_create_fn)(uint16_t socket_id,
> +		uint16_t max_flow_num,
> +		uint16_t max_item_per_flow);
> +typedef void (*gro_tbl_destroy_fn)(void *tbl);

Same here  - user probably shouldn't see these typedefs,
so better to hide them inside.

> +
> +/**
> + * This function create a GRO table, which is used to merge packets.
> + *
> + * @param socket_id
> + *  socket index where the Ethernet port connects to.
> + * @param max_flow_num
> + *  max number of flows in the GRO table.
> + * @param max_item_per_flow
> + *  max packet number per flow. We use the value of (max_flow_num *
> + *  max_item_per_fow) to calculate table size.
> + * @param max_packet_size
> + *  max length of merged packets. Measured in byte.
> + * @param max_timeout_cycles
> + *  max TTL for a packet in the GRO table. It's measured in nanosecond.
> + * @param desired_gro_types
> + *  GRO types to perform.
> + * @return
> + *  if create successfully, return a pointer which points to the GRO
> + *  table. Otherwise, return NULL.
> + */
> +struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
> +		uint16_t max_flow_num,
> +		uint16_t max_item_per_flow,
> +		uint32_t max_packet_size,
> +		uint64_t max_timeout_cycles,
> +		uint64_t desired_gro_types);

Hm, couldn't we have here struct rte_gro_tbl_param * instead of dozen arguments?

> +/**
> + * This function destroys a GRO table.
> + */
> +void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl);
> +
> +/**
> + * This is one of the main reassembly APIs, which merges numbers of
> + * packets at a time. It assumes that all inputted packets are with
> + * correct checksums. That is, applications should guarantee all
> + * inputted packets are correct. Besides, it doesn't re-calculate
> + * checksums for merged packets. If inputted packets are IP fragmented,
> + * this function assumes them are complete (i.e. with L4 header). After
> + * finishing processing, it returns all GROed packets to  applications
> + * immediately.
> + *
> + * @param pkts
> + *  a pointer array which points to the packets to reassemble. Besides,
> + *  it keeps addresses of GROed packets.
> + * @param nb_pkts
> + *  the number of packets to reassemble.
> + * @param param
> + *  applications use it to tell rte_gro_reassemble_burst what rules
> + *  are demanded.
> + * @return
> + *  the number of packets after GROed.
> + */
> +uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> +		const uint16_t nb_pkts,

Here and everywhere - no much point in defining intger input parameter
(or any other that is passed by value) as const.

> +		const struct rte_gro_param param);

You probably meant 'const struct rte_gro_param *param' here?

> +
> +/**
> + * Reassembly function, which tries to merge the inputted packet with
> + * one packet in a given GRO table. This function assumes the inputted
> + * packet is with correct checksums. And it won't update checksums if
> + * two packets are merged. Besides, if the inputted packet is IP
> + * fragmented, this function assumes it's a complete packet (i.e. with
> + * L4 header).
> + *
> + * If the inputted packet doesn't have data or it's with unsupported GRO
> + * type, function returns immediately. Otherwise, the inputted packet is
> + * either merged or inserted into the table. If applications want get
> + * packets in the table, they need to call flush APIs.
> + *
> + * @param pkt
> + *  packet to reassemble.
> + * @param gro_tbl
> + *  a pointer points to a GRO table.
> + * @return
> + *  if merge the packet successfully, return a positive value. If fail
> + *  to merge, return zero. If the packet doesn't have data, or its GRO
> + *  type is unsupported, return a negative value.
> + */
> +int rte_gro_reassemble(struct rte_mbuf *pkt,
> +		struct rte_gro_tbl *gro_tbl);


Ok, and why tbl one can't do bursts?


> +
> +/**
> + * This function flushed packets from reassembly tables of desired GRO
> + * types. It won't re-calculate checksums for merged packets in the
> + * tables. That is, the returned packets may be with wrong checksums.
> + *
> + * @param gro_tbl
> + *  a pointer points to a GRO table object.
> + * @param desired_gro_types
> + *  GRO types whose packets will be flushed.
> + * @param out
> + *  a pointer array that is used to keep flushed packets.
> + * @param nb_out
> + *  the size of out.
> + * @return
> + *  the number of flushed packets. If no packets are flushed, return 0.
> + */
> +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl,
> +		uint64_t desired_gro_types,
> +		struct rte_mbuf **out,
> +		const uint16_t max_nb_out);
> +
> +/**
> + * This function flushes the timeout packets from reassembly tables of
> + * desired GRO types. It won't re-calculate checksums for merged packets
> + * in the tables. That is, the returned packets may be with wrong
> + * checksums.
> + *
> + * @param gro_tbl
> + *  a pointer points to a GRO table object.
> + * @param desired_gro_types
> + * rte_gro_timeout_flush only processes packets which belong to the
> + * GRO types specified by desired_gro_types.
> + * @param out
> + *  a pointer array that is used to keep flushed packets.
> + * @param nb_out
> + *  the size of out.
> + * @return
> + *  the number of flushed packets. If no packets are flushed, return 0.
> + */
> +uint16_t rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl,
> +		uint64_t desired_gro_types,
> +		struct rte_mbuf **out,
> +		const uint16_t max_nb_out);

No point to have 2 flush() functions.
I suggest to merge them together.

> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif
> diff --git a/lib/librte_gro/rte_gro_version.map b/lib/librte_gro/rte_gro_version.map
> new file mode 100644
> index 0000000..827596b
> --- /dev/null
> +++ b/lib/librte_gro/rte_gro_version.map
> @@ -0,0 +1,12 @@
> +DPDK_17.08 {
> +	global:
> +
> +	rte_gro_tbl_create;
> +	rte_gro_tbl_destroy;
> +	rte_gro_reassemble_burst;
> +	rte_gro_reassemble;
> +	rte_gro_flush;
> +	rte_gro_timeout_flush;
> +
> +	local: *;
> +};
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index bcaf1b3..fc3776d 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -98,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)        	+= -lrte_gro
> 
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
> --
> 2.7.4
  
Hu, Jiayu June 28, 2017, 2:17 a.m. UTC | #2
Hi Konstantin,

On Wed, Jun 28, 2017 at 07:42:01AM +0800, Ananyev, Konstantin wrote:
> Hi Jiayu,
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Monday, June 26, 2017 7:44 AM
> > To: dev@dpdk.org
> > Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; stephen@networkplumber.org;
> > yliu@fridaylinux.org; Wu, Jingjing <jingjing.wu@intel.com>; Yao, Lei A <lei.a.yao@intel.com>; Wiles, Keith <keith.wiles@intel.com>; Bie,
> > Tiwei <tiwei.bie@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>
> > Subject: [PATCH v7 1/3] lib: add Generic Receive Offload API framework
> > 
> > Generic Receive Offload (GRO) is a widely used SW-based offloading
> > technique to reduce per-packet processing overhead. It gains
> > performance by reassembling small packets into large ones. This
> > patchset is to support GRO in DPDK. To support GRO, this patch
> > implements a GRO API framework.
> > 
> > To enable more flexibility to applications, DPDK GRO is implemented as
> > a user library. Applications explicitly use the GRO library to merge
> > small packets into large ones. DPDK GRO provides two reassembly modes.
> > One is called lightweigth mode, the other is called heavyweight mode.
> > If applications want to merge packets in a simple way and the number
> > of packets is relatively small, they can use the lightweigth mode.
> > If applications need more fine-grained controls, they can choose the
> > heavyweigth mode.
> > 
> > rte_gro_reassemble_burst is the main reassembly API which is used in
> > lightweigth mode and processes N packets at a time. For applications,
> > performing GRO in lightweigth mode is simple. They just need to invoke
> > rte_gro_reassemble_burst. Applications can get GROed packets as soon as
> > rte_gro_reassemble_burst returns.
> > 
> > rte_gro_reassemble is the main reassembly API which is used in
> > heavyweight mode and processes one packet at a time. For applications,
> > performing GRO in heavyweigth mode is relatively complicated. Before
> > performing GRO, applications need to create a GRO table by
> > rte_gro_tbl_create. Then they can use rte_gro_reassemble to process
> > packets one by one. The processed packets are in the GRO table. If
> > applications want to get them, applications need to manually flush
> > them by flush APIs.
> > 
> > Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
> > ---
> >  config/common_base                 |   5 +
> >  lib/Makefile                       |   2 +
> >  lib/librte_gro/Makefile            |  50 +++++++++
> >  lib/librte_gro/rte_gro.c           | 125 ++++++++++++++++++++++
> >  lib/librte_gro/rte_gro.h           | 205 +++++++++++++++++++++++++++++++++++++
> >  lib/librte_gro/rte_gro_version.map |  12 +++
> >  mk/rte.app.mk                      |   1 +
> >  7 files changed, 400 insertions(+)
> >  create mode 100644 lib/librte_gro/Makefile
> >  create mode 100644 lib/librte_gro/rte_gro.c
> >  create mode 100644 lib/librte_gro/rte_gro.h
> >  create mode 100644 lib/librte_gro/rte_gro_version.map
> > 
> > diff --git a/config/common_base b/config/common_base
> > index f6aafd1..167f5ef 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -712,6 +712,11 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
> >  CONFIG_RTE_LIBRTE_PMD_VHOST=n
> > 
> >  #
> > +# Compile GRO library
> > +#
> > +CONFIG_RTE_LIBRTE_GRO=y
> > +
> > +#
> >  #Compile Xen domain0 support
> >  #
> >  CONFIG_RTE_LIBRTE_XEN_DOM0=n
> > diff --git a/lib/Makefile b/lib/Makefile
> > index 07e1fd0..ac1c2f6 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -106,6 +106,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
> >  DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
> >  DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
> >  DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
> > +DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro
> > +DEPDIRS-librte_gro := librte_eal librte_mbuf librte_ether librte_net
> > 
> >  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> >  DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
> > diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile
> > new file mode 100644
> > index 0000000..7e0f128
> > --- /dev/null
> > +++ b/lib/librte_gro/Makefile
> > @@ -0,0 +1,50 @@
> > +#   BSD LICENSE
> > +#
> > +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > +#   All rights reserved.
> > +#
> > +#   Redistribution and use in source and binary forms, with or without
> > +#   modification, are permitted provided that the following conditions
> > +#   are met:
> > +#
> > +#     * Redistributions of source code must retain the above copyright
> > +#       notice, this list of conditions and the following disclaimer.
> > +#     * Redistributions in binary form must reproduce the above copyright
> > +#       notice, this list of conditions and the following disclaimer in
> > +#       the documentation and/or other materials provided with the
> > +#       distribution.
> > +#     * Neither the name of Intel Corporation nor the names of its
> > +#       contributors may be used to endorse or promote products derived
> > +#       from this software without specific prior written permission.
> > +#
> > +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# library name
> > +LIB = librte_gro.a
> > +
> > +CFLAGS += -O3
> > +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> > +
> > +EXPORT_MAP := rte_gro_version.map
> > +
> > +LIBABIVER := 1
> > +
> > +# source files
> > +SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c
> > +
> > +# install this header file
> > +SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h
> > +
> > +include $(RTE_SDK)/mk/rte.lib.mk
> > diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
> > new file mode 100644
> > index 0000000..33275e8
> > --- /dev/null
> > +++ b/lib/librte_gro/rte_gro.c
> > @@ -0,0 +1,125 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#include <rte_malloc.h>
> > +#include <rte_mbuf.h>
> > +
> > +#include "rte_gro.h"
> > +
> > +static gro_tbl_create_fn tbl_create_functions[GRO_TYPE_MAX_NUM];
> > +static gro_tbl_destroy_fn tbl_destroy_functions[GRO_TYPE_MAX_NUM];
> > +
> > +struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
> > +		uint16_t max_flow_num,
> > +		uint16_t max_item_per_flow,
> > +		uint32_t max_packet_size,
> > +		uint64_t max_timeout_cycles,
> > +		uint64_t desired_gro_types)
> > +{
> > +	gro_tbl_create_fn create_tbl_fn;
> > +	struct rte_gro_tbl *gro_tbl;
> > +	uint64_t gro_type_flag = 0;
> > +	uint8_t i;
> > +
> > +	gro_tbl = rte_zmalloc_socket(__func__,
> > +			sizeof(struct rte_gro_tbl),
> > +			RTE_CACHE_LINE_SIZE,
> > +			socket_id);
> > +	gro_tbl->max_packet_size = max_packet_size;
> > +	gro_tbl->max_timeout_cycles = max_timeout_cycles;
> > +	gro_tbl->desired_gro_types = desired_gro_types;
> > +
> > +	for (i = 0; i < GRO_TYPE_MAX_NUM; i++) {
> > +		gro_type_flag = 1 << i;
> > +		if (desired_gro_types & gro_type_flag) {
> > +			create_tbl_fn = tbl_create_functions[i];
> > +			if (create_tbl_fn)
> > +				create_tbl_fn(socket_id,
> > +						max_flow_num,
> > +						max_item_per_flow);
> 
> As I understand, create_tbl_fn(0 can fail.
> You should handle such situation correctly.

Thanks, I will add failure check.

> 
> 
> > +			else
> > +				gro_tbl->tbls[i] = NULL;
> > +		}
> > +	}
> > +	return gro_tbl;
> > +}
> > +
> > +void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl)
> > +{
> > +	gro_tbl_destroy_fn destroy_tbl_fn;
> > +	uint64_t gro_type_flag;
> > +	uint8_t i;
> > +
> > +	if (gro_tbl == NULL)
> > +		return;
> > +	for (i = 0; i < GRO_TYPE_MAX_NUM; i++) {
> > +		gro_type_flag = 1 << i;
> > +		if (gro_tbl->desired_gro_types & gro_type_flag) {
> > +			destroy_tbl_fn = tbl_destroy_functions[i];
> > +			if (destroy_tbl_fn)
> > +				destroy_tbl_fn(gro_tbl->tbls[i]);
> > +			gro_tbl->tbls[i] = NULL;
> > +		}
> > +	}
> > +	rte_free(gro_tbl);
> > +}
> > +
> > +uint16_t
> > +rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
> > +		const uint16_t nb_pkts,
> > +		const struct rte_gro_param param __rte_unused)
> > +{
> > +	return nb_pkts;
> > +}
> > +
> > +int rte_gro_reassemble(struct rte_mbuf *pkt __rte_unused,
> > +		struct rte_gro_tbl *gro_tbl __rte_unused)
> > +{
> > +	return -1;
> > +}
> > +
> > +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> > +		uint64_t desired_gro_types __rte_unused,
> > +		struct rte_mbuf **out __rte_unused,
> > +		const uint16_t max_nb_out __rte_unused)
> > +{
> > +	return 0;
> > +}
> > +
> > +uint16_t
> > +rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
> > +		uint64_t desired_gro_types __rte_unused,
> > +		struct rte_mbuf **out __rte_unused,
> > +		const uint16_t max_nb_out __rte_unused)
> > +{
> > +	return 0;
> > +}
> > diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
> > new file mode 100644
> > index 0000000..f9d36e8
> > --- /dev/null
> > +++ b/lib/librte_gro/rte_gro.h
> > @@ -0,0 +1,205 @@
> > +/*-
> > + *   BSD LICENSE
> > + *
> > + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> > + *
> > + *   Redistribution and use in source and binary forms, with or without
> > + *   modification, are permitted provided that the following conditions
> > + *   are met:
> > + *
> > + *     * Redistributions of source code must retain the above copyright
> > + *       notice, this list of conditions and the following disclaimer.
> > + *     * Redistributions in binary form must reproduce the above copyright
> > + *       notice, this list of conditions and the following disclaimer in
> > + *       the documentation and/or other materials provided with the
> > + *       distribution.
> > + *     * Neither the name of Intel Corporation nor the names of its
> > + *       contributors may be used to endorse or promote products derived
> > + *       from this software without specific prior written permission.
> > + *
> > + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> > + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> > + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> > + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> > + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> > + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> > + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> > + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> > + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> > + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> > + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> > + */
> > +
> > +#ifndef _RTE_GRO_H_
> > +#define _RTE_GRO_H_
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +/**
> > + * the max packets number that rte_gro_reassemble_burst can
> > + * process in each invocation.
> > + */
> > +#define GRO_MAX_BURST_ITEM_NUM 1024UL
> > +
> > +/* max number of supported GRO types */
> > +#define GRO_TYPE_MAX_NUM 64
> > +#define GRO_TYPE_SUPPORT_NUM 0	/**< current supported GRO num */
> 
> Herre and everywhere public macros should start with RTE_ prefix to follow
> DPDK coding style.

Thanks, I will change the names.

> 
> > +
> > +/**
> > + * GRO table, which is used to merge packets. It keeps many reassembly
> > + * tables of desired GRO types. Applications need to create GRO tables
> > + * before using rte_gro_reassemble to perform GRO.
> > + */
> > +struct rte_gro_tbl {
> > +	uint64_t desired_gro_types;	/**< GRO types to perform */
> > +	/* max TTL measured in nanosecond */
> > +	uint64_t max_timeout_cycles;
> > +	/* max length of merged packet measured in byte */
> > +	uint32_t max_packet_size;
> > +	/* reassebly tables of desired GRO types */
> > +	void *tbls[GRO_TYPE_MAX_NUM];
> > +};
> 
> Not sure why do you need to define that structure here.
> As I understand it is internal to the library.
> Just declaration should be enough.

This structure defines a GRO table, which is used by rte_gro_reassemble
to merge packets. Applications need to create this table before calling
rte_gro_reassemble. So I define it in rte_gro.h.

> 
> > +
> > +struct rte_gro_param {
> > +	uint64_t desired_gro_types;	/**< desired GRO types */
> > +	uint32_t max_packet_size;	/**< max length of merged packets */
> > +	uint16_t max_flow_num;	/**< max flow number */
> > +	uint16_t max_item_per_flow;	/**< max packet number per flow */
> > +};
> > +
> > +typedef void *(*gro_tbl_create_fn)(uint16_t socket_id,
> > +		uint16_t max_flow_num,
> > +		uint16_t max_item_per_flow);
> > +typedef void (*gro_tbl_destroy_fn)(void *tbl);
> 
> Same here  - user probably shouldn't see these typedefs,
> so better to hide them inside.

Thanks, I will hide them inside.

> 
> > +
> > +/**
> > + * This function create a GRO table, which is used to merge packets.
> > + *
> > + * @param socket_id
> > + *  socket index where the Ethernet port connects to.
> > + * @param max_flow_num
> > + *  max number of flows in the GRO table.
> > + * @param max_item_per_flow
> > + *  max packet number per flow. We use the value of (max_flow_num *
> > + *  max_item_per_fow) to calculate table size.
> > + * @param max_packet_size
> > + *  max length of merged packets. Measured in byte.
> > + * @param max_timeout_cycles
> > + *  max TTL for a packet in the GRO table. It's measured in nanosecond.
> > + * @param desired_gro_types
> > + *  GRO types to perform.
> > + * @return
> > + *  if create successfully, return a pointer which points to the GRO
> > + *  table. Otherwise, return NULL.
> > + */
> > +struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
> > +		uint16_t max_flow_num,
> > +		uint16_t max_item_per_flow,
> > +		uint32_t max_packet_size,
> > +		uint64_t max_timeout_cycles,
> > +		uint64_t desired_gro_types);
> 
> Hm, couldn't we have here struct rte_gro_tbl_param * instead of dozen arguments?

Thanks, I will change it.

> 
> > +/**
> > + * This function destroys a GRO table.
> > + */
> > +void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl);
> > +
> > +/**
> > + * This is one of the main reassembly APIs, which merges numbers of
> > + * packets at a time. It assumes that all inputted packets are with
> > + * correct checksums. That is, applications should guarantee all
> > + * inputted packets are correct. Besides, it doesn't re-calculate
> > + * checksums for merged packets. If inputted packets are IP fragmented,
> > + * this function assumes them are complete (i.e. with L4 header). After
> > + * finishing processing, it returns all GROed packets to  applications
> > + * immediately.
> > + *
> > + * @param pkts
> > + *  a pointer array which points to the packets to reassemble. Besides,
> > + *  it keeps addresses of GROed packets.
> > + * @param nb_pkts
> > + *  the number of packets to reassemble.
> > + * @param param
> > + *  applications use it to tell rte_gro_reassemble_burst what rules
> > + *  are demanded.
> > + * @return
> > + *  the number of packets after GROed.
> > + */
> > +uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
> > +		const uint16_t nb_pkts,
> 
> Here and everywhere - no much point in defining intger input parameter
> (or any other that is passed by value) as const.

Thanks. I will modify it.

> 
> > +		const struct rte_gro_param param);
> 
> You probably meant 'const struct rte_gro_param *param' here?

'const struct rte_gro_param *param' is better. I will modify it.
Thanks.

> 
> > +
> > +/**
> > + * Reassembly function, which tries to merge the inputted packet with
> > + * one packet in a given GRO table. This function assumes the inputted
> > + * packet is with correct checksums. And it won't update checksums if
> > + * two packets are merged. Besides, if the inputted packet is IP
> > + * fragmented, this function assumes it's a complete packet (i.e. with
> > + * L4 header).
> > + *
> > + * If the inputted packet doesn't have data or it's with unsupported GRO
> > + * type, function returns immediately. Otherwise, the inputted packet is
> > + * either merged or inserted into the table. If applications want get
> > + * packets in the table, they need to call flush APIs.
> > + *
> > + * @param pkt
> > + *  packet to reassemble.
> > + * @param gro_tbl
> > + *  a pointer points to a GRO table.
> > + * @return
> > + *  if merge the packet successfully, return a positive value. If fail
> > + *  to merge, return zero. If the packet doesn't have data, or its GRO
> > + *  type is unsupported, return a negative value.
> > + */
> > +int rte_gro_reassemble(struct rte_mbuf *pkt,
> > +		struct rte_gro_tbl *gro_tbl);
> 
> 
> Ok, and why tbl one can't do bursts?

In current design, if applications want to do bursts, they don't need to
create gro_tbl. rte_gro_reassemble_burst will create a temporary table
in stack. So when do bursts (we call it lightweight mode), the operations
of applications is very simple: calling rte_gro_reassemble_burst. And
after rte_gro_reassemble_burst returns, applications can get all merged
packets. rte_gro_reassemble is another mode API, called heavyweight mode.
The gro_tbl is just used in rte_gro_reassemble. rte_gro_reassemble just
processes one packet at a time.

So you mean: we should enable rte_gro_reassemble to merge N inputted
packets with the packets in a given gro_tbl?

> 
> 
> > +
> > +/**
> > + * This function flushed packets from reassembly tables of desired GRO
> > + * types. It won't re-calculate checksums for merged packets in the
> > + * tables. That is, the returned packets may be with wrong checksums.
> > + *
> > + * @param gro_tbl
> > + *  a pointer points to a GRO table object.
> > + * @param desired_gro_types
> > + *  GRO types whose packets will be flushed.
> > + * @param out
> > + *  a pointer array that is used to keep flushed packets.
> > + * @param nb_out
> > + *  the size of out.
> > + * @return
> > + *  the number of flushed packets. If no packets are flushed, return 0.
> > + */
> > +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl,
> > +		uint64_t desired_gro_types,
> > +		struct rte_mbuf **out,
> > +		const uint16_t max_nb_out);
> > +
> > +/**
> > + * This function flushes the timeout packets from reassembly tables of
> > + * desired GRO types. It won't re-calculate checksums for merged packets
> > + * in the tables. That is, the returned packets may be with wrong
> > + * checksums.
> > + *
> > + * @param gro_tbl
> > + *  a pointer points to a GRO table object.
> > + * @param desired_gro_types
> > + * rte_gro_timeout_flush only processes packets which belong to the
> > + * GRO types specified by desired_gro_types.
> > + * @param out
> > + *  a pointer array that is used to keep flushed packets.
> > + * @param nb_out
> > + *  the size of out.
> > + * @return
> > + *  the number of flushed packets. If no packets are flushed, return 0.
> > + */
> > +uint16_t rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl,
> > +		uint64_t desired_gro_types,
> > +		struct rte_mbuf **out,
> > +		const uint16_t max_nb_out);
> 
> No point to have 2 flush() functions.
> I suggest to merge them together.

rte_gro_flush flush all packets from table, but rte_gro_timeout_flush only
flush timeout packets. They have different operations. But if we merge them
together, we need to flush all or only timeout ones?

> 
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif
> > diff --git a/lib/librte_gro/rte_gro_version.map b/lib/librte_gro/rte_gro_version.map
> > new file mode 100644
> > index 0000000..827596b
> > --- /dev/null
> > +++ b/lib/librte_gro/rte_gro_version.map
> > @@ -0,0 +1,12 @@
> > +DPDK_17.08 {
> > +	global:
> > +
> > +	rte_gro_tbl_create;
> > +	rte_gro_tbl_destroy;
> > +	rte_gro_reassemble_burst;
> > +	rte_gro_reassemble;
> > +	rte_gro_flush;
> > +	rte_gro_timeout_flush;
> > +
> > +	local: *;
> > +};
> > diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> > index bcaf1b3..fc3776d 100644
> > --- a/mk/rte.app.mk
> > +++ b/mk/rte.app.mk
> > @@ -98,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
> > +_LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)        	+= -lrte_gro
> > 
> >  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
> > --
> > 2.7.4
  
Ananyev, Konstantin June 28, 2017, 5:41 p.m. UTC | #3
Hi Jiayu,

> 
> >
> > > +
> > > +/**
> > > + * GRO table, which is used to merge packets. It keeps many reassembly
> > > + * tables of desired GRO types. Applications need to create GRO tables
> > > + * before using rte_gro_reassemble to perform GRO.
> > > + */
> > > +struct rte_gro_tbl {
> > > +	uint64_t desired_gro_types;	/**< GRO types to perform */
> > > +	/* max TTL measured in nanosecond */
> > > +	uint64_t max_timeout_cycles;
> > > +	/* max length of merged packet measured in byte */
> > > +	uint32_t max_packet_size;
> > > +	/* reassebly tables of desired GRO types */
> > > +	void *tbls[GRO_TYPE_MAX_NUM];
> > > +};
> >
> > Not sure why do you need to define that structure here.
> > As I understand it is internal to the library.
> > Just declaration should be enough.
> 
> This structure defines a GRO table, which is used by rte_gro_reassemble
> to merge packets. Applications need to create this table before calling
> rte_gro_reassemble. So I define it in rte_gro.h.

Yes, application has to call gro_table_create().
But application don't need to access contents of struct rte_gro_tbl,
which means at it can (and should) treat it as opaque pointer.

> > > +
> > > +/**
> > > + * Reassembly function, which tries to merge the inputted packet with
> > > + * one packet in a given GRO table. This function assumes the inputted
> > > + * packet is with correct checksums. And it won't update checksums if
> > > + * two packets are merged. Besides, if the inputted packet is IP
> > > + * fragmented, this function assumes it's a complete packet (i.e. with
> > > + * L4 header).
> > > + *
> > > + * If the inputted packet doesn't have data or it's with unsupported GRO
> > > + * type, function returns immediately. Otherwise, the inputted packet is
> > > + * either merged or inserted into the table. If applications want get
> > > + * packets in the table, they need to call flush APIs.
> > > + *
> > > + * @param pkt
> > > + *  packet to reassemble.
> > > + * @param gro_tbl
> > > + *  a pointer points to a GRO table.
> > > + * @return
> > > + *  if merge the packet successfully, return a positive value. If fail
> > > + *  to merge, return zero. If the packet doesn't have data, or its GRO
> > > + *  type is unsupported, return a negative value.
> > > + */
> > > +int rte_gro_reassemble(struct rte_mbuf *pkt,
> > > +		struct rte_gro_tbl *gro_tbl);
> >
> >
> > Ok, and why tbl one can't do bursts?
> 
> In current design, if applications want to do bursts, they don't need to
> create gro_tbl. rte_gro_reassemble_burst will create a temporary table
> in stack. So when do bursts (we call it lightweight mode), the operations
> of applications is very simple: calling rte_gro_reassemble_burst. And
> after rte_gro_reassemble_burst returns, applications can get all merged
> packets. rte_gro_reassemble is another mode API, called heavyweight mode.
> The gro_tbl is just used in rte_gro_reassemble. rte_gro_reassemble just
> processes one packet at a time.
> 
> So you mean: we should enable rte_gro_reassemble to merge N inputted
> packets with the packets in a given gro_tbl?

Yes, I suppose that will be faster.

> 
> >
> >
> > > +
> > > +/**
> > > + * This function flushed packets from reassembly tables of desired GRO
> > > + * types. It won't re-calculate checksums for merged packets in the
> > > + * tables. That is, the returned packets may be with wrong checksums.
> > > + *
> > > + * @param gro_tbl
> > > + *  a pointer points to a GRO table object.
> > > + * @param desired_gro_types
> > > + *  GRO types whose packets will be flushed.
> > > + * @param out
> > > + *  a pointer array that is used to keep flushed packets.
> > > + * @param nb_out
> > > + *  the size of out.
> > > + * @return
> > > + *  the number of flushed packets. If no packets are flushed, return 0.
> > > + */
> > > +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl,
> > > +		uint64_t desired_gro_types,
> > > +		struct rte_mbuf **out,
> > > +		const uint16_t max_nb_out);
> > > +
> > > +/**
> > > + * This function flushes the timeout packets from reassembly tables of
> > > + * desired GRO types. It won't re-calculate checksums for merged packets
> > > + * in the tables. That is, the returned packets may be with wrong
> > > + * checksums.
> > > + *
> > > + * @param gro_tbl
> > > + *  a pointer points to a GRO table object.
> > > + * @param desired_gro_types
> > > + * rte_gro_timeout_flush only processes packets which belong to the
> > > + * GRO types specified by desired_gro_types.
> > > + * @param out
> > > + *  a pointer array that is used to keep flushed packets.
> > > + * @param nb_out
> > > + *  the size of out.
> > > + * @return
> > > + *  the number of flushed packets. If no packets are flushed, return 0.
> > > + */
> > > +uint16_t rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl,
> > > +		uint64_t desired_gro_types,
> > > +		struct rte_mbuf **out,
> > > +		const uint16_t max_nb_out);
> >
> > No point to have 2 flush() functions.
> > I suggest to merge them together.
> 
> rte_gro_flush flush all packets from table, but rte_gro_timeout_flush only
> flush timeout packets. They have different operations. But if we merge them
> together, we need to flush all or only timeout ones?

We can specify that if timeout is zero (or less then current time)  then
we flush all packets.
  
Hu, Jiayu June 29, 2017, 1:19 a.m. UTC | #4
Hi Konstantin,

On Thu, Jun 29, 2017 at 01:41:40AM +0800, Ananyev, Konstantin wrote:
> 
> Hi Jiayu,
> 
> > 
> > >
> > > > +
> > > > +/**
> > > > + * GRO table, which is used to merge packets. It keeps many reassembly
> > > > + * tables of desired GRO types. Applications need to create GRO tables
> > > > + * before using rte_gro_reassemble to perform GRO.
> > > > + */
> > > > +struct rte_gro_tbl {
> > > > +	uint64_t desired_gro_types;	/**< GRO types to perform */
> > > > +	/* max TTL measured in nanosecond */
> > > > +	uint64_t max_timeout_cycles;
> > > > +	/* max length of merged packet measured in byte */
> > > > +	uint32_t max_packet_size;
> > > > +	/* reassebly tables of desired GRO types */
> > > > +	void *tbls[GRO_TYPE_MAX_NUM];
> > > > +};
> > >
> > > Not sure why do you need to define that structure here.
> > > As I understand it is internal to the library.
> > > Just declaration should be enough.
> > 
> > This structure defines a GRO table, which is used by rte_gro_reassemble
> > to merge packets. Applications need to create this table before calling
> > rte_gro_reassemble. So I define it in rte_gro.h.
> 
> Yes, application has to call gro_table_create().
> But application don't need to access contents of struct rte_gro_tbl,
> which means at it can (and should) treat it as opaque pointer.

Thanks, I will modify it.

> 
> > > > +
> > > > +/**
> > > > + * Reassembly function, which tries to merge the inputted packet with
> > > > + * one packet in a given GRO table. This function assumes the inputted
> > > > + * packet is with correct checksums. And it won't update checksums if
> > > > + * two packets are merged. Besides, if the inputted packet is IP
> > > > + * fragmented, this function assumes it's a complete packet (i.e. with
> > > > + * L4 header).
> > > > + *
> > > > + * If the inputted packet doesn't have data or it's with unsupported GRO
> > > > + * type, function returns immediately. Otherwise, the inputted packet is
> > > > + * either merged or inserted into the table. If applications want get
> > > > + * packets in the table, they need to call flush APIs.
> > > > + *
> > > > + * @param pkt
> > > > + *  packet to reassemble.
> > > > + * @param gro_tbl
> > > > + *  a pointer points to a GRO table.
> > > > + * @return
> > > > + *  if merge the packet successfully, return a positive value. If fail
> > > > + *  to merge, return zero. If the packet doesn't have data, or its GRO
> > > > + *  type is unsupported, return a negative value.
> > > > + */
> > > > +int rte_gro_reassemble(struct rte_mbuf *pkt,
> > > > +		struct rte_gro_tbl *gro_tbl);
> > >
> > >
> > > Ok, and why tbl one can't do bursts?
> > 
> > In current design, if applications want to do bursts, they don't need to
> > create gro_tbl. rte_gro_reassemble_burst will create a temporary table
> > in stack. So when do bursts (we call it lightweight mode), the operations
> > of applications is very simple: calling rte_gro_reassemble_burst. And
> > after rte_gro_reassemble_burst returns, applications can get all merged
> > packets. rte_gro_reassemble is another mode API, called heavyweight mode.
> > The gro_tbl is just used in rte_gro_reassemble. rte_gro_reassemble just
> > processes one packet at a time.
> > 
> > So you mean: we should enable rte_gro_reassemble to merge N inputted
> > packets with the packets in a given gro_tbl?
> 
> Yes, I suppose that will be faster.

Thanks, I will enable it to process N packets at a time.

> 
> > 
> > >
> > >
> > > > +
> > > > +/**
> > > > + * This function flushed packets from reassembly tables of desired GRO
> > > > + * types. It won't re-calculate checksums for merged packets in the
> > > > + * tables. That is, the returned packets may be with wrong checksums.
> > > > + *
> > > > + * @param gro_tbl
> > > > + *  a pointer points to a GRO table object.
> > > > + * @param desired_gro_types
> > > > + *  GRO types whose packets will be flushed.
> > > > + * @param out
> > > > + *  a pointer array that is used to keep flushed packets.
> > > > + * @param nb_out
> > > > + *  the size of out.
> > > > + * @return
> > > > + *  the number of flushed packets. If no packets are flushed, return 0.
> > > > + */
> > > > +uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl,
> > > > +		uint64_t desired_gro_types,
> > > > +		struct rte_mbuf **out,
> > > > +		const uint16_t max_nb_out);
> > > > +
> > > > +/**
> > > > + * This function flushes the timeout packets from reassembly tables of
> > > > + * desired GRO types. It won't re-calculate checksums for merged packets
> > > > + * in the tables. That is, the returned packets may be with wrong
> > > > + * checksums.
> > > > + *
> > > > + * @param gro_tbl
> > > > + *  a pointer points to a GRO table object.
> > > > + * @param desired_gro_types
> > > > + * rte_gro_timeout_flush only processes packets which belong to the
> > > > + * GRO types specified by desired_gro_types.
> > > > + * @param out
> > > > + *  a pointer array that is used to keep flushed packets.
> > > > + * @param nb_out
> > > > + *  the size of out.
> > > > + * @return
> > > > + *  the number of flushed packets. If no packets are flushed, return 0.
> > > > + */
> > > > +uint16_t rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl,
> > > > +		uint64_t desired_gro_types,
> > > > +		struct rte_mbuf **out,
> > > > +		const uint16_t max_nb_out);
> > >
> > > No point to have 2 flush() functions.
> > > I suggest to merge them together.
> > 
> > rte_gro_flush flush all packets from table, but rte_gro_timeout_flush only
> > flush timeout packets. They have different operations. But if we merge them
> > together, we need to flush all or only timeout ones?
> 
> We can specify that if timeout is zero (or less then current time)  then
> we flush all packets.

Thanks, I will merge them together.
  

Patch

diff --git a/config/common_base b/config/common_base
index f6aafd1..167f5ef 100644
--- a/config/common_base
+++ b/config/common_base
@@ -712,6 +712,11 @@  CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
 CONFIG_RTE_LIBRTE_PMD_VHOST=n
 
 #
+# Compile GRO library
+#
+CONFIG_RTE_LIBRTE_GRO=y
+
+#
 #Compile Xen domain0 support
 #
 CONFIG_RTE_LIBRTE_XEN_DOM0=n
diff --git a/lib/Makefile b/lib/Makefile
index 07e1fd0..ac1c2f6 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -106,6 +106,8 @@  DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder
 DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf
 DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump
 DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether
+DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro
+DEPDIRS-librte_gro := librte_eal librte_mbuf librte_ether librte_net
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile
new file mode 100644
index 0000000..7e0f128
--- /dev/null
+++ b/lib/librte_gro/Makefile
@@ -0,0 +1,50 @@ 
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_gro.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+
+EXPORT_MAP := rte_gro_version.map
+
+LIBABIVER := 1
+
+# source files
+SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
new file mode 100644
index 0000000..33275e8
--- /dev/null
+++ b/lib/librte_gro/rte_gro.c
@@ -0,0 +1,125 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <rte_malloc.h>
+#include <rte_mbuf.h>
+
+#include "rte_gro.h"
+
+static gro_tbl_create_fn tbl_create_functions[GRO_TYPE_MAX_NUM];
+static gro_tbl_destroy_fn tbl_destroy_functions[GRO_TYPE_MAX_NUM];
+
+struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
+		uint16_t max_flow_num,
+		uint16_t max_item_per_flow,
+		uint32_t max_packet_size,
+		uint64_t max_timeout_cycles,
+		uint64_t desired_gro_types)
+{
+	gro_tbl_create_fn create_tbl_fn;
+	struct rte_gro_tbl *gro_tbl;
+	uint64_t gro_type_flag = 0;
+	uint8_t i;
+
+	gro_tbl = rte_zmalloc_socket(__func__,
+			sizeof(struct rte_gro_tbl),
+			RTE_CACHE_LINE_SIZE,
+			socket_id);
+	gro_tbl->max_packet_size = max_packet_size;
+	gro_tbl->max_timeout_cycles = max_timeout_cycles;
+	gro_tbl->desired_gro_types = desired_gro_types;
+
+	for (i = 0; i < GRO_TYPE_MAX_NUM; i++) {
+		gro_type_flag = 1 << i;
+		if (desired_gro_types & gro_type_flag) {
+			create_tbl_fn = tbl_create_functions[i];
+			if (create_tbl_fn)
+				create_tbl_fn(socket_id,
+						max_flow_num,
+						max_item_per_flow);
+			else
+				gro_tbl->tbls[i] = NULL;
+		}
+	}
+	return gro_tbl;
+}
+
+void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl)
+{
+	gro_tbl_destroy_fn destroy_tbl_fn;
+	uint64_t gro_type_flag;
+	uint8_t i;
+
+	if (gro_tbl == NULL)
+		return;
+	for (i = 0; i < GRO_TYPE_MAX_NUM; i++) {
+		gro_type_flag = 1 << i;
+		if (gro_tbl->desired_gro_types & gro_type_flag) {
+			destroy_tbl_fn = tbl_destroy_functions[i];
+			if (destroy_tbl_fn)
+				destroy_tbl_fn(gro_tbl->tbls[i]);
+			gro_tbl->tbls[i] = NULL;
+		}
+	}
+	rte_free(gro_tbl);
+}
+
+uint16_t
+rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused,
+		const uint16_t nb_pkts,
+		const struct rte_gro_param param __rte_unused)
+{
+	return nb_pkts;
+}
+
+int rte_gro_reassemble(struct rte_mbuf *pkt __rte_unused,
+		struct rte_gro_tbl *gro_tbl __rte_unused)
+{
+	return -1;
+}
+
+uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
+		uint64_t desired_gro_types __rte_unused,
+		struct rte_mbuf **out __rte_unused,
+		const uint16_t max_nb_out __rte_unused)
+{
+	return 0;
+}
+
+uint16_t
+rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl __rte_unused,
+		uint64_t desired_gro_types __rte_unused,
+		struct rte_mbuf **out __rte_unused,
+		const uint16_t max_nb_out __rte_unused)
+{
+	return 0;
+}
diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
new file mode 100644
index 0000000..f9d36e8
--- /dev/null
+++ b/lib/librte_gro/rte_gro.h
@@ -0,0 +1,205 @@ 
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_GRO_H_
+#define _RTE_GRO_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * the max packets number that rte_gro_reassemble_burst can
+ * process in each invocation.
+ */
+#define GRO_MAX_BURST_ITEM_NUM 1024UL
+
+/* max number of supported GRO types */
+#define GRO_TYPE_MAX_NUM 64
+#define GRO_TYPE_SUPPORT_NUM 0	/**< current supported GRO num */
+
+/**
+ * GRO table, which is used to merge packets. It keeps many reassembly
+ * tables of desired GRO types. Applications need to create GRO tables
+ * before using rte_gro_reassemble to perform GRO.
+ */
+struct rte_gro_tbl {
+	uint64_t desired_gro_types;	/**< GRO types to perform */
+	/* max TTL measured in nanosecond */
+	uint64_t max_timeout_cycles;
+	/* max length of merged packet measured in byte */
+	uint32_t max_packet_size;
+	/* reassebly tables of desired GRO types */
+	void *tbls[GRO_TYPE_MAX_NUM];
+};
+
+struct rte_gro_param {
+	uint64_t desired_gro_types;	/**< desired GRO types */
+	uint32_t max_packet_size;	/**< max length of merged packets */
+	uint16_t max_flow_num;	/**< max flow number */
+	uint16_t max_item_per_flow;	/**< max packet number per flow */
+};
+
+typedef void *(*gro_tbl_create_fn)(uint16_t socket_id,
+		uint16_t max_flow_num,
+		uint16_t max_item_per_flow);
+typedef void (*gro_tbl_destroy_fn)(void *tbl);
+
+/**
+ * This function create a GRO table, which is used to merge packets.
+ *
+ * @param socket_id
+ *  socket index where the Ethernet port connects to.
+ * @param max_flow_num
+ *  max number of flows in the GRO table.
+ * @param max_item_per_flow
+ *  max packet number per flow. We use the value of (max_flow_num *
+ *  max_item_per_fow) to calculate table size.
+ * @param max_packet_size
+ *  max length of merged packets. Measured in byte.
+ * @param max_timeout_cycles
+ *  max TTL for a packet in the GRO table. It's measured in nanosecond.
+ * @param desired_gro_types
+ *  GRO types to perform.
+ * @return
+ *  if create successfully, return a pointer which points to the GRO
+ *  table. Otherwise, return NULL.
+ */
+struct rte_gro_tbl *rte_gro_tbl_create(uint16_t socket_id,
+		uint16_t max_flow_num,
+		uint16_t max_item_per_flow,
+		uint32_t max_packet_size,
+		uint64_t max_timeout_cycles,
+		uint64_t desired_gro_types);
+/**
+ * This function destroys a GRO table.
+ */
+void rte_gro_tbl_destroy(struct rte_gro_tbl *gro_tbl);
+
+/**
+ * This is one of the main reassembly APIs, which merges numbers of
+ * packets at a time. It assumes that all inputted packets are with
+ * correct checksums. That is, applications should guarantee all
+ * inputted packets are correct. Besides, it doesn't re-calculate
+ * checksums for merged packets. If inputted packets are IP fragmented,
+ * this function assumes them are complete (i.e. with L4 header). After
+ * finishing processing, it returns all GROed packets to  applications
+ * immediately.
+ *
+ * @param pkts
+ *  a pointer array which points to the packets to reassemble. Besides,
+ *  it keeps addresses of GROed packets.
+ * @param nb_pkts
+ *  the number of packets to reassemble.
+ * @param param
+ *  applications use it to tell rte_gro_reassemble_burst what rules
+ *  are demanded.
+ * @return
+ *  the number of packets after GROed.
+ */
+uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
+		const uint16_t nb_pkts,
+		const struct rte_gro_param param);
+
+/**
+ * Reassembly function, which tries to merge the inputted packet with
+ * one packet in a given GRO table. This function assumes the inputted
+ * packet is with correct checksums. And it won't update checksums if
+ * two packets are merged. Besides, if the inputted packet is IP
+ * fragmented, this function assumes it's a complete packet (i.e. with
+ * L4 header).
+ *
+ * If the inputted packet doesn't have data or it's with unsupported GRO
+ * type, function returns immediately. Otherwise, the inputted packet is
+ * either merged or inserted into the table. If applications want get
+ * packets in the table, they need to call flush APIs.
+ *
+ * @param pkt
+ *  packet to reassemble.
+ * @param gro_tbl
+ *  a pointer points to a GRO table.
+ * @return
+ *  if merge the packet successfully, return a positive value. If fail
+ *  to merge, return zero. If the packet doesn't have data, or its GRO
+ *  type is unsupported, return a negative value.
+ */
+int rte_gro_reassemble(struct rte_mbuf *pkt,
+		struct rte_gro_tbl *gro_tbl);
+
+/**
+ * This function flushed packets from reassembly tables of desired GRO
+ * types. It won't re-calculate checksums for merged packets in the
+ * tables. That is, the returned packets may be with wrong checksums.
+ *
+ * @param gro_tbl
+ *  a pointer points to a GRO table object.
+ * @param desired_gro_types
+ *  GRO types whose packets will be flushed.
+ * @param out
+ *  a pointer array that is used to keep flushed packets.
+ * @param nb_out
+ *  the size of out.
+ * @return
+ *  the number of flushed packets. If no packets are flushed, return 0.
+ */
+uint16_t rte_gro_flush(struct rte_gro_tbl *gro_tbl,
+		uint64_t desired_gro_types,
+		struct rte_mbuf **out,
+		const uint16_t max_nb_out);
+
+/**
+ * This function flushes the timeout packets from reassembly tables of
+ * desired GRO types. It won't re-calculate checksums for merged packets
+ * in the tables. That is, the returned packets may be with wrong
+ * checksums.
+ *
+ * @param gro_tbl
+ *  a pointer points to a GRO table object.
+ * @param desired_gro_types
+ * rte_gro_timeout_flush only processes packets which belong to the
+ * GRO types specified by desired_gro_types.
+ * @param out
+ *  a pointer array that is used to keep flushed packets.
+ * @param nb_out
+ *  the size of out.
+ * @return
+ *  the number of flushed packets. If no packets are flushed, return 0.
+ */
+uint16_t rte_gro_timeout_flush(struct rte_gro_tbl *gro_tbl,
+		uint64_t desired_gro_types,
+		struct rte_mbuf **out,
+		const uint16_t max_nb_out);
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/lib/librte_gro/rte_gro_version.map b/lib/librte_gro/rte_gro_version.map
new file mode 100644
index 0000000..827596b
--- /dev/null
+++ b/lib/librte_gro/rte_gro_version.map
@@ -0,0 +1,12 @@ 
+DPDK_17.08 {
+	global:
+
+	rte_gro_tbl_create;
+	rte_gro_tbl_destroy;
+	rte_gro_reassemble_burst;
+	rte_gro_reassemble;
+	rte_gro_flush;
+	rte_gro_timeout_flush;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index bcaf1b3..fc3776d 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -98,6 +98,7 @@  _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)        	+= -lrte_gro
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni