[dpdk-dev,v4,1/2] gro: code cleanup

Message ID 1515132769-52572-2-git-send-email-jiayu.hu@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Hu, Jiayu Jan. 5, 2018, 6:12 a.m. UTC
  - Remove needless check and variants
- For better understanding, update the programmer guide and rename
  internal functions and variants
- For supporting tunneled gro, move common internal functions from
  gro_tcp4.c to gro_tcp4.h
- Comply RFC 6864 to process the IPv4 ID field

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Junjie Chen <junjie.j.chen@intel.com>
---
 .../prog_guide/generic_receive_offload_lib.rst     | 246 ++++++++-------
 doc/guides/prog_guide/img/gro-key-algorithm.svg    | 223 ++++++++++++++
 lib/librte_gro/gro_tcp4.c                          | 339 +++++++--------------
 lib/librte_gro/gro_tcp4.h                          | 253 ++++++++++-----
 lib/librte_gro/rte_gro.c                           | 102 +++----
 lib/librte_gro/rte_gro.h                           |  92 +++---
 6 files changed, 750 insertions(+), 505 deletions(-)
 create mode 100644 doc/guides/prog_guide/img/gro-key-algorithm.svg
  

Comments

Yao, Lei A Jan. 8, 2018, 1:15 a.m. UTC | #1
> -----Original Message-----

> From: Hu, Jiayu

> Sent: Friday, January 5, 2018 2:13 PM

> To: dev@dpdk.org

> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chen, Junjie J

> <junjie.j.chen@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;

> stephen@networkplumber.org; Yigit, Ferruh <ferruh.yigit@intel.com>;

> Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yao, Lei A

> <lei.a.yao@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>

> Subject: [PATCH v4 1/2] gro: code cleanup

> 

> - Remove needless check and variants

> - For better understanding, update the programmer guide and rename

>   internal functions and variants

> - For supporting tunneled gro, move common internal functions from

>   gro_tcp4.c to gro_tcp4.h

> - Comply RFC 6864 to process the IPv4 ID field

> 

> Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>

> Reviewed-by: Junjie Chen <junjie.j.chen@intel.com>

Tested-by: Lei Yao<lei.a.yao@intel.com>

I have tested this patch with following traffic follow:
NIC1(In kernel)-->NIC2(pmd, GRO on)-->vhost-user->virtio-net(in VM)
The Iperf test with 1 stream show that GRO VxLAN can improve the 
performance from 6 Gbps(GRO off) to 16 Gbps(GRO on).

> ---

>  .../prog_guide/generic_receive_offload_lib.rst     | 246 ++++++++-------

>  doc/guides/prog_guide/img/gro-key-algorithm.svg    | 223

> ++++++++++++++

>  lib/librte_gro/gro_tcp4.c                          | 339 +++++++--------------

>  lib/librte_gro/gro_tcp4.h                          | 253 ++++++++++-----

>  lib/librte_gro/rte_gro.c                           | 102 +++----

>  lib/librte_gro/rte_gro.h                           |  92 +++---

>  6 files changed, 750 insertions(+), 505 deletions(-)

>  create mode 100644 doc/guides/prog_guide/img/gro-key-algorithm.svg

> 

> diff --git a/doc/guides/prog_guide/generic_receive_offload_lib.rst

> b/doc/guides/prog_guide/generic_receive_offload_lib.rst

> index 22e50ec..c2d7a41 100644

> --- a/doc/guides/prog_guide/generic_receive_offload_lib.rst

> +++ b/doc/guides/prog_guide/generic_receive_offload_lib.rst

> @@ -32,128 +32,162 @@ Generic Receive Offload Library

>  ===============================

> 

>  Generic Receive Offload (GRO) is a widely used SW-based offloading

> -technique to reduce per-packet processing overhead. It gains performance

> -by reassembling small packets into large ones. To enable more flexibility

> -to applications, DPDK implements GRO as a standalone library. Applications

> -explicitly use the GRO library to merge small packets into large ones.

> -

> -The GRO library assumes all input packets have correct checksums. In

> -addition, the GRO library doesn't re-calculate checksums for merged

> -packets. If input packets are IP fragmented, the GRO library assumes

> -they are complete packets (i.e. with L4 headers).

> -

> -Currently, the GRO library implements TCP/IPv4 packet reassembly.

> -

> -Reassembly Modes

> -----------------

> -

> -The GRO library provides two reassembly modes: lightweight and

> -heavyweight mode. If applications want to merge packets in a simple way,

> -they can use the lightweight mode API. If applications want more

> -fine-grained controls, they can choose the heavyweight mode API.

> -

> -Lightweight Mode

> -~~~~~~~~~~~~~~~~

> -

> -The ``rte_gro_reassemble_burst()`` function is used for reassembly in

> -lightweight mode. It tries to merge N input packets at a time, where

> -N should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.

> -

> -In each invocation, ``rte_gro_reassemble_burst()`` allocates temporary

> -reassembly tables for the desired GRO types. Note that the reassembly

> -table is a table structure used to reassemble packets and different GRO

> -types (e.g. TCP/IPv4 GRO and TCP/IPv6 GRO) have different reassembly

> table

> -structures. The ``rte_gro_reassemble_burst()`` function uses the

> reassembly

> -tables to merge the N input packets.

> -

> -For applications, performing GRO in lightweight mode is simple. They

> -just need to invoke ``rte_gro_reassemble_burst()``. Applications can get

> -GROed packets as soon as ``rte_gro_reassemble_burst()`` returns.

> -

> -Heavyweight Mode

> -~~~~~~~~~~~~~~~~

> -

> -The ``rte_gro_reassemble()`` function is used for reassembly in

> heavyweight

> -mode. Compared with the lightweight mode, performing GRO in

> heavyweight mode

> -is relatively complicated.

> -

> -Before performing GRO, applications need to create a GRO context object

> -by calling ``rte_gro_ctx_create()``. A GRO context object holds the

> -reassembly tables of desired GRO types. Note that all update/lookup

> -operations on the context object are not thread safe. So if different

> -processes or threads want to access the same context object

> simultaneously,

> -some external syncing mechanisms must be used.

> -

> -Once the GRO context is created, applications can then use the

> -``rte_gro_reassemble()`` function to merge packets. In each invocation,

> -``rte_gro_reassemble()`` tries to merge input packets with the packets

> -in the reassembly tables. If an input packet is an unsupported GRO type,

> -or other errors happen (e.g. SYN bit is set), ``rte_gro_reassemble()``

> -returns the packet to applications. Otherwise, the input packet is either

> -merged or inserted into a reassembly table.

> -

> -When applications want to get GRO processed packets, they need to use

> -``rte_gro_timeout_flush()`` to flush them from the tables manually.

> +technique to reduce per-packet processing overheads. By reassembling

> +small packets into larger ones, GRO enables applications to process

> +fewer large packets directly, thus reducing the number of packets to

> +be processed. To benefit DPDK-based applications, like Open vSwitch,

> +DPDK also provides own GRO implementation. In DPDK, GRO is

> implemented

> +as a standalone library. Applications explicitly use the GRO library to

> +reassemble packets.

> +

> +Overview

> +--------

> +

> +In the GRO library, there are many GRO types which are defined by packet

> +types. One GRO type is in charge of process one kind of packets. For

> +example, TCP/IPv4 GRO processes TCP/IPv4 packets.

> +

> +Each GRO type has a reassembly function, which defines own algorithm and

> +table structure to reassemble packets. We assign input packets to the

> +corresponding GRO functions by MBUF->packet_type.

> +

> +The GRO library doesn't check if input packets have correct checksums and

> +doesn't re-calculate checksums for merged packets. The GRO library

> +assumes the packets are complete (i.e., MF==0 && frag_off==0), when IP

> +fragmentation is possible (i.e., DF==0). Additionally, it complies RFC

> +6864 to process the IPv4 ID field.

> 

> -TCP/IPv4 GRO

> -------------

> +Currently, the GRO library provides GRO supports for TCP/IPv4 packets.

> +

> +Two Sets of API

> +---------------

> +

> +For different usage scenarios, the GRO library provides two sets of API.

> +The one is called the lightweight mode API, which enables applications to

> +merge a small number of packets rapidly; the other is called the

> +heavyweight mode API, which provides fine-grained controls to

> +applications and supports to merge a large number of packets.

> +

> +Lightweight Mode API

> +~~~~~~~~~~~~~~~~~~~~

> +

> +The lightweight mode only has one function ``rte_gro_reassemble_burst()``,

> +which process N packets at a time. Using the lightweight mode API to

> +merge packets is very simple. Calling ``rte_gro_reassemble_burst()`` is

> +enough. The GROed packets are returned to applications as soon as it

> +finishes.

> +

> +In ``rte_gro_reassemble_burst()``, table structures of different GRO

> +types are allocated in the stack. This design simplifies applications'

> +operations. However, limited by the stack size, the maximum number of

> +packets that ``rte_gro_reassemble_burst()`` can process in an invocation

> +should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.

> +

> +Heavyweight Mode API

> +~~~~~~~~~~~~~~~~~~~~

> +

> +Compared with the lightweight mode, using the heavyweight mode API is

> +relatively complex. Firstly, applications need to create a GRO context

> +by ``rte_gro_ctx_create()``. ``rte_gro_ctx_create()`` allocates tables

> +structures in the heap and stores their pointers in the GRO context.

> +Secondly, applications use ``rte_gro_reassemble()`` to merge packets.

> +If input packets have invalid parameters, ``rte_gro_reassemble()``

> +returns them to applications. For example, packets of unsupported GRO

> +types or TCP SYN packets are returned. Otherwise, the input packets are

> +either merged with the existed packets in the tables or inserted into the

> +tables. Finally, applications use ``rte_gro_timeout_flush()`` to flush

> +packets from the tables, when they want to get the GROed packets.

> +

> +Note that all update/lookup operations on the GRO context are not thread

> +safe. So if different processes or threads want to access the same

> +context object simultaneously, some external syncing mechanisms must be

> +used.

> +

> +Reassembly Algorithm

> +--------------------

> +

> +The reassembly algorithm is used for reassembling packets. In the GRO

> +library, different GRO types can use different algorithms. In this

> +section, we will introduce an algorithm, which is used by TCP/IPv4 GRO.

> 

> -TCP/IPv4 GRO supports merging small TCP/IPv4 packets into large ones,

> -using a table structure called the TCP/IPv4 reassembly table.

> +Challenges

> +~~~~~~~~~~

> 

> -TCP/IPv4 Reassembly Table

> -~~~~~~~~~~~~~~~~~~~~~~~~~

> +The reassembly algorithm determines the efficiency of GRO. There are two

> +challenges in the algorithm design:

> 

> -A TCP/IPv4 reassembly table includes a "key" array and an "item" array.

> -The key array keeps the criteria to merge packets and the item array

> -keeps the packet information.

> +- a high cost algorithm/implementation would cause packet dropping in a

> +  high speed network.

> 

> -Each key in the key array points to an item group, which consists of

> -packets which have the same criteria values but can't be merged. A key

> -in the key array includes two parts:

> +- packet reordering makes it hard to merge packets. For example, Linux

> +  GRO fails to merge packets when encounters packet reordering.

> 

> -* ``criteria``: the criteria to merge packets. If two packets can be

> -  merged, they must have the same criteria values.

> +The above two challenges require our algorithm is:

> 

> -* ``start_index``: the item array index of the first packet in the item

> -  group.

> +- lightweight enough to scale fast networking speed

> 

> -Each element in the item array keeps the information of a packet. An item

> -in the item array mainly includes three parts:

> +- capable of handling packet reordering

> 

> -* ``firstseg``: the mbuf address of the first segment of the packet.

> +In DPDK GRO, we use a key-based algorithm to address the two challenges.

> 

> -* ``lastseg``: the mbuf address of the last segment of the packet.

> +Key-based Reassembly Algorithm

> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> +

> +:numref:`figure_gro-key-algorithm` illustrates the procedure of the

> +key-based algorithm. Packets are classified into "flows" by some header

> +fields (we call them as "key"). To process an input packet, the algorithm

> +searches for a matched "flow" (i.e., the same value of key) for the

> +packet first, then checks all packets in the "flow" and tries to find a

> +"neighbor" for it. If find a "neighbor", merge the two packets together.

> +If can't find a "neighbor", store the packet into its "flow". If can't

> +find a matched "flow", insert a new "flow" and store the packet into the

> +"flow".

> +

> +.. note::

> +        Packets in the same "flow" that can't merge are always caused

> +        by packet reordering.

> +

> +The key-based algorithm has two characters:

> +

> +- classifying packets into "flows" to accelerate packet aggregation is

> +  simple (address challenge 1).

> +

> +- storing out-of-order packets makes it possible to merge later (address

> +  challenge 2).

> +

> +.. _figure_gro-key-algorithm:

> +

> +.. figure:: img/gro-key-algorithm.*

> +   :align: center

> +

> +   Key-based Reassembly Algorithm

> +

> +TCP/IPv4 GRO

> +------------

> 

> -* ``next_pkt_index``: the item array index of the next packet in the same

> -  item group. TCP/IPv4 GRO uses ``next_pkt_index`` to chain the packets

> -  that have the same criteria value but can't be merged together.

> +The table structure used by TCP/IPv4 GRO contains two arrays: flow array

> +and item array. The flow array keeps flow information, and the item array

> +keeps packet information.

> 

> -Procedure to Reassemble a Packet

> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> +Header fields used to define a TCP/IPv4 flow include:

> 

> -To reassemble an incoming packet needs three steps:

> +- source and destination: Ethernet and IP address, TCP port

> 

> -#. Check if the packet should be processed. Packets with one of the

> -   following properties aren't processed and are returned immediately:

> +- TCP acknowledge number

> 

> -   * FIN, SYN, RST, URG, PSH, ECE or CWR bit is set.

> +TCP/IPv4 packets whose FIN, SYN, RST, URG, PSH, ECE or CWR bit is set

> +won't be processed.

> 

> -   * L4 payload length is 0.

> +Header fields deciding if two packets are neighbors include:

> 

> -#.  Traverse the key array to find a key which has the same criteria

> -    value with the incoming packet. If found, go to the next step.

> -    Otherwise, insert a new key and a new item for the packet.

> +- TCP sequence number

> 

> -#. Locate the first packet in the item group via ``start_index``. Then

> -   traverse all packets in the item group via ``next_pkt_index``. If a

> -   packet is found which can be merged with the incoming one, merge them

> -   together. If one isn't found, insert the packet into this item group.

> -   Note that to merge two packets is to link them together via mbuf's

> -   ``next`` field.

> +- IPv4 ID. The IPv4 ID fields of the packets, whose DF bit is 0, should

> +  be increased by 1.

> 

> -When packets are flushed from the reassembly table, TCP/IPv4 GRO

> updates

> -packet header fields for the merged packets. Note that before reassembling

> -the packet, TCP/IPv4 GRO doesn't check if the checksums of packets are

> -correct. Also, TCP/IPv4 GRO doesn't re-calculate checksums for merged

> -packets.

> +.. note::

> +        We comply RFC 6864 to process the IPv4 ID field. Specifically,

> +        we check IPv4 ID fields for the packets whose DF bit is 0 and

> +        ignore IPv4 ID fields for the packets whose DF bit is 1.

> +        Additionally, packets which have different value of DF bit can't

> +        be merged.

> diff --git a/doc/guides/prog_guide/img/gro-key-algorithm.svg

> b/doc/guides/prog_guide/img/gro-key-algorithm.svg

> new file mode 100644

> index 0000000..94e42f5

> --- /dev/null

> +++ b/doc/guides/prog_guide/img/gro-key-algorithm.svg

> @@ -0,0 +1,223 @@

> +<?xml version="1.0" encoding="UTF-8" standalone="no"?>

> +<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN"

> "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">

> +<!-- Generated by Microsoft Visio 11.0, SVG Export, v1.0 gro-key-

> algorithm.svg Page-1 -->

> +<svg xmlns="http://www.w3.org/2000/svg"

> xmlns:xlink="http://www.w3.org/1999/xlink"

> xmlns:ev="http://www.w3.org/2001/xml-events"

> +

> 	xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/

> " width="6.06163in" height="2.66319in"

> +		viewBox="0 0 436.438 191.75" xml:space="preserve" color-

> interpolation-filters="sRGB" class="st10">

> +	<v:documentProperties v:langID="1033" v:viewMarkup="false"/>

> +

> +	<style type="text/css">

> +	<![CDATA[

> +		.st1 {fill:url(#grad30-4);stroke:#404040;stroke-

> linecap:round;stroke-linejoin:round;stroke-width:0.25}

> +		.st2 {fill:#000000;font-family:Calibri;font-size:1.00001em}

> +		.st3 {font-size:1em;font-weight:bold}

> +		.st4 {fill:#000000;font-family:Calibri;font-size:1.00001em;font-

> weight:bold}

> +		.st5 {font-size:1em;font-weight:normal}

> +		.st6 {marker-end:url(#mrkr5-38);stroke:#404040;stroke-

> linecap:round;stroke-linejoin:round;stroke-width:1}

> +		.st7 {fill:#404040;fill-opacity:1;stroke:#404040;stroke-

> opacity:1;stroke-width:0.28409090909091}

> +		.st8 {fill:none;stroke:none;stroke-linecap:round;stroke-

> linejoin:round;stroke-width:0.25}

> +		.st9 {fill:#000000;font-family:Calibri;font-size:0.833336em}

> +		.st10 {fill:none;fill-rule:evenodd;font-

> size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}

> +	]]>

> +	</style>

> +

> +	<defs id="Patterns_And_Gradients">

> +		<linearGradient id="grad30-4" v:fillPattern="30"

> v:foreground="#c6d09f" v:background="#d1dab4" x1="0" y1="1" x2="0"

> y2="0">

> +			<stop offset="0" style="stop-color:#c6d09f;stop-

> opacity:1"/>

> +			<stop offset="1" style="stop-color:#d1dab4;stop-

> opacity:1"/>

> +		</linearGradient>

> +		<linearGradient id="grad30-35" v:fillPattern="30"

> v:foreground="#f0f0f0" v:background="#ffffff" x1="0" y1="1" x2="0" y2="0">

> +			<stop offset="0" style="stop-color:#f0f0f0;stop-

> opacity:1"/>

> +			<stop offset="1" style="stop-color:#ffffff;stop-

> opacity:1"/>

> +		</linearGradient>

> +	</defs>

> +	<defs id="Markers">

> +		<g id="lend5">

> +			<path d="M 2 1 L 0 0 L 1.98117 -0.993387 C 1.67173 -

> 0.364515 1.67301 0.372641 1.98465 1.00043 " style="stroke:none"/>

> +		</g>

> +		<marker id="mrkr5-38" class="st7" v:arrowType="5"

> v:arrowSize="2" v:setback="6.16" refX="-6.16" orient="auto"

> +				markerUnits="strokeWidth"

> overflow="visible">

> +			<use xlink:href="#lend5" transform="scale(-3.52,-

> 3.52) "/>

> +		</marker>

> +	</defs>

> +	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">

> +		<title>Page-1</title>

> +		<v:pageProperties v:drawingScale="1" v:pageScale="1"

> v:drawingUnits="0" v:shadowOffsetX="9" v:shadowOffsetY="-9"/>

> +		<v:layer v:name="Connector" v:index="0"/>

> +		<g id="shape1-1" v:mID="1" v:groupContext="shape"

> transform="translate(0.25,-117.25)">

> +			<title>Rounded rectangle</title>

> +			<desc>Categorize into an existed “flow”</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="90" cy="173.75" width="180"

> height="36"/>

> +			<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180

> 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75

> +						 A9.00007 9.00007 -180 0 0 -0

> 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"

> +					class="st1"/>

> +			<text x="8.91" y="177.35" class="st2"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Categorize into

> an <tspan

> +

> 	class="st3">existed</tspan><tspan class="st3" v:langID="2052">

> </tspan>“<tspan class="st3">flow</tspan>”</text>		</g>

> +		<g id="shape2-9" v:mID="2" v:groupContext="shape"

> transform="translate(0.25,-58.75)">

> +			<title>Rounded rectangle.2</title>

> +			<desc>Search for a “neighbor”</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="90" cy="173.75" width="180"

> height="36"/>

> +			<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180

> 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75

> +						 A9.00007 9.00007 -180 0 0 -0

> 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"

> +					class="st1"/>

> +			<text x="32.19" y="177.35" class="st2"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Search for a

> “<tspan

> +

> 	class="st3">neighbor</tspan>”</text>		</g>

> +		<g id="shape3-14" v:mID="3" v:groupContext="shape"

> transform="translate(225.813,-117.25)">

> +			<title>Rounded rectangle.3</title>

> +			<desc>Insert a new “flow” and store the

> packet</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="105.188" cy="173.75" width="210.38"

> height="36"/>

> +			<path d="M201.37 191.75 A9.00007 9.00007 -180 0 0

> 210.37 182.75 L210.37 164.75 A9.00007 9.00007 -180 0 0 201.37 155.75

> +						 L9 155.75 A9.00007 9.00007 -

> 180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L201.37 191.75

> +						 Z" class="st1"/>

> +			<text x="5.45" y="177.35" class="st2"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Insert a <tspan

> +						class="st3">new

> </tspan>“<tspan class="st3">flow</tspan>” and <tspan class="st3">store

> </tspan>the packet</text>		</g>

> +		<g id="shape4-21" v:mID="4" v:groupContext="shape"

> transform="translate(225.25,-58.75)">

> +			<title>Rounded rectangle.4</title>

> +			<desc>Store the packet</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="83.25" cy="173.75" width="166.5"

> height="36"/>

> +			<path d="M157.5 191.75 A9.00007 9.00007 -180 0 0

> 166.5 182.75 L166.5 164.75 A9.00007 9.00007 -180 0 0 157.5 155.75 L9

> +						 155.75 A9.00007 9.00007 -180

> 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L157.5 191.75 Z"

> +					class="st1"/>

> +			<text x="42.81" y="177.35" class="st4"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Store <tspan

> +						class="st5">the

> packet</tspan></text>		</g>

> +		<g id="shape5-26" v:mID="5" v:groupContext="shape"

> transform="translate(0.25,-0.25)">

> +			<title>Rounded rectangle.5</title>

> +			<desc>Merge the packet</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="90" cy="173.75" width="180"

> height="36"/>

> +			<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180

> 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75

> +						 A9.00007 9.00007 -180 0 0 -0

> 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"

> +					class="st1"/>

> +			<text x="46.59" y="177.35" class="st4"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Merge <tspan

> +						class="st5">the

> packet</tspan></text>		</g>

> +		<g id="shape6-31" v:mID="6" v:groupContext="shape"

> v:layerMember="0" transform="translate(81.25,-175.75)">

> +			<title>Dynamic connector</title>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<path d="M9 191.75 L9 208.09" class="st6"/>

> +		</g>

> +		<g id="shape7-39" v:mID="7" v:groupContext="shape"

> v:layerMember="0" transform="translate(81.25,-117.25)">

> +			<title>Dynamic connector.7</title>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<path d="M9 191.75 L9 208.09" class="st6"/>

> +		</g>

> +		<g id="shape8-45" v:mID="8" v:groupContext="shape"

> v:layerMember="0" transform="translate(81.25,-58.75)">

> +			<title>Dynamic connector.8</title>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<path d="M9 191.75 L9 208.09" class="st6"/>

> +		</g>

> +		<g id="shape9-51" v:mID="9" v:groupContext="shape"

> v:layerMember="0" transform="translate(180.25,-126.25)">

> +			<title>Dynamic connector.9</title>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<path d="M0 182.75 L39.4 182.75" class="st6"/>

> +		</g>

> +		<g id="shape10-57" v:mID="10" v:groupContext="shape"

> v:layerMember="0" transform="translate(180.25,-67.75)">

> +			<title>Dynamic connector.10</title>

> +			<v:userDefs>

> +				<v:ud v:nameU="visVersion"

> v:val="VT0(14):26"/>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<path d="M0 182.75 L38.84 182.75" class="st6"/>

> +		</g>

> +		<g id="shape11-63" v:mID="11" v:groupContext="shape"

> transform="translate(65.5,-173.5)">

> +			<title>Sheet.11</title>

> +			<desc>packet</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="24.75" cy="182.75" width="49.5"

> height="18"/>

> +			<rect x="0" y="173.75" width="49.5" height="18"

> class="st8"/>

> +			<text x="8.46" y="186.35" class="st2"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>packet</text>

> 		</g>

> +		<g id="shape14-66" v:mID="14" v:groupContext="shape"

> transform="translate(98.125,-98.125)">

> +			<title>Sheet.14</title>

> +			<desc>find a “flow”</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="32.0625" cy="183.875" width="64.13"

> height="15.75"/>

> +			<rect x="0" y="176" width="64.125" height="15.75"

> class="st8"/>

> +			<text x="6.41" y="186.88" class="st9"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>find a

> “flow”</text>		</g>

> +		<g id="shape15-69" v:mID="15" v:groupContext="shape"

> transform="translate(99.25,-39.625)">

> +			<title>Sheet.15</title>

> +			<desc>find a “neighbor”</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="40.5" cy="183.875" width="81"

> height="15.75"/>

> +			<rect x="0" y="176" width="81" height="15.75"

> class="st8"/>

> +			<text x="5.48" y="186.88" class="st9"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>find a

> “neighbor”</text>		</g>

> +		<g id="shape13-72" v:mID="13" v:groupContext="shape"

> transform="translate(181.375,-79)">

> +			<title>Sheet.13</title>

> +			<desc>not find</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="21.375" cy="183.875" width="42.75"

> height="15.75"/>

> +			<rect x="0" y="176" width="42.75" height="15.75"

> class="st8"/>

> +			<text x="5.38" y="186.88" class="st9"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>not find</text>

> 		</g>

> +		<g id="shape12-75" v:mID="12" v:groupContext="shape"

> transform="translate(181.375,-137.5)">

> +			<title>Sheet.12</title>

> +			<desc>not find</desc>

> +			<v:userDefs>

> +				<v:ud v:nameU="msvThemeColors"

> v:val="VT0(36):26"/>

> +				<v:ud v:nameU="msvThemeEffects"

> v:val="VT0(16):26"/>

> +			</v:userDefs>

> +			<v:textBlock v:margins="rect(4,4,4,4)"/>

> +			<v:textRect cx="21.375" cy="183.875" width="42.75"

> height="15.75"/>

> +			<rect x="0" y="176" width="42.75" height="15.75"

> class="st8"/>

> +			<text x="5.38" y="186.88" class="st9"

> v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>not find</text>

> 		</g>

> +	</g>

> +</svg>

> diff --git a/lib/librte_gro/gro_tcp4.c b/lib/librte_gro/gro_tcp4.c

> index 03e5ccf..27af23e 100644

> --- a/lib/librte_gro/gro_tcp4.c

> +++ b/lib/librte_gro/gro_tcp4.c

> @@ -6,8 +6,6 @@

>  #include <rte_mbuf.h>

>  #include <rte_cycles.h>

>  #include <rte_ethdev.h>

> -#include <rte_ip.h>

> -#include <rte_tcp.h>

> 

>  #include "gro_tcp4.h"

> 

> @@ -44,20 +42,20 @@ gro_tcp4_tbl_create(uint16_t socket_id,

>  	}

>  	tbl->max_item_num = entries_num;

> 

> -	size = sizeof(struct gro_tcp4_key) * entries_num;

> -	tbl->keys = rte_zmalloc_socket(__func__,

> +	size = sizeof(struct gro_tcp4_flow) * entries_num;

> +	tbl->flows = rte_zmalloc_socket(__func__,

>  			size,

>  			RTE_CACHE_LINE_SIZE,

>  			socket_id);

> -	if (tbl->keys == NULL) {

> +	if (tbl->flows == NULL) {

>  		rte_free(tbl->items);

>  		rte_free(tbl);

>  		return NULL;

>  	}

> -	/* INVALID_ARRAY_INDEX indicates empty key */

> +	/* INVALID_ARRAY_INDEX indicates an empty flow */

>  	for (i = 0; i < entries_num; i++)

> -		tbl->keys[i].start_index = INVALID_ARRAY_INDEX;

> -	tbl->max_key_num = entries_num;

> +		tbl->flows[i].start_index = INVALID_ARRAY_INDEX;

> +	tbl->max_flow_num = entries_num;

> 

>  	return tbl;

>  }

> @@ -69,116 +67,15 @@ gro_tcp4_tbl_destroy(void *tbl)

> 

>  	if (tcp_tbl) {

>  		rte_free(tcp_tbl->items);

> -		rte_free(tcp_tbl->keys);

> +		rte_free(tcp_tbl->flows);

>  	}

>  	rte_free(tcp_tbl);

>  }

> 

> -/*

> - * merge two TCP/IPv4 packets without updating checksums.

> - * If cmp is larger than 0, append the new packet to the

> - * original packet. Otherwise, pre-pend the new packet to

> - * the original packet.

> - */

> -static inline int

> -merge_two_tcp4_packets(struct gro_tcp4_item *item_src,

> -		struct rte_mbuf *pkt,

> -		uint16_t ip_id,

> -		uint32_t sent_seq,

> -		int cmp)

> -{

> -	struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;

> -	uint16_t tcp_datalen;

> -

> -	if (cmp > 0) {

> -		pkt_head = item_src->firstseg;

> -		pkt_tail = pkt;

> -	} else {

> -		pkt_head = pkt;

> -		pkt_tail = item_src->firstseg;

> -	}

> -

> -	/* check if the packet length will be beyond the max value */

> -	tcp_datalen = pkt_tail->pkt_len - pkt_tail->l2_len -

> -		pkt_tail->l3_len - pkt_tail->l4_len;

> -	if (pkt_head->pkt_len - pkt_head->l2_len + tcp_datalen >

> -			TCP4_MAX_L3_LENGTH)

> -		return 0;

> -

> -	/* remove packet header for the tail packet */

> -	rte_pktmbuf_adj(pkt_tail,

> -			pkt_tail->l2_len +

> -			pkt_tail->l3_len +

> -			pkt_tail->l4_len);

> -

> -	/* chain two packets together */

> -	if (cmp > 0) {

> -		item_src->lastseg->next = pkt;

> -		item_src->lastseg = rte_pktmbuf_lastseg(pkt);

> -		/* update IP ID to the larger value */

> -		item_src->ip_id = ip_id;

> -	} else {

> -		lastseg = rte_pktmbuf_lastseg(pkt);

> -		lastseg->next = item_src->firstseg;

> -		item_src->firstseg = pkt;

> -		/* update sent_seq to the smaller value */

> -		item_src->sent_seq = sent_seq;

> -	}

> -	item_src->nb_merged++;

> -

> -	/* update mbuf metadata for the merged packet */

> -	pkt_head->nb_segs += pkt_tail->nb_segs;

> -	pkt_head->pkt_len += pkt_tail->pkt_len;

> -

> -	return 1;

> -}

> -

> -static inline int

> -check_seq_option(struct gro_tcp4_item *item,

> -		struct tcp_hdr *tcp_hdr,

> -		uint16_t tcp_hl,

> -		uint16_t tcp_dl,

> -		uint16_t ip_id,

> -		uint32_t sent_seq)

> -{

> -	struct rte_mbuf *pkt0 = item->firstseg;

> -	struct ipv4_hdr *ipv4_hdr0;

> -	struct tcp_hdr *tcp_hdr0;

> -	uint16_t tcp_hl0, tcp_dl0;

> -	uint16_t len;

> -

> -	ipv4_hdr0 = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt0, char *) +

> -			pkt0->l2_len);

> -	tcp_hdr0 = (struct tcp_hdr *)((char *)ipv4_hdr0 + pkt0->l3_len);

> -	tcp_hl0 = pkt0->l4_len;

> -

> -	/* check if TCP option fields equal. If not, return 0. */

> -	len = RTE_MAX(tcp_hl, tcp_hl0) - sizeof(struct tcp_hdr);

> -	if ((tcp_hl != tcp_hl0) ||

> -			((len > 0) && (memcmp(tcp_hdr + 1,

> -					tcp_hdr0 + 1,

> -					len) != 0)))

> -		return 0;

> -

> -	/* check if the two packets are neighbors */

> -	tcp_dl0 = pkt0->pkt_len - pkt0->l2_len - pkt0->l3_len - tcp_hl0;

> -	if ((sent_seq == (item->sent_seq + tcp_dl0)) &&

> -			(ip_id == (item->ip_id + 1)))

> -		/* append the new packet */

> -		return 1;

> -	else if (((sent_seq + tcp_dl) == item->sent_seq) &&

> -			((ip_id + item->nb_merged) == item->ip_id))

> -		/* pre-pend the new packet */

> -		return -1;

> -	else

> -		return 0;

> -}

> -

>  static inline uint32_t

>  find_an_empty_item(struct gro_tcp4_tbl *tbl)

>  {

> -	uint32_t i;

> -	uint32_t max_item_num = tbl->max_item_num;

> +	uint32_t max_item_num = tbl->max_item_num, i;

> 

>  	for (i = 0; i < max_item_num; i++)

>  		if (tbl->items[i].firstseg == NULL)

> @@ -187,13 +84,12 @@ find_an_empty_item(struct gro_tcp4_tbl *tbl)

>  }

> 

>  static inline uint32_t

> -find_an_empty_key(struct gro_tcp4_tbl *tbl)

> +find_an_empty_flow(struct gro_tcp4_tbl *tbl)

>  {

> -	uint32_t i;

> -	uint32_t max_key_num = tbl->max_key_num;

> +	uint32_t max_flow_num = tbl->max_flow_num, i;

> 

> -	for (i = 0; i < max_key_num; i++)

> -		if (tbl->keys[i].start_index == INVALID_ARRAY_INDEX)

> +	for (i = 0; i < max_flow_num; i++)

> +		if (tbl->flows[i].start_index == INVALID_ARRAY_INDEX)

>  			return i;

>  	return INVALID_ARRAY_INDEX;

>  }

> @@ -201,10 +97,11 @@ find_an_empty_key(struct gro_tcp4_tbl *tbl)

>  static inline uint32_t

>  insert_new_item(struct gro_tcp4_tbl *tbl,

>  		struct rte_mbuf *pkt,

> -		uint16_t ip_id,

> -		uint32_t sent_seq,

> +		uint64_t start_time,

>  		uint32_t prev_idx,

> -		uint64_t start_time)

> +		uint32_t sent_seq,

> +		uint16_t ip_id,

> +		uint8_t is_atomic)

>  {

>  	uint32_t item_idx;

> 

> @@ -219,9 +116,10 @@ insert_new_item(struct gro_tcp4_tbl *tbl,

>  	tbl->items[item_idx].sent_seq = sent_seq;

>  	tbl->items[item_idx].ip_id = ip_id;

>  	tbl->items[item_idx].nb_merged = 1;

> +	tbl->items[item_idx].is_atomic = is_atomic;

>  	tbl->item_num++;

> 

> -	/* if the previous packet exists, chain the new one with it */

> +	/* If the previous packet exists, chain them together. */

>  	if (prev_idx != INVALID_ARRAY_INDEX) {

>  		tbl->items[item_idx].next_pkt_idx =

>  			tbl->items[prev_idx].next_pkt_idx;

> @@ -232,12 +130,13 @@ insert_new_item(struct gro_tcp4_tbl *tbl,

>  }

> 

>  static inline uint32_t

> -delete_item(struct gro_tcp4_tbl *tbl, uint32_t item_idx,

> +delete_item(struct gro_tcp4_tbl *tbl,

> +		uint32_t item_idx,

>  		uint32_t prev_item_idx)

>  {

>  	uint32_t next_idx = tbl->items[item_idx].next_pkt_idx;

> 

> -	/* set NULL to firstseg to indicate it's an empty item */

> +	/* NULL indicates an empty item. */

>  	tbl->items[item_idx].firstseg = NULL;

>  	tbl->item_num--;

>  	if (prev_item_idx != INVALID_ARRAY_INDEX)

> @@ -247,53 +146,33 @@ delete_item(struct gro_tcp4_tbl *tbl, uint32_t

> item_idx,

>  }

> 

>  static inline uint32_t

> -insert_new_key(struct gro_tcp4_tbl *tbl,

> -		struct tcp4_key *key_src,

> +insert_new_flow(struct gro_tcp4_tbl *tbl,

> +		struct tcp4_flow_key *src,

>  		uint32_t item_idx)

>  {

> -	struct tcp4_key *key_dst;

> -	uint32_t key_idx;

> +	struct tcp4_flow_key *dst;

> +	uint32_t flow_idx;

> 

> -	key_idx = find_an_empty_key(tbl);

> -	if (key_idx == INVALID_ARRAY_INDEX)

> +	flow_idx = find_an_empty_flow(tbl);

> +	if (unlikely(flow_idx == INVALID_ARRAY_INDEX))

>  		return INVALID_ARRAY_INDEX;

> 

> -	key_dst = &(tbl->keys[key_idx].key);

> +	dst = &(tbl->flows[flow_idx].key);

> 

> -	ether_addr_copy(&(key_src->eth_saddr), &(key_dst->eth_saddr));

> -	ether_addr_copy(&(key_src->eth_daddr), &(key_dst->eth_daddr));

> -	key_dst->ip_src_addr = key_src->ip_src_addr;

> -	key_dst->ip_dst_addr = key_src->ip_dst_addr;

> -	key_dst->recv_ack = key_src->recv_ack;

> -	key_dst->src_port = key_src->src_port;

> -	key_dst->dst_port = key_src->dst_port;

> +	ether_addr_copy(&(src->eth_saddr), &(dst->eth_saddr));

> +	ether_addr_copy(&(src->eth_daddr), &(dst->eth_daddr));

> +	dst->ip_src_addr = src->ip_src_addr;

> +	dst->ip_dst_addr = src->ip_dst_addr;

> +	dst->recv_ack = src->recv_ack;

> +	dst->src_port = src->src_port;

> +	dst->dst_port = src->dst_port;

> 

> -	/* non-INVALID_ARRAY_INDEX value indicates this key is valid */

> -	tbl->keys[key_idx].start_index = item_idx;

> -	tbl->key_num++;

> +	tbl->flows[flow_idx].start_index = item_idx;

> +	tbl->flow_num++;

> 

> -	return key_idx;

> +	return flow_idx;

>  }

> 

> -static inline int

> -is_same_key(struct tcp4_key k1, struct tcp4_key k2)

> -{

> -	if (is_same_ether_addr(&k1.eth_saddr, &k2.eth_saddr) == 0)

> -		return 0;

> -

> -	if (is_same_ether_addr(&k1.eth_daddr, &k2.eth_daddr) == 0)

> -		return 0;

> -

> -	return ((k1.ip_src_addr == k2.ip_src_addr) &&

> -			(k1.ip_dst_addr == k2.ip_dst_addr) &&

> -			(k1.recv_ack == k2.recv_ack) &&

> -			(k1.src_port == k2.src_port) &&

> -			(k1.dst_port == k2.dst_port));

> -}

> -

> -/*

> - * update packet length for the flushed packet.

> - */

>  static inline void

>  update_header(struct gro_tcp4_item *item)

>  {

> @@ -315,84 +194,106 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,

>  	struct ipv4_hdr *ipv4_hdr;

>  	struct tcp_hdr *tcp_hdr;

>  	uint32_t sent_seq;

> -	uint16_t tcp_dl, ip_id;

> +	uint16_t tcp_dl, ip_id, frag_off, hdr_len;

> +	uint8_t is_atomic;

> 

> -	struct tcp4_key key;

> +	struct tcp4_flow_key key;

>  	uint32_t cur_idx, prev_idx, item_idx;

> -	uint32_t i, max_key_num;

> +	uint32_t i, max_flow_num, left_flow_num;

>  	int cmp;

> +	uint8_t find;

> 

>  	eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);

>  	ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);

>  	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);

> +	hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;

> 

>  	/*

> -	 * if FIN, SYN, RST, PSH, URG, ECE or

> -	 * CWR is set, return immediately.

> +	 * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE

> +	 * or CWR set.

>  	 */

>  	if (tcp_hdr->tcp_flags != TCP_ACK_FLAG)

>  		return -1;

> -	/* if payload length is 0, return immediately */

> -	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -

> -		pkt->l4_len;

> -	if (tcp_dl == 0)

> +	/*

> +	 * Don't process the packet whose payload length is less than or

> +	 * equal to 0.

> +	 */

> +	tcp_dl = pkt->pkt_len - hdr_len;

> +	if (tcp_dl <= 0)

>  		return -1;

> 

> -	ip_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);

> +	/*

> +	 * Save IPv4 ID for the packet whose DF bit is 0. For the packet

> +	 * whose DF bit is 1, IPv4 ID is ignored.

> +	 */

> +	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);

> +	is_atomic = (frag_off & IPV4_HDR_DF_FLAG) == IPV4_HDR_DF_FLAG;

> +	ip_id = is_atomic ? 0 : rte_be_to_cpu_16(ipv4_hdr->packet_id);

>  	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);

> 

>  	ether_addr_copy(&(eth_hdr->s_addr), &(key.eth_saddr));

>  	ether_addr_copy(&(eth_hdr->d_addr), &(key.eth_daddr));

>  	key.ip_src_addr = ipv4_hdr->src_addr;

>  	key.ip_dst_addr = ipv4_hdr->dst_addr;

> +	key.recv_ack = tcp_hdr->recv_ack;

>  	key.src_port = tcp_hdr->src_port;

>  	key.dst_port = tcp_hdr->dst_port;

> -	key.recv_ack = tcp_hdr->recv_ack;

> 

> -	/* search for a key */

> -	max_key_num = tbl->max_key_num;

> -	for (i = 0; i < max_key_num; i++) {

> -		if ((tbl->keys[i].start_index != INVALID_ARRAY_INDEX) &&

> -				is_same_key(tbl->keys[i].key, key))

> -			break;

> +	/* Search for a matched flow. */

> +	max_flow_num = tbl->max_flow_num;

> +	left_flow_num = tbl->flow_num;

> +	find = 0;

> +	for (i = 0; i < max_flow_num && left_flow_num; i++) {

> +		if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {

> +			if (is_same_tcp4_flow(tbl->flows[i].key, key)) {

> +				find = 1;

> +				break;

> +			}

> +			left_flow_num--;

> +		}

>  	}

> 

> -	/* can't find a key, so insert a new key and a new item. */

> -	if (i == tbl->max_key_num) {

> -		item_idx = insert_new_item(tbl, pkt, ip_id, sent_seq,

> -				INVALID_ARRAY_INDEX, start_time);

> +	/*

> +	 * Fail to find a matched flow. Insert a new flow and store the

> +	 * packet into the flow.

> +	 */

> +	if (find == 0) {

> +		item_idx = insert_new_item(tbl, pkt, start_time,

> +				INVALID_ARRAY_INDEX, sent_seq, ip_id,

> +				is_atomic);

>  		if (item_idx == INVALID_ARRAY_INDEX)

>  			return -1;

> -		if (insert_new_key(tbl, &key, item_idx) ==

> +		if (insert_new_flow(tbl, &key, item_idx) ==

>  				INVALID_ARRAY_INDEX) {

> -			/*

> -			 * fail to insert a new key, so

> -			 * delete the inserted item

> -			 */

> +			/* Fail to insert a new flow. */

>  			delete_item(tbl, item_idx, INVALID_ARRAY_INDEX);

>  			return -1;

>  		}

>  		return 0;

>  	}

> 

> -	/* traverse all packets in the item group to find one to merge */

> -	cur_idx = tbl->keys[i].start_index;

> +	/*

> +	 * Check all packets in the flow and try to find a neighbor for

> +	 * the input packet.

> +	 */

> +	cur_idx = tbl->flows[i].start_index;

>  	prev_idx = cur_idx;

>  	do {

>  		cmp = check_seq_option(&(tbl->items[cur_idx]), tcp_hdr,

> -				pkt->l4_len, tcp_dl, ip_id, sent_seq);

> +				sent_seq, ip_id, pkt->l4_len, tcp_dl, 0,

> +				is_atomic);

>  		if (cmp) {

>  			if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),

> -						pkt, ip_id,

> -						sent_seq, cmp))

> +						pkt, cmp, sent_seq, ip_id, 0))

>  				return 1;

>  			/*

> -			 * fail to merge two packets since the packet

> -			 * length will be greater than the max value.

> -			 * So insert the packet into the item group.

> +			 * Fail to merge the two packets, as the packet

> +			 * length is greater than the max value. Store

> +			 * the packet into the flow.

>  			 */

> -			if (insert_new_item(tbl, pkt, ip_id, sent_seq,

> -						prev_idx, start_time) ==

> +			if (insert_new_item(tbl, pkt, start_time, prev_idx,

> +						sent_seq, ip_id,

> +						is_atomic) ==

>  					INVALID_ARRAY_INDEX)

>  				return -1;

>  			return 0;

> @@ -401,12 +302,9 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,

>  		cur_idx = tbl->items[cur_idx].next_pkt_idx;

>  	} while (cur_idx != INVALID_ARRAY_INDEX);

> 

> -	/*

> -	 * can't find a packet in the item group to merge,

> -	 * so insert the packet into the item group.

> -	 */

> -	if (insert_new_item(tbl, pkt, ip_id, sent_seq, prev_idx,

> -				start_time) == INVALID_ARRAY_INDEX)

> +	/* Fail to find a neighbor, so store the packet into the flow. */

> +	if (insert_new_item(tbl, pkt, start_time, prev_idx, sent_seq,

> +				ip_id, is_atomic) == INVALID_ARRAY_INDEX)

>  		return -1;

> 

>  	return 0;

> @@ -418,46 +316,35 @@ gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl

> *tbl,

>  		struct rte_mbuf **out,

>  		uint16_t nb_out)

>  {

> -	uint16_t k = 0;

> +	uint32_t max_flow_num = tbl->max_flow_num;

>  	uint32_t i, j;

> -	uint32_t max_key_num = tbl->max_key_num;

> +	uint16_t k = 0;

> 

> -	for (i = 0; i < max_key_num; i++) {

> -		/* all keys have been checked, return immediately */

> -		if (tbl->key_num == 0)

> +	for (i = 0; i < max_flow_num; i++) {

> +		if (unlikely(tbl->flow_num == 0))

>  			return k;

> 

> -		j = tbl->keys[i].start_index;

> +		j = tbl->flows[i].start_index;

>  		while (j != INVALID_ARRAY_INDEX) {

>  			if (tbl->items[j].start_time <= flush_timestamp) {

>  				out[k++] = tbl->items[j].firstseg;

>  				if (tbl->items[j].nb_merged > 1)

>  					update_header(&(tbl->items[j]));

>  				/*

> -				 * delete the item and get

> -				 * the next packet index

> +				 * Delete the packet and get the next

> +				 * packet in the flow.

>  				 */

> -				j = delete_item(tbl, j,

> -						INVALID_ARRAY_INDEX);

> +				j = delete_item(tbl, j,

> INVALID_ARRAY_INDEX);

> +				tbl->flows[i].start_index = j;

> +				if (j == INVALID_ARRAY_INDEX)

> +					tbl->flow_num--;

> 

> -				/*

> -				 * delete the key as all of

> -				 * packets are flushed

> -				 */

> -				if (j == INVALID_ARRAY_INDEX) {

> -					tbl->keys[i].start_index =

> -						INVALID_ARRAY_INDEX;

> -					tbl->key_num--;

> -				} else

> -					/* update start_index of the key */

> -					tbl->keys[i].start_index = j;

> -

> -				if (k == nb_out)

> +				if (unlikely(k == nb_out))

>  					return k;

>  			} else

>  				/*

> -				 * left packets of this key won't be

> -				 * timeout, so go to check other keys.

> +				 * The left packets in this flow won't be

> +				 * timeout. Go to check other flows.

>  				 */

>  				break;

>  		}

> diff --git a/lib/librte_gro/gro_tcp4.h b/lib/librte_gro/gro_tcp4.h

> index d129523..c2b66a8 100644

> --- a/lib/librte_gro/gro_tcp4.h

> +++ b/lib/librte_gro/gro_tcp4.h

> @@ -5,17 +5,20 @@

>  #ifndef _GRO_TCP4_H_

>  #define _GRO_TCP4_H_

> 

> +#include <rte_ip.h>

> +#include <rte_tcp.h>

> +

>  #define INVALID_ARRAY_INDEX 0xffffffffUL

>  #define GRO_TCP4_TBL_MAX_ITEM_NUM (1024UL * 1024UL)

> 

>  /*

> - * the max L3 length of a TCP/IPv4 packet. The L3 length

> - * is the sum of ipv4 header, tcp header and L4 payload.

> + * The max length of a IPv4 packet, which includes the length of the L3

> + * header, the L4 header and the data payload.

>   */

> -#define TCP4_MAX_L3_LENGTH UINT16_MAX

> +#define MAX_IPV4_PKT_LENGTH UINT16_MAX

> 

> -/* criteria of mergeing packets */

> -struct tcp4_key {

> +/* Header fields representing a TCP/IPv4 flow */

> +struct tcp4_flow_key {

>  	struct ether_addr eth_saddr;

>  	struct ether_addr eth_daddr;

>  	uint32_t ip_src_addr;

> @@ -26,77 +29,76 @@ struct tcp4_key {

>  	uint16_t dst_port;

>  };

> 

> -struct gro_tcp4_key {

> -	struct tcp4_key key;

> +struct gro_tcp4_flow {

> +	struct tcp4_flow_key key;

>  	/*

> -	 * the index of the first packet in the item group.

> -	 * If the value is INVALID_ARRAY_INDEX, it means

> -	 * the key is empty.

> +	 * The index of the first packet in the flow.

> +	 * INVALID_ARRAY_INDEX indicates an empty flow.

>  	 */

>  	uint32_t start_index;

>  };

> 

>  struct gro_tcp4_item {

>  	/*

> -	 * first segment of the packet. If the value

> +	 * The first MBUF segment of the packet. If the value

>  	 * is NULL, it means the item is empty.

>  	 */

>  	struct rte_mbuf *firstseg;

> -	/* last segment of the packet */

> +	/* The last MBUF segment of the packet */

>  	struct rte_mbuf *lastseg;

>  	/*

> -	 * the time when the first packet is inserted

> -	 * into the table. If a packet in the table is

> -	 * merged with an incoming packet, this value

> -	 * won't be updated. We set this value only

> -	 * when the first packet is inserted into the

> -	 * table.

> +	 * The time when the first packet is inserted into the table.

> +	 * This value won't be updated, even if the packet is merged

> +	 * with other packets.

>  	 */

>  	uint64_t start_time;

>  	/*

> -	 * we use next_pkt_idx to chain the packets that

> -	 * have same key value but can't be merged together.

> +	 * next_pkt_idx is used to chain the packets that

> +	 * are in the same flow but can't be merged together

> +	 * (e.g. caused by packet reordering).

>  	 */

>  	uint32_t next_pkt_idx;

> -	/* the sequence number of the packet */

> +	/* TCP sequence number of the packet */

>  	uint32_t sent_seq;

> -	/* the IP ID of the packet */

> +	/* IPv4 ID of the packet */

>  	uint16_t ip_id;

> -	/* the number of merged packets */

> +	/* The number of merged packets */

>  	uint16_t nb_merged;

> +	/* Indicate if IPv4 ID can be ignored */

> +	uint8_t is_atomic;

>  };

> 

>  /*

> - * TCP/IPv4 reassembly table structure.

> + * TCP/IPv4 reassembly table structure

>   */

>  struct gro_tcp4_tbl {

>  	/* item array */

>  	struct gro_tcp4_item *items;

> -	/* key array */

> -	struct gro_tcp4_key *keys;

> +	/* flow array */

> +	struct gro_tcp4_flow *flows;

>  	/* current item number */

>  	uint32_t item_num;

> -	/* current key num */

> -	uint32_t key_num;

> +	/* current flow num */

> +	uint32_t flow_num;

>  	/* item array size */

>  	uint32_t max_item_num;

> -	/* key array size */

> -	uint32_t max_key_num;

> +	/* flow array size */

> +	uint32_t max_flow_num;

>  };

> 

>  /**

>   * This function creates a TCP/IPv4 reassembly table.

>   *

>   * @param socket_id

> - *  socket index for allocating TCP/IPv4 reassemble table

> + *  Socket index for allocating the TCP/IPv4 reassemble table

>   * @param max_flow_num

> - *  the maximum number of flows in the TCP/IPv4 GRO table

> + *  The maximum number of flows in the TCP/IPv4 GRO table

>   * @param max_item_per_flow

> - *  the maximum packet number per flow.

> + *  The maximum number of packets per flow

>   *

>   * @return

> - *  if create successfully, return a pointer which points to the

> - *  created TCP/IPv4 GRO table. Otherwise, return NULL.

> + *  - Return the table pointer on success.

> + *  - Return NULL on failure.

>   */

>  void *gro_tcp4_tbl_create(uint16_t socket_id,

>  		uint16_t max_flow_num,

> @@ -106,62 +108,56 @@ void *gro_tcp4_tbl_create(uint16_t socket_id,

>   * This function destroys a TCP/IPv4 reassembly table.

>   *

>   * @param tbl

> - *  a pointer points to the TCP/IPv4 reassembly table.

> + *  Pointer pointing to the TCP/IPv4 reassembly table.

>   */

>  void gro_tcp4_tbl_destroy(void *tbl);

> 

>  /**

> - * This function searches for a packet in the TCP/IPv4 reassembly table

> - * to merge with the inputted one. To merge two packets is to chain them

> - * together and update packet headers. Packets, whose SYN, FIN, RST, PSH

> - * CWR, ECE or URG bit is set, are returned immediately. Packets which

> - * only have packet headers (i.e. without data) are also returned

> - * immediately. Otherwise, the packet is either merged, or inserted into

> - * the table. Besides, if there is no available space to insert the

> - * packet, this function returns immediately too.

> + * This function merges a TCP/IPv4 packet. It doesn't process the packet,

> + * which has SYN, FIN, RST, PSH, CWR, ECE or URG set, or doesn't have

> + * payload.

>   *

> - * This function assumes the inputted packet is with correct IPv4 and

> - * TCP checksums. And if two packets are merged, it won't re-calculate

> - * IPv4 and TCP checksums. Besides, if the inputted packet is IP

> - * fragmented, it assumes the packet is complete (with TCP header).

> + * This function doesn't check if the packet has correct checksums and

> + * doesn't re-calculate checksums for the merged packet. Additionally,

> + * it assumes the packets are complete (i.e., MF==0 && frag_off==0),

> + * when IP fragmentation is possible (i.e., DF==0). It returns the

> + * packet, if the packet has invalid parameters (e.g. SYN bit is set)

> + * or there is no available space in the table.

>   *

>   * @param pkt

> - *  packet to reassemble.

> + *  Packet to reassemble

>   * @param tbl

> - *  a pointer that points to a TCP/IPv4 reassembly table.

> + *  Pointer pointing to the TCP/IPv4 reassembly table

>   * @start_time

> - *  the start time that the packet is inserted into the table

> + *  The time when the packet is inserted into the table

>   *

>   * @return

> - *  if the packet doesn't have data, or SYN, FIN, RST, PSH, CWR, ECE

> - *  or URG bit is set, or there is no available space in the table to

> - *  insert a new item or a new key, return a negative value. If the

> - *  packet is merged successfully, return an positive value. If the

> - *  packet is inserted into the table, return 0.

> + *  - Return a positive value if the packet is merged.

> + *  - Return zero if the packet isn't merged but stored in the table.

> + *  - Return a negative value for invalid parameters or no available

> + *    space in the table.

>   */

>  int32_t gro_tcp4_reassemble(struct rte_mbuf *pkt,

>  		struct gro_tcp4_tbl *tbl,

>  		uint64_t start_time);

> 

>  /**

> - * This function flushes timeout packets in a TCP/IPv4 reassembly table

> - * to applications, and without updating checksums for merged packets.

> - * The max number of flushed timeout packets is the element number of

> - * the array which is used to keep flushed packets.

> + * This function flushes timeout packets in a TCP/IPv4 reassembly table,

> + * and without updating checksums.

>   *

>   * @param tbl

> - *  a pointer that points to a TCP GRO table.

> + *  TCP/IPv4 reassembly table pointer

>   * @param flush_timestamp

> - *  this function flushes packets which are inserted into the table

> - *  before or at the flush_timestamp.

> + *  Flush packets which are inserted into the table before or at the

> + *  flush_timestamp.

>   * @param out

> - *  pointer array which is used to keep flushed packets.

> + *  Pointer array used to keep flushed packets

>   * @param nb_out

> - *  the element number of out. It's also the max number of timeout

> + *  The element number in 'out'. It also determines the maximum number

> of

>   *  packets that can be flushed finally.

>   *

>   * @return

> - *  the number of packets that are returned.

> + *  The number of flushed packets

>   */

>  uint16_t gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl *tbl,

>  		uint64_t flush_timestamp,

> @@ -173,10 +169,131 @@ uint16_t gro_tcp4_tbl_timeout_flush(struct

> gro_tcp4_tbl *tbl,

>   * reassembly table.

>   *

>   * @param tbl

> - *  pointer points to a TCP/IPv4 reassembly table.

> + *  TCP/IPv4 reassembly table pointer

>   *

>   * @return

> - *  the number of packets in the table

> + *  The number of packets in the table

>   */

>  uint32_t gro_tcp4_tbl_pkt_count(void *tbl);

> +

> +/*

> + * Check if two TCP/IPv4 packets belong to the same flow.

> + */

> +static inline int

> +is_same_tcp4_flow(struct tcp4_flow_key k1, struct tcp4_flow_key k2)

> +{

> +	return (is_same_ether_addr(&k1.eth_saddr, &k2.eth_saddr) &&

> +			is_same_ether_addr(&k1.eth_daddr, &k2.eth_daddr)

> &&

> +			(k1.ip_src_addr == k2.ip_src_addr) &&

> +			(k1.ip_dst_addr == k2.ip_dst_addr) &&

> +			(k1.recv_ack == k2.recv_ack) &&

> +			(k1.src_port == k2.src_port) &&

> +			(k1.dst_port == k2.dst_port));

> +}

> +

> +/*

> + * Check if two TCP/IPv4 packets are neighbors.

> + */

> +static inline int

> +check_seq_option(struct gro_tcp4_item *item,

> +		struct tcp_hdr *tcph,

> +		uint32_t sent_seq,

> +		uint16_t ip_id,

> +		uint16_t tcp_hl,

> +		uint16_t tcp_dl,

> +		uint16_t l2_offset,

> +		uint8_t is_atomic)

> +{

> +	struct rte_mbuf *pkt_orig = item->firstseg;

> +	struct ipv4_hdr *iph_orig;

> +	struct tcp_hdr *tcph_orig;

> +	uint16_t len, l4_len_orig;

> +

> +	iph_orig = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt_orig, char *) +

> +			l2_offset + pkt_orig->l2_len);

> +	tcph_orig = (struct tcp_hdr *)((char *)iph_orig + pkt_orig->l3_len);

> +	l4_len_orig = pkt_orig->l4_len;

> +

> +	/* Check if TCP option fields equal */

> +	len = RTE_MAX(tcp_hl, l4_len_orig) - sizeof(struct tcp_hdr);

> +	if ((tcp_hl != l4_len_orig) || ((len > 0) &&

> +				(memcmp(tcph + 1, tcph_orig + 1,

> +					len) != 0)))

> +		return 0;

> +

> +	/* Don't merge packets whose DF bits are different */

> +	if (unlikely(item->is_atomic ^ is_atomic))

> +		return 0;

> +

> +	/* Check if the two packets are neighbors */

> +	len = pkt_orig->pkt_len - l2_offset - pkt_orig->l2_len -

> +		pkt_orig->l3_len - l4_len_orig;

> +	if ((sent_seq == item->sent_seq + len) && (is_atomic ||

> +				(ip_id == item->ip_id + item->nb_merged)))

> +		/* Append the new packet */

> +		return 1;

> +	else if ((sent_seq + tcp_dl == item->sent_seq) && (is_atomic ||

> +				(ip_id + 1 == item->ip_id)))

> +		/* Pre-pend the new packet */

> +		return -1;

> +

> +	return 0;

> +}

> +

> +/*

> + * Merge two TCP/IPv4 packets without updating checksums.

> + * If cmp is larger than 0, append the new packet to the

> + * original packet. Otherwise, pre-pend the new packet to

> + * the original packet.

> + */

> +static inline int

> +merge_two_tcp4_packets(struct gro_tcp4_item *item,

> +		struct rte_mbuf *pkt,

> +		int cmp,

> +		uint32_t sent_seq,

> +		uint16_t ip_id,

> +		uint16_t l2_offset)

> +{

> +	struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;

> +	uint16_t hdr_len, l2_len;

> +

> +	if (cmp > 0) {

> +		pkt_head = item->firstseg;

> +		pkt_tail = pkt;

> +	} else {

> +		pkt_head = pkt;

> +		pkt_tail = item->firstseg;

> +	}

> +

> +	/* Check if the IPv4 packet length is greater than the max value */

> +	hdr_len = l2_offset + pkt_head->l2_len + pkt_head->l3_len +

> +		pkt_head->l4_len;

> +	l2_len = l2_offset > 0 ? pkt_head->outer_l2_len : pkt_head->l2_len;

> +	if (unlikely(pkt_head->pkt_len - l2_len + pkt_tail->pkt_len - hdr_len >

> +			MAX_IPV4_PKT_LENGTH))

> +		return 0;

> +

> +	/* Remove the packet header */

> +	rte_pktmbuf_adj(pkt_tail, hdr_len);

> +

> +	/* Chain two packets together */

> +	if (cmp > 0) {

> +		item->lastseg->next = pkt;

> +		item->lastseg = rte_pktmbuf_lastseg(pkt);

> +	} else {

> +		lastseg = rte_pktmbuf_lastseg(pkt);

> +		lastseg->next = item->firstseg;

> +		item->firstseg = pkt;

> +		/* Update sent_seq and ip_id */

> +		item->sent_seq = sent_seq;

> +		item->ip_id = ip_id;

> +	}

> +	item->nb_merged++;

> +

> +	/* Update MBUF metadata for the merged packet */

> +	pkt_head->nb_segs += pkt_tail->nb_segs;

> +	pkt_head->pkt_len += pkt_tail->pkt_len;

> +

> +	return 1;

> +}

>  #endif

> diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c

> index d6b8cd1..7176c0e 100644

> --- a/lib/librte_gro/rte_gro.c

> +++ b/lib/librte_gro/rte_gro.c

> @@ -23,11 +23,14 @@ static gro_tbl_destroy_fn

> tbl_destroy_fn[RTE_GRO_TYPE_MAX_NUM] = {

>  static gro_tbl_pkt_count_fn tbl_pkt_count_fn[RTE_GRO_TYPE_MAX_NUM]

> = {

>  			gro_tcp4_tbl_pkt_count, NULL};

> 

> +#define IS_IPV4_TCP_PKT(ptype) (RTE_ETH_IS_IPV4_HDR(ptype) && \

> +		((ptype & RTE_PTYPE_L4_TCP) == RTE_PTYPE_L4_TCP))

> +

>  /*

> - * GRO context structure, which is used to merge packets. It keeps

> - * many reassembly tables of desired GRO types. Applications need to

> - * create GRO context objects before using rte_gro_reassemble to

> - * perform GRO.

> + * GRO context structure. It keeps the table structures, which are

> + * used to merge packets, for different GRO types. Before using

> + * rte_gro_reassemble(), applications need to create the GRO context

> + * first.

>   */

>  struct gro_ctx {

>  	/* GRO types to perform */

> @@ -65,7 +68,7 @@ rte_gro_ctx_create(const struct rte_gro_param *param)

>  				param->max_flow_num,

>  				param->max_item_per_flow);

>  		if (gro_ctx->tbls[i] == NULL) {

> -			/* destroy all created tables */

> +			/* Destroy all created tables */

>  			gro_ctx->gro_types = gro_types;

>  			rte_gro_ctx_destroy(gro_ctx);

>  			return NULL;

> @@ -85,8 +88,6 @@ rte_gro_ctx_destroy(void *ctx)

>  	uint64_t gro_type_flag;

>  	uint8_t i;

> 

> -	if (gro_ctx == NULL)

> -		return;

>  	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {

>  		gro_type_flag = 1ULL << i;

>  		if ((gro_ctx->gro_types & gro_type_flag) == 0)

> @@ -103,62 +104,54 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,

>  		uint16_t nb_pkts,

>  		const struct rte_gro_param *param)

>  {

> -	uint16_t i;

> -	uint16_t nb_after_gro = nb_pkts;

> -	uint32_t item_num;

> -

> -	/* allocate a reassembly table for TCP/IPv4 GRO */

> +	/* Allocate a reassembly table for TCP/IPv4 GRO */

>  	struct gro_tcp4_tbl tcp_tbl;

> -	struct gro_tcp4_key tcp_keys[RTE_GRO_MAX_BURST_ITEM_NUM];

> +	struct gro_tcp4_flow

> tcp_flows[RTE_GRO_MAX_BURST_ITEM_NUM];

>  	struct gro_tcp4_item tcp_items[RTE_GRO_MAX_BURST_ITEM_NUM]

> = {{0} };

> 

>  	struct rte_mbuf *unprocess_pkts[nb_pkts];

> -	uint16_t unprocess_num = 0;

> +	uint32_t item_num;

>  	int32_t ret;

> -	uint64_t current_time;

> +	uint16_t i, unprocess_num = 0, nb_after_gro = nb_pkts;

> 

> -	if ((param->gro_types & RTE_GRO_TCP_IPV4) == 0)

> +	if (unlikely((param->gro_types & RTE_GRO_TCP_IPV4) == 0))

>  		return nb_pkts;

> 

> -	/* get the actual number of packets */

> +	/* Get the maximum number of packets */

>  	item_num = RTE_MIN(nb_pkts, (param->max_flow_num *

> -			param->max_item_per_flow));

> +				param->max_item_per_flow));

>  	item_num = RTE_MIN(item_num,

> RTE_GRO_MAX_BURST_ITEM_NUM);

> 

>  	for (i = 0; i < item_num; i++)

> -		tcp_keys[i].start_index = INVALID_ARRAY_INDEX;

> +		tcp_flows[i].start_index = INVALID_ARRAY_INDEX;

> 

> -	tcp_tbl.keys = tcp_keys;

> +	tcp_tbl.flows = tcp_flows;

>  	tcp_tbl.items = tcp_items;

> -	tcp_tbl.key_num = 0;

> +	tcp_tbl.flow_num = 0;

>  	tcp_tbl.item_num = 0;

> -	tcp_tbl.max_key_num = item_num;

> +	tcp_tbl.max_flow_num = item_num;

>  	tcp_tbl.max_item_num = item_num;

> 

> -	current_time = rte_rdtsc();

> -

>  	for (i = 0; i < nb_pkts; i++) {

> -		if ((pkts[i]->packet_type & (RTE_PTYPE_L3_IPV4 |

> -					RTE_PTYPE_L4_TCP)) ==

> -				(RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP))

> {

> -			ret = gro_tcp4_reassemble(pkts[i],

> -					&tcp_tbl,

> -					current_time);

> +		if (IS_IPV4_TCP_PKT(pkts[i]->packet_type)) {

> +			/*

> +			 * The timestamp is ignored, since all packets

> +			 * will be flushed from the tables.

> +			 */

> +			ret = gro_tcp4_reassemble(pkts[i], &tcp_tbl, 0);

>  			if (ret > 0)

> -				/* merge successfully */

> +				/* Merge successfully */

>  				nb_after_gro--;

> -			else if (ret < 0) {

> -				unprocess_pkts[unprocess_num++] =

> -					pkts[i];

> -			}

> +			else if (ret < 0)

> +				unprocess_pkts[unprocess_num++] = pkts[i];

>  		} else

>  			unprocess_pkts[unprocess_num++] = pkts[i];

>  	}

> 

> -	/* re-arrange GROed packets */

>  	if (nb_after_gro < nb_pkts) {

> -		i = gro_tcp4_tbl_timeout_flush(&tcp_tbl, current_time,

> -				pkts, nb_pkts);

> +		/* Flush all packets from the tables */

> +		i = gro_tcp4_tbl_timeout_flush(&tcp_tbl, 0, pkts, nb_pkts);

> +		/* Copy unprocessed packets */

>  		if (unprocess_num > 0) {

>  			memcpy(&pkts[i], unprocess_pkts,

>  					sizeof(struct rte_mbuf *) *

> @@ -174,31 +167,28 @@ rte_gro_reassemble(struct rte_mbuf **pkts,

>  		uint16_t nb_pkts,

>  		void *ctx)

>  {

> -	uint16_t i, unprocess_num = 0;

>  	struct rte_mbuf *unprocess_pkts[nb_pkts];

>  	struct gro_ctx *gro_ctx = ctx;

> +	void *tcp_tbl;

>  	uint64_t current_time;

> +	uint16_t i, unprocess_num = 0;

> 

> -	if ((gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == 0)

> +	if (unlikely((gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == 0))

>  		return nb_pkts;

> 

> +	tcp_tbl = gro_ctx->tbls[RTE_GRO_TCP_IPV4_INDEX];

>  	current_time = rte_rdtsc();

> 

>  	for (i = 0; i < nb_pkts; i++) {

> -		if ((pkts[i]->packet_type & (RTE_PTYPE_L3_IPV4 |

> -					RTE_PTYPE_L4_TCP)) ==

> -				(RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP))

> {

> -			if (gro_tcp4_reassemble(pkts[i],

> -						gro_ctx->tbls

> -						[RTE_GRO_TCP_IPV4_INDEX],

> +		if (IS_IPV4_TCP_PKT(pkts[i]->packet_type)) {

> +			if (gro_tcp4_reassemble(pkts[i], tcp_tbl,

>  						current_time) < 0)

>  				unprocess_pkts[unprocess_num++] = pkts[i];

>  		} else

>  			unprocess_pkts[unprocess_num++] = pkts[i];

>  	}

>  	if (unprocess_num > 0) {

> -		memcpy(pkts, unprocess_pkts,

> -				sizeof(struct rte_mbuf *) *

> +		memcpy(pkts, unprocess_pkts, sizeof(struct rte_mbuf *) *

>  				unprocess_num);

>  	}

> 

> @@ -224,6 +214,7 @@ rte_gro_timeout_flush(void *ctx,

>  				flush_timestamp,

>  				out, max_nb_out);

>  	}

> +

>  	return 0;

>  }

> 

> @@ -232,19 +223,20 @@ rte_gro_get_pkt_count(void *ctx)

>  {

>  	struct gro_ctx *gro_ctx = ctx;

>  	gro_tbl_pkt_count_fn pkt_count_fn;

> +	uint64_t gro_types = gro_ctx->gro_types, flag;

>  	uint64_t item_num = 0;

> -	uint64_t gro_type_flag;

>  	uint8_t i;

> 

> -	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {

> -		gro_type_flag = 1ULL << i;

> -		if ((gro_ctx->gro_types & gro_type_flag) == 0)

> +	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM && gro_types; i++) {

> +		flag = 1ULL << i;

> +		if ((gro_types & flag) == 0)

>  			continue;

> 

> +		gro_types ^= flag;

>  		pkt_count_fn = tbl_pkt_count_fn[i];

> -		if (pkt_count_fn == NULL)

> -			continue;

> -		item_num += pkt_count_fn(gro_ctx->tbls[i]);

> +		if (pkt_count_fn)

> +			item_num += pkt_count_fn(gro_ctx->tbls[i]);

>  	}

> +

>  	return item_num;

>  }

> diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h

> index 81a2eac..7979a59 100644

> --- a/lib/librte_gro/rte_gro.h

> +++ b/lib/librte_gro/rte_gro.h

> @@ -31,8 +31,8 @@ extern "C" {

>  /**< TCP/IPv4 GRO flag */

> 

>  /**

> - * A structure which is used to create GRO context objects or tell

> - * rte_gro_reassemble_burst() what reassembly rules are demanded.

> + * Structure used to create GRO context objects or used to pass

> + * application-determined parameters to rte_gro_reassemble_burst().

>   */

>  struct rte_gro_param {

>  	uint64_t gro_types;

> @@ -78,26 +78,23 @@ void rte_gro_ctx_destroy(void *ctx);

> 

>  /**

>   * This is one of the main reassembly APIs, which merges numbers of

> - * packets at a time. It assumes that all inputted packets are with

> - * correct checksums. That is, applications should guarantee all

> - * inputted packets are correct. Besides, it doesn't re-calculate

> - * checksums for merged packets. If inputted packets are IP fragmented,

> - * this function assumes them are complete (i.e. with L4 header). After

> - * finishing processing, it returns all GROed packets to applications

> - * immediately.

> + * packets at a time. It doesn't check if input packets have correct

> + * checksums and doesn't re-calculate checksums for merged packets.

> + * It assumes the packets are complete (i.e., MF==0 && frag_off==0),

> + * when IP fragmentation is possible (i.e., DF==1). The GROed packets

> + * are returned as soon as the function finishes.

>   *

>   * @param pkts

> - *  a pointer array which points to the packets to reassemble. Besides,

> - *  it keeps mbuf addresses for the GROed packets.

> + *  Pointer array pointing to the packets to reassemble. Besides, it

> + *  keeps MBUF addresses for the GROed packets.

>   * @param nb_pkts

> - *  the number of packets to reassemble.

> + *  The number of packets to reassemble

>   * @param param

> - *  applications use it to tell rte_gro_reassemble_burst() what rules

> - *  are demanded.

> + *  Application-determined parameters for reassembling packets.

>   *

>   * @return

> - *  the number of packets after been GROed. If no packets are merged,

> - *  the returned value is nb_pkts.

> + *  The number of packets after been GROed. If no packets are merged,

> + *  the return value is equals to nb_pkts.

>   */

>  uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,

>  		uint16_t nb_pkts,

> @@ -107,32 +104,28 @@ uint16_t rte_gro_reassemble_burst(struct

> rte_mbuf **pkts,

>   * @warning

>   * @b EXPERIMENTAL: this API may change without prior notice

>   *

> - * Reassembly function, which tries to merge inputted packets with

> - * the packets in the reassembly tables of a given GRO context. This

> - * function assumes all inputted packets are with correct checksums.

> - * And it won't update checksums if two packets are merged. Besides,

> - * if inputted packets are IP fragmented, this function assumes they

> - * are complete packets (i.e. with L4 header).

> + * Reassembly function, which tries to merge input packets with the

> + * existed packets in the reassembly tables of a given GRO context.

> + * It doesn't check if input packets have correct checksums and doesn't

> + * re-calculate checksums for merged packets. Additionally, it assumes

> + * the packets are complete (i.e., MF==0 && frag_off==0), when IP

> + * fragmentation is possible (i.e., DF==1).

>   *

> - * If the inputted packets don't have data or are with unsupported GRO

> - * types etc., they won't be processed and are returned to applications.

> - * Otherwise, the inputted packets are either merged or inserted into

> - * the table. If applications want get packets in the table, they need

> - * to call flush API.

> + * If the input packets have invalid parameters (e.g. no data payload,

> + * unsupported GRO types), they are returned to applications. Otherwise,

> + * they are either merged or inserted into the table. Applications need

> + * to flush packets from the tables by flush API, if they want to get the

> + * GROed packets.

>   *

>   * @param pkts

> - *  packet to reassemble. Besides, after this function finishes, it

> - *  keeps the unprocessed packets (e.g. without data or unsupported

> - *  GRO types).

> + *  Packets to reassemble. It's also used to store the unprocessed packets.

>   * @param nb_pkts

> - *  the number of packets to reassemble.

> + *  The number of packets to reassemble

>   * @param ctx

> - *  a pointer points to a GRO context object.

> + *  GRO context object pointer

>   *

>   * @return

> - *  return the number of unprocessed packets (e.g. without data or

> - *  unsupported GRO types). If all packets are processed (merged or

> - *  inserted into the table), return 0.

> + *  The number of unprocessed packets.

>   */

>  uint16_t rte_gro_reassemble(struct rte_mbuf **pkts,

>  		uint16_t nb_pkts,

> @@ -142,29 +135,28 @@ uint16_t rte_gro_reassemble(struct rte_mbuf

> **pkts,

>   * @warning

>   * @b EXPERIMENTAL: this API may change without prior notice

>   *

> - * This function flushes the timeout packets from reassembly tables of

> - * desired GRO types. The max number of flushed timeout packets is the

> - * element number of the array which is used to keep the flushed packets.

> + * This function flushes the timeout packets from the reassembly tables

> + * of desired GRO types. The max number of flushed packets is the

> + * element number of 'out'.

>   *

> - * Besides, this function won't re-calculate checksums for merged

> - * packets in the tables. That is, the returned packets may be with

> - * wrong checksums.

> + * Additionally, the flushed packets may have incorrect checksums, since

> + * this function doesn't re-calculate checksums for merged packets.

>   *

>   * @param ctx

> - *  a pointer points to a GRO context object.

> + *  GRO context object pointer.

>   * @param timeout_cycles

> - *  max TTL for packets in reassembly tables, measured in nanosecond.

> + *  The max TTL for packets in reassembly tables, measured in nanosecond.

>   * @param gro_types

> - *  this function only flushes packets which belong to the GRO types

> - *  specified by gro_types.

> + *  This function flushes packets whose GRO types are specified by

> + *  gro_types.

>   * @param out

> - *  a pointer array that is used to keep flushed timeout packets.

> + *  Pointer array used to keep flushed packets.

>   * @param max_nb_out

> - *  the element number of out. It's also the max number of timeout

> + *  The element number of 'out'. It's also the max number of timeout

>   *  packets that can be flushed finally.

>   *

>   * @return

> - *  the number of flushed packets. If no packets are flushed, return 0.

> + *  The number of flushed packets.

>   */

>  uint16_t rte_gro_timeout_flush(void *ctx,

>  		uint64_t timeout_cycles,

> @@ -180,10 +172,10 @@ uint16_t rte_gro_timeout_flush(void *ctx,

>   * of a given GRO context.

>   *

>   * @param ctx

> - *  pointer points to a GRO context object.

> + *  GRO context object pointer.

>   *

>   * @return

> - *  the number of packets in all reassembly tables.

> + *  The number of packets in the tables.

>   */

>  uint64_t rte_gro_get_pkt_count(void *ctx);

> 

> --

> 2.7.4
  
Thomas Monjalon Jan. 10, 2018, 12:09 a.m. UTC | #2
Hi,

05/01/2018 07:12, Jiayu Hu:
> - Remove needless check and variants
> - For better understanding, update the programmer guide and rename
>   internal functions and variants
> - For supporting tunneled gro, move common internal functions from
>   gro_tcp4.c to gro_tcp4.h
> - Comply RFC 6864 to process the IPv4 ID field

I think you could split this patch in several ones.
Please remind that the git history can be used later to understand
why the changes were done.

Thanks
  
Hu, Jiayu Jan. 10, 2018, 1:55 a.m. UTC | #3
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, January 10, 2018 8:09 AM
> To: Hu, Jiayu <jiayu.hu@intel.com>
> Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Chen,
> Junjie J <junjie.j.chen@intel.com>; Tan, Jianfeng <jianfeng.tan@intel.com>;
> stephen@networkplumber.org; Yigit, Ferruh <ferruh.yigit@intel.com>;
> Ananyev, Konstantin <konstantin.ananyev@intel.com>; Yao, Lei A
> <lei.a.yao@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v4 1/2] gro: code cleanup
> 
> Hi,
> 
> 05/01/2018 07:12, Jiayu Hu:
> > - Remove needless check and variants
> > - For better understanding, update the programmer guide and rename
> >   internal functions and variants
> > - For supporting tunneled gro, move common internal functions from
> >   gro_tcp4.c to gro_tcp4.h
> > - Comply RFC 6864 to process the IPv4 ID field
> 
> I think you could split this patch in several ones.
> Please remind that the git history can be used later to understand
> why the changes were done.

Thanks for your suggestion. I will split this patch into three patches: code cleanup,
comply RFC 6864 to process IPv4 ID field and extract common functions for supporting
tunneled GRO.

Regards,
Jiayu
> 
> Thanks
  

Patch

diff --git a/doc/guides/prog_guide/generic_receive_offload_lib.rst b/doc/guides/prog_guide/generic_receive_offload_lib.rst
index 22e50ec..c2d7a41 100644
--- a/doc/guides/prog_guide/generic_receive_offload_lib.rst
+++ b/doc/guides/prog_guide/generic_receive_offload_lib.rst
@@ -32,128 +32,162 @@  Generic Receive Offload Library
 ===============================
 
 Generic Receive Offload (GRO) is a widely used SW-based offloading
-technique to reduce per-packet processing overhead. It gains performance
-by reassembling small packets into large ones. To enable more flexibility
-to applications, DPDK implements GRO as a standalone library. Applications
-explicitly use the GRO library to merge small packets into large ones.
-
-The GRO library assumes all input packets have correct checksums. In
-addition, the GRO library doesn't re-calculate checksums for merged
-packets. If input packets are IP fragmented, the GRO library assumes
-they are complete packets (i.e. with L4 headers).
-
-Currently, the GRO library implements TCP/IPv4 packet reassembly.
-
-Reassembly Modes
-----------------
-
-The GRO library provides two reassembly modes: lightweight and
-heavyweight mode. If applications want to merge packets in a simple way,
-they can use the lightweight mode API. If applications want more
-fine-grained controls, they can choose the heavyweight mode API.
-
-Lightweight Mode
-~~~~~~~~~~~~~~~~
-
-The ``rte_gro_reassemble_burst()`` function is used for reassembly in
-lightweight mode. It tries to merge N input packets at a time, where
-N should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.
-
-In each invocation, ``rte_gro_reassemble_burst()`` allocates temporary
-reassembly tables for the desired GRO types. Note that the reassembly
-table is a table structure used to reassemble packets and different GRO
-types (e.g. TCP/IPv4 GRO and TCP/IPv6 GRO) have different reassembly table
-structures. The ``rte_gro_reassemble_burst()`` function uses the reassembly
-tables to merge the N input packets.
-
-For applications, performing GRO in lightweight mode is simple. They
-just need to invoke ``rte_gro_reassemble_burst()``. Applications can get
-GROed packets as soon as ``rte_gro_reassemble_burst()`` returns.
-
-Heavyweight Mode
-~~~~~~~~~~~~~~~~
-
-The ``rte_gro_reassemble()`` function is used for reassembly in heavyweight
-mode. Compared with the lightweight mode, performing GRO in heavyweight mode
-is relatively complicated.
-
-Before performing GRO, applications need to create a GRO context object
-by calling ``rte_gro_ctx_create()``. A GRO context object holds the
-reassembly tables of desired GRO types. Note that all update/lookup
-operations on the context object are not thread safe. So if different
-processes or threads want to access the same context object simultaneously,
-some external syncing mechanisms must be used.
-
-Once the GRO context is created, applications can then use the
-``rte_gro_reassemble()`` function to merge packets. In each invocation,
-``rte_gro_reassemble()`` tries to merge input packets with the packets
-in the reassembly tables. If an input packet is an unsupported GRO type,
-or other errors happen (e.g. SYN bit is set), ``rte_gro_reassemble()``
-returns the packet to applications. Otherwise, the input packet is either
-merged or inserted into a reassembly table.
-
-When applications want to get GRO processed packets, they need to use
-``rte_gro_timeout_flush()`` to flush them from the tables manually.
+technique to reduce per-packet processing overheads. By reassembling
+small packets into larger ones, GRO enables applications to process
+fewer large packets directly, thus reducing the number of packets to
+be processed. To benefit DPDK-based applications, like Open vSwitch,
+DPDK also provides own GRO implementation. In DPDK, GRO is implemented
+as a standalone library. Applications explicitly use the GRO library to
+reassemble packets.
+
+Overview
+--------
+
+In the GRO library, there are many GRO types which are defined by packet
+types. One GRO type is in charge of process one kind of packets. For
+example, TCP/IPv4 GRO processes TCP/IPv4 packets.
+
+Each GRO type has a reassembly function, which defines own algorithm and
+table structure to reassemble packets. We assign input packets to the
+corresponding GRO functions by MBUF->packet_type.
+
+The GRO library doesn't check if input packets have correct checksums and
+doesn't re-calculate checksums for merged packets. The GRO library
+assumes the packets are complete (i.e., MF==0 && frag_off==0), when IP
+fragmentation is possible (i.e., DF==0). Additionally, it complies RFC
+6864 to process the IPv4 ID field.
 
-TCP/IPv4 GRO
-------------
+Currently, the GRO library provides GRO supports for TCP/IPv4 packets.
+
+Two Sets of API
+---------------
+
+For different usage scenarios, the GRO library provides two sets of API.
+The one is called the lightweight mode API, which enables applications to
+merge a small number of packets rapidly; the other is called the
+heavyweight mode API, which provides fine-grained controls to
+applications and supports to merge a large number of packets.
+
+Lightweight Mode API
+~~~~~~~~~~~~~~~~~~~~
+
+The lightweight mode only has one function ``rte_gro_reassemble_burst()``,
+which process N packets at a time. Using the lightweight mode API to
+merge packets is very simple. Calling ``rte_gro_reassemble_burst()`` is
+enough. The GROed packets are returned to applications as soon as it
+finishes.
+
+In ``rte_gro_reassemble_burst()``, table structures of different GRO
+types are allocated in the stack. This design simplifies applications'
+operations. However, limited by the stack size, the maximum number of
+packets that ``rte_gro_reassemble_burst()`` can process in an invocation
+should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.
+
+Heavyweight Mode API
+~~~~~~~~~~~~~~~~~~~~
+
+Compared with the lightweight mode, using the heavyweight mode API is
+relatively complex. Firstly, applications need to create a GRO context
+by ``rte_gro_ctx_create()``. ``rte_gro_ctx_create()`` allocates tables
+structures in the heap and stores their pointers in the GRO context.
+Secondly, applications use ``rte_gro_reassemble()`` to merge packets.
+If input packets have invalid parameters, ``rte_gro_reassemble()``
+returns them to applications. For example, packets of unsupported GRO
+types or TCP SYN packets are returned. Otherwise, the input packets are
+either merged with the existed packets in the tables or inserted into the
+tables. Finally, applications use ``rte_gro_timeout_flush()`` to flush
+packets from the tables, when they want to get the GROed packets.
+
+Note that all update/lookup operations on the GRO context are not thread
+safe. So if different processes or threads want to access the same
+context object simultaneously, some external syncing mechanisms must be
+used.
+
+Reassembly Algorithm
+--------------------
+
+The reassembly algorithm is used for reassembling packets. In the GRO
+library, different GRO types can use different algorithms. In this
+section, we will introduce an algorithm, which is used by TCP/IPv4 GRO.
 
-TCP/IPv4 GRO supports merging small TCP/IPv4 packets into large ones,
-using a table structure called the TCP/IPv4 reassembly table.
+Challenges
+~~~~~~~~~~
 
-TCP/IPv4 Reassembly Table
-~~~~~~~~~~~~~~~~~~~~~~~~~
+The reassembly algorithm determines the efficiency of GRO. There are two
+challenges in the algorithm design:
 
-A TCP/IPv4 reassembly table includes a "key" array and an "item" array.
-The key array keeps the criteria to merge packets and the item array
-keeps the packet information.
+- a high cost algorithm/implementation would cause packet dropping in a
+  high speed network.
 
-Each key in the key array points to an item group, which consists of
-packets which have the same criteria values but can't be merged. A key
-in the key array includes two parts:
+- packet reordering makes it hard to merge packets. For example, Linux
+  GRO fails to merge packets when encounters packet reordering.
 
-* ``criteria``: the criteria to merge packets. If two packets can be
-  merged, they must have the same criteria values.
+The above two challenges require our algorithm is:
 
-* ``start_index``: the item array index of the first packet in the item
-  group.
+- lightweight enough to scale fast networking speed
 
-Each element in the item array keeps the information of a packet. An item
-in the item array mainly includes three parts:
+- capable of handling packet reordering
 
-* ``firstseg``: the mbuf address of the first segment of the packet.
+In DPDK GRO, we use a key-based algorithm to address the two challenges.
 
-* ``lastseg``: the mbuf address of the last segment of the packet.
+Key-based Reassembly Algorithm
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+:numref:`figure_gro-key-algorithm` illustrates the procedure of the
+key-based algorithm. Packets are classified into "flows" by some header
+fields (we call them as "key"). To process an input packet, the algorithm
+searches for a matched "flow" (i.e., the same value of key) for the
+packet first, then checks all packets in the "flow" and tries to find a
+"neighbor" for it. If find a "neighbor", merge the two packets together.
+If can't find a "neighbor", store the packet into its "flow". If can't
+find a matched "flow", insert a new "flow" and store the packet into the
+"flow".
+
+.. note::
+        Packets in the same "flow" that can't merge are always caused
+        by packet reordering.
+
+The key-based algorithm has two characters:
+
+- classifying packets into "flows" to accelerate packet aggregation is
+  simple (address challenge 1).
+
+- storing out-of-order packets makes it possible to merge later (address
+  challenge 2).
+
+.. _figure_gro-key-algorithm:
+
+.. figure:: img/gro-key-algorithm.*
+   :align: center
+
+   Key-based Reassembly Algorithm
+
+TCP/IPv4 GRO
+------------
 
-* ``next_pkt_index``: the item array index of the next packet in the same
-  item group. TCP/IPv4 GRO uses ``next_pkt_index`` to chain the packets
-  that have the same criteria value but can't be merged together.
+The table structure used by TCP/IPv4 GRO contains two arrays: flow array
+and item array. The flow array keeps flow information, and the item array
+keeps packet information.
 
-Procedure to Reassemble a Packet
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Header fields used to define a TCP/IPv4 flow include:
 
-To reassemble an incoming packet needs three steps:
+- source and destination: Ethernet and IP address, TCP port
 
-#. Check if the packet should be processed. Packets with one of the
-   following properties aren't processed and are returned immediately:
+- TCP acknowledge number
 
-   * FIN, SYN, RST, URG, PSH, ECE or CWR bit is set.
+TCP/IPv4 packets whose FIN, SYN, RST, URG, PSH, ECE or CWR bit is set
+won't be processed.
 
-   * L4 payload length is 0.
+Header fields deciding if two packets are neighbors include:
 
-#.  Traverse the key array to find a key which has the same criteria
-    value with the incoming packet. If found, go to the next step.
-    Otherwise, insert a new key and a new item for the packet.
+- TCP sequence number
 
-#. Locate the first packet in the item group via ``start_index``. Then
-   traverse all packets in the item group via ``next_pkt_index``. If a
-   packet is found which can be merged with the incoming one, merge them
-   together. If one isn't found, insert the packet into this item group.
-   Note that to merge two packets is to link them together via mbuf's
-   ``next`` field.
+- IPv4 ID. The IPv4 ID fields of the packets, whose DF bit is 0, should
+  be increased by 1.
 
-When packets are flushed from the reassembly table, TCP/IPv4 GRO updates
-packet header fields for the merged packets. Note that before reassembling
-the packet, TCP/IPv4 GRO doesn't check if the checksums of packets are
-correct. Also, TCP/IPv4 GRO doesn't re-calculate checksums for merged
-packets.
+.. note::
+        We comply RFC 6864 to process the IPv4 ID field. Specifically,
+        we check IPv4 ID fields for the packets whose DF bit is 0 and
+        ignore IPv4 ID fields for the packets whose DF bit is 1.
+        Additionally, packets which have different value of DF bit can't
+        be merged.
diff --git a/doc/guides/prog_guide/img/gro-key-algorithm.svg b/doc/guides/prog_guide/img/gro-key-algorithm.svg
new file mode 100644
index 0000000..94e42f5
--- /dev/null
+++ b/doc/guides/prog_guide/img/gro-key-algorithm.svg
@@ -0,0 +1,223 @@ 
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">
+<!-- Generated by Microsoft Visio 11.0, SVG Export, v1.0 gro-key-algorithm.svg Page-1 -->
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
+		xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="6.06163in" height="2.66319in"
+		viewBox="0 0 436.438 191.75" xml:space="preserve" color-interpolation-filters="sRGB" class="st10">
+	<v:documentProperties v:langID="1033" v:viewMarkup="false"/>
+
+	<style type="text/css">
+	<![CDATA[
+		.st1 {fill:url(#grad30-4);stroke:#404040;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.25}
+		.st2 {fill:#000000;font-family:Calibri;font-size:1.00001em}
+		.st3 {font-size:1em;font-weight:bold}
+		.st4 {fill:#000000;font-family:Calibri;font-size:1.00001em;font-weight:bold}
+		.st5 {font-size:1em;font-weight:normal}
+		.st6 {marker-end:url(#mrkr5-38);stroke:#404040;stroke-linecap:round;stroke-linejoin:round;stroke-width:1}
+		.st7 {fill:#404040;fill-opacity:1;stroke:#404040;stroke-opacity:1;stroke-width:0.28409090909091}
+		.st8 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.25}
+		.st9 {fill:#000000;font-family:Calibri;font-size:0.833336em}
+		.st10 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
+	]]>
+	</style>
+
+	<defs id="Patterns_And_Gradients">
+		<linearGradient id="grad30-4" v:fillPattern="30" v:foreground="#c6d09f" v:background="#d1dab4" x1="0" y1="1" x2="0" y2="0">
+			<stop offset="0" style="stop-color:#c6d09f;stop-opacity:1"/>
+			<stop offset="1" style="stop-color:#d1dab4;stop-opacity:1"/>
+		</linearGradient>
+		<linearGradient id="grad30-35" v:fillPattern="30" v:foreground="#f0f0f0" v:background="#ffffff" x1="0" y1="1" x2="0" y2="0">
+			<stop offset="0" style="stop-color:#f0f0f0;stop-opacity:1"/>
+			<stop offset="1" style="stop-color:#ffffff;stop-opacity:1"/>
+		</linearGradient>
+	</defs>
+	<defs id="Markers">
+		<g id="lend5">
+			<path d="M 2 1 L 0 0 L 1.98117 -0.993387 C 1.67173 -0.364515 1.67301 0.372641 1.98465 1.00043 " style="stroke:none"/>
+		</g>
+		<marker id="mrkr5-38" class="st7" v:arrowType="5" v:arrowSize="2" v:setback="6.16" refX="-6.16" orient="auto"
+				markerUnits="strokeWidth" overflow="visible">
+			<use xlink:href="#lend5" transform="scale(-3.52,-3.52) "/>
+		</marker>
+	</defs>
+	<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
+		<title>Page-1</title>
+		<v:pageProperties v:drawingScale="1" v:pageScale="1" v:drawingUnits="0" v:shadowOffsetX="9" v:shadowOffsetY="-9"/>
+		<v:layer v:name="Connector" v:index="0"/>
+		<g id="shape1-1" v:mID="1" v:groupContext="shape" transform="translate(0.25,-117.25)">
+			<title>Rounded rectangle</title>
+			<desc>Categorize into an existed “flow”</desc>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="90" cy="173.75" width="180" height="36"/>
+			<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75
+						 A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"
+					class="st1"/>
+			<text x="8.91" y="177.35" class="st2" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Categorize into an <tspan
+						class="st3">existed</tspan><tspan class="st3" v:langID="2052"> </tspan>“<tspan class="st3">flow</tspan>”</text>		</g>
+		<g id="shape2-9" v:mID="2" v:groupContext="shape" transform="translate(0.25,-58.75)">
+			<title>Rounded rectangle.2</title>
+			<desc>Search for a “neighbor”</desc>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="90" cy="173.75" width="180" height="36"/>
+			<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75
+						 A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"
+					class="st1"/>
+			<text x="32.19" y="177.35" class="st2" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Search for a “<tspan
+						class="st3">neighbor</tspan>”</text>		</g>
+		<g id="shape3-14" v:mID="3" v:groupContext="shape" transform="translate(225.813,-117.25)">
+			<title>Rounded rectangle.3</title>
+			<desc>Insert a new “flow” and store the packet</desc>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="105.188" cy="173.75" width="210.38" height="36"/>
+			<path d="M201.37 191.75 A9.00007 9.00007 -180 0 0 210.37 182.75 L210.37 164.75 A9.00007 9.00007 -180 0 0 201.37 155.75
+						 L9 155.75 A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L201.37 191.75
+						 Z" class="st1"/>
+			<text x="5.45" y="177.35" class="st2" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Insert a <tspan
+						class="st3">new </tspan>“<tspan class="st3">flow</tspan>” and <tspan class="st3">store </tspan>the packet</text>		</g>
+		<g id="shape4-21" v:mID="4" v:groupContext="shape" transform="translate(225.25,-58.75)">
+			<title>Rounded rectangle.4</title>
+			<desc>Store the packet</desc>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="83.25" cy="173.75" width="166.5" height="36"/>
+			<path d="M157.5 191.75 A9.00007 9.00007 -180 0 0 166.5 182.75 L166.5 164.75 A9.00007 9.00007 -180 0 0 157.5 155.75 L9
+						 155.75 A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L157.5 191.75 Z"
+					class="st1"/>
+			<text x="42.81" y="177.35" class="st4" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Store <tspan
+						class="st5">the packet</tspan></text>		</g>
+		<g id="shape5-26" v:mID="5" v:groupContext="shape" transform="translate(0.25,-0.25)">
+			<title>Rounded rectangle.5</title>
+			<desc>Merge the packet</desc>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="90" cy="173.75" width="180" height="36"/>
+			<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75
+						 A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"
+					class="st1"/>
+			<text x="46.59" y="177.35" class="st4" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Merge <tspan
+						class="st5">the packet</tspan></text>		</g>
+		<g id="shape6-31" v:mID="6" v:groupContext="shape" v:layerMember="0" transform="translate(81.25,-175.75)">
+			<title>Dynamic connector</title>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<path d="M9 191.75 L9 208.09" class="st6"/>
+		</g>
+		<g id="shape7-39" v:mID="7" v:groupContext="shape" v:layerMember="0" transform="translate(81.25,-117.25)">
+			<title>Dynamic connector.7</title>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<path d="M9 191.75 L9 208.09" class="st6"/>
+		</g>
+		<g id="shape8-45" v:mID="8" v:groupContext="shape" v:layerMember="0" transform="translate(81.25,-58.75)">
+			<title>Dynamic connector.8</title>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<path d="M9 191.75 L9 208.09" class="st6"/>
+		</g>
+		<g id="shape9-51" v:mID="9" v:groupContext="shape" v:layerMember="0" transform="translate(180.25,-126.25)">
+			<title>Dynamic connector.9</title>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<path d="M0 182.75 L39.4 182.75" class="st6"/>
+		</g>
+		<g id="shape10-57" v:mID="10" v:groupContext="shape" v:layerMember="0" transform="translate(180.25,-67.75)">
+			<title>Dynamic connector.10</title>
+			<v:userDefs>
+				<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<path d="M0 182.75 L38.84 182.75" class="st6"/>
+		</g>
+		<g id="shape11-63" v:mID="11" v:groupContext="shape" transform="translate(65.5,-173.5)">
+			<title>Sheet.11</title>
+			<desc>packet</desc>
+			<v:userDefs>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="24.75" cy="182.75" width="49.5" height="18"/>
+			<rect x="0" y="173.75" width="49.5" height="18" class="st8"/>
+			<text x="8.46" y="186.35" class="st2" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>packet</text>		</g>
+		<g id="shape14-66" v:mID="14" v:groupContext="shape" transform="translate(98.125,-98.125)">
+			<title>Sheet.14</title>
+			<desc>find a “flow”</desc>
+			<v:userDefs>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="32.0625" cy="183.875" width="64.13" height="15.75"/>
+			<rect x="0" y="176" width="64.125" height="15.75" class="st8"/>
+			<text x="6.41" y="186.88" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>find a “flow”</text>		</g>
+		<g id="shape15-69" v:mID="15" v:groupContext="shape" transform="translate(99.25,-39.625)">
+			<title>Sheet.15</title>
+			<desc>find a “neighbor”</desc>
+			<v:userDefs>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="40.5" cy="183.875" width="81" height="15.75"/>
+			<rect x="0" y="176" width="81" height="15.75" class="st8"/>
+			<text x="5.48" y="186.88" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>find a “neighbor”</text>		</g>
+		<g id="shape13-72" v:mID="13" v:groupContext="shape" transform="translate(181.375,-79)">
+			<title>Sheet.13</title>
+			<desc>not find</desc>
+			<v:userDefs>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="21.375" cy="183.875" width="42.75" height="15.75"/>
+			<rect x="0" y="176" width="42.75" height="15.75" class="st8"/>
+			<text x="5.38" y="186.88" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>not find</text>		</g>
+		<g id="shape12-75" v:mID="12" v:groupContext="shape" transform="translate(181.375,-137.5)">
+			<title>Sheet.12</title>
+			<desc>not find</desc>
+			<v:userDefs>
+				<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
+				<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
+			</v:userDefs>
+			<v:textBlock v:margins="rect(4,4,4,4)"/>
+			<v:textRect cx="21.375" cy="183.875" width="42.75" height="15.75"/>
+			<rect x="0" y="176" width="42.75" height="15.75" class="st8"/>
+			<text x="5.38" y="186.88" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>not find</text>		</g>
+	</g>
+</svg>
diff --git a/lib/librte_gro/gro_tcp4.c b/lib/librte_gro/gro_tcp4.c
index 03e5ccf..27af23e 100644
--- a/lib/librte_gro/gro_tcp4.c
+++ b/lib/librte_gro/gro_tcp4.c
@@ -6,8 +6,6 @@ 
 #include <rte_mbuf.h>
 #include <rte_cycles.h>
 #include <rte_ethdev.h>
-#include <rte_ip.h>
-#include <rte_tcp.h>
 
 #include "gro_tcp4.h"
 
@@ -44,20 +42,20 @@  gro_tcp4_tbl_create(uint16_t socket_id,
 	}
 	tbl->max_item_num = entries_num;
 
-	size = sizeof(struct gro_tcp4_key) * entries_num;
-	tbl->keys = rte_zmalloc_socket(__func__,
+	size = sizeof(struct gro_tcp4_flow) * entries_num;
+	tbl->flows = rte_zmalloc_socket(__func__,
 			size,
 			RTE_CACHE_LINE_SIZE,
 			socket_id);
-	if (tbl->keys == NULL) {
+	if (tbl->flows == NULL) {
 		rte_free(tbl->items);
 		rte_free(tbl);
 		return NULL;
 	}
-	/* INVALID_ARRAY_INDEX indicates empty key */
+	/* INVALID_ARRAY_INDEX indicates an empty flow */
 	for (i = 0; i < entries_num; i++)
-		tbl->keys[i].start_index = INVALID_ARRAY_INDEX;
-	tbl->max_key_num = entries_num;
+		tbl->flows[i].start_index = INVALID_ARRAY_INDEX;
+	tbl->max_flow_num = entries_num;
 
 	return tbl;
 }
@@ -69,116 +67,15 @@  gro_tcp4_tbl_destroy(void *tbl)
 
 	if (tcp_tbl) {
 		rte_free(tcp_tbl->items);
-		rte_free(tcp_tbl->keys);
+		rte_free(tcp_tbl->flows);
 	}
 	rte_free(tcp_tbl);
 }
 
-/*
- * merge two TCP/IPv4 packets without updating checksums.
- * If cmp is larger than 0, append the new packet to the
- * original packet. Otherwise, pre-pend the new packet to
- * the original packet.
- */
-static inline int
-merge_two_tcp4_packets(struct gro_tcp4_item *item_src,
-		struct rte_mbuf *pkt,
-		uint16_t ip_id,
-		uint32_t sent_seq,
-		int cmp)
-{
-	struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
-	uint16_t tcp_datalen;
-
-	if (cmp > 0) {
-		pkt_head = item_src->firstseg;
-		pkt_tail = pkt;
-	} else {
-		pkt_head = pkt;
-		pkt_tail = item_src->firstseg;
-	}
-
-	/* check if the packet length will be beyond the max value */
-	tcp_datalen = pkt_tail->pkt_len - pkt_tail->l2_len -
-		pkt_tail->l3_len - pkt_tail->l4_len;
-	if (pkt_head->pkt_len - pkt_head->l2_len + tcp_datalen >
-			TCP4_MAX_L3_LENGTH)
-		return 0;
-
-	/* remove packet header for the tail packet */
-	rte_pktmbuf_adj(pkt_tail,
-			pkt_tail->l2_len +
-			pkt_tail->l3_len +
-			pkt_tail->l4_len);
-
-	/* chain two packets together */
-	if (cmp > 0) {
-		item_src->lastseg->next = pkt;
-		item_src->lastseg = rte_pktmbuf_lastseg(pkt);
-		/* update IP ID to the larger value */
-		item_src->ip_id = ip_id;
-	} else {
-		lastseg = rte_pktmbuf_lastseg(pkt);
-		lastseg->next = item_src->firstseg;
-		item_src->firstseg = pkt;
-		/* update sent_seq to the smaller value */
-		item_src->sent_seq = sent_seq;
-	}
-	item_src->nb_merged++;
-
-	/* update mbuf metadata for the merged packet */
-	pkt_head->nb_segs += pkt_tail->nb_segs;
-	pkt_head->pkt_len += pkt_tail->pkt_len;
-
-	return 1;
-}
-
-static inline int
-check_seq_option(struct gro_tcp4_item *item,
-		struct tcp_hdr *tcp_hdr,
-		uint16_t tcp_hl,
-		uint16_t tcp_dl,
-		uint16_t ip_id,
-		uint32_t sent_seq)
-{
-	struct rte_mbuf *pkt0 = item->firstseg;
-	struct ipv4_hdr *ipv4_hdr0;
-	struct tcp_hdr *tcp_hdr0;
-	uint16_t tcp_hl0, tcp_dl0;
-	uint16_t len;
-
-	ipv4_hdr0 = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt0, char *) +
-			pkt0->l2_len);
-	tcp_hdr0 = (struct tcp_hdr *)((char *)ipv4_hdr0 + pkt0->l3_len);
-	tcp_hl0 = pkt0->l4_len;
-
-	/* check if TCP option fields equal. If not, return 0. */
-	len = RTE_MAX(tcp_hl, tcp_hl0) - sizeof(struct tcp_hdr);
-	if ((tcp_hl != tcp_hl0) ||
-			((len > 0) && (memcmp(tcp_hdr + 1,
-					tcp_hdr0 + 1,
-					len) != 0)))
-		return 0;
-
-	/* check if the two packets are neighbors */
-	tcp_dl0 = pkt0->pkt_len - pkt0->l2_len - pkt0->l3_len - tcp_hl0;
-	if ((sent_seq == (item->sent_seq + tcp_dl0)) &&
-			(ip_id == (item->ip_id + 1)))
-		/* append the new packet */
-		return 1;
-	else if (((sent_seq + tcp_dl) == item->sent_seq) &&
-			((ip_id + item->nb_merged) == item->ip_id))
-		/* pre-pend the new packet */
-		return -1;
-	else
-		return 0;
-}
-
 static inline uint32_t
 find_an_empty_item(struct gro_tcp4_tbl *tbl)
 {
-	uint32_t i;
-	uint32_t max_item_num = tbl->max_item_num;
+	uint32_t max_item_num = tbl->max_item_num, i;
 
 	for (i = 0; i < max_item_num; i++)
 		if (tbl->items[i].firstseg == NULL)
@@ -187,13 +84,12 @@  find_an_empty_item(struct gro_tcp4_tbl *tbl)
 }
 
 static inline uint32_t
-find_an_empty_key(struct gro_tcp4_tbl *tbl)
+find_an_empty_flow(struct gro_tcp4_tbl *tbl)
 {
-	uint32_t i;
-	uint32_t max_key_num = tbl->max_key_num;
+	uint32_t max_flow_num = tbl->max_flow_num, i;
 
-	for (i = 0; i < max_key_num; i++)
-		if (tbl->keys[i].start_index == INVALID_ARRAY_INDEX)
+	for (i = 0; i < max_flow_num; i++)
+		if (tbl->flows[i].start_index == INVALID_ARRAY_INDEX)
 			return i;
 	return INVALID_ARRAY_INDEX;
 }
@@ -201,10 +97,11 @@  find_an_empty_key(struct gro_tcp4_tbl *tbl)
 static inline uint32_t
 insert_new_item(struct gro_tcp4_tbl *tbl,
 		struct rte_mbuf *pkt,
-		uint16_t ip_id,
-		uint32_t sent_seq,
+		uint64_t start_time,
 		uint32_t prev_idx,
-		uint64_t start_time)
+		uint32_t sent_seq,
+		uint16_t ip_id,
+		uint8_t is_atomic)
 {
 	uint32_t item_idx;
 
@@ -219,9 +116,10 @@  insert_new_item(struct gro_tcp4_tbl *tbl,
 	tbl->items[item_idx].sent_seq = sent_seq;
 	tbl->items[item_idx].ip_id = ip_id;
 	tbl->items[item_idx].nb_merged = 1;
+	tbl->items[item_idx].is_atomic = is_atomic;
 	tbl->item_num++;
 
-	/* if the previous packet exists, chain the new one with it */
+	/* If the previous packet exists, chain them together. */
 	if (prev_idx != INVALID_ARRAY_INDEX) {
 		tbl->items[item_idx].next_pkt_idx =
 			tbl->items[prev_idx].next_pkt_idx;
@@ -232,12 +130,13 @@  insert_new_item(struct gro_tcp4_tbl *tbl,
 }
 
 static inline uint32_t
-delete_item(struct gro_tcp4_tbl *tbl, uint32_t item_idx,
+delete_item(struct gro_tcp4_tbl *tbl,
+		uint32_t item_idx,
 		uint32_t prev_item_idx)
 {
 	uint32_t next_idx = tbl->items[item_idx].next_pkt_idx;
 
-	/* set NULL to firstseg to indicate it's an empty item */
+	/* NULL indicates an empty item. */
 	tbl->items[item_idx].firstseg = NULL;
 	tbl->item_num--;
 	if (prev_item_idx != INVALID_ARRAY_INDEX)
@@ -247,53 +146,33 @@  delete_item(struct gro_tcp4_tbl *tbl, uint32_t item_idx,
 }
 
 static inline uint32_t
-insert_new_key(struct gro_tcp4_tbl *tbl,
-		struct tcp4_key *key_src,
+insert_new_flow(struct gro_tcp4_tbl *tbl,
+		struct tcp4_flow_key *src,
 		uint32_t item_idx)
 {
-	struct tcp4_key *key_dst;
-	uint32_t key_idx;
+	struct tcp4_flow_key *dst;
+	uint32_t flow_idx;
 
-	key_idx = find_an_empty_key(tbl);
-	if (key_idx == INVALID_ARRAY_INDEX)
+	flow_idx = find_an_empty_flow(tbl);
+	if (unlikely(flow_idx == INVALID_ARRAY_INDEX))
 		return INVALID_ARRAY_INDEX;
 
-	key_dst = &(tbl->keys[key_idx].key);
+	dst = &(tbl->flows[flow_idx].key);
 
-	ether_addr_copy(&(key_src->eth_saddr), &(key_dst->eth_saddr));
-	ether_addr_copy(&(key_src->eth_daddr), &(key_dst->eth_daddr));
-	key_dst->ip_src_addr = key_src->ip_src_addr;
-	key_dst->ip_dst_addr = key_src->ip_dst_addr;
-	key_dst->recv_ack = key_src->recv_ack;
-	key_dst->src_port = key_src->src_port;
-	key_dst->dst_port = key_src->dst_port;
+	ether_addr_copy(&(src->eth_saddr), &(dst->eth_saddr));
+	ether_addr_copy(&(src->eth_daddr), &(dst->eth_daddr));
+	dst->ip_src_addr = src->ip_src_addr;
+	dst->ip_dst_addr = src->ip_dst_addr;
+	dst->recv_ack = src->recv_ack;
+	dst->src_port = src->src_port;
+	dst->dst_port = src->dst_port;
 
-	/* non-INVALID_ARRAY_INDEX value indicates this key is valid */
-	tbl->keys[key_idx].start_index = item_idx;
-	tbl->key_num++;
+	tbl->flows[flow_idx].start_index = item_idx;
+	tbl->flow_num++;
 
-	return key_idx;
+	return flow_idx;
 }
 
-static inline int
-is_same_key(struct tcp4_key k1, struct tcp4_key k2)
-{
-	if (is_same_ether_addr(&k1.eth_saddr, &k2.eth_saddr) == 0)
-		return 0;
-
-	if (is_same_ether_addr(&k1.eth_daddr, &k2.eth_daddr) == 0)
-		return 0;
-
-	return ((k1.ip_src_addr == k2.ip_src_addr) &&
-			(k1.ip_dst_addr == k2.ip_dst_addr) &&
-			(k1.recv_ack == k2.recv_ack) &&
-			(k1.src_port == k2.src_port) &&
-			(k1.dst_port == k2.dst_port));
-}
-
-/*
- * update packet length for the flushed packet.
- */
 static inline void
 update_header(struct gro_tcp4_item *item)
 {
@@ -315,84 +194,106 @@  gro_tcp4_reassemble(struct rte_mbuf *pkt,
 	struct ipv4_hdr *ipv4_hdr;
 	struct tcp_hdr *tcp_hdr;
 	uint32_t sent_seq;
-	uint16_t tcp_dl, ip_id;
+	uint16_t tcp_dl, ip_id, frag_off, hdr_len;
+	uint8_t is_atomic;
 
-	struct tcp4_key key;
+	struct tcp4_flow_key key;
 	uint32_t cur_idx, prev_idx, item_idx;
-	uint32_t i, max_key_num;
+	uint32_t i, max_flow_num, left_flow_num;
 	int cmp;
+	uint8_t find;
 
 	eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
 	ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
 	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
+	hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
 
 	/*
-	 * if FIN, SYN, RST, PSH, URG, ECE or
-	 * CWR is set, return immediately.
+	 * Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
+	 * or CWR set.
 	 */
 	if (tcp_hdr->tcp_flags != TCP_ACK_FLAG)
 		return -1;
-	/* if payload length is 0, return immediately */
-	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
-		pkt->l4_len;
-	if (tcp_dl == 0)
+	/*
+	 * Don't process the packet whose payload length is less than or
+	 * equal to 0.
+	 */
+	tcp_dl = pkt->pkt_len - hdr_len;
+	if (tcp_dl <= 0)
 		return -1;
 
-	ip_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
+	/*
+	 * Save IPv4 ID for the packet whose DF bit is 0. For the packet
+	 * whose DF bit is 1, IPv4 ID is ignored.
+	 */
+	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
+	is_atomic = (frag_off & IPV4_HDR_DF_FLAG) == IPV4_HDR_DF_FLAG;
+	ip_id = is_atomic ? 0 : rte_be_to_cpu_16(ipv4_hdr->packet_id);
 	sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
 
 	ether_addr_copy(&(eth_hdr->s_addr), &(key.eth_saddr));
 	ether_addr_copy(&(eth_hdr->d_addr), &(key.eth_daddr));
 	key.ip_src_addr = ipv4_hdr->src_addr;
 	key.ip_dst_addr = ipv4_hdr->dst_addr;
+	key.recv_ack = tcp_hdr->recv_ack;
 	key.src_port = tcp_hdr->src_port;
 	key.dst_port = tcp_hdr->dst_port;
-	key.recv_ack = tcp_hdr->recv_ack;
 
-	/* search for a key */
-	max_key_num = tbl->max_key_num;
-	for (i = 0; i < max_key_num; i++) {
-		if ((tbl->keys[i].start_index != INVALID_ARRAY_INDEX) &&
-				is_same_key(tbl->keys[i].key, key))
-			break;
+	/* Search for a matched flow. */
+	max_flow_num = tbl->max_flow_num;
+	left_flow_num = tbl->flow_num;
+	find = 0;
+	for (i = 0; i < max_flow_num && left_flow_num; i++) {
+		if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
+			if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
+				find = 1;
+				break;
+			}
+			left_flow_num--;
+		}
 	}
 
-	/* can't find a key, so insert a new key and a new item. */
-	if (i == tbl->max_key_num) {
-		item_idx = insert_new_item(tbl, pkt, ip_id, sent_seq,
-				INVALID_ARRAY_INDEX, start_time);
+	/*
+	 * Fail to find a matched flow. Insert a new flow and store the
+	 * packet into the flow.
+	 */
+	if (find == 0) {
+		item_idx = insert_new_item(tbl, pkt, start_time,
+				INVALID_ARRAY_INDEX, sent_seq, ip_id,
+				is_atomic);
 		if (item_idx == INVALID_ARRAY_INDEX)
 			return -1;
-		if (insert_new_key(tbl, &key, item_idx) ==
+		if (insert_new_flow(tbl, &key, item_idx) ==
 				INVALID_ARRAY_INDEX) {
-			/*
-			 * fail to insert a new key, so
-			 * delete the inserted item
-			 */
+			/* Fail to insert a new flow. */
 			delete_item(tbl, item_idx, INVALID_ARRAY_INDEX);
 			return -1;
 		}
 		return 0;
 	}
 
-	/* traverse all packets in the item group to find one to merge */
-	cur_idx = tbl->keys[i].start_index;
+	/*
+	 * Check all packets in the flow and try to find a neighbor for
+	 * the input packet.
+	 */
+	cur_idx = tbl->flows[i].start_index;
 	prev_idx = cur_idx;
 	do {
 		cmp = check_seq_option(&(tbl->items[cur_idx]), tcp_hdr,
-				pkt->l4_len, tcp_dl, ip_id, sent_seq);
+				sent_seq, ip_id, pkt->l4_len, tcp_dl, 0,
+				is_atomic);
 		if (cmp) {
 			if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
-						pkt, ip_id,
-						sent_seq, cmp))
+						pkt, cmp, sent_seq, ip_id, 0))
 				return 1;
 			/*
-			 * fail to merge two packets since the packet
-			 * length will be greater than the max value.
-			 * So insert the packet into the item group.
+			 * Fail to merge the two packets, as the packet
+			 * length is greater than the max value. Store
+			 * the packet into the flow.
 			 */
-			if (insert_new_item(tbl, pkt, ip_id, sent_seq,
-						prev_idx, start_time) ==
+			if (insert_new_item(tbl, pkt, start_time, prev_idx,
+						sent_seq, ip_id,
+						is_atomic) ==
 					INVALID_ARRAY_INDEX)
 				return -1;
 			return 0;
@@ -401,12 +302,9 @@  gro_tcp4_reassemble(struct rte_mbuf *pkt,
 		cur_idx = tbl->items[cur_idx].next_pkt_idx;
 	} while (cur_idx != INVALID_ARRAY_INDEX);
 
-	/*
-	 * can't find a packet in the item group to merge,
-	 * so insert the packet into the item group.
-	 */
-	if (insert_new_item(tbl, pkt, ip_id, sent_seq, prev_idx,
-				start_time) == INVALID_ARRAY_INDEX)
+	/* Fail to find a neighbor, so store the packet into the flow. */
+	if (insert_new_item(tbl, pkt, start_time, prev_idx, sent_seq,
+				ip_id, is_atomic) == INVALID_ARRAY_INDEX)
 		return -1;
 
 	return 0;
@@ -418,46 +316,35 @@  gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl *tbl,
 		struct rte_mbuf **out,
 		uint16_t nb_out)
 {
-	uint16_t k = 0;
+	uint32_t max_flow_num = tbl->max_flow_num;
 	uint32_t i, j;
-	uint32_t max_key_num = tbl->max_key_num;
+	uint16_t k = 0;
 
-	for (i = 0; i < max_key_num; i++) {
-		/* all keys have been checked, return immediately */
-		if (tbl->key_num == 0)
+	for (i = 0; i < max_flow_num; i++) {
+		if (unlikely(tbl->flow_num == 0))
 			return k;
 
-		j = tbl->keys[i].start_index;
+		j = tbl->flows[i].start_index;
 		while (j != INVALID_ARRAY_INDEX) {
 			if (tbl->items[j].start_time <= flush_timestamp) {
 				out[k++] = tbl->items[j].firstseg;
 				if (tbl->items[j].nb_merged > 1)
 					update_header(&(tbl->items[j]));
 				/*
-				 * delete the item and get
-				 * the next packet index
+				 * Delete the packet and get the next
+				 * packet in the flow.
 				 */
-				j = delete_item(tbl, j,
-						INVALID_ARRAY_INDEX);
+				j = delete_item(tbl, j, INVALID_ARRAY_INDEX);
+				tbl->flows[i].start_index = j;
+				if (j == INVALID_ARRAY_INDEX)
+					tbl->flow_num--;
 
-				/*
-				 * delete the key as all of
-				 * packets are flushed
-				 */
-				if (j == INVALID_ARRAY_INDEX) {
-					tbl->keys[i].start_index =
-						INVALID_ARRAY_INDEX;
-					tbl->key_num--;
-				} else
-					/* update start_index of the key */
-					tbl->keys[i].start_index = j;
-
-				if (k == nb_out)
+				if (unlikely(k == nb_out))
 					return k;
 			} else
 				/*
-				 * left packets of this key won't be
-				 * timeout, so go to check other keys.
+				 * The left packets in this flow won't be
+				 * timeout. Go to check other flows.
 				 */
 				break;
 		}
diff --git a/lib/librte_gro/gro_tcp4.h b/lib/librte_gro/gro_tcp4.h
index d129523..c2b66a8 100644
--- a/lib/librte_gro/gro_tcp4.h
+++ b/lib/librte_gro/gro_tcp4.h
@@ -5,17 +5,20 @@ 
 #ifndef _GRO_TCP4_H_
 #define _GRO_TCP4_H_
 
+#include <rte_ip.h>
+#include <rte_tcp.h>
+
 #define INVALID_ARRAY_INDEX 0xffffffffUL
 #define GRO_TCP4_TBL_MAX_ITEM_NUM (1024UL * 1024UL)
 
 /*
- * the max L3 length of a TCP/IPv4 packet. The L3 length
- * is the sum of ipv4 header, tcp header and L4 payload.
+ * The max length of a IPv4 packet, which includes the length of the L3
+ * header, the L4 header and the data payload.
  */
-#define TCP4_MAX_L3_LENGTH UINT16_MAX
+#define MAX_IPV4_PKT_LENGTH UINT16_MAX
 
-/* criteria of mergeing packets */
-struct tcp4_key {
+/* Header fields representing a TCP/IPv4 flow */
+struct tcp4_flow_key {
 	struct ether_addr eth_saddr;
 	struct ether_addr eth_daddr;
 	uint32_t ip_src_addr;
@@ -26,77 +29,76 @@  struct tcp4_key {
 	uint16_t dst_port;
 };
 
-struct gro_tcp4_key {
-	struct tcp4_key key;
+struct gro_tcp4_flow {
+	struct tcp4_flow_key key;
 	/*
-	 * the index of the first packet in the item group.
-	 * If the value is INVALID_ARRAY_INDEX, it means
-	 * the key is empty.
+	 * The index of the first packet in the flow.
+	 * INVALID_ARRAY_INDEX indicates an empty flow.
 	 */
 	uint32_t start_index;
 };
 
 struct gro_tcp4_item {
 	/*
-	 * first segment of the packet. If the value
+	 * The first MBUF segment of the packet. If the value
 	 * is NULL, it means the item is empty.
 	 */
 	struct rte_mbuf *firstseg;
-	/* last segment of the packet */
+	/* The last MBUF segment of the packet */
 	struct rte_mbuf *lastseg;
 	/*
-	 * the time when the first packet is inserted
-	 * into the table. If a packet in the table is
-	 * merged with an incoming packet, this value
-	 * won't be updated. We set this value only
-	 * when the first packet is inserted into the
-	 * table.
+	 * The time when the first packet is inserted into the table.
+	 * This value won't be updated, even if the packet is merged
+	 * with other packets.
 	 */
 	uint64_t start_time;
 	/*
-	 * we use next_pkt_idx to chain the packets that
-	 * have same key value but can't be merged together.
+	 * next_pkt_idx is used to chain the packets that
+	 * are in the same flow but can't be merged together
+	 * (e.g. caused by packet reordering).
 	 */
 	uint32_t next_pkt_idx;
-	/* the sequence number of the packet */
+	/* TCP sequence number of the packet */
 	uint32_t sent_seq;
-	/* the IP ID of the packet */
+	/* IPv4 ID of the packet */
 	uint16_t ip_id;
-	/* the number of merged packets */
+	/* The number of merged packets */
 	uint16_t nb_merged;
+	/* Indicate if IPv4 ID can be ignored */
+	uint8_t is_atomic;
 };
 
 /*
- * TCP/IPv4 reassembly table structure.
+ * TCP/IPv4 reassembly table structure
  */
 struct gro_tcp4_tbl {
 	/* item array */
 	struct gro_tcp4_item *items;
-	/* key array */
-	struct gro_tcp4_key *keys;
+	/* flow array */
+	struct gro_tcp4_flow *flows;
 	/* current item number */
 	uint32_t item_num;
-	/* current key num */
-	uint32_t key_num;
+	/* current flow num */
+	uint32_t flow_num;
 	/* item array size */
 	uint32_t max_item_num;
-	/* key array size */
-	uint32_t max_key_num;
+	/* flow array size */
+	uint32_t max_flow_num;
 };
 
 /**
  * This function creates a TCP/IPv4 reassembly table.
  *
  * @param socket_id
- *  socket index for allocating TCP/IPv4 reassemble table
+ *  Socket index for allocating the TCP/IPv4 reassemble table
  * @param max_flow_num
- *  the maximum number of flows in the TCP/IPv4 GRO table
+ *  The maximum number of flows in the TCP/IPv4 GRO table
  * @param max_item_per_flow
- *  the maximum packet number per flow.
+ *  The maximum number of packets per flow
  *
  * @return
- *  if create successfully, return a pointer which points to the
- *  created TCP/IPv4 GRO table. Otherwise, return NULL.
+ *  - Return the table pointer on success.
+ *  - Return NULL on failure.
  */
 void *gro_tcp4_tbl_create(uint16_t socket_id,
 		uint16_t max_flow_num,
@@ -106,62 +108,56 @@  void *gro_tcp4_tbl_create(uint16_t socket_id,
  * This function destroys a TCP/IPv4 reassembly table.
  *
  * @param tbl
- *  a pointer points to the TCP/IPv4 reassembly table.
+ *  Pointer pointing to the TCP/IPv4 reassembly table.
  */
 void gro_tcp4_tbl_destroy(void *tbl);
 
 /**
- * This function searches for a packet in the TCP/IPv4 reassembly table
- * to merge with the inputted one. To merge two packets is to chain them
- * together and update packet headers. Packets, whose SYN, FIN, RST, PSH
- * CWR, ECE or URG bit is set, are returned immediately. Packets which
- * only have packet headers (i.e. without data) are also returned
- * immediately. Otherwise, the packet is either merged, or inserted into
- * the table. Besides, if there is no available space to insert the
- * packet, this function returns immediately too.
+ * This function merges a TCP/IPv4 packet. It doesn't process the packet,
+ * which has SYN, FIN, RST, PSH, CWR, ECE or URG set, or doesn't have
+ * payload.
  *
- * This function assumes the inputted packet is with correct IPv4 and
- * TCP checksums. And if two packets are merged, it won't re-calculate
- * IPv4 and TCP checksums. Besides, if the inputted packet is IP
- * fragmented, it assumes the packet is complete (with TCP header).
+ * This function doesn't check if the packet has correct checksums and
+ * doesn't re-calculate checksums for the merged packet. Additionally,
+ * it assumes the packets are complete (i.e., MF==0 && frag_off==0),
+ * when IP fragmentation is possible (i.e., DF==0). It returns the
+ * packet, if the packet has invalid parameters (e.g. SYN bit is set)
+ * or there is no available space in the table.
  *
  * @param pkt
- *  packet to reassemble.
+ *  Packet to reassemble
  * @param tbl
- *  a pointer that points to a TCP/IPv4 reassembly table.
+ *  Pointer pointing to the TCP/IPv4 reassembly table
  * @start_time
- *  the start time that the packet is inserted into the table
+ *  The time when the packet is inserted into the table
  *
  * @return
- *  if the packet doesn't have data, or SYN, FIN, RST, PSH, CWR, ECE
- *  or URG bit is set, or there is no available space in the table to
- *  insert a new item or a new key, return a negative value. If the
- *  packet is merged successfully, return an positive value. If the
- *  packet is inserted into the table, return 0.
+ *  - Return a positive value if the packet is merged.
+ *  - Return zero if the packet isn't merged but stored in the table.
+ *  - Return a negative value for invalid parameters or no available
+ *    space in the table.
  */
 int32_t gro_tcp4_reassemble(struct rte_mbuf *pkt,
 		struct gro_tcp4_tbl *tbl,
 		uint64_t start_time);
 
 /**
- * This function flushes timeout packets in a TCP/IPv4 reassembly table
- * to applications, and without updating checksums for merged packets.
- * The max number of flushed timeout packets is the element number of
- * the array which is used to keep flushed packets.
+ * This function flushes timeout packets in a TCP/IPv4 reassembly table,
+ * and without updating checksums.
  *
  * @param tbl
- *  a pointer that points to a TCP GRO table.
+ *  TCP/IPv4 reassembly table pointer
  * @param flush_timestamp
- *  this function flushes packets which are inserted into the table
- *  before or at the flush_timestamp.
+ *  Flush packets which are inserted into the table before or at the
+ *  flush_timestamp.
  * @param out
- *  pointer array which is used to keep flushed packets.
+ *  Pointer array used to keep flushed packets
  * @param nb_out
- *  the element number of out. It's also the max number of timeout
+ *  The element number in 'out'. It also determines the maximum number of
  *  packets that can be flushed finally.
  *
  * @return
- *  the number of packets that are returned.
+ *  The number of flushed packets
  */
 uint16_t gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl *tbl,
 		uint64_t flush_timestamp,
@@ -173,10 +169,131 @@  uint16_t gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl *tbl,
  * reassembly table.
  *
  * @param tbl
- *  pointer points to a TCP/IPv4 reassembly table.
+ *  TCP/IPv4 reassembly table pointer
  *
  * @return
- *  the number of packets in the table
+ *  The number of packets in the table
  */
 uint32_t gro_tcp4_tbl_pkt_count(void *tbl);
+
+/*
+ * Check if two TCP/IPv4 packets belong to the same flow.
+ */
+static inline int
+is_same_tcp4_flow(struct tcp4_flow_key k1, struct tcp4_flow_key k2)
+{
+	return (is_same_ether_addr(&k1.eth_saddr, &k2.eth_saddr) &&
+			is_same_ether_addr(&k1.eth_daddr, &k2.eth_daddr) &&
+			(k1.ip_src_addr == k2.ip_src_addr) &&
+			(k1.ip_dst_addr == k2.ip_dst_addr) &&
+			(k1.recv_ack == k2.recv_ack) &&
+			(k1.src_port == k2.src_port) &&
+			(k1.dst_port == k2.dst_port));
+}
+
+/*
+ * Check if two TCP/IPv4 packets are neighbors.
+ */
+static inline int
+check_seq_option(struct gro_tcp4_item *item,
+		struct tcp_hdr *tcph,
+		uint32_t sent_seq,
+		uint16_t ip_id,
+		uint16_t tcp_hl,
+		uint16_t tcp_dl,
+		uint16_t l2_offset,
+		uint8_t is_atomic)
+{
+	struct rte_mbuf *pkt_orig = item->firstseg;
+	struct ipv4_hdr *iph_orig;
+	struct tcp_hdr *tcph_orig;
+	uint16_t len, l4_len_orig;
+
+	iph_orig = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt_orig, char *) +
+			l2_offset + pkt_orig->l2_len);
+	tcph_orig = (struct tcp_hdr *)((char *)iph_orig + pkt_orig->l3_len);
+	l4_len_orig = pkt_orig->l4_len;
+
+	/* Check if TCP option fields equal */
+	len = RTE_MAX(tcp_hl, l4_len_orig) - sizeof(struct tcp_hdr);
+	if ((tcp_hl != l4_len_orig) || ((len > 0) &&
+				(memcmp(tcph + 1, tcph_orig + 1,
+					len) != 0)))
+		return 0;
+
+	/* Don't merge packets whose DF bits are different */
+	if (unlikely(item->is_atomic ^ is_atomic))
+		return 0;
+
+	/* Check if the two packets are neighbors */
+	len = pkt_orig->pkt_len - l2_offset - pkt_orig->l2_len -
+		pkt_orig->l3_len - l4_len_orig;
+	if ((sent_seq == item->sent_seq + len) && (is_atomic ||
+				(ip_id == item->ip_id + item->nb_merged)))
+		/* Append the new packet */
+		return 1;
+	else if ((sent_seq + tcp_dl == item->sent_seq) && (is_atomic ||
+				(ip_id + 1 == item->ip_id)))
+		/* Pre-pend the new packet */
+		return -1;
+
+	return 0;
+}
+
+/*
+ * Merge two TCP/IPv4 packets without updating checksums.
+ * If cmp is larger than 0, append the new packet to the
+ * original packet. Otherwise, pre-pend the new packet to
+ * the original packet.
+ */
+static inline int
+merge_two_tcp4_packets(struct gro_tcp4_item *item,
+		struct rte_mbuf *pkt,
+		int cmp,
+		uint32_t sent_seq,
+		uint16_t ip_id,
+		uint16_t l2_offset)
+{
+	struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
+	uint16_t hdr_len, l2_len;
+
+	if (cmp > 0) {
+		pkt_head = item->firstseg;
+		pkt_tail = pkt;
+	} else {
+		pkt_head = pkt;
+		pkt_tail = item->firstseg;
+	}
+
+	/* Check if the IPv4 packet length is greater than the max value */
+	hdr_len = l2_offset + pkt_head->l2_len + pkt_head->l3_len +
+		pkt_head->l4_len;
+	l2_len = l2_offset > 0 ? pkt_head->outer_l2_len : pkt_head->l2_len;
+	if (unlikely(pkt_head->pkt_len - l2_len + pkt_tail->pkt_len - hdr_len >
+			MAX_IPV4_PKT_LENGTH))
+		return 0;
+
+	/* Remove the packet header */
+	rte_pktmbuf_adj(pkt_tail, hdr_len);
+
+	/* Chain two packets together */
+	if (cmp > 0) {
+		item->lastseg->next = pkt;
+		item->lastseg = rte_pktmbuf_lastseg(pkt);
+	} else {
+		lastseg = rte_pktmbuf_lastseg(pkt);
+		lastseg->next = item->firstseg;
+		item->firstseg = pkt;
+		/* Update sent_seq and ip_id */
+		item->sent_seq = sent_seq;
+		item->ip_id = ip_id;
+	}
+	item->nb_merged++;
+
+	/* Update MBUF metadata for the merged packet */
+	pkt_head->nb_segs += pkt_tail->nb_segs;
+	pkt_head->pkt_len += pkt_tail->pkt_len;
+
+	return 1;
+}
 #endif
diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c
index d6b8cd1..7176c0e 100644
--- a/lib/librte_gro/rte_gro.c
+++ b/lib/librte_gro/rte_gro.c
@@ -23,11 +23,14 @@  static gro_tbl_destroy_fn tbl_destroy_fn[RTE_GRO_TYPE_MAX_NUM] = {
 static gro_tbl_pkt_count_fn tbl_pkt_count_fn[RTE_GRO_TYPE_MAX_NUM] = {
 			gro_tcp4_tbl_pkt_count, NULL};
 
+#define IS_IPV4_TCP_PKT(ptype) (RTE_ETH_IS_IPV4_HDR(ptype) && \
+		((ptype & RTE_PTYPE_L4_TCP) == RTE_PTYPE_L4_TCP))
+
 /*
- * GRO context structure, which is used to merge packets. It keeps
- * many reassembly tables of desired GRO types. Applications need to
- * create GRO context objects before using rte_gro_reassemble to
- * perform GRO.
+ * GRO context structure. It keeps the table structures, which are
+ * used to merge packets, for different GRO types. Before using
+ * rte_gro_reassemble(), applications need to create the GRO context
+ * first.
  */
 struct gro_ctx {
 	/* GRO types to perform */
@@ -65,7 +68,7 @@  rte_gro_ctx_create(const struct rte_gro_param *param)
 				param->max_flow_num,
 				param->max_item_per_flow);
 		if (gro_ctx->tbls[i] == NULL) {
-			/* destroy all created tables */
+			/* Destroy all created tables */
 			gro_ctx->gro_types = gro_types;
 			rte_gro_ctx_destroy(gro_ctx);
 			return NULL;
@@ -85,8 +88,6 @@  rte_gro_ctx_destroy(void *ctx)
 	uint64_t gro_type_flag;
 	uint8_t i;
 
-	if (gro_ctx == NULL)
-		return;
 	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
 		gro_type_flag = 1ULL << i;
 		if ((gro_ctx->gro_types & gro_type_flag) == 0)
@@ -103,62 +104,54 @@  rte_gro_reassemble_burst(struct rte_mbuf **pkts,
 		uint16_t nb_pkts,
 		const struct rte_gro_param *param)
 {
-	uint16_t i;
-	uint16_t nb_after_gro = nb_pkts;
-	uint32_t item_num;
-
-	/* allocate a reassembly table for TCP/IPv4 GRO */
+	/* Allocate a reassembly table for TCP/IPv4 GRO */
 	struct gro_tcp4_tbl tcp_tbl;
-	struct gro_tcp4_key tcp_keys[RTE_GRO_MAX_BURST_ITEM_NUM];
+	struct gro_tcp4_flow tcp_flows[RTE_GRO_MAX_BURST_ITEM_NUM];
 	struct gro_tcp4_item tcp_items[RTE_GRO_MAX_BURST_ITEM_NUM] = {{0} };
 
 	struct rte_mbuf *unprocess_pkts[nb_pkts];
-	uint16_t unprocess_num = 0;
+	uint32_t item_num;
 	int32_t ret;
-	uint64_t current_time;
+	uint16_t i, unprocess_num = 0, nb_after_gro = nb_pkts;
 
-	if ((param->gro_types & RTE_GRO_TCP_IPV4) == 0)
+	if (unlikely((param->gro_types & RTE_GRO_TCP_IPV4) == 0))
 		return nb_pkts;
 
-	/* get the actual number of packets */
+	/* Get the maximum number of packets */
 	item_num = RTE_MIN(nb_pkts, (param->max_flow_num *
-			param->max_item_per_flow));
+				param->max_item_per_flow));
 	item_num = RTE_MIN(item_num, RTE_GRO_MAX_BURST_ITEM_NUM);
 
 	for (i = 0; i < item_num; i++)
-		tcp_keys[i].start_index = INVALID_ARRAY_INDEX;
+		tcp_flows[i].start_index = INVALID_ARRAY_INDEX;
 
-	tcp_tbl.keys = tcp_keys;
+	tcp_tbl.flows = tcp_flows;
 	tcp_tbl.items = tcp_items;
-	tcp_tbl.key_num = 0;
+	tcp_tbl.flow_num = 0;
 	tcp_tbl.item_num = 0;
-	tcp_tbl.max_key_num = item_num;
+	tcp_tbl.max_flow_num = item_num;
 	tcp_tbl.max_item_num = item_num;
 
-	current_time = rte_rdtsc();
-
 	for (i = 0; i < nb_pkts; i++) {
-		if ((pkts[i]->packet_type & (RTE_PTYPE_L3_IPV4 |
-					RTE_PTYPE_L4_TCP)) ==
-				(RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)) {
-			ret = gro_tcp4_reassemble(pkts[i],
-					&tcp_tbl,
-					current_time);
+		if (IS_IPV4_TCP_PKT(pkts[i]->packet_type)) {
+			/*
+			 * The timestamp is ignored, since all packets
+			 * will be flushed from the tables.
+			 */
+			ret = gro_tcp4_reassemble(pkts[i], &tcp_tbl, 0);
 			if (ret > 0)
-				/* merge successfully */
+				/* Merge successfully */
 				nb_after_gro--;
-			else if (ret < 0) {
-				unprocess_pkts[unprocess_num++] =
-					pkts[i];
-			}
+			else if (ret < 0)
+				unprocess_pkts[unprocess_num++] = pkts[i];
 		} else
 			unprocess_pkts[unprocess_num++] = pkts[i];
 	}
 
-	/* re-arrange GROed packets */
 	if (nb_after_gro < nb_pkts) {
-		i = gro_tcp4_tbl_timeout_flush(&tcp_tbl, current_time,
-				pkts, nb_pkts);
+		/* Flush all packets from the tables */
+		i = gro_tcp4_tbl_timeout_flush(&tcp_tbl, 0, pkts, nb_pkts);
+		/* Copy unprocessed packets */
 		if (unprocess_num > 0) {
 			memcpy(&pkts[i], unprocess_pkts,
 					sizeof(struct rte_mbuf *) *
@@ -174,31 +167,28 @@  rte_gro_reassemble(struct rte_mbuf **pkts,
 		uint16_t nb_pkts,
 		void *ctx)
 {
-	uint16_t i, unprocess_num = 0;
 	struct rte_mbuf *unprocess_pkts[nb_pkts];
 	struct gro_ctx *gro_ctx = ctx;
+	void *tcp_tbl;
 	uint64_t current_time;
+	uint16_t i, unprocess_num = 0;
 
-	if ((gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == 0)
+	if (unlikely((gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == 0))
 		return nb_pkts;
 
+	tcp_tbl = gro_ctx->tbls[RTE_GRO_TCP_IPV4_INDEX];
 	current_time = rte_rdtsc();
 
 	for (i = 0; i < nb_pkts; i++) {
-		if ((pkts[i]->packet_type & (RTE_PTYPE_L3_IPV4 |
-					RTE_PTYPE_L4_TCP)) ==
-				(RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)) {
-			if (gro_tcp4_reassemble(pkts[i],
-						gro_ctx->tbls
-						[RTE_GRO_TCP_IPV4_INDEX],
+		if (IS_IPV4_TCP_PKT(pkts[i]->packet_type)) {
+			if (gro_tcp4_reassemble(pkts[i], tcp_tbl,
 						current_time) < 0)
 				unprocess_pkts[unprocess_num++] = pkts[i];
 		} else
 			unprocess_pkts[unprocess_num++] = pkts[i];
 	}
 	if (unprocess_num > 0) {
-		memcpy(pkts, unprocess_pkts,
-				sizeof(struct rte_mbuf *) *
+		memcpy(pkts, unprocess_pkts, sizeof(struct rte_mbuf *) *
 				unprocess_num);
 	}
 
@@ -224,6 +214,7 @@  rte_gro_timeout_flush(void *ctx,
 				flush_timestamp,
 				out, max_nb_out);
 	}
+
 	return 0;
 }
 
@@ -232,19 +223,20 @@  rte_gro_get_pkt_count(void *ctx)
 {
 	struct gro_ctx *gro_ctx = ctx;
 	gro_tbl_pkt_count_fn pkt_count_fn;
+	uint64_t gro_types = gro_ctx->gro_types, flag;
 	uint64_t item_num = 0;
-	uint64_t gro_type_flag;
 	uint8_t i;
 
-	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
-		gro_type_flag = 1ULL << i;
-		if ((gro_ctx->gro_types & gro_type_flag) == 0)
+	for (i = 0; i < RTE_GRO_TYPE_MAX_NUM && gro_types; i++) {
+		flag = 1ULL << i;
+		if ((gro_types & flag) == 0)
 			continue;
 
+		gro_types ^= flag;
 		pkt_count_fn = tbl_pkt_count_fn[i];
-		if (pkt_count_fn == NULL)
-			continue;
-		item_num += pkt_count_fn(gro_ctx->tbls[i]);
+		if (pkt_count_fn)
+			item_num += pkt_count_fn(gro_ctx->tbls[i]);
 	}
+
 	return item_num;
 }
diff --git a/lib/librte_gro/rte_gro.h b/lib/librte_gro/rte_gro.h
index 81a2eac..7979a59 100644
--- a/lib/librte_gro/rte_gro.h
+++ b/lib/librte_gro/rte_gro.h
@@ -31,8 +31,8 @@  extern "C" {
 /**< TCP/IPv4 GRO flag */
 
 /**
- * A structure which is used to create GRO context objects or tell
- * rte_gro_reassemble_burst() what reassembly rules are demanded.
+ * Structure used to create GRO context objects or used to pass
+ * application-determined parameters to rte_gro_reassemble_burst().
  */
 struct rte_gro_param {
 	uint64_t gro_types;
@@ -78,26 +78,23 @@  void rte_gro_ctx_destroy(void *ctx);
 
 /**
  * This is one of the main reassembly APIs, which merges numbers of
- * packets at a time. It assumes that all inputted packets are with
- * correct checksums. That is, applications should guarantee all
- * inputted packets are correct. Besides, it doesn't re-calculate
- * checksums for merged packets. If inputted packets are IP fragmented,
- * this function assumes them are complete (i.e. with L4 header). After
- * finishing processing, it returns all GROed packets to applications
- * immediately.
+ * packets at a time. It doesn't check if input packets have correct
+ * checksums and doesn't re-calculate checksums for merged packets.
+ * It assumes the packets are complete (i.e., MF==0 && frag_off==0),
+ * when IP fragmentation is possible (i.e., DF==1). The GROed packets
+ * are returned as soon as the function finishes.
  *
  * @param pkts
- *  a pointer array which points to the packets to reassemble. Besides,
- *  it keeps mbuf addresses for the GROed packets.
+ *  Pointer array pointing to the packets to reassemble. Besides, it
+ *  keeps MBUF addresses for the GROed packets.
  * @param nb_pkts
- *  the number of packets to reassemble.
+ *  The number of packets to reassemble
  * @param param
- *  applications use it to tell rte_gro_reassemble_burst() what rules
- *  are demanded.
+ *  Application-determined parameters for reassembling packets.
  *
  * @return
- *  the number of packets after been GROed. If no packets are merged,
- *  the returned value is nb_pkts.
+ *  The number of packets after been GROed. If no packets are merged,
+ *  the return value is equals to nb_pkts.
  */
 uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
 		uint16_t nb_pkts,
@@ -107,32 +104,28 @@  uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
  * @warning
  * @b EXPERIMENTAL: this API may change without prior notice
  *
- * Reassembly function, which tries to merge inputted packets with
- * the packets in the reassembly tables of a given GRO context. This
- * function assumes all inputted packets are with correct checksums.
- * And it won't update checksums if two packets are merged. Besides,
- * if inputted packets are IP fragmented, this function assumes they
- * are complete packets (i.e. with L4 header).
+ * Reassembly function, which tries to merge input packets with the
+ * existed packets in the reassembly tables of a given GRO context.
+ * It doesn't check if input packets have correct checksums and doesn't
+ * re-calculate checksums for merged packets. Additionally, it assumes
+ * the packets are complete (i.e., MF==0 && frag_off==0), when IP
+ * fragmentation is possible (i.e., DF==1).
  *
- * If the inputted packets don't have data or are with unsupported GRO
- * types etc., they won't be processed and are returned to applications.
- * Otherwise, the inputted packets are either merged or inserted into
- * the table. If applications want get packets in the table, they need
- * to call flush API.
+ * If the input packets have invalid parameters (e.g. no data payload,
+ * unsupported GRO types), they are returned to applications. Otherwise,
+ * they are either merged or inserted into the table. Applications need
+ * to flush packets from the tables by flush API, if they want to get the
+ * GROed packets.
  *
  * @param pkts
- *  packet to reassemble. Besides, after this function finishes, it
- *  keeps the unprocessed packets (e.g. without data or unsupported
- *  GRO types).
+ *  Packets to reassemble. It's also used to store the unprocessed packets.
  * @param nb_pkts
- *  the number of packets to reassemble.
+ *  The number of packets to reassemble
  * @param ctx
- *  a pointer points to a GRO context object.
+ *  GRO context object pointer
  *
  * @return
- *  return the number of unprocessed packets (e.g. without data or
- *  unsupported GRO types). If all packets are processed (merged or
- *  inserted into the table), return 0.
+ *  The number of unprocessed packets.
  */
 uint16_t rte_gro_reassemble(struct rte_mbuf **pkts,
 		uint16_t nb_pkts,
@@ -142,29 +135,28 @@  uint16_t rte_gro_reassemble(struct rte_mbuf **pkts,
  * @warning
  * @b EXPERIMENTAL: this API may change without prior notice
  *
- * This function flushes the timeout packets from reassembly tables of
- * desired GRO types. The max number of flushed timeout packets is the
- * element number of the array which is used to keep the flushed packets.
+ * This function flushes the timeout packets from the reassembly tables
+ * of desired GRO types. The max number of flushed packets is the
+ * element number of 'out'.
  *
- * Besides, this function won't re-calculate checksums for merged
- * packets in the tables. That is, the returned packets may be with
- * wrong checksums.
+ * Additionally, the flushed packets may have incorrect checksums, since
+ * this function doesn't re-calculate checksums for merged packets.
  *
  * @param ctx
- *  a pointer points to a GRO context object.
+ *  GRO context object pointer.
  * @param timeout_cycles
- *  max TTL for packets in reassembly tables, measured in nanosecond.
+ *  The max TTL for packets in reassembly tables, measured in nanosecond.
  * @param gro_types
- *  this function only flushes packets which belong to the GRO types
- *  specified by gro_types.
+ *  This function flushes packets whose GRO types are specified by
+ *  gro_types.
  * @param out
- *  a pointer array that is used to keep flushed timeout packets.
+ *  Pointer array used to keep flushed packets.
  * @param max_nb_out
- *  the element number of out. It's also the max number of timeout
+ *  The element number of 'out'. It's also the max number of timeout
  *  packets that can be flushed finally.
  *
  * @return
- *  the number of flushed packets. If no packets are flushed, return 0.
+ *  The number of flushed packets.
  */
 uint16_t rte_gro_timeout_flush(void *ctx,
 		uint64_t timeout_cycles,
@@ -180,10 +172,10 @@  uint16_t rte_gro_timeout_flush(void *ctx,
  * of a given GRO context.
  *
  * @param ctx
- *  pointer points to a GRO context object.
+ *  GRO context object pointer.
  *
  * @return
- *  the number of packets in all reassembly tables.
+ *  The number of packets in the tables.
  */
 uint64_t rte_gro_get_pkt_count(void *ctx);