[dpdk-dev] [PATCH RFC] librte_reorder: new reorder library

Jay Rolette rolette at infiniteio.com
Fri Oct 17 18:26:39 CEST 2014


Thanks for the responses, Reshma.

Can you provide a little more context about the use case that your reorder
library is intended to help with? If I'm understanding your answers
correctly, the functionality seems pretty limited and not something I would
ever end up using, but that may be more about the types of products I build
(deep packet inspection and working at L4-L7 generally, even though running
at or near line-rate).

Please take my comments in the spirit intended... If the design makes sense
for different use cases and I'm not the target audience, that's perfectly
ok and there are probably different trade-offs being made. But if it is
intended to be useful for DPI applications, I'd hate to just be quiet and
end up with something that doesn't get used as much as it might.

I haven't looked at the distributor library, so entirely possible it makes
more sense in that context.

More detailed responses to your previous answers inline.

Regards,
Jay

On Fri, Oct 17, 2014 at 4:44 AM, Pattan, Reshma <reshma.pattan at intel.com>
wrote:

>  Hi Jay,
>
>
>
> Please find comments inline.
>
>
>
> Thanks,
>
> Reshma
>
>
>
> *From:* Jay Rolette [mailto:rolette at infiniteio.com]
> *Sent:* Thursday, October 9, 2014 8:02 PM
> *To:* Pattan, Reshma
> *Cc:* dev at dpdk.org
> *Subject:* Re: [dpdk-dev] [PATCH RFC] librte_reorder: new reorder library
>
>
>
> Hi Reshma,
>
>
>
> A few comments and questions about your design...
>
>
>
> 1) How do you envision the reorder library to be used? Based on the
> description, it seems like the expectation is that packet order would be
> maintained at either the interface/port level or maybe at the RX queue
> level. Is that right or am I reading too much between the lines?
>
>
>
> For my purposes (and for network security products I've developed in the
> past), I mostly don't care about box or port-level order. The most
> important thing is to maintain packet order within a flow. Relative order
> from packets in different flows doesn't matter. If there is a way I can
> process packets in parallel and transmit out-of-order transmission *within
> the flow*, that's very useful. Architecturally, it helps avoid hot-spotting
> in my packet processing pipeline and wasting cycles when load-balancing
> isn't perfect (and it never is).
>
> [Reshma]: Generic parallel processing of packets is planned in phase2
> version of distributor based on sequence numbers, but not flow based
>  parallel processing.
>

See question at the top of my email about the intended use-case. For DPI
applications, global (box-wide or per port) reordering isn't normally
required. Maintaining order within flows is the important part. Depending
on your implementation and the guarantees you make, the impact it has on
aggregate system throughput can be significant.


>   2) If the reorder library is "flow aware", then give me flexibility on
> deciding what a flow is. Let me define pseudo-flows even if the protocol
> itself isn't connection oriented (ie., frequently useful to treat UDP
> 5-tuples as a flow). I may want to include tunnels/VLANs/etc. as part of my
> "flow" definition. I may need to include the physical port as part of the
> flow definition.
>
>
>
> Ideally, the library includes the common cases and gives me the option to
> register a callback function for doing whatever sort of "flows" I require
> for my app.
>
> [Reshma]:It is not flow aware. But to reorder packets of particular flow,
> you can handover particular flow to the library and library will give you
> back the reordered data.
>

I think given how a couple of other bits are described, this doesn't end up
helping. More a bit further down.


>   3) Is there a way to apply the reorder library to some packets and not
> others? I might want to use for TCP and UDP, but not care about order for
> other IP traffic (for example).
>
> [Reshma]:No, reorder library will not have intelligence about  traffic
> type (i.e. flow or protocols based).
>
> Applications can do  traffic  splitting into flows or  protocol based and
>  handover to library for reordering
>

Ditto

  4) How are you dealing with internal congestion? If I drop a packet
> somewhere in my processing pipeline, how does the TX side of the reorder
> queue/buffer deal with the missing sequence number? Is there some sort of
> timeout mechanism so that it will only wait for X microseconds for a
> missing sequence number?
>
> [Reshma]: Library just takes care of packets what it has  got. No waiting
> mechanism is used for missing packets.
>
> Reorder processing will skip the dropped packets(i.e. will create a gap in
> reorder buffer) and proceed with allocation of slot to the later packets
> which are available.
>
>
>
> Need the ability to bound how long packets are held up in the reorder
> engine before they are released.
>
> [Reshma]: This is dependent upon how frequently packets are enqueued and
> dequeued from it. Packets which are in order and without gaps are dequeued
> at the next call to the dequeue api. If there is a gap, the time taken to
> skip over the gap will depend on the size of the reorder ring.
>

So the window for correcting out-of-order is nothing more than whatever
queueing delays happen to be on the TX queue? That seems... not very
useful. Am I missing something about the design?

For DPI applications, processing time is somewhat variable between
different packets in a flow. I'm assuming L3 apps have similar issues with
control plane traffic. In a low-latency architecture, very few packets
should be sitting in any TX queues so you really need something with some
time/cycle-count constraints to manage that window - ie., how long should
other packets in a flow be held up in the TX queue waiting for "earlier"
packets vs. transmitting them anyway?

  Assuming you address this, the reorder engine will also need to deal with
> slow packets that show up after "later" packets were transmitted.
>
> [Reshma]: As of now, plan is to check sequence number of current packet
> that library has got with the min sequence number maintained in the library.
>
> The difference between them should not cross 2*reorder_buffer_size . If so
> we don’t handle such packet and drop it.
>
> But, we are open to suggestions on how to handle late packets? Should we
> have config option to drop them or just deque them in next immediate
> dequeue operation.
>

Config option is the most flexible, but I would expect the normal case to
be to TX the packet "quickly". I'm hesitant to say deque it in the next
immediate dequeue operation because there are potential DoS attack vectors
on the system depending on implementation details.


>  On Tue, Oct 7, 2014 at 4:33 AM, Pattan, Reshma <reshma.pattan at intel.com>
> wrote:
>
> Hi All,
>
> I am planning  to implement packet reorder library. Details are as below,
> please go through them and provide the comments.
>
> Requirement:
>                To reorder out of ordered packets that are received from
> different cores.
>
> Usage:
> To be used along with distributor library. Next version of distributor are
> planned to distribute incoming packets to all worker cores irrespective of
> the flow type.
> In this case to ensure in order delivery of the packets at output side
> reorder library will used by the tx end.
>
> Assumption:
> All input packets will be marked with sequence number in seqn field of
> mbuf, this will be the reference for reordering at the tx end.
> Sequence number will be of type uint32_t. New sequence number field seqn
> will be added to mbuf structure.
>
> Design:
> a)There will be reorder buffer(circular buffer) structure maintained in
> reorder library to store reordered packets and other details of buffer like
> head to drain the packet from, min sequence number and other details.
>                b)Library will provide insert and drain functions to
> reorder and fetch out the reordered packets respectively.
> c)Users of library should pass the packets to insert functions for
> reordering.
>
> Insertion logic:
> Sequence number of current packet will be used to calculate offset in
> reorder buffer and write packet to the location  in the reorder buffer
> corresponding to offset.
>                              Offset is calculated as difference of current
> packet  sequence number and sequence number associated with  reorder buffer.
>
> During sequence number wrapping or wrapping over of reorder buffer size,
> before inserting the new packet we should move offset number of packets to
> other buffer called overflow buffer and advance the head of reorder buffer
> by "offset-reorder buffer size" and insert the new packet.
>
> Insert function:
> int rte_reorder_insert(struct rte_reorder_buffer *buffer, struct rte_mbuf
> *mbuf);
> Note: Other insert function is also under plan to insert burst of packets.
>
>                Reorder buffer:
> struct rte_reorder_buffer {
>         unsigned int size;      /* The size (number of entries) of the
> buffer. */
>         unsigned int mask;      /* Mask (size - 1) of the buffer */
>         unsigned int head;      /* Current head of buffer */
>         uint32_t min_seqn;      /* latest sequence number associated with
> buffer */
>         struct rte_mbuf *entries[MAX_REORDER_BUFFER_SIZE]; /* buffer to
> hold reordered mbufs */
> };
>
> d)Users can fetch out the reordered packets by drain function provided by
> library. Users must pass the mbuf array , drain function should fill
> passed mbuff array  with the reordered buffer packets.
> During drain operation, overflow buffer  packets will be fetched out first
> and then reorder buffer.
>
> Drain function:
>                int rte_reorder_drain(struct rte_reorder_buffer *buffer,
> struct rte_mbuf **mbufs)
>
> Thanks,
> Reshma
>
> --------------------------------------------------------------
> Intel Shannon Limited
> Registered in Ireland
> Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
> Registered Number: 308263
> Business address: Dromore House, East Park, Shannon, Co. Clare
>
> This e-mail and any attachments may contain confidential material for the
> sole use of the intended recipient(s). Any review or distribution by others
> is strictly prohibited. If you are not the intended recipient, please
> contact the sender and delete all copies.
>
>
>


More information about the dev mailing list