[PATCH v2 10/11] eventdev: RFC clarify comments on scheduling types

Bruce Richardson bruce.richardson at intel.com
Wed Jan 24 12:21:16 CET 2024
Previous message (by thread): [PATCH v2 10/11] eventdev: RFC clarify comments on scheduling types
Next message (by thread): [PATCH v2 10/11] eventdev: RFC clarify comments on scheduling types
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Jan 23, 2024 at 05:19:18PM +0100, Mattias Rönnblom wrote:
> On 2024-01-19 18:43, Bruce Richardson wrote:
> > The description of ordered and atomic scheduling given in the eventdev
> > doxygen documentation was not always clear. Try and simplify this so
> > that it is clearer for the end-user of the application
> > 
> > Signed-off-by: Bruce Richardson <bruce.richardson at intel.com>
> > ---
> > 
> > NOTE TO REVIEWERS:
> > I've updated this based on my understanding of what these scheduling
> > types are meant to do. It matches my understanding of the support
> > offered by our Intel DLB2 driver, as well as the SW eventdev, and I
> > believe the DSW eventdev too. If it does not match the behaviour of
> > other eventdevs, let's have a discussion to see if we can reach a good
> > definition of the behaviour that is common.
> > ---
> >   lib/eventdev/rte_eventdev.h | 47 ++++++++++++++++++++-----------------
> >   1 file changed, 25 insertions(+), 22 deletions(-)
> > 
> > diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> > index 2c6576e921..cb13602ffb 100644
> > --- a/lib/eventdev/rte_eventdev.h
> > +++ b/lib/eventdev/rte_eventdev.h
> > @@ -1313,26 +1313,24 @@ struct rte_event_vector {
> >   #define RTE_SCHED_TYPE_ORDERED          0
> >   /**< Ordered scheduling
> >    *
> > - * Events from an ordered flow of an event queue can be scheduled to multiple
> > + * Events from an ordered event queue can be scheduled to multiple
> 
> What is the rationale for this change?
> 
> An implementation that impose a total order on all events on a particular
> ordered queue will still adhere to the current, more relaxed, per-flow
> ordering semantics.
> 
> An application wanting a total order would just set the flow id to 0 on all
> events destined that queue, and it would work on all event devices.
> 
> Why don't you just put a note in the DLB driver saying "btw it's total
> order", so any application where per-flow ordering is crucial for
> performance (i.e., where the potentially needless head-of-line blocking is
> an issue) can use multiple queues when running with the DLB.
> 
> In the API as-written, the app is free to express more relaxed ordering
> requirements (i.e., to have multiple flows) and it's up to the event device
> to figure out if it's in a position where it can translate this to lower
> latency.
> 

Yes, you are right. I'll roll-back or rework this change in V3. Keep it
documented that flow-ordering is guaranteed, but note that some
implementations may use total ordering to achieve that.

> >    * ports for concurrent processing while maintaining the original event order.
> 
> Maybe it's worth mentioning what is the original event order. "(i.e., the
> order in which the events were enqueued to the queue)". Especially since one
> like to specify what ordering guarantees one have of events enqueued to the
> same queue on different ports and by different lcores).
> 
> I don't know where that information should go though, since it's relevant
> for both atomic and ordered-type queues.
> 

It's probably more relevant for ordered, but I'll try and see where it's
best to go.

> >    * This scheme enables the user to achieve high single flow throughput by
> > - * avoiding SW synchronization for ordering between ports which bound to cores.
> > - *
> > - * The source flow ordering from an event queue is maintained when events are
> > - * enqueued to their destination queue within the same ordered flow context.
> > - * An event port holds the context until application call
> > - * rte_event_dequeue_burst() from the same port, which implicitly releases
> > - * the context.
> > - * User may allow the scheduler to release the context earlier than that
> > - * by invoking rte_event_enqueue_burst() with RTE_EVENT_OP_RELEASE operation.
> > - *
> > - * Events from the source queue appear in their original order when dequeued
> > - * from a destination queue.
> > - * Event ordering is based on the received event(s), but also other
> > - * (newly allocated or stored) events are ordered when enqueued within the same
> > - * ordered context. Events not enqueued (e.g. released or stored) within the
> > - * context are  considered missing from reordering and are skipped at this time
> > - * (but can be ordered again within another context).
> > + * avoiding SW synchronization for ordering between ports which are polled by
> > + * different cores.
> > + *
> > + * As events are scheduled to ports/cores, the original event order from the
> > + * source event queue is recorded internally in the scheduler. As events are
> > + * returned (via FORWARD type enqueue) to the scheduler, the original event
> > + * order is restored before the events are enqueued into their new destination
> > + * queue.
> 
> Delete the first sentence on implementation.
> 
> "As events are re-enqueued to the next queue (with the op field set to
> RTE_EVENT_OP_FORWARD), the event device restores the original event order
> before the events arrive on the destination queue."
> 
> > + *
> > + * Any events not forwarded, ie. dropped explicitly via RELEASE or implicitly
> > + * released by the next dequeue from a port, are skipped by the reordering
> > + * stage and do not affect the reordering of returned events.
> > + *
> > + * The ordering behaviour of NEW events with respect to FORWARD events is
> > + * undefined and implementation dependent.
> 
> For some reason I find this a little vague. "NEW and FORWARD events enqueued
> to a queue are not ordered in relation to each other (even if the flow id is
> the same)."
> 
> I think I agree that NEW shouldn't be ordered vis-a-vi FORWARD, but maybe
> one should say that an event device should avoid excessive reordering NEW
> and FORWARD events.
> 
> I think it would also be helpful to address port-to-port ordering guarantees
> (or a lack thereof).
> 
> "Events enqueued on one port are not ordered in relation to events enqueued
> on some other port."
> 
> Or are they? Not in DSW, at least, and I'm not sure I see a use case for
> such a guarantee, but it's a little counter-intuitive to have them
> potentially re-shuffled.
> 
> (This is also relevant for atomic queues.)
> 

Ack.

> >    *
> >    * @see rte_event_queue_setup(), rte_event_dequeue_burst(), RTE_EVENT_OP_RELEASE
> >    */
> > @@ -1340,18 +1338,23 @@ struct rte_event_vector {
> >   #define RTE_SCHED_TYPE_ATOMIC           1
> >   /**< Atomic scheduling
> >    *
> > - * Events from an atomic flow of an event queue can be scheduled only to a
> > + * Events from an atomic flow, identified by @ref rte_event.flow_id,
> 
> A flow is identified by the combination of queue_id and flow_id, so if you
> reference one you should also reference the other.
> 

Yes, this is probably one to be reflected globally. Also on your previous
comment about priority, I believe that a flow for ordering guarantees
should be a combination of queue_id, flow_id and priority. Two packets with
different priorities should expect to be reordered, since that tends to be
what priority implies.

> > + * of an event queue can be scheduled only to a
> >    * single port at a time. The port is guaranteed to have exclusive (atomic)
> >    * access to the associated flow context, which enables the user to avoid SW
> >    * synchronization. Atomic flows also help to maintain event ordering
> 
> "help" here needs to go, I think. It sounds like a best-effort affair. The
> atomic queue ordering guarantees (or the lack thereof) should be spelled
> out.
> 
> "Event order in an atomic flow is maintained."

Ack.

> 
> > - * since only one port at a time can process events from a flow of an
> > + * since only one port at a time can process events from each flow of an
> >    * event queue.
> 
> Yes, and *but also since* the event device is not reshuffling events
> enqueued to an atomic queue. And that's more complicated than just something
> that falls out of atomicity, especially if you assume that FORWARD type
> enqueues are not ordered with other FORWARD type enqueues on a different
> port.
> 

Ack.

> >    *
> > - * The atomic queue synchronization context is dedicated to the port until
> > + * The atomic queue synchronization context for a flow is dedicated to the port until
> 
> What is an "atomic queue synchronization context" (except for something that
> makes for long sentences).
> 

Yes, it's rather wordy. I like the idea of using lock terminology you
suggest. The use of the word "contexts" in relation to atomic/ordered I
find confusing myself too.

> How about:
> "The atomic flow is locked to the port until /../"
> 
> You could also used the word "bound" instead of "locked".
> 
> >    * application call rte_event_dequeue_burst() from the same port,
> >    * which implicitly releases the context. User may allow the scheduler to
> >    * release the context earlier than that by invoking rte_event_enqueue_burst()
> > - * with RTE_EVENT_OP_RELEASE operation.
> > + * with RTE_EVENT_OP_RELEASE operation for each event from that flow. The context
> > + * is only released once the last event from the flow, outstanding on the port,
> > + * is released. So long as there is one event from an atomic flow scheduled to
> > + * a port/core (including any events in the port's dequeue queue, not yet read
> > + * by the application), that port will hold the synchronization context.
> 
> In case you like the "atomic flow locked/bound to port", this part would
> also need updating.
> 
> Maybe here is a good place to add a note on memory ordering and event
> ordering.
> 
> "Any memory stores done as a part of event processing will be globally
> visible before the next event in the same atomic flow is dequeued on a
> different lcore."
> 
> I.e., enqueue includes write barrier before the event can be seen.
> 
> One should probably mentioned a rmb in dequeue as well.
> 

Do we think that that is necessary? I can add it, but I would have thought
that - as with rings - it could be assumed.

/Bruce
Previous message (by thread): [PATCH v2 10/11] eventdev: RFC clarify comments on scheduling types
Next message (by thread): [PATCH v2 10/11] eventdev: RFC clarify comments on scheduling types
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list