[dpdk-dev] [RFC v2] doc compression API for DPDK

Verma, Shally Shally.Verma at cavium.com
Fri Feb 16 08:16:58 CET 2018


Hi Fiona, Ahmed

>-----Original Message-----
>From: Ahmed Mansour [mailto:ahmed.mansour at nxp.com]
>Sent: 16 February 2018 02:40
>To: Trahe, Fiona <fiona.trahe at intel.com>; Verma, Shally <Shally.Verma at cavium.com>; dev at dpdk.org
>Cc: Athreya, Narayana Prasad <NarayanaPrasad.Athreya at cavium.com>; Gupta, Ashish <Ashish.Gupta at cavium.com>; Sahu, Sunila
><Sunila.Sahu at cavium.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch at intel.com>; Challa, Mahipal
><Mahipal.Challa at cavium.com>; Jain, Deepak K <deepak.k.jain at intel.com>; Hemant Agrawal <hemant.agrawal at nxp.com>; Roy
>Pledge <roy.pledge at nxp.com>; Youri Querry <youri.querry_1 at nxp.com>
>Subject: Re: [RFC v2] doc compression API for DPDK
>
>On 2/15/2018 1:47 PM, Trahe, Fiona wrote:
>> Hi Shally, Ahmed,
>> Sorry for the delay in replying,
>> Comments below
>>
>>> -----Original Message-----
>>> From: Verma, Shally [mailto:Shally.Verma at cavium.com]
>>> Sent: Wednesday, February 14, 2018 7:41 AM
>>> To: Ahmed Mansour <ahmed.mansour at nxp.com>; Trahe, Fiona <fiona.trahe at intel.com>;
>>> dev at dpdk.org
>>> Cc: Athreya, Narayana Prasad <NarayanaPrasad.Athreya at cavium.com>; Gupta, Ashish
>>> <Ashish.Gupta at cavium.com>; Sahu, Sunila <Sunila.Sahu at cavium.com>; De Lara Guarch, Pablo
>>> <pablo.de.lara.guarch at intel.com>; Challa, Mahipal <Mahipal.Challa at cavium.com>; Jain, Deepak K
>>> <deepak.k.jain at intel.com>; Hemant Agrawal <hemant.agrawal at nxp.com>; Roy Pledge
>>> <roy.pledge at nxp.com>; Youri Querry <youri.querry_1 at nxp.com>
>>> Subject: RE: [RFC v2] doc compression API for DPDK
>>>
>>> Hi Ahmed,
>>>
>>>> -----Original Message-----
>>>> From: Ahmed Mansour [mailto:ahmed.mansour at nxp.com]
>>>> Sent: 02 February 2018 01:53
>>>> To: Trahe, Fiona <fiona.trahe at intel.com>; Verma, Shally <Shally.Verma at cavium.com>; dev at dpdk.org
>>>> Cc: Athreya, Narayana Prasad <NarayanaPrasad.Athreya at cavium.com>; Gupta, Ashish
>>> <Ashish.Gupta at cavium.com>; Sahu, Sunila
>>>> <Sunila.Sahu at cavium.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch at intel.com>; Challa,
>>> Mahipal
>>>> <Mahipal.Challa at cavium.com>; Jain, Deepak K <deepak.k.jain at intel.com>; Hemant Agrawal
>>> <hemant.agrawal at nxp.com>; Roy
>>>> Pledge <roy.pledge at nxp.com>; Youri Querry <youri.querry_1 at nxp.com>
>>>> Subject: Re: [RFC v2] doc compression API for DPDK
>>>>
>>>> On 1/31/2018 2:03 PM, Trahe, Fiona wrote:
>>>>> Hi Ahmed, Shally,
>>>>>
>>>>> ///snip///
>>>>>>>>>>> D.1.1 Stateless and OUT_OF_SPACE
>>>>>>>>>>> ------------------------------------------------
>>>>>>>>>>> OUT_OF_SPACE is a condition when output buffer runs out of space
>>>>>>>> and
>>>>>>>>>> where PMD still has more data to produce. If PMD run into such
>>>>>>>> condition,
>>>>>>>>>> then it's an error condition in stateless processing.
>>>>>>>>>>> In such case, PMD resets itself and return with status
>>>>>>>>>> RTE_COMP_OP_STATUS_OUT_OF_SPACE with produced=consumed=0
>>>>>>>> i.e.
>>>>>>>>>> no input read, no output written.
>>>>>>>>>>> Application can resubmit an full input with larger output buffer size.
>>>>>>>>>> [Ahmed] Can we add an option to allow the user to read the data that
>>>>>>>> was
>>>>>>>>>> produced while still reporting OUT_OF_SPACE? this is mainly useful for
>>>>>>>>>> decompression applications doing search.
>>>>>>>>> [Shally] It is there but applicable for stateful operation type (please refer to
>>>>>>>> handling out_of_space under
>>>>>>>>> "Stateful Section").
>>>>>>>>> By definition, "stateless" here means that application (such as IPCOMP)
>>>>>>>> knows maximum output size
>>>>>>>>> guaranteedly and ensure that uncompressed data size cannot grow more
>>>>>>>> than provided output buffer.
>>>>>>>>> Such apps can submit an op with type = STATELESS and provide full input,
>>>>>>>> then PMD assume it has
>>>>>>>>> sufficient input and output and thus doesn't need to maintain any contexts
>>>>>>>> after op is processed.
>>>>>>>>> If application doesn't know about max output size, then it should process it
>>>>>>>> as stateful op i.e. setup op
>>>>>>>>> with type = STATEFUL and attach a stream so that PMD can maintain
>>>>>>>> relevant context to handle such
>>>>>>>>> condition.
>>>>>>>> [Fiona] There may be an alternative that's useful for Ahmed, while still
>>>>>>>> respecting the stateless concept.
>>>>>>>> In Stateless case where a PMD reports OUT_OF_SPACE in decompression
>>>>>>>> case
>>>>>>>> it could also return consumed=0, produced = x, where x>0. X indicates the
>>>>>>>> amount of valid data which has
>>>>>>>>  been written to the output buffer. It is not complete, but if an application
>>>>>>>> wants to search it it may be sufficient.
>>>>>>>> If the application still wants the data it must resubmit the whole input with a
>>>>>>>> bigger output buffer, and
>>>>>>>>  decompression will be repeated from the start, it
>>>>>>>>  cannot expect to continue on as the PMD has not maintained state, history
>>>>>>>> or data.
>>>>>>>> I don't think there would be any need to indicate this in capabilities, PMDs
>>>>>>>> which cannot provide this
>>>>>>>> functionality would always return produced=consumed=0, while PMDs which
>>>>>>>> can could set produced > 0.
>>>>>>>> If this works for you both, we could consider a similar case for compression.
>>>>>>>>
>>>>>>> [Shally] Sounds Fine to me. Though then in that case, consume should also be updated to actual
>>>>>> consumed by PMD.
>>>>>>> Setting consumed = 0 with produced > 0 doesn't correlate.
>>>>>> [Ahmed]I like Fiona's suggestion, but I also do not like the implication
>>>>>> of returning consumed = 0. At the same time returning consumed = y
>>>>>> implies to the user that it can proceed from the middle. I prefer the
>>>>>> consumed = 0 implementation, but I think a different return is needed to
>>>>>> distinguish it from OUT_OF_SPACE that the use can recover from. Perhaps
>>>>>> OUT_OF_SPACE_RECOVERABLE and OUT_OF_SPACE_TERMINATED. This also allows
>>>>>> future PMD implementations to provide recover-ability even in STATELESS
>>>>>> mode if they so wish. In this model STATELESS or STATEFUL would be a
>>>>>> hint for the PMD implementation to make optimizations for each case, but
>>>>>> it does not force the PMD implementation to limit functionality if it
>>>>>> can provide recover-ability.
>>>>> [Fiona] So you're suggesting the following:
>>>>> OUT_OF_SPACE - returned only on stateful operation. Not an error. Op.produced
>>>>>     can be used and next op in stream should continue on from op.consumed+1.
>>>>> OUT_OF_SPACE_TERMINATED - returned only on stateless operation.
>>>>>     Error condition, no recovery possible.
>>>>>     consumed=produced=0. Application must resubmit all input data with
>>>>>     a bigger output buffer.
>>>>> OUT_OF_SPACE_RECOVERABLE - returned only on stateless operation, some recovery possible.
>>>>>      - consumed = 0, produced > 0. Application must resubmit all input data with
>>>>>         a bigger output buffer. However in decompression case, data up to produced
>>>>>         in dst buffer may be inspected/searched. Never happens in compression
>>>>>         case as output data would be meaningless.
>>>>>      - consumed > 0, produced > 0. PMD has stored relevant state and history and so
>>>>>         can convert to stateful, using op.produced and continuing from consumed+1.
>>>>> I don't expect our PMDs to use this last case, but maybe this works for others?
>>>>> I'm not convinced it's not just adding complexity. It sounds like a version of stateful
>>>>> without a stream, and maybe less efficient?
>>>>> If so should it respect the FLUSH flag? Which would have been FULL or FINAL in the op.
>>>>> Or treat it as FLUSH_NONE or SYNC? I don't know why an application would not
>>>>> simply have submitted a STATEFUL request if this is the behaviour it wants?
>>>> [Ahmed] I was actually suggesting the removal of OUT_OF_SPACE entirely
>>>> and replacing it with
>>>> OUT_OF_SPACE_TERMINATED - returned only on stateless operation.
>>>>        Error condition, no recovery possible.
>>>>        - consumed=0 produced=amount of data produced. Application must
>>>> resubmit all input data with
>>>>          a bigger output buffer to process all of the op
>>>> OUT_OF_SPACE_RECOVERABLE -  Normally returned on stateful operation. Not
>>>> an error. Op.produced
>>>>    can be used and next op in stream should continue on from op.consumed+1.
>>>>        -  consumed > 0, produced > 0. PMD has stored relevant state and
>>>> history and so
>>>>            can continue using op.produced and continuing from consumed+1.
>>>>
>>>> We would not return OUT_OF_SPACE_RECOVERABLE in stateless mode in our
>>>> implementation either.
>>>>
>>>> Regardless of speculative future PMDs. The more important aspect of this
>>>> for today is that the return status clearly determines
>>>> the meaning of "consumed". If it is RECOVERABLE then consumed is
>>>> meaningful. if it is TERMINATED then consumed in meaningless.
>>>> This way we take away the ambiguity of having OUT_OF_SPACE mean two
>>>> different user work flows.
>>>>
>>>> A speculative future PMD may be designed to return RECOVERABLE for
>>>> stateless ops that are attached to streams.
>>>> A future PMD may look to see if an op has a stream is attached and write
>>>> out the state there and go into recoverable mode.
>>>> in essence this leaves the choice up to the implementation and allows
>>>> the PMD to take advantage of stateless optimizations
>>>> so long as a "RECOVERABLE" scenario is rarely hit. The PMD will dump
>>>> context as soon as it fully processes an op. It will only
>>>> write context out in cases where the op chokes.
>>>> This futuristic PMD should ignore the FLUSH since this STATELESS mode as
>>>> indicated by the user and optimize
>>> [Shally] IMO, it looks okay to have two separate return code TERMINATED and RECOVERABLE with
>>> definition as you mentioned and seem doable.
>>> So then it mean all following conditions:
>>> a. stateless with flush = full/final, no stream pointer provided , PMD can return TERMINATED i.e. user
>>> has to start all over again, it's a failure (as in current definition)
>>> b. stateless with flush = full/final, stream pointer provided, here it's up to PMD to return either
>>> TERMINATED or RECOVERABLE depending upon its ability (note if Recoverable, then PMD will maintain
>>> states in stream pointer)
>>> c. stateful with flush = full / NO_SYNC, stream pointer always there, PMD will
>>> TERMINATED/RECOVERABLE depending on STATEFUL_COMPRESSION/DECOMPRESSION feature flag
>>> enabled or not
>> [Fiona] I don't think the flush flag is relevant - it could be out of space on any flush flag, and if out of space
>> should ignore the flush flag.
>> Is there a need for TERMINATED? - I didn't think it would ever need to be returned in stateful case.
>>  Why the ref to feature flag? If a PMD doesn't support a feature I think it should fail the op - not with
>>  out-of space, but unsupported or similar. Or it would fail on stream creation.
>[Ahmed] Agreed with Fiona. The flush flag only matters on success. By
>definition the PMD should return OUT_OF_SPACE_RECOVERABLE in stateful
>mode when it runs out of space.
>@Shally If the user did not provide a stream, then the PMD should
>probably return TERMINATED every time. I am not sure we should make a
>"really smart" PMD which returns RECOVERABLE even if no stream pointer
>was given. In that case the PMD must give some ID back to the caller
>that the caller can use to "recover" the op. I am not sure how it would
>be implemented in the PMD and when does the PMD decide to retire streams
>belonging to dead ops that the caller decided not to "recover".
>>
>>> and one more exception case is:
>>> d. stateless with flush = full, no stream pointer provided, PMD can return RECOVERABLE i.e. PMD
>>> internally maintained that state somehow and consumed & produced > 0, so user can start consumed+1
>>> but there's restriction on user not to alter or change op until it is fully processed?!
>> [Fiona] Why the need for this case?
>> There's always a restriction on user not to alter or change op until it is fully processed.
>> If a PMD can do this - why doesn't it create a stream when that API is called - and then it's same as b?
>[Ahmed] Agreed. The user should not touch an op once enqueued until they
>receive it in dequeue. We ignore the flush in stateless mode. We assume
>it to be final every time.

[Shally] Agreed and am not in favour of supporting such implementation either. Just listed out different possibilities up here to better visualise Ahmed requirements/applicability of TERMINATED and RECOVERABLE.

>>
>>> API currently takes care of case a and c, and case b can be supported if specification accept another
>>> proposal which mention optional usage of stream with stateless.
>> [Fiona] API has this, but as we agreed, not optional to call the create_stream() with an op_type
>> parameter (stateful/stateless). PMD can return NULL or provide a stream, if the latter then that
>> stream must be attached to ops.
>>
>>  Until then API takes no difference to
>>> case b and c i.e. we can have op such as,
>>> - type= stateful with flush = full/final, stream pointer provided, PMD can return
>>> TERMINATED/RECOVERABLE according to its ability
>>>
>>> Case d , is something exceptional, if there's requirement in PMDs to support it, then believe it will be
>>> doable with concept of different return code.
>>>
>> [Fiona] That's not quite how I understood it. Can it be simpler and only following cases?
>> a. stateless with flush = full/final, no stream pointer provided , PMD can return TERMINATED i.e. user
>>     has to start all over again, it's a failure (as in current definition).
>>     consumed = 0, produced=amount of data produced. This is usually 0, but in decompression
>>     case a PMD may return > 0 and application may find it useful to inspect that data.
>> b. stateless with flush = full/final, stream pointer provided, here it's up to PMD to return either
>>     TERMINATED or RECOVERABLE depending upon its ability (note if Recoverable, then PMD will maintain
>>     states in stream pointer)
>> c. stateful with flush = any, stream pointer always there, PMD will return RECOVERABLE.
>>     op.produced can be used and next op in stream should continue on from op.consumed+1.
>>     Consumed=0, produced=0 is an unusual but allowed case. I'm not sure if it could ever happen, but
>>     no need to change state to TERMINATED in this case. There may be useful state/history
>>     stored in the PMD, even though no output produced yet.
>[Ahmed] Agreed
[Shally] Sounds good.

>>
>>>>>>>>>>> D.2 Compression API Stateful operation
>>>>>>>>>>> ----------------------------------------------------------
>>>>>>>>>>>  A Stateful operation in DPDK compression means application invokes
>>>>>>>>>> enqueue burst() multiple times to process related chunk of data either
>>>>>>>>>> because
>>>>>>>>>>> - Application broke data into several ops, and/or
>>>>>>>>>>> - PMD ran into out_of_space situation during input processing
>>>>>>>>>>>
>>>>>>>>>>> In case of either one or all of the above conditions, PMD is required to
>>>>>>>>>> maintain state of op across enque_burst() calls and
>>>>>>>>>>> ops are setup with op_type RTE_COMP_OP_STATEFUL, and begin with
>>>>>>>>>> flush value = RTE_COMP_NO/SYNC_FLUSH and end at flush value
>>>>>>>>>> RTE_COMP_FULL/FINAL_FLUSH.
>>>>>>>>>>> D.2.1 Stateful operation state maintenance
>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>> It is always an ideal expectation from application that it should parse
>>>>>>>>>> through all related chunk of source data making its mbuf-chain and
>>>>>>>> enqueue
>>>>>>>>>> it for stateless processing.
>>>>>>>>>>> However, if it need to break it into several enqueue_burst() calls, then
>>>>>>>> an
>>>>>>>>>> expected call flow would be something like:
>>>>>>>>>>> enqueue_burst( |op.no_flush |)
>>>>>>>>>> [Ahmed] The work is now in flight to the PMD.The user will call dequeue
>>>>>>>>>> burst in a loop until all ops are received. Is this correct?
>>>>>>>>>>
>>>>>>>>>>> deque_burst(op) // should dequeue before we enqueue next
>>>>>>>>> [Shally] Yes. Ideally every submitted op need to be dequeued. However
>>>>>>>> this illustration is specifically in
>>>>>>>>> context of stateful op processing to reflect if a stream is broken into
>>>>>>>> chunks, then each chunk should be
>>>>>>>>> submitted as one op at-a-time with type = STATEFUL and need to be
>>>>>>>> dequeued first before next chunk is
>>>>>>>>> enqueued.
>>>>>>>>>
>>>>>>>>>>> enqueue_burst( |op.no_flush |)
>>>>>>>>>>> deque_burst(op) // should dequeue before we enqueue next
>>>>>>>>>>> enqueue_burst( |op.full_flush |)
>>>>>>>>>> [Ahmed] Why now allow multiple work items in flight? I understand that
>>>>>>>>>> occasionaly there will be OUT_OF_SPACE exception. Can we just
>>>>>>>> distinguish
>>>>>>>>>> the response in exception cases?
>>>>>>>>> [Shally] Multiples ops are allowed in flight, however condition is each op in
>>>>>>>> such case is independent of
>>>>>>>>> each other i.e. belong to different streams altogether.
>>>>>>>>> Earlier (as part of RFC v1 doc) we did consider the proposal to process all
>>>>>>>> related chunks of data in single
>>>>>>>>> burst by passing them as ops array but later found that as not-so-useful for
>>>>>>>> PMD handling for various
>>>>>>>>> reasons. You may please refer to RFC v1 doc review comments for same.
>>>>>>>> [Fiona] Agree with Shally. In summary, as only one op can be processed at a
>>>>>>>> time, since each needs the
>>>>>>>> state of the previous, to allow more than 1 op to be in-flight at a time would
>>>>>>>> force PMDs to implement internal queueing and exception handling for
>>>>>>>> OUT_OF_SPACE conditions you mention.
>>>>>> [Ahmed] But we are putting the ops on qps which would make them
>>>>>> sequential. Handling OUT_OF_SPACE conditions would be a little bit more
>>>>>> complex but doable.
>>>>> [Fiona] In my opinion this is not doable, could be very inefficient.
>>>>> There may be many streams.
>>>>> The PMD would have to have an internal queue per stream so
>>>>> it could adjust the next src offset and length in the OUT_OF_SPACE case.
>>>>> And this may ripple back though all subsequent ops in the stream as each
>>>>> source len is increased and its dst buffer is not big enough.
>>>> [Ahmed] Regarding multi op OUT_OF_SPACE handling.
>>>> The caller would still need to adjust
>>>> the src length/output buffer as you say. The PMD cannot handle
>>>> OUT_OF_SPACE internally.
>>>> After OUT_OF_SPACE occurs, the PMD should reject all ops in this stream
>>>> until it gets explicit
>>>> confirmation from the caller to continue working on this stream. Any ops
>>>> received by
>>>> the PMD should be returned to the caller with status STREAM_PAUSED since
>>>> the caller did not
>>>> explicitly acknowledge that it has solved the OUT_OF_SPACE issue.
>>>> These semantics can be enabled by adding a new function to the API
>>>> perhaps stream_resume().
>>>> This allows the caller to indicate that it acknowledges that it has seen
>>>> the issue and this op
>>>> should be used to resolve the issue. Implementations that do not support
>>>> this mode of use
>>>> can push back immediately after one op is in flight. Implementations
>>>> that support this use
>>>> mode can allow many ops from the same session
>>>>
>>> [Shally] Is it still in context of having single burst where all op belongs to one stream? If yes, I would still
>>> say it would add an overhead to PMDs especially if it is expected to work closer to HW (which I think is
>>> the case with DPDK PMD).
>>> Though your approach is doable but why this all cannot be in a layer above PMD? i.e. a layer above PMD
>>> can either pass one-op at a time with burst size = 1 OR can make chained mbuf of input and output and
>>> pass than as one op.
>>> Is it just to ease applications of chained mbuf burden or do you see any performance /use-case
>>> impacting aspect also?
>>>
>>> if it is in context where each op belong to different stream in a burst, then why do we need
>>> stream_pause and resume? It is a expectations from app to pass more output buffer with consumed + 1
>>> from next call onwards as it has already
>>> seen OUT_OF_SPACE.
>[Ahmed] Yes, this would add extra overhead to the PMD. Our PMD
>implementation rejects all ops that belong to a stream that has entered
>"RECOVERABLE" state for one reason or another. The caller must
>acknowledge explicitly that it has received news of the problem before
>the PMD allows this stream to exit "RECOVERABLE" state. I agree with you
>that implementing this functionality in the software layer above the PMD
>is a bad idea since the latency reductions are lost.

[Shally] Just reiterating, I rather meant other way around i.e. I see it easier to put all such complexity in a layer above PMD.

>This setup is useful in latency sensitive applications where the latency
>of buffering multiple ops into one op is significant. We found latency
>makes a significant difference in search applications where the PMD
>competes with software decompression.
>> [Fiona] I still have concerns with this and would not want to support in our PMD.
>> TO make sure I understand, you want to send a burst of ops, with several from same stream.
>> If one causes OUT_OF_SPACE_RECOVERABLE, then the PMD should not process any
>> subsequent ops in that stream.
>> Should it return them in a dequeue_burst() with status still NOT_PROCESSED?
>> Or somehow drop them? How?
>> While still processing ops form other streams.
>[Ahmed] This is exactly correct. It should return them with
>NOT_PROCESSED. Yes, the PMD should continue processing other streams.
>> As we want to offload each op to hardware with as little CPU processing as possible we
>> would not want to open up each op to see which stream it's attached to and
>> make decisions to do per-stream storage, or drop it, or bypass hw and dequeue without processing.
>[Ahmed] I think I might have missed your point here, but I will try to
>answer. There is no need to "cushion" ops in DPDK. DPDK should send ops
>to the PMD and the PMD should reject until stream_continue() is called.
>The next op to be sent by the user will have a special marker in it to
>inform the PMD to continue working on this stream. Alternatively the
>DPDK layer can be made "smarter" to fail during the enqueue by checking
>the stream and its state, but like you say this adds additional CPU
>overhead during the enqueue.
>I am curious. In a simple synchronous use case. How do we prevent users
>from putting multiple ops in flight that belong to a single stream? Do
>we just currently say it is undefined behavior? Otherwise we would have
>to check the stream and incur the CPU overhead.
>>
>> Maybe we could add a capability if this behaviour is important for you?
>> e.g. ALLOW_ENQUEUE_MULTIPLE_STATEFUL_OPS ?
>> Our PMD would set this to 0. And expect no more than one op from a stateful stream
>> to be in flight at any time.
>[Ahmed] That makes sense. This way the different DPDK implementations do
>not have to add extra checking for unsupported cases.

[Shally] @ahmed, If I summarise your use-case, this is how to want to PMD to support?
- a burst *carry only one stream* and all ops then assumed to be belong to that stream? (please note, here burst is not carrying more than one stream)
-PMD will submit one op at a time to HW? 
-if processed successfully, push it back to completion queue with status = SUCCESS. If failed or run to into OUT_OF_SPACE, then push it to completion queue with status = FAILURE/ OUT_OF_SPACE_RECOVERABLE and rest with status = NOT_PROCESSED and return with enqueue count = total # of ops submitted originally with burst?
-app assumes all have been enqueued, so it go and dequeue all ops
-on seeing an op with OUT_OF_SPACE_RECOVERABLE, app resubmit a burst of ops with call to stream_continue/resume API starting from op which encountered OUT_OF_SPACE and others as NOT_PROCESSED with updated input and output buffer?
-repeat until *all* are dequeued with status = SUCCESS or *any* with status = FAILURE? If anytime failure is seen, then app start whole processing all over again or just drop this burst?!

If all of above is true, then I think we should add another API such as rte_comp_enque_single_stream() which will be functional under Feature Flag = ALLOW_ENQUEUE_MULTIPLE_STATEFUL_OPS or better name is SUPPORT_ENQUEUE_SINGLE_STREAM?!


>>
>>
>>>> Regarding the ordering of ops
>>>> We do force serialization of ops belonging to a stream in STATEFUL
>>>> operation. Related ops do
>>>> not go out of order and are given to available PMDs one at a time.
>>>>
>>>>>> The question is this mode of use useful for real
>>>>>> life applications or would we be just adding complexity? The technical
>>>>>> advantage of this is that processing of Stateful ops is interdependent
>>>>>> and PMDs can take advantage of caching and other optimizations to make
>>>>>> processing related ops much faster than switching on every op. PMDs have
>>>>>> maintain state of more than 32 KB for DEFLATE for every stream.
>>>>>>>> If the application has all the data, it can put it into chained mbufs in a single
>>>>>>>> op rather than
>>>>>>>> multiple ops, which avoids pushing all that complexity down to the PMDs.
>>>>>> [Ahmed] I think that your suggested scheme of putting all related mbufs
>>>>>> into one op may be the best solution without the extra complexity of
>>>>>> handling OUT_OF_SPACE cases, while still allowing the enqueuer extra
>>>>>> time If we have a way of marking mbufs as ready for consumption. The
>>>>>> enqueuer may not have all the data at hand but can enqueue the op with a
>>>>>> couple of empty mbus marked as not ready for consumption. The enqueuer
>>>>>> will then update the rest of the mbufs to ready for consumption once the
>>>>>> data is added. This introduces a race condition. A second flag for each
>>>>>> mbuf can be updated by the PMD to indicate that it processed it or not.
>>>>>> This way in cases where the PMD beat the application to the op, the
>>>>>> application will just update the op to point to the first unprocessed
>>>>>> mbuf and resend it to the PMD.
>>>>> [Fiona] This doesn't sound safe. You want to add data to a stream after you've
>>>>> enqueued the op. You would have to write to op.src.length at a time when the PMD
>>>>> might be reading it. Sounds like a lock would be necessary.
>>>>> Once the op has been enqueued, my understanding is its ownership is handed
>>>>> over to the PMD and the application should not touch it until it has been dequeued.
>>>>> I don't think it's a good idea to change this model.
>>>>> Can't the application just collect a stream of data in chained mbufs until it has
>>>>> enough to send an op, then construct the op and while waiting for that op to
>>>>> complete, accumulate the next batch of chained mbufs? Only construct the next op
>>>>> after the previous one is complete, based on the result of the previous one.
>>>>>
>>>> [Ahmed] Fair enough. I agree with you. I imagined it in a different way
>>>> in which each mbuf would have its own length.
>>>> The advantage to gain is in applications where there is one PMD user,
>>>> the down time between ops can be significant and setting up a single
>>>> producer consumer pair significantly reduces the CPU cycles and PMD down
>>>> time.
>>>>
>>>> ////snip////
>



More information about the dev mailing list