[PATCH v2] app/testpmd: use Tx preparation in txonly engine

Konstantin Ananyev konstantin.ananyev at huawei.com
Mon Feb 26 14:26:12 CET 2024



> >>>>>>> TSO breaks when MSS spans more than 8 data fragments. Those
> >>>>>>> packets will be dropped by Tx preparation API, but it will
> >>> cause
> >>>>>>> MDD event if txonly forwarding engine does not call the Tx
> >>>>> preparation
> >>>>>>> API before transmitting packets.
> >>>>>>>
> >>>>>>
> >>>>>> txonly is used commonly, adding Tx prepare for a specific case
> >>> may
> >>>>>> impact performance for users.
> >>>>>>
> >>>>>> What happens when driver throws MDD (Malicious Driver Detection)
> >>>>> event,
> >>>>>> can't it be ignored? As you are already OK to drop the packet,
> >>> can
> >>>>>> device be configured to drop these packages?
> >>>>>>
> >>>>>>
> >>>>>> Or as Jerin suggested adding a new forwarding engine is a
> >>> solution,
> >>>>> but
> >>>>>> that will create code duplication, I prefer to not have it if
> >>> this
> >>>>> can
> >>>>>> be handled in device level.
> >>>>>
> >>>>> Actually I am agree with the author of the patch - when TX offloads
> >>>>> and/or multisegs are enabled,
> >>>>> user supposed to invoke eth_tx_prepare().
> >>>>> Not doing that seems like a bug to me.
> >>>>
> >>>> I strongly disagree with that statement, Konstantin!
> >>>> It is not documented anywhere that using TX offloads and/or multisegs
> >>> requires calling rte_eth_tx_prepare() before
> >>>> rte_eth_tx_burst(). And none of the examples do it.
> >>>
> >>> In fact, we do use it for test-pmd/csumonly.c.
> >>> About other sample apps:
> >>> AFAIK, not many of other DPDK apps do use L4 offloads.
> >>> Right now special treatment (pseudo-header cksum calculation) is needed
> >>> only for L4 offloads (CKSUM, SEG).
> >>> So, majority of our apps who rely on other TX offloads (multi-seg, ipv4
> >>> cksum, vlan insertion) happily run without
> >>> calling tx_prepare(), even though it is not the safest way.
> >>>
> >>>>
> >>>> In my opinion:
> >>>> If some driver has limitations for a feature, e.g. max 8 fragments,
> >>> it should be documented for that driver, so the application
> >>>> developer can make the appropriate decisions when designing the
> >>> application.
> >>>> Furthermore, we have APIs for the drivers to expose to the
> >>> applications what the driver supports, so the application can configure
> >>>> itself optimally at startup. Perhaps those APIs need to be expanded.
> >>>> And if a feature limitation is common across the majority of drivers,
> >>> that limitation should be mentioned in the documentation of the
> >>>> feature itself.
> >>>
> >>> Many of such limitations *are* documented and in fact we do have an API
> >>> to check max segments that each driver support,
> >>> see struct rte_eth_desc_lim.
> >>
> >> Yes, this is the kind of API we should provide, so the application can configure itself appropriately.
> >>
> >>> The problem is:
> >>> - none of our sample app does proper check on these values, so users
> >>> don't have a good example how to do it.
> >>
> >> Agreed.
> >> Adding an example showing how to do it properly would be the best solution.
> >> Calling tx_prepare() in the examples is certainly not the solution.
> >>
> >>> - with current DPDK API not all of HW/PMD requirements could be
> >>> extracted programmatically:
> >>>    let say majority of Intel PMDs for TCP offloads expect pseudo-header
> >>> cksum to be pre-calculated by the SW.
> >>
> >> I hope this requirement is documented somewhere.
> >>
> >>>    another example, some HW expects pkt_len to be bigger then some
> >>> threshold value, otherwise HW hang may appear.
> >>
> >> I hope this requirement is also documented somewhere.
> >
> > No idea, I found it only in the code.
> 
> IMHO Tx burst must check such limitations. If you made your HW simpler
> (or just lost it on initial testing), pay in your drivers (or your
> HW+driver will be unusable because of such problems).
> 
> >> Generally, if the requirements cannot be extracted programmatically, they must be prominently documented, like this note to
> >> rte_eth_rx_burst():
> >
> > Obviously, more detailed documentation is always good, but...
> > Right now we have 50+ different PMDs from different vendors.
> > Even if each and every of them will carefully document all possible limitations and necessary preparation steps,
> > how DPDK app developer supposed to  deal with all that?
> > Do you expect everyone, to read carefully through all of them, and handle all of them properly oh his own
> > in each and every DPDK app he is going to write?
> > That seems unrealistic.
> > Again what to do with backward compatibility: when new driver (with new limitations) will arise
> > *after* your app is already written and tested?
> 
> +1
> 
> >>
> >>   * @note
> >>   *   Some drivers using vector instructions require that *nb_pkts* is
> >>   *   divisible by 4 or 8, depending on the driver implementation.
> 
> I'm wondering what application should do if it needs to send just one
> packet and do it now. IMHO, such limitations are not acceptable.
> 
> >>
> >>> - As new HW and PMD keep appearing it is hard to predict what extra
> >>> limitations/requirements will arise,
> >>>    that's why tx_prepare() was introduced as s driver op.
> >>>
> >>>>
> >>>> We don't want to check in the fast path what can be checked at
> >>> startup or build time!
> >>>
> >>> If your app supposed to work with just a few, known in advance, NIC
> >>> models, then sure, you can do that.
> >>> For apps that supposed to work 'in general'  with any possible PMDs
> >>> that DPDK supports - that might be a problem.
> >>> That's why tx_prepare() was introduced and it is strongly recommended
> >>> to use it by the apps that do use TX offloads.
> >>> Probably tx_prepare() is not the best possible approach, but right now
> >>> there are not many alternatives within DPDK.
> >>
> >> What exactly is an application supposed to do if tx_prepare() doesn't accept the full burst? It doesn't return information about
> what is
> >> wrong.
> >
> > It provides some information, but it is *very* limited: just index of 'bad' packet and error code.
> > In theory app can try to handle it in a 'smart' way: let say if ENOTSUP is returned, then try to disable all HW offloads
> > and do all in SW. But again, it is much better to do so *before* submitting packets for TX, so in practice everyone
> > just drop such 'bad' packets.
> >
> >   Dropping the packets might not be an option, e.g. for applications used in life support or tele-medicine.
> 
> Critical applications should be able to do all Tx offloads in SW and
> retry. Of course, various statistics and logs should help to improve the
> application.
> 
> > If the packet is 'bad', then it is much better to drop it, then TX corrupted packet or even hang NIC HW completely.
> 
> IMHO Tx burst must drop packet which could hang NIC HW completely. I
> realize that it is an extra checks and performance drop, but vendor
> should pay in performance if HW is not good enough.
> 
> > Though off-course it is much better to have an app that would check for limitations that can be checked by API provided
> > and enable only supported offloads.
> 
> Yes, that's why API to get limitation is much better than documentation.
> 
> >> If limitations are documented, an application can use the lowest common denominator of the NICs it supports. And if the
> application is
> >> supposed to work in general, that becomes the lowest common denominator of all NICs.
> >
> > I agree: for limitations that can be extracted with generic API, like:
> > number of segments per packet, supported TX offloads, mbuf fileds that must be provided for each TX offload,  etc. -
> > it is responsibility of well-written application to obey all of them.
> > Yes, many tx_prepare() implementations do  such checks anyway, but from my perspective it is sort of last-line of defense.
> > For well written application that should just never happen.
> > But there is one more important responsibility of tx_prepare() -  it performs PMD specific packet modifications for requested
> offloads.
> > As I already mentioned for Intel NICs - it does pseudo-header cksum calcucation, for packets with size less then minimal, it can
> > probably do padding (even if doesn't do it right now), for some other PMDs - might be something else, I didn't check.
> > Obviously it saves app developer from a burden to do all these things on his own.
> >
> >> It looks like tx_prepare() has become a horrible workaround for undocumented limitations.
> 
> I strongly disagree. Documentation is never a solution for a generic
> application which is intended to work on any HW and IMHO it is the
> goal to have more and more applications which work on any HW.
> 
> >> Limitations due to hardware and/or software tradeoffs are unavoidable, so we have to live with them; but we should not accept
> >> PMDs with undocumented limitations.
> >
> > As I already said, more documentation never hurts, but for that case, I think it is not enough.
> > I expect PMD to provide a tx_prepare() implementation that would deal with such specific things.
> 
> +1
> > Anyway, back to the original patch - I looked at it once again, and realized that the problem
> > is just in the unsupported number of segments.
> > As we discussed above - such limitations should be handled by well written app,
> > but none of DPDK apps does it right now.
> > So probably it is good opportunity to do things in a proper way and introduce such checks
> > in testpmd ;)
> 
> +1 since we already have fields in device information to report such
> limitations, but it does not say that Tx prepare should be dropped.
> Drivers which don't need Tx prepare keep it NULL and it returns
> immediately from ethdev. Since it is done per-burst, it should not
> affect performance a lot.
> 

100% agree. I think tx_prepare needs to stay, and in general has to be strongly recommended
for apps that do use TX offloads/multi-segs. 


More information about the stable mailing list