[dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api

Ananyev, Konstantin konstantin.ananyev at intel.com
Fri Feb 12 12:44:42 CET 2016


> 
> > -----Original Message-----
> > From: Kulasek, TomaszX
> > Sent: Tuesday, February 09, 2016 5:03 PM
> > To: Ananyev, Konstantin; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api
> >
> >
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Tuesday, February 2, 2016 14:50
> > > To: Kulasek, TomaszX <tomaszx.kulasek at intel.com>; dev at dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api
> > >
> > > Hi Tomasz,
> > >
> > > > -----Original Message-----
> > > > From: Kulasek, TomaszX
> > > > Sent: Tuesday, February 02, 2016 10:01 AM
> > > > To: Ananyev, Konstantin; dev at dpdk.org
> > > > Subject: RE: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api
> > > >
> > > > Hi Konstantin,
> > > >
> > > > > -----Original Message-----
> > > > > From: Ananyev, Konstantin
> > > > > Sent: Friday, January 15, 2016 19:45
> > > > > To: Kulasek, TomaszX; dev at dpdk.org
> > > > > Subject: RE: [dpdk-dev] [PATCH 1/2] ethdev: add buffered tx api
> > > > >
> > > > > Hi Tomasz,
> > > > >
> > > > > >
> > > > > > +		/* get new buffer space first, but keep old space around
> > > */
> > > > > > +		new_bufs = rte_zmalloc("ethdev->txq_bufs",
> > > > > > +				sizeof(*dev->data->txq_bufs) * nb_queues, 0);
> > > > > > +		if (new_bufs == NULL)
> > > > > > +			return -(ENOMEM);
> > > > > > +
> > > > >
> > > > > Why not to allocate space for txq_bufs together with tx_queues (as
> > > > > one chunk for both)?
> > > > > As I understand there is always one to one mapping between them
> > > anyway.
> > > > > Would simplify things a bit.
> > > > > Or even introduce a new struct to group with all related tx queue
> > > > > info togetehr struct rte_eth_txq_data {
> > > > > 	void *queue; /*actual pmd  queue*/
> > > > > 	struct rte_eth_dev_tx_buffer buf;
> > > > > 	uint8_t state;
> > > > > }
> > > > > And use it inside struct rte_eth_dev_data?
> > > > > Would probably give a better data locality.
> > > > >
> > > >
> > > > Introducing such a struct will require a huge rework of pmd drivers. I
> > > don't think it's worth only for this one feature.
> > >
> > > Why not?
> > > Things are getting more and more messy here: now we have a separate array
> > > of pointer to queues, Separate array of queue states, you are going to add
> > > separate array of tx buffers.
> > > For me it seems logical to unite all these 3 fields into one sub-struct.
> > >
> >
> > I agree with you, and probably such a work will be nice also for rx queues, but these two changes impacts on another part of dpdk.
> > While buffered tx API is more client application helper.
> >
> > For me these two thinks are different features and should be made separately because:
> > 1) They are independent and can be done separately,
> > 2) They can (and should) be reviewed, tested and approved separately,
> > 3) They are addressed to another type of people (tx buffering to application developers, rte_eth_dev_data to pmd developers), so
> > another people can be interested in having (or not) one or second feature
> 
> Such division seems a bit artificial to me :)
> You are making changes in rte_ethdev.[c,h]  - I think that filed regrouping would make code cleaner and easier to read/maintain.
> 
> >
> > Even for bug tracking it will be cleaner to separate these two things. And yes, it is logical to unite it, maybe also for rx queues, but
> > should be discussed separately.
> >
> > I've made a prototype with this rework, and the impact on the code not related to this particular feature is too wide and strong to
> join
> > them. I would rather to provide it as independent patch for further discussion only on it, if needed.
> 
> Sure, separate patch is fine.
> Why not to submit it as extra one is the series?
> 
> 
> >
> > > >
> > > >
> > > > > > +/**
> > > > > > + * @internal
> > > > > > + * Structure used to buffer packets for future TX
> > > > > > + * Used by APIs rte_eth_tx_buffer and rte_eth_tx_buffer_flush  */
> > > > > > +struct rte_eth_dev_tx_buffer {
> > > > > > +	struct rte_mbuf *pkts[RTE_ETHDEV_TX_BUFSIZE];
> > > > >
> > > > > I think it is better to make size of pkts[] configurable at runtime.
> > > > > There are a lot of different usage scenarios - hard to predict what
> > > > > would be an optimal buffer size for all cases.
> > > > >
> > > >
> > > > This buffer is allocated in eth_dev shared memory, so there are two
> > > scenarios:
> > > > 1) We have prealocated buffer with maximal size, and then we can set
> > > > threshold level without restarting device, or
> > > > 2) We need to set its size before starting device.
> > >
> > > >
> > > > Second one is better, I think.
> > >
> > > Yep, I was thinking about 2) too.
> > > Might be an extra parameter in struct rte_eth_txconf.
> > >
> >
> > Struct rte_eth_txconf is passed to ethdev after rte_eth_dev_tx_queue_config, so we don't know its value when buffers are
> > allocated.
> 
> Ok, and why allocation of the tx buffer can't be done at rte_eth_tx_queue_setup()?
> 
> Actually just thought why not to let rte_eth_tx_buffer() to accept struct rte_eth_dev_tx_buffer * as a parameter:
> +static inline int __attribute__((always_inline))
> +rte_eth_tx_buffer(uint8_t port_id, uint16_t queue_id,  accept struct rte_eth_dev_tx_buffer * txb, struct rte_mbuf *tx_pkt)
> ?
> 
> In that case we don't need to make any changes at rte_ethdev.[h,c] to alloc/free/maintain tx_buffer inside each queue...
> It all will be upper layer responsibility.
> So no need to modify existing rte_ethdev structures/code.
> Again, no need for error callback - caller would check return value and decide what to do with unsent packets in the tx_buffer.
> 

Just to summarise why I think it is better to have tx buffering managed on the app level:

1. avoid any ABI change.
2. Avoid extra changes in rte_ethdev.c: tx_queue_setup/tx_queue_stop.
3. Provides much more flexibility to the user:
   a) where to allocate space for tx_buffer (stack, heap, hugepages, etc).
   b) user can mix and match plain tx_burst() and   tx_buffer/tx_buffer_flush()
        in any way he fills it appropriate.
   c) user can change the size of tx_buffer without stop/re-config/start queue:
        just allocate new larger(smaller) tx_buffer & copy contents to the new one.
   d) user can preserve buffered packets through device restart circle:
        i.e if let say TX hang happened, and user has to do dev_stop/dev_start -
        contents of tx_buffer will stay unchanged and its contents could be
        (re-)transmitted after device is up again, or  through different port/queue if needed.
 
As a drawbacks mentioned - tx error handling becomes less transparent...
But we can add error handling routine and it's user provided parameter
into struct rte_eth_dev_tx_buffer', something like that:

+struct rte_eth_dev_tx_buffer {
+	buffer_tx_error_fn cbfn;
+	void *userdata;
+	unsigned nb_pkts;
+	uint64_t errors;
+	/**< Total number of queue packets to sent that are dropped. */
+	struct rte_mbuf *pkts[];
+};

Konstantin



More information about the dev mailing list