[dpdk-dev] [RFC] [PATCH] eventdev: abstract ethdev HW capability to inject packets to eventdev

Eads, Gage gage.eads at intel.com
Sat Apr 22 00:31:52 CEST 2017


Hi Jerin,

Thanks for getting this ball rolling, and I agree that we need a solution that covers the three cases you described. We've also been thinking about an environment where devices (NIC Rx (or even Tx), crypto, or a timer "device" that uses librte_timer to inject events) can plug in eventdev -- whether through a direct connection to the event scheduler (case #3) or using software to bridge the gap -- such that application software can have a consistent view of device interfacing on different platforms.

Some initial thoughts on your proposal:

1. I imagine that deploying these service functions at the granularity of a core can be excessive on devices with few (<= 8) cores. For example, if the crypto traffic rate is low then a cryptodev service function could be co-scheduled with other service functions and/or application work. I think we'll need a more flexible deployment of these service functions.

2. Knowing which device type a service function is for would be useful -- without it, it's not possible to assign the function to the NUMA node on which the device is located.

3. Placing the service core logic in the PMDs is nice in terms of application ease-of-use, but it forces PMD to write one-size-fits-all service core functions, where, for example, the application's control of the NIC Rx functionality is limited to the options that struct rte_event_queue_producer_conf exports. An application may want customized service core behavior such as: prioritized polling of Rx queues, using Rx queue interrupts for low traffic rate queues, or (for "closed system" eventdevs) control over whether/when a service core drops events (and a way to notify applications of event drops). For such cases, I think the appropriate solution is allow applications to plug in their own service core functions (when hardware support isn't present).

Some of these thoughts are reflected in the eventdev_pipeline app[1] that Harry submitted earlier today, like flexible service function deployment. In that app, the user supplies a device coremask that can pin a service function to a core, multiplex multiple functions on the core, or even affinitize the service function to multiple cores (using cmpset-based exclusion to ensure it's executed by one lcore at a time). In thinking about this, Narender and I have envisioned something like a framework for eventdev applications in which these service functions can be registered and (in a similar manner to eventdev_pipeline's service functions) executed.

Thanks,
Gage

[1] http://dpdk.org/ml/archives/dev/2017-April/064511.html

>  -----Original Message-----
>  From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
>  Sent: Tuesday, April 18, 2017 8:23 AM
>  To: dev at dpdk.org
>  Cc: Richardson, Bruce <bruce.richardson at intel.com>; Van Haaren, Harry
>  <harry.van.haaren at intel.com>; hemant.agrawal at nxp.com; Eads, Gage
>  <gage.eads at intel.com>; nipun.gupta at nxp.com; Jerin Jacob
>  <jerin.jacob at caviumnetworks.com>
>  Subject: [RFC] [dpdk-dev] [PATCH] eventdev: abstract ethdev HW capability to
>  inject packets to eventdev
>  
>  Some Ethdev Hardware is capable of injecting the events(Ethernet packets) to
>  eventdev without the need for dedicated service cores on Rx path.
>  Since eventdev API is device capability agnostic, we need to address three
>  combinations of ethdev and eventdev PMD drivers.
>  
>  1) Ethdev HW is not capable of injecting the packets and SW eventdev driver(All
>  existing ethdev PMD + drivers/event/sw PMD combination)
>  2) Ethdev HW is not capable of injecting the packets and not compatible HW
>  eventdev driver(All existing ethdev PMD + driver/event/octeontx PMD
>  combination)	
>  3) Ethdev HW is capable of injecting the packet to compatible HW eventdev
>  driver.
>  
>  This RFC attempts to abstract such capability disparity and have unified way to
>  get the functionality in the application.
>  
>  Detailed comments are added in the header file.
>  
>  Example API usage:
>  
>  - rte_eth_dev_configure(port,..);
>  - rte_eth_rx_queue_setup(port,..);
>  - rte_eth_dev_start(port,..);
>  
>  - rte_event_dev_configure(dev_id,..);
>  
>  struct rte_event_queue_producer_conf ethdev_conf =  {
>  	.event_type = RTE_EVENT_TYPE_ETHDEV;
>  	.sched_type = RTE_SCHED_TYPE_ATOMIC;
>  	.priority =  RTE_EVENT_DEV_PRIORITY_LOWEST;
>  	.ethdev.ethdev_port = port;
>    	.ethdev.rx_queue_id = -1;
>  };
>  
>  struct rte_event_queue_conf = conf {
>  	nb_producers = 1;
>  	producers = &ethdev_conf;
>  };
>  
>  - rte_event_queue_setup(dev_id, &conf,..);
>  - rte_event_port_setup(dev_id,..);
>  
>  lcore_function_t *fns[RTE_MAX_LCORE];
>  nb_serivice_cores = rte_event_dev_start(dev_id, fns); nscores = 0;
>  
>  RTE_LCORE_FOREACH_SLAVE(lcore_id) {
>  	if (nscores < nb_serivice_cores) {
>  		rte_eal_remote_launch(fns[nscores], NULL, lcore_id);
>           		nscores++;
>  	} else {
>  		rte_eal_remote_launch(normal_workers, NULL, lcore_id);
>  	}
>  }
>  
>  Another possible option is to move the lcore launch in the PMD and extend
>  enum rte_rmt_call_master_t to add SKIP_RUNNING to avoid application
>  browsing the service cores.
>  
>  Comments?
>  
>  Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
>  ---
>   lib/librte_eventdev/rte_eventdev.h | 87
>  +++++++++++++++++++++++++++++++++++++-
>   1 file changed, 85 insertions(+), 2 deletions(-)
>  
>  diff --git a/lib/librte_eventdev/rte_eventdev.h
>  b/lib/librte_eventdev/rte_eventdev.h
>  index b8ed6ef08..b601b9ecd 100644
>  --- a/lib/librte_eventdev/rte_eventdev.h
>  +++ b/lib/librte_eventdev/rte_eventdev.h
>  @@ -517,6 +517,64 @@ rte_event_dev_configure(uint8_t dev_id,
>    *  @see rte_event_port_setup(), rte_event_port_link()
>    */
>  
>  +/** Event queue producers configuration structure.
>  + * The events are injected to event device through *enqueue* operation
>  +with
>  + * op == RTE_EVENT_OP_NEW by event producers in the system. If the
>  +event
>  + * producers is an Ethernet device then eventdev PMD may operate in
>  +conjunction
>  + * with ethdev PMD to injects the events(Ethernet packets) to eventdev.
>  + * The event injection can happen in HW or SW or the combination of
>  +these two
>  + * based on the HW capabilities of target eventdev and ethdev PMDs.
>  + * If the eventdev PMD needs additional threads to inject the events,
>  + * a set of callbacks will be provided in rte_event_dev_start().
>  + * Application must invoke each callback on each lcores to meet the
>  +required
>  + * functionality.
>  + *
>  + * @see rte_event_dev_start()
>  + *
>  + */
>  +struct rte_event_queue_producer_conf {
>  +	uint32_t event_type:4;
>  +	/**< Event type to classify the event source.
>  +	 * @see RTE_EVENT_TYPE_ETHDEV, (RTE_EVENT_TYPE_*)
>  +	 */
>  +	uint8_t sched_type:2;
>  +	/**< Scheduler synchronization type (RTE_SCHED_TYPE_*)
>  +	 * associated with flow id on a given event queue for the enqueue
>  +	 * operation.
>  +	 */
>  +	uint8_t priority;
>  +	/**< Event priority relative to other events in the
>  +	 * event queue. The requested priority should in the
>  +	 * range of  [RTE_EVENT_DEV_PRIORITY_HIGHEST,
>  +	 * RTE_EVENT_DEV_PRIORITY_LOWEST].
>  +	 * The implementation shall normalize the requested
>  +	 * priority to supported priority value.
>  +	 * Valid when the device has
>  +	 * RTE_EVENT_DEV_CAP_EVENT_QOS capability.
>  +	 */
>  +	union {
>  +		struct rte_event_ethdev_producer {
>  +			uint16_t ethdev_port;
>  +			/**< The port identifier of the Ethernet device */
>  +			int32_t rx_queue_id;
>  +			/**< The index of the receive queue from which to
>  +			* retrieve the input packets and inject to eventdev.
>  +			* The value -1 denotes all the Rx queues configured
>  +			* for the given ethdev_port are selected for retrieving
>  +			* the input packets and then injecting the
>  +			* events/packets to eventdev.
>  +			* The rte_eth_rx_burst() result is undefined
>  +			* if application invokes on bounded ethdev_port and
>  +			* rx_queue_id.
>  +			*/
>  +		} ethdev; /* RTE_EVENT_TYPE_ETHDEV */
>  +		/**< Valid when event_type == RTE_EVENT_TYPE_ETHDEV.
>  +		 * Implementation may use mbuff's rss->hash value as
>  +		 * flow_id for the enqueue operation.
>  +		 */
>  +	};
>  +};
>  +
>   /** Event queue configuration structure */  struct rte_event_queue_conf {
>   	uint32_t nb_atomic_flows;
>  @@ -545,6 +603,17 @@ struct rte_event_queue_conf {
>   	 * event device supported priority value.
>   	 * Valid when the device has RTE_EVENT_DEV_CAP_QUEUE_QOS
>  capability
>   	 */
>  +	uint16_t nb_producers;
>  +	/**< The number of producers to inject the events with operation as
>  +	 * RTE_EVENT_OP_NEW to this event queue.
>  +	 *
>  +	 * @see rte_event_queue_producer_conf RTE_EVENT_OP_NEW
>  +	 */
>  +	struct rte_event_queue_producer_conf *producers;
>  +	/**< Points to an array of *nb_producers* objects of type
>  +	 * *rte_event_queue_producer_conf* structure which contain
>  +	 * event queue producers configuration information.
>  +	 */
>   };
>  
>   /**
>  @@ -590,7 +659,14 @@ rte_event_queue_default_conf_get(uint8_t dev_id,
>  uint8_t queue_id,
>    * @return
>    *   - 0: Success, event queue correctly set up.
>    *   - <0: event queue configuration failed
>  + *   - -EDQUOT: Quota exceeded(Application tried to configure the same
>  producer
>  + *   on more than one event queue)
>  + *   - -EOPNOTSUPP: Implementation is not capable of pulling the events from
>  + *   the specific producer queue. On this error, application may try to
>  + *   reconfigure the event queue with rx_queue_id as -1 in
>  + *   struct rte_event_queue_producer_conf.
>    */
>  +
>   int
>   rte_event_queue_setup(uint8_t dev_id, uint8_t queue_id,
>   		      const struct rte_event_queue_conf *queue_conf); @@ -
>  755,13 +831,20 @@ rte_event_port_count(uint8_t dev_id);
>    *
>    * @param dev_id
>    *   Event device identifier
>  + * @param[out] fns
>  + *   Block of memory to insert callback pointers into. Application must launch
>  + *   these callbacks on available lcores using rte_eal_remote_launch() or
>  + *   equivalent.
>  + *   The caller has to allocate *RTE_MAX_LCORE * sizeof(void\*)* bytes to
>  + *   store the callback pointers.
>    * @return
>  - *   - 0: Success, device started.
>  + *   - >= 0: Success, device started.
>  + *   - Positive value: Number of callback pointers filled into the fns array.
>    *   - -ESTALE : Not all ports of the device are configured
>    *   - -ENOLINK: Not all queues are linked, which could lead to deadlock.
>    */
>   int
>  -rte_event_dev_start(uint8_t dev_id);
>  +rte_event_dev_start(uint8_t dev_id, lcore_function_t *fns[]);
>  
>   /**
>    * Stop an event device. The device can be restarted with a call to
>  --
>  2.12.2



More information about the dev mailing list