[dpdk-dev] [PATCH v5 5/6] eal: add per rx queue interrupt handling based on VFIO

Zhou, Danny danny.zhou at intel.com
Wed Feb 25 16:29:34 CET 2015



From: David Marchand [mailto:david.marchand at 6wind.com]
Sent: Wednesday, February 25, 2015 6:22 PM
To: Zhou, Danny
Cc: dev at dpdk.org; Liang, Cunming
Subject: Re: [dpdk-dev] [PATCH v5 5/6] eal: add per rx queue interrupt handling based on VFIO

Hello Danny,

On Wed, Feb 25, 2015 at 7:58 AM, Zhou, Danny <danny.zhou at intel.com<mailto:danny.zhou at intel.com>> wrote:

+int
+rte_intr_wait_rx_pkt(struct rte_intr_handle *intr_handle, uint8_t queue_id)
+{
+       struct epoll_event ev;
+       unsigned numfds = 0;
+
+       if (!intr_handle || intr_handle->fd < 0 || intr_handle->uio_cfg_fd < 0)
+               return -1;
+       if (queue_id >= VFIO_MAX_QUEUE_ID)
+               return -1;
+
+       /* create epoll fd */
+       int pfd = epoll_create(1);
+       if (pfd < 0) {
+               RTE_LOG(ERR, EAL, "Cannot create epoll instance\n");
+               return -1;
+       }

Why recreate the epoll instance at each call to this function ?

DZ: To avoid recreating the epoll instance for each queue, the struct rte_intr_handle(or a new structure added to ethdev)
should be extended by adding fields storing per-queue pfd. This way, it could reduce user/kernel context  switch overhead
when calling epoll_create() each time.

Sounds good?

You don't need a epfd per queue. And hardcoding epfd == eventfd will give a not very usable api.

Plus, epoll is something linux-specific, so you can't move it out of eal/linux.
I suppose you need an abstraction here (and in the future we could add something for bsd ?).

DZ: libev provides abstraction layer which is a good candidate to integrate, rather than
reinventing one I think. The BSD support can be implemented in the files under
lib\librte_eal\bsdapp\eal folder by calling BSD specific APIs. Maybe it is a good idea to introduce
a separated component like OS Adaption Layer into EAL in the future once DPDK is widely adopted in
BSD as well as Windows, then all DPDK components invoke Linux specific APIs could instead calling abstraction APIs.

Adding an abstraction here specifically for epoll does not resolve all the porting/migration problem in my mind.


Looking at this patchset, I think there is a design issue.
eal does not need to know about portid neither queueid.

eal can provide an api to retrieve the interrupt fds, configure an epoll instance, wait on an epoll instance etc...
ethdev is then responsible to setup the mapping between port id / queue id and interrupt fds by asking the eal about those fds.

This would result in an eal api even simpler and we could add other fds in a single epoll fd for other uses.

DZ: The queueid is just an index to the queue related eventfd array stored in EAL. If this array is still in the EAL and ethdev can apply for it and setup mapping for certain queue, there
might be issue for multiple-process use case where the fd resources allocated for secondary process are not freed if the secondary process exits unexpectedly.

Not sure I follow you.
If a secondary process exits, the eventfds created in primary process should still be valid and reusable.
Why would you need to free them ? Something to do with vfio ?

DZ: See below.

Probably we can setup the eventfd array inside ethdev,  and we just need EAL API to wait for ethdev’fd. So application invokes ethdev API with portid and queueid, and ethdev calls eal
API to wait on a ethdev fd which correlates with the specified portid and queueid.

Sounds ok to you?

eventfds creation can not be handled by ethdev, since it needs infrastructure and informations from within the eal/linux.
Again, do we need an abstraction ?

ethdev must be the one that does the mappings between port/queue and eventfds (or any object that represents a way to wake up for a given port/queue).

DZ: agreed after revisiting code. Let us follow your direction to create a ethdev API, similar to rte_eth_dev_rx_queue_intr_enable()/rte_eth_dev_rx_queue_intr_disable(), to use portiid and queueid as arguments. Then this ethdev API uses the mapped eventfds to invoke corresponding EAL API, waiting for interrupt event notification from kernel.  A V6 patchset will be created for this.

--
David Marchand


More information about the dev mailing list