[dpdk-dev,V10,2/2] eal: add uevent pass and process function

Message ID 1515679534-22473-2-git-send-email-jia.guo@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Guo, Jia Jan. 11, 2018, 2:05 p.m. UTC
  In order to handle the uevent which have been detected from the kernel
side, add uevent process function, let hot plug event to be example to
show uevent mechanism how to pass the uevent and process the uevent.

About uevent passing and processing, add below functions in linux eal
dev layer. FreeBSD not support uevent ,so let it to be void and do not
implement in function.
a.dev_uev_parse
b.dev_uev_receive
c.dev_uev_process

Signed-off-by: Jeff Guo <jia.guo@intel.com>
---
V10->V9:
delete some unuse part
---
 lib/librte_eal/common/include/rte_dev.h |  23 +++++++
 lib/librte_eal/linuxapp/eal/eal_dev.c   | 109 +++++++++++++++++++++++++++++++-
 2 files changed, 130 insertions(+), 2 deletions(-)
  

Comments

Thomas Monjalon Jan. 14, 2018, 11:24 p.m. UTC | #1
11/01/2018 15:05, Jeff Guo:
> +enum rte_dev_state {
> +	RTE_DEV_UNDEFINED,	/**< unknown device state */
> +	RTE_DEV_FAULT,	/**< device fault or error */
> +	RTE_DEV_PARSED,	/**< device have been parsed on bus*/
> +	RTE_DEV_PROBED,	/**< devcie have been probed driver  */
> +};

Let's start with nitpicks: please be careful with spacing in comments.
+ typo: devcie
+ grammar: device has

What means parsed on bus? Is it "scanned"?

> +enum rte_dev_subsystem {
> +	RTE_DEV_SUBSYSTEM_UIO,
> +	RTE_DEV_SUBSYSTEM_VFIO,
> +	RTE_DEV_SUBSYSTEM_PCI,
> +	RTE_DEV_SUBSYSTEM_MAX
> +};

I don't think PCI and UIO/VFIO should be described at the same level.
Can you re-use the enum rte_kernel_driver?

> +enum event_monitor_netlink_group {
> +	RTE_DEV_EVENT_MONITOR_KERNEL,
> +	RTE_DEV_EVENT_MONITOR_UDEV,
> +};

This enum should be prefixed with rte_

> +	enum event_monitor_netlink_group group;	/**< device netlink group */

netlink is specific to Linux.
I don't think it should be in a generic API struct.
  
Guo, Jia Jan. 15, 2018, 10:52 a.m. UTC | #2
On 1/15/2018 7:24 AM, Thomas Monjalon wrote:
> 11/01/2018 15:05, Jeff Guo:
>> +enum rte_dev_state {
>> +	RTE_DEV_UNDEFINED,	/**< unknown device state */
>> +	RTE_DEV_FAULT,	/**< device fault or error */
>> +	RTE_DEV_PARSED,	/**< device have been parsed on bus*/
>> +	RTE_DEV_PROBED,	/**< devcie have been probed driver  */
>> +};
> Let's start with nitpicks: please be careful with spacing in comments.
> + typo: devcie
> + grammar: device has
>
> What means parsed on bus? Is it "scanned"?
absolutely what i mean is scanned.
>> +enum rte_dev_subsystem {
>> +	RTE_DEV_SUBSYSTEM_UIO,
>> +	RTE_DEV_SUBSYSTEM_VFIO,
>> +	RTE_DEV_SUBSYSTEM_PCI,
>> +	RTE_DEV_SUBSYSTEM_MAX
>> +};
> I don't think PCI and UIO/VFIO should be described at the same level.
> Can you re-use the enum rte_kernel_driver?

rte_kernel_driver might be not qualify for that use, since that is the event sumsystem, it include pci/uio/vfio, such strings to identify each subsystem. i will modify it to be rte_dev_event_subsystem.

>> +enum event_monitor_netlink_group {
>> +	RTE_DEV_EVENT_MONITOR_KERNEL,
>> +	RTE_DEV_EVENT_MONITOR_UDEV,
>> +};
> This enum should be prefixed with rte_
sure.
>> +	enum event_monitor_netlink_group group;	/**< device netlink group */
> netlink is specific to Linux.
> I don't think it should be in a generic API struct.
agree.
  
Thomas Monjalon Jan. 15, 2018, 11:29 a.m. UTC | #3
15/01/2018 11:52, Guo, Jia:
> On 1/15/2018 7:24 AM, Thomas Monjalon wrote:
> > 11/01/2018 15:05, Jeff Guo:
> >> +enum rte_dev_subsystem {
> >> +	RTE_DEV_SUBSYSTEM_UIO,
> >> +	RTE_DEV_SUBSYSTEM_VFIO,
> >> +	RTE_DEV_SUBSYSTEM_PCI,
> >> +	RTE_DEV_SUBSYSTEM_MAX
> >> +};
> > 
> > I don't think PCI and UIO/VFIO should be described at the same level.
> > Can you re-use the enum rte_kernel_driver?
> 
> rte_kernel_driver might be not qualify for that use, since that is the event sumsystem, it include pci/uio/vfio, such strings to identify each subsystem. i will modify it to be rte_dev_event_subsystem.

I don't understand this classification.
A device can be both PCI and VFIO.
  
Guo, Jia Jan. 15, 2018, 3:33 p.m. UTC | #4
On 1/15/2018 7:29 PM, Thomas Monjalon wrote:
> 15/01/2018 11:52, Guo, Jia:
>> On 1/15/2018 7:24 AM, Thomas Monjalon wrote:
>>> 11/01/2018 15:05, Jeff Guo:
>>>> +enum rte_dev_subsystem {
>>>> +	RTE_DEV_SUBSYSTEM_UIO,
>>>> +	RTE_DEV_SUBSYSTEM_VFIO,
>>>> +	RTE_DEV_SUBSYSTEM_PCI,
>>>> +	RTE_DEV_SUBSYSTEM_MAX
>>>> +};
>>> I don't think PCI and UIO/VFIO should be described at the same level.
>>> Can you re-use the enum rte_kernel_driver?
>> rte_kernel_driver might be not qualify for that use, since that is the event sumsystem, it include pci/uio/vfio, such strings to identify each subsystem. i will modify it to be rte_dev_event_subsystem.
>>
>> I don't understand this classification.
>> A device can be both PCI and VFIO.
yes , i think that might be a little strange, but what i saw in the 
uevent message is that , the item of subsystem info from kernel side is 
pci sometimes, but some time is uio, i don't know if it is good to 
defferentiy them by "subsystem " or "driver" or other. let me think 
about it more.
  

Patch

diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h
index fea037a..a3166f7 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -51,6 +51,23 @@  extern "C" {
 
 #include <rte_log.h>
 
+#define RTE_EAL_UEV_MSG_LEN 4096
+#define RTE_EAL_UEV_MSG_ELEM_LEN 128
+
+enum rte_dev_state {
+	RTE_DEV_UNDEFINED,	/**< unknown device state */
+	RTE_DEV_FAULT,	/**< device fault or error */
+	RTE_DEV_PARSED,	/**< device have been parsed on bus*/
+	RTE_DEV_PROBED,	/**< devcie have been probed driver  */
+};
+
+enum rte_dev_subsystem {
+	RTE_DEV_SUBSYSTEM_UIO,
+	RTE_DEV_SUBSYSTEM_VFIO,
+	RTE_DEV_SUBSYSTEM_PCI,
+	RTE_DEV_SUBSYSTEM_MAX
+};
+
 /**
  * The device event type.
  */
@@ -61,10 +78,16 @@  enum rte_dev_event_type {
 	RTE_DEV_EVENT_MAX	/**< max value of this enum */
 };
 
+enum event_monitor_netlink_group {
+	RTE_DEV_EVENT_MONITOR_KERNEL,
+	RTE_DEV_EVENT_MONITOR_UDEV,
+};
+
 struct rte_dev_event {
 	enum rte_dev_event_type type;	/**< device event type */
 	int subsystem;				/**< subsystem id */
 	char *devname;				/**< device name */
+	enum event_monitor_netlink_group group;	/**< device netlink group */
 };
 
 typedef int (*rte_dev_event_cb_fn)(enum rte_dev_event_type event,
diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c b/lib/librte_eal/linuxapp/eal/eal_dev.c
index bc32aab..1d0ec33 100644
--- a/lib/librte_eal/linuxapp/eal/eal_dev.c
+++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
@@ -79,10 +79,115 @@  dev_monitor_enable(int netlink_fd)
 	return -1;
 }
 
+static void
+dev_uev_parse(const char *buf, struct rte_dev_event *event)
+{
+	char action[RTE_EAL_UEV_MSG_ELEM_LEN];
+	char subsystem[RTE_EAL_UEV_MSG_ELEM_LEN];
+	char dev_path[RTE_EAL_UEV_MSG_ELEM_LEN];
+	char pci_slot_name[RTE_EAL_UEV_MSG_ELEM_LEN];
+	int i = 0;
+
+	memset(action, 0, RTE_EAL_UEV_MSG_ELEM_LEN);
+	memset(subsystem, 0, RTE_EAL_UEV_MSG_ELEM_LEN);
+	memset(dev_path, 0, RTE_EAL_UEV_MSG_ELEM_LEN);
+	memset(pci_slot_name, 0, RTE_EAL_UEV_MSG_ELEM_LEN);
+
+	while (i < RTE_EAL_UEV_MSG_LEN) {
+		for (; i < RTE_EAL_UEV_MSG_LEN; i++) {
+			if (*buf)
+				break;
+			buf++;
+		}
+		if (!strncmp(buf, "libudev", 7)) {
+			buf += 7;
+			i += 7;
+			event->group = RTE_DEV_EVENT_MONITOR_UDEV;
+		}
+		if (!strncmp(buf, "ACTION=", 7)) {
+			buf += 7;
+			i += 7;
+			snprintf(action, sizeof(action), "%s", buf);
+		} else if (!strncmp(buf, "DEVPATH=", 8)) {
+			buf += 8;
+			i += 8;
+			snprintf(dev_path, sizeof(dev_path), "%s", buf);
+		} else if (!strncmp(buf, "SUBSYSTEM=", 10)) {
+			buf += 10;
+			i += 10;
+			snprintf(subsystem, sizeof(subsystem), "%s", buf);
+		} else if (!strncmp(buf, "PCI_SLOT_NAME=", 14)) {
+			buf += 14;
+			i += 14;
+			snprintf(pci_slot_name, sizeof(subsystem), "%s", buf);
+			event->devname = pci_slot_name;
+		}
+		for (; i < RTE_EAL_UEV_MSG_LEN; i++) {
+			if (*buf == '\0')
+				break;
+			buf++;
+		}
+	}
+
+	if (!strncmp(subsystem, "pci", 3))
+		event->subsystem = RTE_DEV_SUBSYSTEM_UIO;
+	if (!strncmp(action, "add", 3))
+		event->type = RTE_DEV_EVENT_ADD;
+	if (!strncmp(action, "remove", 6))
+		event->type = RTE_DEV_EVENT_REMOVE;
+	event->devname = pci_slot_name;
+}
+
+static int
+dev_uev_receive(int fd, struct rte_dev_event *uevent)
+{
+	int ret;
+	char buf[RTE_EAL_UEV_MSG_LEN];
+
+	memset(uevent, 0, sizeof(struct rte_dev_event));
+	memset(buf, 0, RTE_EAL_UEV_MSG_LEN);
+
+	ret = recv(fd, buf, RTE_EAL_UEV_MSG_LEN - 1, MSG_DONTWAIT);
+	if (ret < 0) {
+		RTE_LOG(ERR, EAL,
+		"Socket read error(%d): %s\n",
+		errno, strerror(errno));
+		return -1;
+	} else if (ret == 0)
+		/* connection closed */
+		return -1;
+
+	dev_uev_parse(buf, uevent);
+
+	return 0;
+}
+
 static int
-dev_uev_process(__rte_unused struct epoll_event *events, __rte_unused int nfds)
+dev_uev_process(struct epoll_event *events, int nfds)
 {
-	/* TODO: device uevent processing */
+	struct rte_dev_event uevent;
+	int i;
+
+	for (i = 0; i < nfds; i++) {
+		/**
+		 * check device uevent from kernel side, no need to check
+		 * uevent from udev.
+		 */
+		if ((dev_uev_receive(events[i].data.fd, &uevent)) ||
+			(uevent.group == RTE_DEV_EVENT_MONITOR_UDEV))
+			return 0;
+
+		/* default handle all pci devcie when is being hot plug */
+		if (uevent.subsystem == RTE_DEV_SUBSYSTEM_UIO) {
+			if (uevent.type == RTE_DEV_EVENT_REMOVE) {
+				return(_rte_dev_callback_process(NULL,
+				  RTE_DEV_EVENT_REMOVE));
+			} else if (uevent.type == RTE_DEV_EVENT_ADD) {
+				return(_rte_dev_callback_process(NULL,
+				  RTE_DEV_EVENT_ADD));
+			}
+		}
+	}
 	return 0;
 }