[dpdk-dev,RFC] ether: standardize getting the port by name

Message ID 1512027330-30030-1-git-send-email-yliu@fridaylinux.org (mailing list archive)
State Superseded, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Yuanhan Liu Nov. 30, 2017, 7:35 a.m. UTC
  The ethdev name is taken from the "name" parameter from the helper
function rte_eth_dev_allocate(name). The name is given by the caller,
thus, there is no way to guarantee all the callers will follow the
same syntax, leaving us the port name is not standardized.

For example, for all ports probed by the generic pci probe function,
the name is set to the PCI id. For all others, it's set by the caller,
aka, the PMD driver. Taking mlx PMD driver as the example, it's set
to something like "mlx5_0 port 0".

Unfortunately, ovs-dpdk uses such name for referencing a specific port.
Since there is no standard, user has to figure out what is the right
name for the nic he is using. That adds extra (unnecessary) obstruction
to users.

Thus, the name should be standardized. We should give user a consistent
interface for finding a specific port.

What this patch proposes is to use "name[,mac]" syntax. "name" is the
PCI id for pci device. For vdev, it's the vdev name given by user. The
reason "mac" is needed is for some devices (say ConnectX-3), 2 ports
(in a single NIC) have the same PCI id.

There are 2 reasons I didn't make "mac" mandatory:
- it keeps the compatibility
- in most cases, the pci id is good enough to identify a port

However, while writing this commit log, I think it might be better to
use something like UUID for standardizing the port name. This way, we
will always have a very consistent naming, no matter whether it's PCI
device or vdev device and whether a PCI devices has 2 ports share the
same PIC id, or something we have considered so far (say, few ports
sharing same PCI and mac address :/).

It's also simpler and cleaner. The only drawback is such ID is meaningless
to human.

Please also note that this patch just comes up with an API to query
a port from standard name suggested above. The ethdev name isn't really
standardized here. This patch is asking for comments after all.

Thoughts?

Cc: Thomas Monjalon <thomas@monjalon.net>
Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Cc: Ciara Loftus <ciara.loftus@intel.com>
Cc: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>
---
 lib/librte_ether/rte_ethdev.c | 61 +++++++++++++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h | 28 ++++++++++++++++++++
 2 files changed, 89 insertions(+)
  

Comments

Stephen Hemminger Nov. 30, 2017, 5:15 p.m. UTC | #1
On Thu, 30 Nov 2017 15:35:30 +0800
Yuanhan Liu <yliu@fridaylinux.org> wrote:

> The ethdev name is taken from the "name" parameter from the helper
> function rte_eth_dev_allocate(name). The name is given by the caller,
> thus, there is no way to guarantee all the callers will follow the
> same syntax, leaving us the port name is not standardized.
> 
> For example, for all ports probed by the generic pci probe function,
> the name is set to the PCI id. For all others, it's set by the caller,
> aka, the PMD driver. Taking mlx PMD driver as the example, it's set
> to something like "mlx5_0 port 0".
> 
> Unfortunately, ovs-dpdk uses such name for referencing a specific port.
> Since there is no standard, user has to figure out what is the right
> name for the nic he is using. That adds extra (unnecessary) obstruction
> to users.
> 
> Thus, the name should be standardized. We should give user a consistent
> interface for finding a specific port.
> 
> What this patch proposes is to use "name[,mac]" syntax. "name" is the
> PCI id for pci device. For vdev, it's the vdev name given by user. The
> reason "mac" is needed is for some devices (say ConnectX-3), 2 ports
> (in a single NIC) have the same PCI id.
> 
> There are 2 reasons I didn't make "mac" mandatory:
> - it keeps the compatibility
> - in most cases, the pci id is good enough to identify a port
> 
> However, while writing this commit log, I think it might be better to
> use something like UUID for standardizing the port name. This way, we
> will always have a very consistent naming, no matter whether it's PCI
> device or vdev device and whether a PCI devices has 2 ports share the
> same PIC id, or something we have considered so far (say, few ports
> sharing same PCI and mac address :/).
> 
> It's also simpler and cleaner. The only drawback is such ID is meaningless
> to human.
> 
> Please also note that this patch just comes up with an API to query
> a port from standard name suggested above. The ethdev name isn't really
> standardized here. This patch is asking for comments after all.
> 
> Thoughts?
> 
> Cc: Thomas Monjalon <thomas@monjalon.net>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Cc: Ciara Loftus <ciara.loftus@intel.com>
> Cc: Kevin Traynor <ktraynor@redhat.com>
> Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>

Some thoughts.
1) Not all devices are PCI; look at recent VMBUS
2) The name may have to be set before MAC address is determined on boot.
3) The names themselves are not persistent or human friendly. This is hard
   see the effort udev goes to.
  
Thomas Monjalon Nov. 30, 2017, 5:35 p.m. UTC | #2
30/11/2017 18:15, Stephen Hemminger:
> Some thoughts.
> 1) Not all devices are PCI; look at recent VMBUS

Yes, we need a syntax which works for every devices.
I suggest to use the prefix "pci:" before the PCI id.
We need also a prefix and ids for NXP buses.
We could use "vmbus:" before VMBUS ids.
How VMBUS ids look like?

> 2) The name may have to be set before MAC address is determined on boot.

I don't understand this comment.
Do you mean MAC may be unknown when starting DPDK?

> 3) The names themselves are not persistent or human friendly. This is hard
>    see the effort udev goes to.

Yes udev has a syntax to identify devices. It can be inspiring.
Qemu may also be inspiring:
	https://github.com/qemu/qemu/blob/master/docs/qdev-device-use.txt
  
Stephen Hemminger Nov. 30, 2017, 9:21 p.m. UTC | #3
On Thu, 30 Nov 2017 18:35:11 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> 30/11/2017 18:15, Stephen Hemminger:
> > Some thoughts.
> > 1) Not all devices are PCI; look at recent VMBUS  
> 
> Yes, we need a syntax which works for every devices.
> I suggest to use the prefix "pci:" before the PCI id.
> We need also a prefix and ids for NXP buses.
> We could use "vmbus:" before VMBUS ids.
> How VMBUS ids look like?
> 
> > 2) The name may have to be set before MAC address is determined on boot.  
> 
> I don't understand this comment.
> Do you mean MAC may be unknown when starting DPDK?

The MAC be known by the hardware, but the device would have to be
created before using  hardware to read it.


> 
> > 3) The names themselves are not persistent or human friendly. This is hard
> >    see the effort udev goes to.  
> 
> Yes udev has a syntax to identify devices. It can be inspiring.
> Qemu may also be inspiring:
> 	https://github.com/qemu/qemu/blob/master/docs/qdev-device-use.txt
  
Thomas Monjalon Nov. 30, 2017, 9:44 p.m. UTC | #4
30/11/2017 22:21, Stephen Hemminger:
> On Thu, 30 Nov 2017 18:35:11 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > 30/11/2017 18:15, Stephen Hemminger:
> > > Some thoughts.
> > > 1) Not all devices are PCI; look at recent VMBUS  
> > 
> > Yes, we need a syntax which works for every devices.
> > I suggest to use the prefix "pci:" before the PCI id.
> > We need also a prefix and ids for NXP buses.
> > We could use "vmbus:" before VMBUS ids.
> > How VMBUS ids look like?
> > 
> > > 2) The name may have to be set before MAC address is determined on boot.  
> > 
> > I don't understand this comment.
> > Do you mean MAC may be unknown when starting DPDK?
> 
> The MAC be known by the hardware, but the device would have to be
> created before using  hardware to read it.

Indeed, it is a problem if we want to use this syntax for blacklist.


> > > 3) The names themselves are not persistent or human friendly. This is hard
> > >    see the effort udev goes to.  
> > 
> > Yes udev has a syntax to identify devices. It can be inspiring.
> > Qemu may also be inspiring:
> > 	https://github.com/qemu/qemu/blob/master/docs/qdev-device-use.txt
  
Gaëtan Rivet Dec. 1, 2017, 9:47 a.m. UTC | #5
On Thu, Nov 30, 2017 at 10:44:58PM +0100, Thomas Monjalon wrote:
> 30/11/2017 22:21, Stephen Hemminger:
> > On Thu, 30 Nov 2017 18:35:11 +0100
> > Thomas Monjalon <thomas@monjalon.net> wrote:
> > 
> > > 30/11/2017 18:15, Stephen Hemminger:
> > > > Some thoughts.
> > > > 1) Not all devices are PCI; look at recent VMBUS  
> > > 
> > > Yes, we need a syntax which works for every devices.
> > > I suggest to use the prefix "pci:" before the PCI id.
> > > We need also a prefix and ids for NXP buses.
> > > We could use "vmbus:" before VMBUS ids.
> > > How VMBUS ids look like?
> > > 

rte_devargs are easily accessible, user-readable. Only thing missing
would be requiring a 1-1 mapping between an rte_devargs and a port, thus
requiring PMDs to have at least one version of a device string that
would probe a single port (as is done with port= in mlx4).

Implementing an rte_devargs to rte_device in rte_bus is simple enough,
and this would allow implementing an rte_devargs to port_id in rte_eth.

What am I missing?

> > > > 2) The name may have to be set before MAC address is determined on boot.  
> > > 
> > > I don't understand this comment.
> > > Do you mean MAC may be unknown when starting DPDK?
> > 
> > The MAC be known by the hardware, but the device would have to be
> > created before using  hardware to read it.
> 
> Indeed, it is a problem if we want to use this syntax for blacklist.
> 
> 
> > > > 3) The names themselves are not persistent or human friendly. This is hard
> > > >    see the effort udev goes to.  
> > > 
> > > Yes udev has a syntax to identify devices. It can be inspiring.
> > > Qemu may also be inspiring:
> > > 	https://github.com/qemu/qemu/blob/master/docs/qdev-device-use.txt
>
  
Yuanhan Liu Dec. 4, 2017, 1:55 p.m. UTC | #6
On Fri, Dec 01, 2017 at 10:47:50AM +0100, Gaëtan Rivet wrote:
> On Thu, Nov 30, 2017 at 10:44:58PM +0100, Thomas Monjalon wrote:
> > 30/11/2017 22:21, Stephen Hemminger:
> > > On Thu, 30 Nov 2017 18:35:11 +0100
> > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > 
> > > > 30/11/2017 18:15, Stephen Hemminger:
> > > > > Some thoughts.
> > > > > 1) Not all devices are PCI; look at recent VMBUS  
> > > > 
> > > > Yes, we need a syntax which works for every devices.
> > > > I suggest to use the prefix "pci:" before the PCI id.
> > > > We need also a prefix and ids for NXP buses.
> > > > We could use "vmbus:" before VMBUS ids.
> > > > How VMBUS ids look like?
> > > > 
> 
> rte_devargs are easily accessible, user-readable. Only thing missing
> would be requiring a 1-1 mapping between an rte_devargs and a port, thus
> requiring PMDs to have at least one version of a device string that
> would probe a single port (as is done with port= in mlx4).
> 
> Implementing an rte_devargs to rte_device in rte_bus is simple enough,
> and this would allow implementing an rte_devargs to port_id in rte_eth.
> 
> What am I missing?

rte_devargs is identified by the name (pci id for pci device). It also
includes other driver specific key-value options. It's not clear for the
user to know which one (or few) of them should be used together with the
PCI id to identify a specific port. For example, as you mentioned, in
mlx4, it's "pci_id,port=x". It could be something else in other drivers.

Actually, this patch also proposes a devarg like naming style: "name[,mac]".
What it does try to do is to define a standard syntax, so that the user
doesn't have to know any driver specific options.

However, the mac address is changeable, leaving this naming inconsistent.
Well, in practice, PCI id is also changeable.

OTOH, having a consistent naming seems a bit far away from this patch's
goal: define a standard ethdev naming and leave less harassment to the users.

	--yliu
> 
> > > > > 2) The name may have to be set before MAC address is determined on boot.  
> > > > 
> > > > I don't understand this comment.
> > > > Do you mean MAC may be unknown when starting DPDK?
> > > 
> > > The MAC be known by the hardware, but the device would have to be
> > > created before using  hardware to read it.
> > 
> > Indeed, it is a problem if we want to use this syntax for blacklist.
> > 
> > 
> > > > > 3) The names themselves are not persistent or human friendly. This is hard
> > > > >    see the effort udev goes to.  
> > > > 
> > > > Yes udev has a syntax to identify devices. It can be inspiring.
> > > > Qemu may also be inspiring:
> > > > 	https://github.com/qemu/qemu/blob/master/docs/qdev-device-use.txt
> > 
> 
> -- 
> Gaëtan Rivet
> 6WIND
  
Adrien Mazarguil Dec. 5, 2017, 11:04 a.m. UTC | #7
Hi Yuanhan,

On Mon, Dec 04, 2017 at 09:55:31PM +0800, Yuanhan Liu wrote:
> On Fri, Dec 01, 2017 at 10:47:50AM +0100, Gaëtan Rivet wrote:
> > On Thu, Nov 30, 2017 at 10:44:58PM +0100, Thomas Monjalon wrote:
> > > 30/11/2017 22:21, Stephen Hemminger:
> > > > On Thu, 30 Nov 2017 18:35:11 +0100
> > > > Thomas Monjalon <thomas@monjalon.net> wrote:
> > > > 
> > > > > 30/11/2017 18:15, Stephen Hemminger:
> > > > > > Some thoughts.
> > > > > > 1) Not all devices are PCI; look at recent VMBUS  
> > > > > 
> > > > > Yes, we need a syntax which works for every devices.
> > > > > I suggest to use the prefix "pci:" before the PCI id.
> > > > > We need also a prefix and ids for NXP buses.
> > > > > We could use "vmbus:" before VMBUS ids.
> > > > > How VMBUS ids look like?
> > > > > 
> > 
> > rte_devargs are easily accessible, user-readable. Only thing missing
> > would be requiring a 1-1 mapping between an rte_devargs and a port, thus
> > requiring PMDs to have at least one version of a device string that
> > would probe a single port (as is done with port= in mlx4).
> > 
> > Implementing an rte_devargs to rte_device in rte_bus is simple enough,
> > and this would allow implementing an rte_devargs to port_id in rte_eth.
> > 
> > What am I missing?
> 
> rte_devargs is identified by the name (pci id for pci device). It also
> includes other driver specific key-value options. It's not clear for the
> user to know which one (or few) of them should be used together with the
> PCI id to identify a specific port. For example, as you mentioned, in
> mlx4, it's "pci_id,port=x". It could be something else in other drivers.

Just for information, this "port=x" argument in mlx4 is consistent with the
value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
a port index (instead of a MAC or something else), it would make sense to
standardize it on the same order as given by the host OS for consistency
across all PMDs.

Devices with a single port per PCI address would simply use/allow "0".

> Actually, this patch also proposes a devarg like naming style: "name[,mac]".
> What it does try to do is to define a standard syntax, so that the user
> doesn't have to know any driver specific options.
> 
> However, the mac address is changeable, leaving this naming inconsistent.
> Well, in practice, PCI id is also changeable.
> 
> OTOH, having a consistent naming seems a bit far away from this patch's
> goal: define a standard ethdev naming and leave less harassment to the users.

I'm not a fan of the MAC naming scheme either, a kind of per-device physical
port index seems more robust and doesn't require much initialization to
determine how many ports are supported by the device and whether the index is
known/valid (particularly given the vast majority exposes only one per bus
address).

Would it make sense to have this name usable unmodified as a valid device
(-w) argument, including parameters?

If so, PMDs could append parameters while probing the underlying device, by
appending ",port=x", ",mac=x" followed by other unspecified parameters with
default values. This could uniquely identify the port _and_ its
configuration in a reusable fashion.

Otherwise if all we need is unique names, we can use the opposite and much
simpler approach. Let librte_ether assign them sequentially
(e.g. "rte_eth%d", no need for consistency with OS netdevices), applications
can figure the rest based on data structures if needed.

Thoughts?

> > > > > > 2) The name may have to be set before MAC address is determined on boot.  
> > > > > 
> > > > > I don't understand this comment.
> > > > > Do you mean MAC may be unknown when starting DPDK?
> > > > 
> > > > The MAC be known by the hardware, but the device would have to be
> > > > created before using  hardware to read it.
> > > 
> > > Indeed, it is a problem if we want to use this syntax for blacklist.
> > > 
> > > 
> > > > > > 3) The names themselves are not persistent or human friendly. This is hard
> > > > > >    see the effort udev goes to.  
> > > > > 
> > > > > Yes udev has a syntax to identify devices. It can be inspiring.
> > > > > Qemu may also be inspiring:
> > > > > 	https://github.com/qemu/qemu/blob/master/docs/qdev-device-use.txt
  
Thomas Monjalon Dec. 5, 2017, 1:20 p.m. UTC | #8
05/12/2017 12:04, Adrien Mazarguil:
> Hi Yuanhan,
> 
> On Mon, Dec 04, 2017 at 09:55:31PM +0800, Yuanhan Liu wrote:
> > rte_devargs is identified by the name (pci id for pci device). It also
> > includes other driver specific key-value options. It's not clear for the
> > user to know which one (or few) of them should be used together with the
> > PCI id to identify a specific port. For example, as you mentioned, in
> > mlx4, it's "pci_id,port=x". It could be something else in other drivers.
> 
> Just for information, this "port=x" argument in mlx4 is consistent with the
> value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
> a port index (instead of a MAC or something else), it would make sense to
> standardize it on the same order as given by the host OS for consistency
> across all PMDs.

Good idea, thanks.

I think we will MAC in some cases and port number in others.
It is important to have identifiers available even without initializing
the device.

> Devices with a single port per PCI address would simply use/allow "0".
> 
> > Actually, this patch also proposes a devarg like naming style: "name[,mac]".
> > What it does try to do is to define a standard syntax, so that the user
> > doesn't have to know any driver specific options.
> > 
> > However, the mac address is changeable, leaving this naming inconsistent.
> > Well, in practice, PCI id is also changeable.
> > 
> > OTOH, having a consistent naming seems a bit far away from this patch's
> > goal: define a standard ethdev naming and leave less harassment to the users.
> 
> I'm not a fan of the MAC naming scheme either, a kind of per-device physical
> port index seems more robust and doesn't require much initialization to
> determine how many ports are supported by the device and whether the index is
> known/valid (particularly given the vast majority exposes only one per bus
> address).
> 
> Would it make sense to have this name usable unmodified as a valid device
> (-w) argument, including parameters?

Yes we must provide some new key/value arguments for devargs.
So it can be used anywhere, including -w/-b options in DPDK
or port configuration in OVS.

> If so, PMDs could append parameters while probing the underlying device, by
> appending ",port=x", ",mac=x" followed by other unspecified parameters with
> default values. This could uniquely identify the port _and_ its
> configuration in a reusable fashion.

Question: should we separate device identification and configuration
in the syntax?

> Otherwise if all we need is unique names, we can use the opposite and much
> simpler approach. Let librte_ether assign them sequentially
> (e.g. "rte_eth%d", no need for consistency with OS netdevices), applications
> can figure the rest based on data structures if needed.

No, unique names are not useful / not usable by users.
  
Yuanhan Liu Dec. 5, 2017, 1:58 p.m. UTC | #9
On Tue, Dec 05, 2017 at 02:20:05PM +0100, Thomas Monjalon wrote:
> 05/12/2017 12:04, Adrien Mazarguil:
> 
> > Hi Yuanhan,
> 
> >
> 
> > On Mon, Dec 04, 2017 at 09:55:31PM +0800, Yuanhan Liu wrote:
> > > rte_devargs is identified by the name (pci id for pci device). It also
> > > includes other driver specific key-value options. It's not clear for the
> > > user to know which one (or few) of them should be used together with the
> > > PCI id to identify a specific port. For example, as you mentioned, in
> > > mlx4, it's "pci_id,port=x". It could be something else in other drivers.
> 
> > Just for information, this "port=x" argument in mlx4 is consistent with the
> > value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
> > a port index (instead of a MAC or something else), it would make sense to
> > standardize it on the same order as given by the host OS for consistency
> > across all PMDs.

Thanks for the info.

But FYI, for most of other PMDs, such sys file won't exist, as the host
driver should have been unbind and bind with something like uio. So I
don't think it applies to all other nics.

> 
> Good idea, thanks.
> 
>  
> 
> I think we will MAC in some cases and port number in others.

Could you list some examples?

> It is important to have identifiers available even without initializing
> the device.

I don't object that. I think that's good for whitelisting/blacklisting
before the device initialization. However, once the initialization is
done, everything could be used to identify a port (say, the mac this patch
mentioned).

> > Devices with a single port per PCI address would simply use/allow "0".

Yes.

> >
> 
> > > Actually, this patch also proposes a devarg like naming style: "name[,mac]
> ".
> > > What it does try to do is to define a standard syntax, so that the user
> > > doesn't have to know any driver specific options.
> > >
> > > However, the mac address is changeable, leaving this naming inconsistent.
> > > Well, in practice, PCI id is also changeable.
> 
> > >
> 
> > > OTOH, having a consistent naming seems a bit far away from this patch's
> > > goal: define a standard ethdev naming and leave less harassment to the
> users.
> 
> >
> 
> > I'm not a fan of the MAC naming scheme either, a kind of per-device physical
> > port index seems more robust and doesn't require much initialization to
> > determine how many ports are supported by the device and whether the index is
> > known/valid (particularly given the vast majority exposes only one per bus
> > address).

I'm not a fan of the MAC naming, neither. The reason this patch proposes mac
naming is that it's more clear for the user to specify a port. I also agree
that the port index could be another good option.

One thing I really haven't considered yet is how it becomes when the VF
reprenstor comes to play? (more guys are cc'ed).

> > Would it make sense to have this name usable unmodified as a valid device
> > (-w) argument, including parameters?
> 
> Yes we must provide some new key/value arguments for devargs.
> So it can be used anywhere, including -w/-b options in DPDK
> or port configuration in OVS.
>  
> 
> > If so, PMDs could append parameters while probing the underlying device, by
> > appending ",port=x", ",mac=x" followed by other unspecified parameters with
> > default values. This could uniquely identify the port _and_ its
> > configuration in a reusable fashion.
> 
>  
> 
> Question: should we separate device identification and configuration
> in the syntax?

Yes, we should. I think we should parse the identification generically,
and leave the left to the driver.

> > Otherwise if all we need is unique names, we can use the opposite and much
> > simpler approach. Let librte_ether assign them sequentially
> > (e.g. "rte_eth%d", no need for consistency with OS netdevices), applications
> > can figure the rest based on data structures if needed.
> 
> No, unique names are not useful / not usable by users.

Agree. We don't really need another unique name, since the port id is
already unique.

However, I think a consistent naming might be useful: user doesn't have
to worry it even though MAC is changed or the PCI slot is changed.

	--yliu
  
Thomas Monjalon Dec. 5, 2017, 3:28 p.m. UTC | #10
05/12/2017 14:58, Yuanhan Liu:
> On Tue, Dec 05, 2017 at 02:20:05PM +0100, Thomas Monjalon wrote:
> > 05/12/2017 12:04, Adrien Mazarguil:
> > 
> > > Hi Yuanhan,
> > 
> > >
> > 
> > > On Mon, Dec 04, 2017 at 09:55:31PM +0800, Yuanhan Liu wrote:
> > > > rte_devargs is identified by the name (pci id for pci device). It also
> > > > includes other driver specific key-value options. It's not clear for the
> > > > user to know which one (or few) of them should be used together with the
> > > > PCI id to identify a specific port. For example, as you mentioned, in
> > > > mlx4, it's "pci_id,port=x". It could be something else in other drivers.
> > 
> > > Just for information, this "port=x" argument in mlx4 is consistent with the
> > > value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
> > > a port index (instead of a MAC or something else), it would make sense to
> > > standardize it on the same order as given by the host OS for consistency
> > > across all PMDs.
> 
> Thanks for the info.
> 
> But FYI, for most of other PMDs, such sys file won't exist, as the host
> driver should have been unbind and bind with something like uio. So I
> don't think it applies to all other nics.
> 
> > 
> > Good idea, thanks.
> > 
> >  
> > 
> > I think we will MAC in some cases and port number in others.
> 
> Could you list some examples?

Port number could be used for Mellanox devices.
MAC could be used when devices are already probed in DPDK,
but can it be used for blacklisting? Do we know the MAC address?


> > It is important to have identifiers available even without initializing
> > the device.
> 
> I don't object that. I think that's good for whitelisting/blacklisting
> before the device initialization. However, once the initialization is
> done, everything could be used to identify a port (say, the mac this patch
> mentioned).
> 
> > > Devices with a single port per PCI address would simply use/allow "0".
> 
> Yes.
> 
> > >
> > 
> > > > Actually, this patch also proposes a devarg like naming style: "name[,mac]
> > ".
> > > > What it does try to do is to define a standard syntax, so that the user
> > > > doesn't have to know any driver specific options.
> > > >
> > > > However, the mac address is changeable, leaving this naming inconsistent.
> > > > Well, in practice, PCI id is also changeable.
> > 
> > > >
> > 
> > > > OTOH, having a consistent naming seems a bit far away from this patch's
> > > > goal: define a standard ethdev naming and leave less harassment to the
> > users.
> > 
> > >
> > 
> > > I'm not a fan of the MAC naming scheme either, a kind of per-device physical
> > > port index seems more robust and doesn't require much initialization to
> > > determine how many ports are supported by the device and whether the index is
> > > known/valid (particularly given the vast majority exposes only one per bus
> > > address).
> 
> I'm not a fan of the MAC naming, neither. The reason this patch proposes mac
> naming is that it's more clear for the user to specify a port. I also agree
> that the port index could be another good option.
> 
> One thing I really haven't considered yet is how it becomes when the VF
> reprenstor comes to play? (more guys are cc'ed).
> 
> > > Would it make sense to have this name usable unmodified as a valid device
> > > (-w) argument, including parameters?
> > 
> > Yes we must provide some new key/value arguments for devargs.
> > So it can be used anywhere, including -w/-b options in DPDK
> > or port configuration in OVS.
> >  
> > 
> > > If so, PMDs could append parameters while probing the underlying device, by
> > > appending ",port=x", ",mac=x" followed by other unspecified parameters with
> > > default values. This could uniquely identify the port _and_ its
> > > configuration in a reusable fashion.
> > 
> >  
> > 
> > Question: should we separate device identification and configuration
> > in the syntax?
> 
> Yes, we should. I think we should parse the identification generically,
> and leave the left to the driver.

So we must take care of a syntax to separate them.
For instance, MAC can be used for identification or configuration (changing
the MAC by user input).


> > > Otherwise if all we need is unique names, we can use the opposite and much
> > > simpler approach. Let librte_ether assign them sequentially
> > > (e.g. "rte_eth%d", no need for consistency with OS netdevices), applications
> > > can figure the rest based on data structures if needed.
> > 
> > No, unique names are not useful / not usable by users.
> 
> Agree. We don't really need another unique name, since the port id is
> already unique.
> 
> However, I think a consistent naming might be useful: user doesn't have
> to worry it even though MAC is changed or the PCI slot is changed.
> 
> 	--yliu
>
  
Adrien Mazarguil Dec. 5, 2017, 5:22 p.m. UTC | #11
On Tue, Dec 05, 2017 at 04:28:00PM +0100, Thomas Monjalon wrote:
> 05/12/2017 14:58, Yuanhan Liu:
> > On Tue, Dec 05, 2017 at 02:20:05PM +0100, Thomas Monjalon wrote:
> > > 05/12/2017 12:04, Adrien Mazarguil:
> > > 
> > > > Hi Yuanhan,
> > > 
> > > >
> > > 
> > > > On Mon, Dec 04, 2017 at 09:55:31PM +0800, Yuanhan Liu wrote:
> > > > > rte_devargs is identified by the name (pci id for pci device). It also
> > > > > includes other driver specific key-value options. It's not clear for the
> > > > > user to know which one (or few) of them should be used together with the
> > > > > PCI id to identify a specific port. For example, as you mentioned, in
> > > > > mlx4, it's "pci_id,port=x". It could be something else in other drivers.
> > > 
> > > > Just for information, this "port=x" argument in mlx4 is consistent with the
> > > > value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
> > > > a port index (instead of a MAC or something else), it would make sense to
> > > > standardize it on the same order as given by the host OS for consistency
> > > > across all PMDs.
> > 
> > Thanks for the info.
> > 
> > But FYI, for most of other PMDs, such sys file won't exist, as the host
> > driver should have been unbind and bind with something like uio. So I
> > don't think it applies to all other nics.

Sure, I only meant PMDs must implement the same numbering scheme the kernel
driver they replace would have used (as exposed through dev_port) for
consistency. Note dev_port is always present since Linux 3.15, even with
single port/bus address devices, so applications that want to construct
-w/-b arguments can rely on that before unbinding devices.

> > > Good idea, thanks.
> > > 
> > >  
> > > 
> > > I think we will MAC in some cases and port number in others.
> > 
> > Could you list some examples?
> 
> Port number could be used for Mellanox devices.
> MAC could be used when devices are already probed in DPDK,
> but can it be used for blacklisting? Do we know the MAC address?

Problem as described in a prior message (below) is that once a device is
initialized, its MAC can be modified while the device name cannot. This will
lead to confusion and probably bug reports.

Besides, it conflicts with the tap PMD that already takes a "mac" argument
to assign a default MAC address, not look for a device with the given MAC.

> > > It is important to have identifiers available even without initializing
> > > the device.
> > 
> > I don't object that. I think that's good for whitelisting/blacklisting
> > before the device initialization. However, once the initialization is
> > done, everything could be used to identify a port (say, the mac this patch
> > mentioned).
> > 
> > > > Devices with a single port per PCI address would simply use/allow "0".
> > 
> > Yes.
> > 
> > > >
> > > 
> > > > > Actually, this patch also proposes a devarg like naming style: "name[,mac]
> > > ".
> > > > > What it does try to do is to define a standard syntax, so that the user
> > > > > doesn't have to know any driver specific options.
> > > > >
> > > > > However, the mac address is changeable, leaving this naming inconsistent.
> > > > > Well, in practice, PCI id is also changeable.
> > > 
> > > > >
> > > 
> > > > > OTOH, having a consistent naming seems a bit far away from this patch's
> > > > > goal: define a standard ethdev naming and leave less harassment to the
> > > users.
> > > 
> > > >
> > > 
> > > > I'm not a fan of the MAC naming scheme either, a kind of per-device physical
> > > > port index seems more robust and doesn't require much initialization to
> > > > determine how many ports are supported by the device and whether the index is
> > > > known/valid (particularly given the vast majority exposes only one per bus
> > > > address).
> > 
> > I'm not a fan of the MAC naming, neither. The reason this patch proposes mac
> > naming is that it's more clear for the user to specify a port. I also agree
> > that the port index could be another good option.
> > 
> > One thing I really haven't considered yet is how it becomes when the VF
> > reprenstor comes to play? (more guys are cc'ed).

Won't VFs come through a separate PCI bus address anyway? Not sure extra
care is needed for those.

> > > > Would it make sense to have this name usable unmodified as a valid device
> > > > (-w) argument, including parameters?
> > > 
> > > Yes we must provide some new key/value arguments for devargs.
> > > So it can be used anywhere, including -w/-b options in DPDK
> > > or port configuration in OVS.
> > >  
> > > 
> > > > If so, PMDs could append parameters while probing the underlying device, by
> > > > appending ",port=x", ",mac=x" followed by other unspecified parameters with
> > > > default values. This could uniquely identify the port _and_ its
> > > > configuration in a reusable fashion.
> > > 
> > >  
> > > 
> > > Question: should we separate device identification and configuration
> > > in the syntax?
> > 
> > Yes, we should. I think we should parse the identification generically,
> > and leave the left to the driver.
> 
> So we must take care of a syntax to separate them.
> For instance, MAC can be used for identification or configuration (changing
> the MAC by user input).

Is it worth the hassle though? Simply documenting that "port" and/or "mac"
devargs are special and interpreted by both ethdev and the PMD to pinpoint
the exact port(s) to use, right? (although I don't recommend "mac" for the
above reasons)

Keep in mind that with mlx4, the port argument can be provided multiple
times to enable them all and generate several ethdev out of a single device
argument. Each of them should retain a single "port=X" value in their name.

> > > > Otherwise if all we need is unique names, we can use the opposite and much
> > > > simpler approach. Let librte_ether assign them sequentially
> > > > (e.g. "rte_eth%d", no need for consistency with OS netdevices), applications
> > > > can figure the rest based on data structures if needed.
> > > 
> > > No, unique names are not useful / not usable by users.
> > 
> > Agree. We don't really need another unique name, since the port id is
> > already unique.
> > 
> > However, I think a consistent naming might be useful: user doesn't have
> > to worry it even though MAC is changed or the PCI slot is changed.

Contrary to MAC, PCI bus addresses are unlikely to change from within DPDK,
there are no devops to modify them. Such an event would be triggered by
external means and should be handled through the hot-plug subsystem. The
same goes for a physical port ID in my opinion.

I think we can only rely on this kind of non-configurable properties.
  
Yuanhan Liu Dec. 6, 2017, 3:49 p.m. UTC | #12
On Tue, Dec 05, 2017 at 06:22:05PM +0100, Adrien Mazarguil wrote:
> > > > > Just for information, this "port=x" argument in mlx4 is consistent with the
> > > > > value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
> > > > > a port index (instead of a MAC or something else), it would make sense to
> > > > > standardize it on the same order as given by the host OS for consistency
> > > > > across all PMDs.
> > > 
> > > Thanks for the info.
> > > 
> > > But FYI, for most of other PMDs, such sys file won't exist, as the host
> > > driver should have been unbind and bind with something like uio. So I
> > > don't think it applies to all other nics.
> 
> Sure, I only meant PMDs must implement the same numbering scheme the kernel
> driver they replace would have used (as exposed through dev_port) for
> consistency. Note dev_port is always present since Linux 3.15, even with
> single port/bus address devices, so applications that want to construct
> -w/-b arguments can rely on that before unbinding devices.

I don't think that's a clean way. Fortunate enough though, even we
want to use the port as one of the key for identification, we don't
really need that in most cases. So, just like the mac proposed here,
we could (and probably should) make it optional.

[...]
> > > I'm not a fan of the MAC naming, neither. The reason this patch proposes mac
> > > naming is that it's more clear for the user to specify a port. I also agree
> > > that the port index could be another good option.
> > > 
> > > One thing I really haven't considered yet is how it becomes when the VF
> > > reprenstor comes to play? (more guys are cc'ed).
> 
> Won't VFs come through a separate PCI bus address anyway? Not sure extra
> care is needed for those.

Yes, for VF. But I was talking about VF representors [1]. Note that the
interface is not settled down yet, but it's likely we need some interfaces
(or more precisely, some concepts) like that. In that proposal, several
ports could share a same vdev name, while each one is differenced by a
vport_id. Again, it's not settled down yet, it might be something else
in the end, say all share a same (PF) PCI id, while each one is identified
by something else, etc.

[1]: http://dpdk.org/ml/archives/dev/2017-November/080946.html

	--yliu
  
Thomas Monjalon Dec. 18, 2017, 10:25 p.m. UTC | #13
05/12/2017 14:20, Thomas Monjalon:
> 05/12/2017 12:04, Adrien Mazarguil:
> > Just for information, this "port=x" argument in mlx4 is consistent with the
> > value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
> > a port index (instead of a MAC or something else), it would make sense to
> > standardize it on the same order as given by the host OS for consistency
> > across all PMDs.
> 
> Good idea, thanks.

dev_port is implemented in Linux for few devices:

% git grep -l '\<dev_port\>' drivers/net/ethernet
drivers/net/ethernet/broadcom/bnxt/bnxt.c
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
drivers/net/ethernet/intel/i40e/i40e_fcoe.c
drivers/net/ethernet/mellanox/mlx4/en_netdev.c
  
Stephen Hemminger Dec. 18, 2017, 10:30 p.m. UTC | #14
On Mon, 18 Dec 2017 23:25:37 +0100
Thomas Monjalon <thomas@monjalon.net> wrote:

> 05/12/2017 14:20, Thomas Monjalon:
> > 05/12/2017 12:04, Adrien Mazarguil:  
> > > Just for information, this "port=x" argument in mlx4 is consistent with the
> > > value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
> > > a port index (instead of a MAC or something else), it would make sense to
> > > standardize it on the same order as given by the host OS for consistency
> > > across all PMDs.  
> > 
> > Good idea, thanks.  
> 
> dev_port is implemented in Linux for few devices:
> 
> % git grep -l '\<dev_port\>' drivers/net/ethernet
> drivers/net/ethernet/broadcom/bnxt/bnxt.c
> drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
> drivers/net/ethernet/intel/i40e/i40e_fcoe.c
> drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> 

port is used to distinguish multiple physical ports on a single card.
It is not global like the DPDK port.
  
Thomas Monjalon Dec. 18, 2017, 10:41 p.m. UTC | #15
18/12/2017 23:30, Stephen Hemminger:
> On Mon, 18 Dec 2017 23:25:37 +0100
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > 05/12/2017 14:20, Thomas Monjalon:
> > > 05/12/2017 12:04, Adrien Mazarguil:  
> > > > Just for information, this "port=x" argument in mlx4 is consistent with the
> > > > value found in /sys/class/net/ethX/dev_port under Linux. If we choose to use
> > > > a port index (instead of a MAC or something else), it would make sense to
> > > > standardize it on the same order as given by the host OS for consistency
> > > > across all PMDs.  
> > > 
> > > Good idea, thanks.  
> > 
> > dev_port is implemented in Linux for few devices:
> > 
> > % git grep -l '\<dev_port\>' drivers/net/ethernet
> > drivers/net/ethernet/broadcom/bnxt/bnxt.c
> > drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> > drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c
> > drivers/net/ethernet/intel/i40e/i40e_fcoe.c
> > drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> > 
> 
> port is used to distinguish multiple physical ports on a single card.
> It is not global like the DPDK port.

Yes, it is an additional property helpful to identify devices.
  
Thomas Monjalon Dec. 18, 2017, 11:05 p.m. UTC | #16
Let's summarize and resume this thread.

We need a generic syntax to describe a device.
This syntax can be used
	- before initializing the device (i.e. whitelist/blacklist)
	- or after the initialization (e.g. user config)

We need to answer 4 questions:
1/ what are the separators (comma, colon, etc)?
2/ how to distinguish a device identification from a configuration?
3/ what are the mandatory parts?
4/ what can be the optional properties?

30/11/2017 08:35, Yuanhan Liu:
> What this patch proposes is to use "name[,mac]" syntax. "name" is the
> PCI id for pci device. For vdev, it's the vdev name given by user. The
> reason "mac" is needed is for some devices (say ConnectX-3), 2 ports
> (in a single NIC) have the same PCI id.

Based on the feedbacks we had, I suggest a syntax where everything is
optional key/value pairs, and split in 3 categories:
	- bus (pci, vdev, vmbus, fslmc, etc)
	- class (eth, crypto)
	- driver (i40e, mlx5, virtio, etc)

Between categories, the separator is a slash.
Inside a category, the separator is a comma.
Inside a key/value pair, the separator is an equal sign.

It may look like this:
bus=BUS_NAME,id=BUS_ID/class=CLASS_NAME,dev_port=PORT_NUM,mac=MAC_ADDRESS/driver=DRIVER_NAME,driverspecificproperty=VALUE

A device is identified when every properties are matched.
Before device is probed, only the bus category is relevant.
For the simple PCI whitelist, it means moving from
	-w 0000:01:00.0
to
	-w bus=pci,id=0000:01:00.0

It is possible to mix some settings in these devargs syntax if the keys
are differents. Example: mac= is for identification by MAC, whereas
newmac= would be for specifying a MAC address to set.

Agreement?
  
Thomas Monjalon Dec. 20, 2017, 10:02 p.m. UTC | #17
Changing the title and adding more comments inline:

19/12/2017 00:05, Thomas Monjalon:
> Let's summarize and resume this thread.
> 
> We need a generic syntax to describe a device.
> This syntax can be used
> 	- before initializing the device (i.e. whitelist/blacklist)
> 	- or after the initialization (e.g. user config)
> 
> We need to answer 4 questions:
> 1/ what are the separators (comma, colon, etc)?
> 2/ how to distinguish a device identification from a configuration?
> 3/ what are the mandatory parts?
> 4/ what can be the optional properties?
> 
> 30/11/2017 08:35, Yuanhan Liu:
> > What this patch proposes is to use "name[,mac]" syntax. "name" is the
> > PCI id for pci device. For vdev, it's the vdev name given by user. The
> > reason "mac" is needed is for some devices (say ConnectX-3), 2 ports
> > (in a single NIC) have the same PCI id.
> 
> Based on the feedbacks we had, I suggest a syntax where everything is
> optional key/value pairs, and split in 3 categories:
> 	- bus (pci, vdev, vmbus, fslmc, etc)
> 	- class (eth, crypto)
> 	- driver (i40e, mlx5, virtio, etc)

The key/value pair describing the category scope is mandatory
and must be the first pair in the category properties.
Example: bus=pci, must be placed before id=0000:01:00.0

> Between categories, the separator is a slash.
> Inside a category, the separator is a comma.
> Inside a key/value pair, the separator is an equal sign.
> 
> It may look like this:
> bus=BUS_NAME,id=BUS_ID/class=CLASS_NAME,dev_port=PORT_NUM,mac=MAC_ADDRESS/driver=DRIVER_NAME,driverspecificproperty=VALUE
> 
> A device is identified when every properties are matched.
> Before device is probed, only the bus category is relevant.
> For the simple PCI whitelist, it means moving from
> 	-w 0000:01:00.0
> to
> 	-w bus=pci,id=0000:01:00.0
> 
> It is possible to mix some settings in these devargs syntax if the keys
> are differents. Example: mac= is for identification by MAC, whereas
> newmac= would be for specifying a MAC address to set.
> 
> Agreement?

Yuanhan is proposing to use this syntax in OVS option dpdk-devargs:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/342273.html

Please, any feedback or approval that this syntax is good?
  
Shreyansh Jain Dec. 22, 2017, 7:01 a.m. UTC | #18
On Thursday 21 December 2017 03:32 AM, Thomas Monjalon wrote:
> Changing the title and adding more comments inline:
> 
> 19/12/2017 00:05, Thomas Monjalon:
>> Let's summarize and resume this thread.
>>
>> We need a generic syntax to describe a device.
>> This syntax can be used
>> 	- before initializing the device (i.e. whitelist/blacklist)
>> 	- or after the initialization (e.g. user config)
>>
>> We need to answer 4 questions:
>> 1/ what are the separators (comma, colon, etc)?
>> 2/ how to distinguish a device identification from a configuration?
>> 3/ what are the mandatory parts?
>> 4/ what can be the optional properties?
>>
>> 30/11/2017 08:35, Yuanhan Liu:
>>> What this patch proposes is to use "name[,mac]" syntax. "name" is the
>>> PCI id for pci device. For vdev, it's the vdev name given by user. The
>>> reason "mac" is needed is for some devices (say ConnectX-3), 2 ports
>>> (in a single NIC) have the same PCI id.
>>
>> Based on the feedbacks we had, I suggest a syntax where everything is
>> optional key/value pairs, and split in 3 categories:
>> 	- bus (pci, vdev, vmbus, fslmc, etc)
>> 	- class (eth, crypto)
>> 	- driver (i40e, mlx5, virtio, etc)
> 
> The key/value pair describing the category scope is mandatory
> and must be the first pair in the category properties.
> Example: bus=pci, must be placed before id=0000:01:00.0
> 
>> Between categories, the separator is a slash.

Why is a '/' required as a separator? Are you expecting the key in 
key,value pair to duplicate across categories?

>> Inside a category, the separator is a comma.
>> Inside a key/value pair, the separator is an equal sign.

sounds reasonable to me.

>>
>> It may look like this:
>> bus=BUS_NAME,id=BUS_ID/class=CLASS_NAME,dev_port=PORT_NUM,mac=MAC_ADDRESS/driver=DRIVER_NAME,driverspecificproperty=VALUE

If I take cue from fslmc and dpaa bus:

for fslmc: bus=fslmc,id=dpni.1
            bus=fslmc,id=dpsec.1
for dpaa: bus=dpaa,id=fm1-mac1
           bus=dpaa,id=dpaa-sec1

So, at least from fslmc/dpaa perspective, above fits.

Just want to highlight: in some cases the device names can contain ',' - 
but then, that can be handled at bus scan level.
For dpaa bus, the device identified from platform identifiers contains a 
longer ',' separated string - which is then stripped to form a name like 
'fm1-mac1' before being added to device->name.

>>
>> A device is identified when every properties are matched.

So, a ovs-dpdk user would have to know a dpdk bus to identify a device 
to plug into a OVS bridge?

>> Before device is probed, only the bus category is relevant.
>> For the simple PCI whitelist, it means moving from
>> 	-w 0000:01:00.0
>> to
>> 	-w bus=pci,id=0000:01:00.0

OK

>>
>> It is possible to mix some settings in these devargs syntax if the keys
>> are differents. Example: mac= is for identification by MAC, whereas
>> newmac= would be for specifying a MAC address to set.
>>
>> Agreement?

in principle, yes.
just need clarity on '/' as a separator.

> 
> Yuanhan is proposing to use this syntax in OVS option dpdk-devargs:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/342273.html
> 
> Please, any feedback or approval that this syntax is good?
>
  
Thomas Monjalon Dec. 22, 2017, 9 a.m. UTC | #19
22/12/2017 08:01, Shreyansh Jain:
> On Thursday 21 December 2017 03:32 AM, Thomas Monjalon wrote:
> > Changing the title and adding more comments inline:
> > 
> > 19/12/2017 00:05, Thomas Monjalon:
> >> Let's summarize and resume this thread.
> >>
> >> We need a generic syntax to describe a device.
> >> This syntax can be used
> >> 	- before initializing the device (i.e. whitelist/blacklist)
> >> 	- or after the initialization (e.g. user config)
> >>
> >> We need to answer 4 questions:
> >> 1/ what are the separators (comma, colon, etc)?
> >> 2/ how to distinguish a device identification from a configuration?
> >> 3/ what are the mandatory parts?
> >> 4/ what can be the optional properties?
> >>
> >> 30/11/2017 08:35, Yuanhan Liu:
> >>> What this patch proposes is to use "name[,mac]" syntax. "name" is the
> >>> PCI id for pci device. For vdev, it's the vdev name given by user. The
> >>> reason "mac" is needed is for some devices (say ConnectX-3), 2 ports
> >>> (in a single NIC) have the same PCI id.
> >>
> >> Based on the feedbacks we had, I suggest a syntax where everything is
> >> optional key/value pairs, and split in 3 categories:
> >> 	- bus (pci, vdev, vmbus, fslmc, etc)
> >> 	- class (eth, crypto)
> >> 	- driver (i40e, mlx5, virtio, etc)
> > 
> > The key/value pair describing the category scope is mandatory
> > and must be the first pair in the category properties.
> > Example: bus=pci, must be placed before id=0000:01:00.0
> > 
> >> Between categories, the separator is a slash.
> 
> Why is a '/' required as a separator? Are you expecting the key in 
> key,value pair to duplicate across categories?

We need to separate categories because each category is parsed by
a different parser.
The first level parser will get strings for each category and will
call the appropriate parser (by going through different levels of
bus parsers for bus and driver categories).

> >> Inside a category, the separator is a comma.
> >> Inside a key/value pair, the separator is an equal sign.
> 
> sounds reasonable to me.
> 
> >>
> >> It may look like this:
> >> bus=BUS_NAME,id=BUS_ID/class=CLASS_NAME,dev_port=PORT_NUM,mac=MAC_ADDRESS/driver=DRIVER_NAME,driverspecificproperty=VALUE
> 
> If I take cue from fslmc and dpaa bus:
> 
> for fslmc: bus=fslmc,id=dpni.1
>             bus=fslmc,id=dpsec.1
> for dpaa: bus=dpaa,id=fm1-mac1
>            bus=dpaa,id=dpaa-sec1
> 
> So, at least from fslmc/dpaa perspective, above fits.
> 
> Just want to highlight: in some cases the device names can contain ',' - 
> but then, that can be handled at bus scan level.
> For dpaa bus, the device identified from platform identifiers contains a 
> longer ',' separated string - which is then stripped to form a name like 
> 'fm1-mac1' before being added to device->name.
> 
> >>
> >> A device is identified when every properties are matched.
> 
> So, a ovs-dpdk user would have to know a dpdk bus to identify a device 
> to plug into a OVS bridge?

Yes, the user must know the bus if using a bus id.
But he can use another kind of id like the MAC address.

> >> Before device is probed, only the bus category is relevant.
> >> For the simple PCI whitelist, it means moving from
> >> 	-w 0000:01:00.0
> >> to
> >> 	-w bus=pci,id=0000:01:00.0
> 
> OK
> 
> >> It is possible to mix some settings in these devargs syntax if the keys
> >> are differents. Example: mac= is for identification by MAC, whereas
> >> newmac= would be for specifying a MAC address to set.
> >>
> >> Agreement?
> 
> in principle, yes.
> just need clarity on '/' as a separator.

Thanks for reviewing.
  
Finn Christensen Jan. 5, 2018, 7:52 a.m. UTC | #20
-----Original Message-----
    From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
    Monjalon
    Sent: 20. december 2017 23:03
    To: dev@dpdk.org
    Cc: Yuanhan Liu <yliu@fridaylinux.org>; Adrien Mazarguil
    <adrien.mazarguil@6wind.com>; Ciara Loftus <ciara.loftus@intel.com>;
    Kevin Traynor <ktraynor@redhat.com>; stephen@networkplumber.org;
    ferruh.yigit@intel.com
    Subject: Re: [dpdk-dev] standardize device identification
    
    Changing the title and adding more comments inline:
    
    19/12/2017 00:05, Thomas Monjalon:
    > Let's summarize and resume this thread.
    >
    > We need a generic syntax to describe a device.
    > This syntax can be used
    > 	- before initializing the device (i.e. whitelist/blacklist)
    > 	- or after the initialization (e.g. user config)
    >
    > We need to answer 4 questions:
    > 1/ what are the separators (comma, colon, etc)?
    > 2/ how to distinguish a device identification from a configuration?
    > 3/ what are the mandatory parts?
    > 4/ what can be the optional properties?
    >
    > 30/11/2017 08:35, Yuanhan Liu:
    > > What this patch proposes is to use "name[,mac]" syntax. "name" is
    > > the PCI id for pci device. For vdev, it's the vdev name given by
    > > user. The reason "mac" is needed is for some devices (say
    > > ConnectX-3), 2 ports (in a single NIC) have the same PCI id.
    >
    > Based on the feedbacks we had, I suggest a syntax where everything is
    > optional key/value pairs, and split in 3 categories:
    > 	- bus (pci, vdev, vmbus, fslmc, etc)
    > 	- class (eth, crypto)
    > 	- driver (i40e, mlx5, virtio, etc)
    
    The key/value pair describing the category scope is mandatory and must
    be the first pair in the category properties.
    Example: bus=pci, must be placed before id=0000:01:00.0
    
    > Between categories, the separator is a slash.
    > Inside a category, the separator is a comma.
    > Inside a key/value pair, the separator is an equal sign.
    >
    > It may look like this:
    >
    bus=BUS_NAME,id=BUS_ID/class=CLASS_NAME,dev_port=PORT_NUM,m
    ac=MAC_ADDR
    > ESS/driver=DRIVER_NAME,driverspecificproperty=VALUE
    >
    > A device is identified when every properties are matched.
    > Before device is probed, only the bus category is relevant.
    > For the simple PCI whitelist, it means moving from
    > 	-w 0000:01:00.0
    > to
    > 	-w bus=pci,id=0000:01:00.0
    >
    > It is possible to mix some settings in these devargs syntax if the
    > keys are differents. Example: mac= is for identification by MAC,
    > whereas newmac= would be for specifying a MAC address to set.
    >
    > Agreement?
    
 
We also need to distinguish between multiple ports sitting on same PCI bus ID.
and from our point of view, this will fully cover our needs.
Thanks - great proposal.

Regards,
Finn Christensen, Napatech


    Yuanhan is proposing to use this syntax in OVS option dpdk-devargs:
    https://mail.openvswitch.org/pipermail/ovs-dev/2017-
    December/342273.html
    
    Please, any feedback or approval that this syntax is good?
  
Thomas Monjalon Jan. 5, 2018, 8:39 a.m. UTC | #21
05/01/2018 08:52, Finn Christensen:
> From: Thomas
> > It may look like this:
> >
> > bus=BUS_NAME,id=BUS_ID/class=CLASS_NAME,dev_port=PORT_NUM,
> > mac=MAC_ADDRESS/driver=DRIVER_NAME,driverspecificproperty=VALUE
[...] 
> We also need to distinguish between multiple ports sitting on same PCI bus ID.
> and from our point of view, this will fully cover our needs.

Which property can help to distinguish Napatech ports?
Can you use class=eth,dev_port=X ?
The dev_port property will use /sys/class/net/DEV/dev_port
on Linux. Is it OK for you?
  
Finn Christensen Jan. 5, 2018, 11:09 a.m. UTC | #22
-----Original Message-----
    From: Thomas Monjalon [mailto:thomas@monjalon.net]
    Sent: 5. januar 2018 09:40
    To: Finn Christensen <fc@napatech.com>
    Cc: dev@dpdk.org; Yuanhan Liu <yliu@fridaylinux.org>; Adrien Mazarguil
    <adrien.mazarguil@6wind.com>; Ciara Loftus <ciara.loftus@intel.com>;
    Kevin Traynor <ktraynor@redhat.com>; stephen@networkplumber.org;
    ferruh.yigit@intel.com
    Subject: Re: [dpdk-dev] standardize device identification
    
    05/01/2018 08:52, Finn Christensen:
    > From: Thomas
    > > It may look like this:
    > >
    > >
    bus=BUS_NAME,id=BUS_ID/class=CLASS_NAME,dev_port=PORT_NUM,
    > >
    mac=MAC_ADDRESS/driver=DRIVER_NAME,driverspecificproperty=VALUE
    [...]
    > We also need to distinguish between multiple ports sitting on same PCI
    bus ID.
    > and from our point of view, this will fully cover our needs.
    
    Which property can help to distinguish Napatech ports?
    Can you use class=eth,dev_port=X ?
    The dev_port property will use /sys/class/net/DEV/dev_port on Linux. Is it
    OK for you?

Actually, what we were thinking of was using the mac property in the class 
category to distinguish our ports.
For instance: 
	-w bus=pci,id=0000:01:00.0/class=eth,mac=00:11:22:33:44:55
or simply:
	-w class=eth,mac=00:11:22:33:44:55

We will not be able to support the dev_port property, that will not work for us.
At least not for now.

Regards,
Finn
  
Thomas Monjalon Jan. 5, 2018, 12:01 p.m. UTC | #23
05/01/2018 12:09, Finn Christensen:
> From: Thomas Monjalon
>     Which property can help to distinguish Napatech ports?
>     Can you use class=eth,dev_port=X ?
>     The dev_port property will use /sys/class/net/DEV/dev_port on Linux. Is it
>     OK for you?
> 
> Actually, what we were thinking of was using the mac property in the class
> category to distinguish our ports.
> For instance:
> -w bus=pci,id=0000:01:00.0/class=eth,mac=00:11:22:33:44:55
> or simply:
> -w class=eth,mac=00:11:22:33:44:55

The problem with the mac property is that it cannot be used for
white/blacklisting in DPDK because the MAC is not known before
port initialization.

> We will not be able to support the dev_port property, that will not work for us.
> At least not for now.
  
Finn Christensen Jan. 5, 2018, 2:14 p.m. UTC | #24
>-----Original Message-----
>From: Thomas Monjalon [mailto:thomas@monjalon.net]
>Sent: 5. januar 2018 13:02
>To: Finn Christensen <fc@napatech.com>
>Cc: dev@dpdk.org; Yuanhan Liu <yliu@fridaylinux.org>; Adrien Mazarguil
><adrien.mazarguil@6wind.com>; Ciara Loftus <ciara.loftus@intel.com>; Kevin
>Traynor <ktraynor@redhat.com>; stephen@networkplumber.org;
>ferruh.yigit@intel.com
>Subject: Re: [dpdk-dev] standardize device identification
>
>05/01/2018 12:09, Finn Christensen:
>> From: Thomas Monjalon
>>     Which property can help to distinguish Napatech ports?
>>     Can you use class=eth,dev_port=X ?
>>     The dev_port property will use /sys/class/net/DEV/dev_port on Linux. Is it
>>     OK for you?
>>
>> Actually, what we were thinking of was using the mac property in the
>> class category to distinguish our ports.
>> For instance:
>> -w bus=pci,id=0000:01:00.0/class=eth,mac=00:11:22:33:44:55
>> or simply:
>> -w class=eth,mac=00:11:22:33:44:55
>
>The problem with the mac property is that it cannot be used for
>white/blacklisting in DPDK because the MAC is not known before port
>initialization.
>

Sure, that makes sense. I just for a minute thought that we could use that 
mechanism to enable individual ports at startup also. We will continue to 
use proprietary devargs passed by whiterlist to the PMD probe function.
What we needed was a way to select the individual ports, by using 
rte_eth_dev_get_port_by_name(). 


>> We will not be able to support the dev_port property, that will not work for
>us.
>> At least not for now.
  
Thomas Monjalon Jan. 5, 2018, 3:34 p.m. UTC | #25
05/01/2018 15:14, Finn Christensen:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> >05/01/2018 12:09, Finn Christensen:
> >> From: Thomas Monjalon
> >>     Which property can help to distinguish Napatech ports?
> >>     Can you use class=eth,dev_port=X ?
> >>     The dev_port property will use /sys/class/net/DEV/dev_port on Linux. Is it
> >>     OK for you?
> >>
> >> Actually, what we were thinking of was using the mac property in the
> >> class category to distinguish our ports.
> >> For instance:
> >> -w bus=pci,id=0000:01:00.0/class=eth,mac=00:11:22:33:44:55
> >> or simply:
> >> -w class=eth,mac=00:11:22:33:44:55
> >
> >The problem with the mac property is that it cannot be used for
> >white/blacklisting in DPDK because the MAC is not known before port
> >initialization.
> >
> 
> Sure, that makes sense. I just for a minute thought that we could use that
> mechanism to enable individual ports at startup also. We will continue to
> use proprietary devargs passed by whiterlist to the PMD probe function.
> What we needed was a way to select the individual ports, by using
> rte_eth_dev_get_port_by_name().

The whitelist will be replaced by this new syntax.
And yes, you can have your own driver-specific property with this syntax.

> >> We will not be able to support the dev_port property, that will not work for
> >us.
> >> At least not for now.

It leads to a totally different question:
Can we discuss again how to integrate your driver in DPDK upstream?
Please explain again your situation in a new thread.
  
Finn Christensen Jan. 5, 2018, 8:32 p.m. UTC | #26
>-----Original Message-----
>From: Thomas Monjalon [mailto:thomas@monjalon.net]
>Sent: 5. januar 2018 16:34
>To: Finn Christensen <fc@napatech.com>
>Cc: dev@dpdk.org; Yuanhan Liu <yliu@fridaylinux.org>; Adrien Mazarguil
><adrien.mazarguil@6wind.com>; Ciara Loftus <ciara.loftus@intel.com>; Kevin
>Traynor <ktraynor@redhat.com>; stephen@networkplumber.org;
>ferruh.yigit@intel.com
>Subject: Re: [dpdk-dev] standardize device identification
>
>05/01/2018 15:14, Finn Christensen:
>> From: Thomas Monjalon [mailto:thomas@monjalon.net]
>> >05/01/2018 12:09, Finn Christensen:
>> >> From: Thomas Monjalon
>> >>     Which property can help to distinguish Napatech ports?
>> >>     Can you use class=eth,dev_port=X ?
>> >>     The dev_port property will use /sys/class/net/DEV/dev_port on Linux.
>Is it
>> >>     OK for you?
>> >>
>> >> Actually, what we were thinking of was using the mac property in
>> >> the class category to distinguish our ports.
>> >> For instance:
>> >> -w bus=pci,id=0000:01:00.0/class=eth,mac=00:11:22:33:44:55
>> >> or simply:
>> >> -w class=eth,mac=00:11:22:33:44:55
>> >
>> >The problem with the mac property is that it cannot be used for
>> >white/blacklisting in DPDK because the MAC is not known before port
>> >initialization.
>> >
>>
>> Sure, that makes sense. I just for a minute thought that we could use
>> that mechanism to enable individual ports at startup also. We will
>> continue to use proprietary devargs passed by whiterlist to the PMD probe
>function.
>> What we needed was a way to select the individual ports, by using
>> rte_eth_dev_get_port_by_name().
>
>The whitelist will be replaced by this new syntax.
>And yes, you can have your own driver-specific property with this syntax.

That's fine.

>
>> >> We will not be able to support the dev_port property, that will not
>> >> work for
>> >us.
>> >> At least not for now.
>
>It leads to a totally different question:
>Can we discuss again how to integrate your driver in DPDK upstream?
>Please explain again your situation in a new thread.

Sure, I'll get back to you in a new thread. Thanks!
  
Ferruh Yigit Jan. 16, 2018, 8:20 p.m. UTC | #27
On 11/30/2017 7:35 AM, Yuanhan Liu wrote:
> The ethdev name is taken from the "name" parameter from the helper
> function rte_eth_dev_allocate(name). The name is given by the caller,
> thus, there is no way to guarantee all the callers will follow the
> same syntax, leaving us the port name is not standardized.
> 
> For example, for all ports probed by the generic pci probe function,
> the name is set to the PCI id. For all others, it's set by the caller,
> aka, the PMD driver. Taking mlx PMD driver as the example, it's set
> to something like "mlx5_0 port 0".
> 
> Unfortunately, ovs-dpdk uses such name for referencing a specific port.
> Since there is no standard, user has to figure out what is the right
> name for the nic he is using. That adds extra (unnecessary) obstruction
> to users.

Can you please give more details about the ovs-dpdk usage?

I assume ovs-dpdk creates some virtual/physical devices while started, and
during switch configuration it need to access those devices and this is done by
getting device by name. That is why need to know the device name.

If above assumption correct, can supporting a generic id=x devargs help? Ethdev
can guarantie that unique ids provided are unique and provides API to get device
by id.
And in ovs-dpdk can use those ids directly, can this work?

> 
> Thus, the name should be standardized. We should give user a consistent
> interface for finding a specific port.
> 
> What this patch proposes is to use "name[,mac]" syntax. "name" is the
> PCI id for pci device. For vdev, it's the vdev name given by user. The
> reason "mac" is needed is for some devices (say ConnectX-3), 2 ports
> (in a single NIC) have the same PCI id.
> 
> There are 2 reasons I didn't make "mac" mandatory:
> - it keeps the compatibility
> - in most cases, the pci id is good enough to identify a port
> 
> However, while writing this commit log, I think it might be better to
> use something like UUID for standardizing the port name. This way, we
> will always have a very consistent naming, no matter whether it's PCI
> device or vdev device and whether a PCI devices has 2 ports share the
> same PIC id, or something we have considered so far (say, few ports
> sharing same PCI and mac address :/).
> 
> It's also simpler and cleaner. The only drawback is such ID is meaningless
> to human.
> 
> Please also note that this patch just comes up with an API to query
> a port from standard name suggested above. The ethdev name isn't really
> standardized here. This patch is asking for comments after all.
> 
> Thoughts?
> 
> Cc: Thomas Monjalon <thomas@monjalon.net>
> Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Cc: Ciara Loftus <ciara.loftus@intel.com>
> Cc: Kevin Traynor <ktraynor@redhat.com>
> Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>
> ---
>  lib/librte_ether/rte_ethdev.c | 61 +++++++++++++++++++++++++++++++++++++++++++
>  lib/librte_ether/rte_ethdev.h | 28 ++++++++++++++++++++
>  2 files changed, 89 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 318af28..acf29e7 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -362,6 +362,67 @@ rte_eth_dev_get_port_by_name(const char *name, uint16_t *port_id)
>  	return -ENODEV;
>  }
>  
> +static int
> +str_to_ether(const char *mac, struct ether_addr *ea)
> +{
> +	unsigned int bytes[6];
> +	int i;
> +
> +	if (sscanf(mac, "%x:%x:%x:%x:%x:%x",
> +		   &bytes[0], &bytes[1], &bytes[2],
> +		   &bytes[3], &bytes[4], &bytes[5]) != 6) {
> +		return -1;
> +	}
> +
> +	for (i = 0; i < 6; i++)
> +		ea->addr_bytes[i] = bytes[i];
> +
> +	return 0;
> +}
> +
> +int
> +rte_eth_dev_get_port(const char *devarg, uint16_t *port_id)
> +{
> +	int i;
> +	char *name;
> +	char *mac;
> +	struct ether_addr ea;
> +	int nr_match = 0;
> +
> +	if (devarg == NULL)
> +		return -EINVAL;
> +
> +	name = strdup(devarg);
> +	mac  = strchr(name, ',');
> +	if (mac) {
> +		*mac++ = '\0';
> +		if (str_to_ether(mac, &ea) < 0)
> +			return -EINVAL;
> +	}
> +
> +	RTE_ETH_FOREACH_DEV(i) {
> +		if (strncmp(name, rte_eth_dev_data[i].name, strlen(name)) == 0) {
> +			*port_id = i;
> +
> +			/* if mac is given, mac has to be matched also */
> +			if (mac && !is_same_ether_addr(&ea,
> +					&rte_eth_dev_data[i].mac_addrs[0]))
> +				continue;
> +
> +			*port_id = i;
> +			nr_match += 1;
> +		}
> +	}
> +
> +	if (nr_match > 1) {
> +		RTE_LOG(ERR, EAL, "%d ports found with devarg: %s\n",
> +				nr_match, devarg);
> +		return -EINVAL;
> +	}
> +
> +	return -ENODEV;
> +}
> +
>  /* attach the new device, then store port_id of the device */
>  int
>  rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 341c2d6..3e131b6 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -4555,6 +4555,34 @@ int
>  rte_eth_dev_get_port_by_name(const char *name, uint16_t *port_id);
>  
>  /**
> +* Get the port id from the devarg
> +*
> +* The standard syntax for the devarg is "name[,mac]".
> +*
> +* For PCI device, "name" is the PCI id, for example, "0000:2:00.0".
> +* For vdev, it's the vdev name, for example, "net_pcap0".
> +*
> +* The "mac" is optional, as in most cases, the name (say, the PCI
> +* id) is good enough to identify a specific port. However, there
> +* are cases that multiple ports share a single PCI id. In such case,
> +* mac is needed to identify the specific port.
> +*
> +* If more than one port is found by the given devarg, -EINVAL will
> +* be returned. An error message will also be dumped.
> +*
> +* @param devarg
> +*   a standard devarg string
> +* @param port_id
> +*   pointer to port identifier of the device
> +* @return
> +*   - 0 if successful and port_id is filled.
> +*   - -ENODEV if no port found
> +*   - -EINVAL for invalid devarg
> +*/
> +int
> +rte_eth_dev_get_port(const char *devarg, uint16_t *port_id);
> +
> +/**
>  * Get the device name from port id
>  *
>  * @param port_id
>
  

Patch

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 318af28..acf29e7 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -362,6 +362,67 @@  rte_eth_dev_get_port_by_name(const char *name, uint16_t *port_id)
 	return -ENODEV;
 }
 
+static int
+str_to_ether(const char *mac, struct ether_addr *ea)
+{
+	unsigned int bytes[6];
+	int i;
+
+	if (sscanf(mac, "%x:%x:%x:%x:%x:%x",
+		   &bytes[0], &bytes[1], &bytes[2],
+		   &bytes[3], &bytes[4], &bytes[5]) != 6) {
+		return -1;
+	}
+
+	for (i = 0; i < 6; i++)
+		ea->addr_bytes[i] = bytes[i];
+
+	return 0;
+}
+
+int
+rte_eth_dev_get_port(const char *devarg, uint16_t *port_id)
+{
+	int i;
+	char *name;
+	char *mac;
+	struct ether_addr ea;
+	int nr_match = 0;
+
+	if (devarg == NULL)
+		return -EINVAL;
+
+	name = strdup(devarg);
+	mac  = strchr(name, ',');
+	if (mac) {
+		*mac++ = '\0';
+		if (str_to_ether(mac, &ea) < 0)
+			return -EINVAL;
+	}
+
+	RTE_ETH_FOREACH_DEV(i) {
+		if (strncmp(name, rte_eth_dev_data[i].name, strlen(name)) == 0) {
+			*port_id = i;
+
+			/* if mac is given, mac has to be matched also */
+			if (mac && !is_same_ether_addr(&ea,
+					&rte_eth_dev_data[i].mac_addrs[0]))
+				continue;
+
+			*port_id = i;
+			nr_match += 1;
+		}
+	}
+
+	if (nr_match > 1) {
+		RTE_LOG(ERR, EAL, "%d ports found with devarg: %s\n",
+				nr_match, devarg);
+		return -EINVAL;
+	}
+
+	return -ENODEV;
+}
+
 /* attach the new device, then store port_id of the device */
 int
 rte_eth_dev_attach(const char *devargs, uint16_t *port_id)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 341c2d6..3e131b6 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -4555,6 +4555,34 @@  int
 rte_eth_dev_get_port_by_name(const char *name, uint16_t *port_id);
 
 /**
+* Get the port id from the devarg
+*
+* The standard syntax for the devarg is "name[,mac]".
+*
+* For PCI device, "name" is the PCI id, for example, "0000:2:00.0".
+* For vdev, it's the vdev name, for example, "net_pcap0".
+*
+* The "mac" is optional, as in most cases, the name (say, the PCI
+* id) is good enough to identify a specific port. However, there
+* are cases that multiple ports share a single PCI id. In such case,
+* mac is needed to identify the specific port.
+*
+* If more than one port is found by the given devarg, -EINVAL will
+* be returned. An error message will also be dumped.
+*
+* @param devarg
+*   a standard devarg string
+* @param port_id
+*   pointer to port identifier of the device
+* @return
+*   - 0 if successful and port_id is filled.
+*   - -ENODEV if no port found
+*   - -EINVAL for invalid devarg
+*/
+int
+rte_eth_dev_get_port(const char *devarg, uint16_t *port_id);
+
+/**
 * Get the device name from port id
 *
 * @param port_id