[dpdk-dev] rte_eth_dev_socket_id() vs KVM/AWS/...

Burakov, Anatoly anatoly.burakov at intel.com
Mon May 14 10:09:36 CEST 2018


On 09-May-18 6:08 PM, Mike Stolarchuk wrote:
> Hello Dpdk,
> 
> rte_eth_dev_socket_id() describes a -1 return value as:
> 
> *Returns*
> 
> The NUMA socket id to which the Ethernet device is connected or a default
> of zero if the socket could not be determined. -1 is returned is the
> port_id value is out of range.
> 
> But, rte_eth_dev_socket_id() is implemented as:
> 
> int
> rte_eth_dev_socket_id(uint16_t port_id)
> {
>      RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -1);
>      return rte_eth_devices[port_id].data->numa_node;
> }
> 
> And numa_node here is set from /sys/bus/pci/<device>/numa_node.
> And https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci
> documents numa_node as:
> 
> What: /sys/bus/pci/devices/.../numa_node
> Date: Oct 2014
> Contact: Prarit Bhargava <prarit at redhat.com>
> Description:
> This file contains the NUMA node to which the PCI device is
> attached, or -1 if the node is unknown.  The initial value
> comes from an ACPI _PXM method or a similar firmware
> source.  If that is missing or incorrect, this file can be
> written to override the node.  In that case, please report
> a firmware bug to the system vendor.  Writing to this file
> taints the kernel with TAINT_FIRMWARE_WORKAROUND, which
> reduces the supportability of your system.
> 
> in other words, a value of -1 for numa_node means the association of the
> pci device WRT socket is unknown.
> And as an example, in a KVM with e1000's.
> /sys/bus/pci/devices/<d>/numa_node can return -1.
> 
> This means that rte_eth_dev_socket_id() returns -1 in situations other than
> 'port_id value is out of range'.
> And its not possible to identify whether the port_id is invalid, or whether
> the base system didn't
> announce the numa_node association.
> 
> Perhaps a -1 return value should be an indication the the numa_node
> association isn't known,
> and a different return value, say -2, should indicate the port_id value is
> out of range.
> 
> 
> mts.
> 

For cases like these, we have rte_errno - we could set it to EINVAL in 
case of invalid value, and e.g. ENODEV (?) on invalid NUMA node.

-- 
Thanks,
Anatoly


More information about the dev mailing list