[dpdk-dev] [PATCH v5 0/9] ethdev: support SubFunction representor

Andrew Rybchenko andrew.rybchenko at oktetlabs.ru
Fri Feb 5 10:37:12 CET 2021


On 2/5/21 12:13 PM, Xueming(Steven) Li wrote:
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>
>> Sent: Friday, February 5, 2021 3:35 PM
>> To: Xueming(Steven) Li <xuemingl at nvidia.com>
>> Cc: dev at dpdk.org; Slava Ovsiienko <viacheslavo at nvidia.com>; Asaf Penso <asafp at nvidia.com>; Thomas Monjalon
>> <tmonjalon at nvidia.com>
>> Subject: Re: [dpdk-dev] [PATCH v5 0/9] ethdev: support SubFunction representor
>>
>> On 2/4/21 5:15 PM, Xueming(Steven) Li wrote:
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>
>>>> Sent: Monday, February 1, 2021 4:39 PM
>>>> To: Xueming(Steven) Li <xuemingl at nvidia.com>
>>>> Cc: dev at dpdk.org; Slava Ovsiienko <viacheslavo at nvidia.com>; Asaf
>>>> Penso <asafp at nvidia.com>
>>>> Subject: Re: [dpdk-dev] [PATCH v5 0/9] ethdev: support SubFunction
>>>> representor
>>>>
>>>> On 1/28/21 5:31 PM, Xueming(Steven) Li wrote:
>>>>> <snip>
>>>>>>> The patch of device SF capability, but seems I misunderstood your suggestion.
>>>>>>> Let me explain process to create a SF:
>>>>>>> 1. SF can be created on the fly with scripts, unlike VF which is statically pre-created.
>>>>>>
>>>>>> Is there a maximum index and maximum total number of SF's created? How to find it?
>>>>>
>>>>> The maximum index is defined by firmware configuration, all SF's
>>>>> information could be found from sysfs. To create a SF, both PCI and sfnum have to be specified.
>>>>
>>>> sysfs is obviously Linux specific. I think the information should be available via DPDK API.
>>>
>>> Yes, the new api discussed below should resolve this issue.
>>>
>>>>
>>>>>>
>>>>>>> 2. SF is created on a PF with a SF number. SF number is named per PF, different PF may have same SF number.
>>>>>>> 3. For standalone PF, hot plug to DPDK using "PF#_BDF,representor=sf#", no need to use pf#sf# here.
>>>>>>> 4. For bonding netdev, hot plug to DPDK using "PF0_BDF,representor=pf#sf#"
>>>>>>> If using new api to return all representor IDs, need some way
>>>>>>> locate the new created SF by PF and SF number, that's why "pf#sf#" is used in this patch set.
>>>>>>
>>>>>> I think the API should simply reserve/report space for maximum
>>>>>> number of SFs. So, IDs are stable across restart/reboot in
>>>>>> assumption that NIC is not reconfigured (changed maximum number of
>>>>>> VF or
>>>> maximum number of SFs of any PF).
>>>>>
>>>>> Yes, IDs should be stable as long as no  NIC firmware configuration change.
>>>>>
>>>>> Just clarify, this api should be common enough to report all devices that a bus device supports:
>>>>> 1. name, might contains controller and pf info, example: "eth:representor:c0pf1vf"
>>>>> 2. ID range, example: 0-127
>>>>> The api describes ID ranges for each sub device type, users have to query the api and choose representor ID to probe.
>>>>>
>>>>> Prototype:
>>>>> struct rte_bus_device_range {
>>>>> 	char name[64];
>>>>> 	uint32_t base;
>>>>> 	uint32_t number;
>>>>> }
>>>>> /* return number of ranges filled, or number of ranges if list is
>>>>> NULL. */ int rte_bus_ dev_range_get(struct rte_bus_device_range
>>>>> *list, int n);
>>>>
>>>> Hm, I thought about more port representor specific API.
>>>> For me it is hard to tell if such generic naming is good or bad. I
>>>> think it should be proven that such generic API makes sense. Any other potential users / use cases?
>>>
>>> I was thinking about SF, but SF is PCI specific, not suitable for this api. So I'm fine to make it as ethdev api.
>>> To append new api into eth_dev_ops, is there ABI concern?
>>
>> No, eth_dev_ops are internal
>>
>>>> I've considered ethdev API which returns (in similar way as
>>>> above) list of possible port representors which could be controlled
>>>> by the device. Also I think it would be useful to include type information (enum with PF, VF, SF), controller ID.
>>>
>>> Agree.
>>>
>>> There is a new concern from orchestration side, currently, no
>>> interface in openstack and OVS to retrieve representor ID range info,
>>> It will take time to adapt this solution. To probe a representor,
>>> orchestration need to know how to calculate representor ID, and the ID might vary on different max SF number, i.e. VF4 on PP1
>> might got different ID. Representor ID change before that will break the product.
>>
>> I see.
>>
>>> Considering both orchestration and testpmd users, how about keeping both solution together? This will bring max flexibility IMHO.
>>
>> As I said before I don't mind and I really think it is a good idea to add suggested interface to specify representor (i.e. cXpfYvfZ), but the
>> problem is making bitmap from representor ID.
>>
>> ethdev API should use new representor info API to make a representor ID from controller/PF/{VF,SF}.
>> Or do you see any problems with such approach?
> 
> Sorry I thought the user to figure out representor ID from api.
> This combination look good, thanks for clarification :)
> 
> So the new api looks like this:

Roughly speaking - yes

> struct rte_eth_representor_info {
>   Enum representor_type;
>   Uint16_t controller; // -1 for any

I'm not sure that I understand what does "any" mean in this
case. I think it should be the zero in examples below.
I think that API should return caller controller ID and
PF ID. It would allow to interpret "vf5" correctly when
caller is not controller #0 and/or PF #0.

>   Uint16_t port; // -1 for any

port sounds like physical port, but it should be PF
 (pf, phys_fn or something like this). It could be many
PFs per physical network port.

>   Uint16_t representor_id;

May be base_id? Or rep_base_id?

The question is what to do if range for VF or SF is
not contiguous. Should we have one more index after phys_fn
to  represent it? E.g.
union {
    uint16_t vf;
    uint16_t sf;
};

>   Uint16_t count;

May be id_range which should be 1 to show one function.
It could be convenient to treat 0 this way as well,
but I doubt that it is a good idea.

>   char name[N];
> 
> int rte_eth_representor_info_get(struct rte_eth_representor_info *infos);
> - Return number of entries.
> - NULL infos just return number of entries supported.
> Sample outputs:
>  VF, -1, 0, 0, 		128, 	"pf0vf"
>  SF, -1, 0, 128, 		2048, 	"pf0sf"
>  PF, -1, 0, 32767, 	1, 	"pf"
>  VF, -1, 1, 32768, 	128, 	"pf1vf"
>  SF, -1, 0, (32768+128), 	2048, 	"pf1sf"
>  PF, -1, 0, 65535, 	1,	 "pf"
> 
>>
>>> In struct rte_eth_dev_data, reserved bits could be used to define controller and port, this will avoid bitmap. How do you think?
>>
>> Could you add a bit more on it? Just a bit more details to the idea since I don't understand what exactly you mean and how it could
>> help.
> 
> The idea is replacing reserved_64s and adding more device location info in rte_eth-dev_data like this:
>   Uint16_t representor_id;
>   Uint16_t port_id;
>   Uint16_t controller_id;
>   Enum representor_type;
> Compare them all when matching a device, this will also avoid bitmap encoding. 
> Reserved_64s[] was added to mitigate ABI conflicts, IIRC.
> But seems no need if making representor info API to make ID.
> 
>>
>>>>
>>>> There is one more bit which is not in the picture yet -
>>>> switch_info.port_id. Should it be equal to representor ID? Or different and provided in the info structure?
>>>
>>> Not exactly same AFAIK, the id used in e-switch.
>>>
>>>
> 



More information about the dev mailing list