[dpdk-dev] [PATCH 07/10] linuxapp/eal_vfio: honor iova mode before mapping
Maxime Coquelin
maxime.coquelin at redhat.com
Thu Jul 6 15:11:04 CEST 2017
On 07/06/2017 03:08 PM, Maxime Coquelin wrote:
>
>
> On 07/06/2017 01:19 PM, santosh wrote:
>> On Thursday 06 July 2017 04:29 PM, Maxime Coquelin wrote:
>>
>>>
>>> On 07/06/2017 11:49 AM, Jerin Jacob wrote:
>>>> -----Original Message-----
>>>>> Date: Thu, 6 Jul 2017 09:58:41 +0200
>>>>> From: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>>> To: Jerin Jacob <jerin.jacob at caviumnetworks.com>
>>>>> CC: Santosh Shukla <santosh.shukla at caviumnetworks.com>,
>>>>> thomas at monjalon.net, bruce.richardson at intel.com, dev at dpdk.org,
>>>>> hemant.agrawal at nxp.com, shreyansh.jain at nxp.com,
>>>>> gaetan.rivet at 6wind.com
>>>>> Subject: Re: [dpdk-dev] [PATCH 07/10] linuxapp/eal_vfio: honor iova
>>>>> mode
>>>>> before mapping
>>>>> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
>>>>> Thunderbird/52.1.0
>>>>>
>>>>>
>>>>>
>>>>> On 07/05/2017 05:43 PM, Jerin Jacob wrote:
>>>>>> -----Original Message-----
>>>>>>> Date: Wed, 5 Jul 2017 11:14:01 +0200
>>>>>>> From: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>>>>> To: Santosh Shukla <santosh.shukla at caviumnetworks.com>,
>>>>>>> thomas at monjalon.net, bruce.richardson at intel.com, dev at dpdk.org
>>>>>>> CC: jerin.jacob at caviumnetworks.com, hemant.agrawal at nxp.com,
>>>>>>> shreyansh.jain at nxp.com, gaetan.rivet at 6wind.com
>>>>>>> Subject: Re: [dpdk-dev] [PATCH 07/10] linuxapp/eal_vfio: honor
>>>>>>> iova mode
>>>>>>> before mapping
>>>>>>> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
>>>>>>> Thunderbird/52.1.0
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 06/08/2017 01:05 PM, Santosh Shukla wrote:
>>>>>>>> Check iova mode and accordingly map iova to pa or va.
>>>>>>>>
>>>>>>>> Signed-off-by: Santosh Shukla<santosh.shukla at caviumnetworks.com>
>>>>>>>> Signed-off-by: Jerin Jacob<jerin.jacob at caviumnetworks.com>
>>>>>>>> ---
>>>>>>>> lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++--
>>>>>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
>>>>>>>> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
>>>>>>>> index 04914406f..348b7a7f4 100644
>>>>>>>> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
>>>>>>>> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
>>>>>>>> @@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd)
>>>>>>>> dma_map.argsz = sizeof(struct
>>>>>>>> vfio_iommu_type1_dma_map);
>>>>>>>> dma_map.vaddr = ms[i].addr_64;
>>>>>>>> dma_map.size = ms[i].len;
>>>>>>>> - dma_map.iova = ms[i].phys_addr;
>>>>>>>> + if (rte_eal_iova_mode() == RTE_IOVA_VA)
>>>>>>>> + dma_map.iova = dma_map.vaddr;
>>>>>>>> + else
>>>>>>>> + dma_map.iova = ms[i].phys_addr;
>>>>>>>> dma_map.flags = VFIO_DMA_MAP_FLAG_READ |
>>>>>>>> VFIO_DMA_MAP_FLAG_WRITE;
>>>>>>>
>>>>>>> IIUC, it is changing default behavior for VFIO devices.
>>>>>>>
>>>>>>> I see a possible problem, but I'm not sure the case is valid.
>>>>>>>
>>>>>>> Imagine you have two devices in the iommu group, and the two
>>>>>>> devices are
>>>>>>> used in separate processes. Each process could try two different
>>>>>>> physical addresses at the same virtual address, and so the second
>>>>>>> map
>>>>>>> would fail.
>>>>>>
>>>>>> IMO, Doesn't look like a problem. Here is the data flow
>>>>>>
>>>>>> 1) The vfio DMA map function(vfio_type1_dma_map()) will be called
>>>>>> only
>>>>>> on primary process
>>>>>> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_vfio.c#n359
>>>>>>
>>>>>>
>>>>>> 2) On secondary process, DPDK rte_eal_huge_page_attach() will make
>>>>>> sure
>>>>>> that, the Secondary process has the _same_ virtual address as
>>>>>> primary or
>>>>>> exit from on attach.
>>>>>> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_memory.c#n1452
>>>>>>
>>>>>>
>>>>>> 3) Since secondary process adds the mapped the virtual address in
>>>>>> step (2).
>>>>>> in the page table in OS. On SMMU entry miss(When device
>>>>>> request from I/O transaction), OS will load the mapping and update
>>>>>> the SMMU
>>>>>> "context" with page tables from MMU.
>>>>>
>>>>> Ok thanks for the detailed info, but what about the case where the
>>>>> same
>>>>> iommu group is used by two primary processes?
>>>>
>>>> Does that case exist with DPDK? We always need to blacklist same BDF in
>>>> the secondary process to make things work with existing DPDK setup.
>>>> Which
>>>> make sense as well. Only primary process configures the HW blocks.
>>>
>>> I meant the case when two BDF are in the same IOMMU group (if ACS is not
>>> supported at some point in the hierarchy). And I meant two primary
>>> processes running, like for example two containers running each a DPDK
>>> application.
>>>
>>> Maybe this is not a valid use-case (it is not secure, as it would break
>>> isolation between the two containers), but it seems that it is something
>>> DPDK allows today, if I'm not mistaken.
>>>
>> I'm not sure how two primary process could run, as because latter
>> primary process
>> would try accessing /var/run/.rte_config and would fail at this [1]
>> point.
>>
>> It's not valid use-case for dpdk (imo).
>> [1]
>> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal.c#n204
>
> Yes this is possible. I had never used it before, but Thomas told me it
> is supported by setting--file-prefix option. I had a trial, and I
> confirm it works:
> session 1> ./install/bin/testpmd -l 0,2 --socket-mem=1024 -w
> 0000:05:00.0 --proc-type=primary --file-prefix=app1 -- --disable-hw-vlan
> -i --rxq=1 --txq=1 --nb-cores=1 --forward-mode=io
> session 2> ./install/bin/testpmd -l 0,3 --socket-mem=1024 -w
> 0000:05:00.1 --proc-type=primary --file-prefix=app2 -- --disable-hw-vlan
> -i --rxq=1 --txq=1 --nb-cores=1 --forward-mode=io
>
> In the above example, two ports of the same card is used by two
> processes. Note that in this case, ACS is supproted and both ports have
> their own iommu group.
# ls -al /var/run/.app*
-rw-r-----. 1 root root 208420 Jul 6 09:08 /var/run/.app1_config
-rw-r--r--. 1 root root 49728 Jul 6 09:08 /var/run/.app1_hugepage_info
srwxr-xr-x. 1 root root 0 Jul 6 09:08 /var/run/.app1_mp_socket
-rw-r-----. 1 root root 208420 Jul 6 09:08 /var/run/.app2_config
-rw-r--r--. 1 root root 45584 Jul 6 09:08 /var/run/.app2_hugepage_info
srwxr-xr-x. 1 root root 0 Jul 6 09:08 /var/run/.app2_mp_socket
More information about the dev
mailing list