[dpdk-dev] [PATCH 2/2] eal: fix IOVA mode selection as VA for pci drivers

Jerin Jacob Kollanukkaran jerinj at marvell.com
Mon Jul 15 17:35:50 CEST 2019


> -----Original Message-----
> From: Thomas Monjalon <thomas at monjalon.net>
> Sent: Monday, July 15, 2019 8:34 PM
> To: Jerin Jacob Kollanukkaran <jerinj at marvell.com>
> Cc: Burakov, Anatoly <anatoly.burakov at intel.com>; David Marchand
> <david.marchand at redhat.com>; dev at dpdk.org; John McNamara
> <john.mcnamara at intel.com>; Marko Kovacevic
> <marko.kovacevic at intel.com>; Igor Russkikh
> <igor.russkikh at aquantia.com>; Pavel Belous <pavel.belous at aquantia.com>;
> Ajit Khaparde <ajit.khaparde at broadcom.com>; Somnath Kotur
> <somnath.kotur at broadcom.com>; Wenzhuo Lu <wenzhuo.lu at intel.com>;
> John Daley <johndale at cisco.com>; Hyong Youb Kim <hyonkim at cisco.com>;
> Qi Zhang <qi.z.zhang at intel.com>; Xiao Wang <xiao.w.wang at intel.com>;
> Beilei Xing <beilei.xing at intel.com>; Jingjing Wu <jingjing.wu at intel.com>;
> Qiming Yang <qiming.yang at intel.com>; Konstantin Ananyev
> <konstantin.ananyev at intel.com>; Matan Azrad <matan at mellanox.com>;
> Shahaf Shuler <shahafs at mellanox.com>; Yongseok Koh
> <yskoh at mellanox.com>; Viacheslav Ovsiienko
> <viacheslavo at mellanox.com>; Alejandro Lucero
> <alejandro.lucero at netronome.com>; Nithin Kumar Dabilpuram
> <ndabilpuram at marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark at marvell.com>; Rasesh Mody <rmody at marvell.com>; Shahed
> Shaikh <shshaikh at marvell.com>; Bruce Richardson
> <bruce.richardson at intel.com>; alialnu at mellanox.com;
> aconole at redhat.com
> Subject: Re: [PATCH 2/2] eal: fix IOVA mode selection as VA for pci drivers
> 
> 15/07/2019 16:26, Jerin Jacob Kollanukkaran:
> > > > > +
> > > > > +IOVA Mode is selected by considering what the current usable
> > > > > +Devices on the system requires and/or supports.
> > > > > +
> > > > > +Below is the 2-step heuristic for this choice.
> > > > > +
> > > > > +For the first step, EAL asks each bus its requirement in terms
> > > > > +of IOVA mode and decides on a preferred IOVA mode.
> > > > > +
> > > > > +- if all buses report RTE_IOVA_PA, then the preferred IOVA mode
> > > > > +is RTE_IOVA_PA,
> > > > > +- if all buses report RTE_IOVA_VA, then the preferred IOVA mode
> > > > > +is RTE_IOVA_VA,
> > > > > +- if all buses report RTE_IOVA_DC, no bus expressed a
> > > > > +preferrence, then the
> > > > > +  preferred mode is RTE_IOVA_DC,
> > > > > +- if the buses disagree (at least one wants RTE_IOVA_PA and at
> > > > > +least one wants
> > > > > +  RTE_IOVA_VA), then the preferred IOVA mode is RTE_IOVA_DC
> > > > > +(see below with the
> > > > > +  check on Physical Addresses availability),
> > > > > +
> > > > > +The second step is checking if the preferred mode complies with
> > > > > +the Physical Addresses availability since those are only
> > > > > +available to root user in recent kernels.
> > > > > +
> > > > > +- if the preferred mode is RTE_IOVA_PA but there is no access
> > > > > +to Physical
> > > > > +  Addresses, then EAL init will fail early, since later probing
> > > > > +of the devices
> > > > > +  would fail anyway,
> > > > > +- if the preferred mode is RTE_IOVA_DC then based on the
> > > > > +Physical Addresses
> > > > > +  availability, the preferred mode is adjusted to RTE_IOVA_PA
> > > > > +or
> > > RTE_IOVA_VA.
> > > > > +  In the case when the buses had disagreed on the IOVA Mode at
> > > > > +the first step,
> > > > > +  part of the buses won't work because of this decision.
> > > >
> > > > Is there any specific reason why we always prefer PA if physical
> > > > addresses are available? Since we're already assuming that all
> > > > devices support PA and VA anyway, what's the harm in enabling VA by
> default?
> > >
> > > If PA is available, it means we are running as root.
> > > We can assume that using root is a choice, probably related to a
> > > preference for PA.
> >
> > # Even if we are running as root, Why to choose PA in case of DC?
> > ie. Following logic is not need
> >                 if (iova_mode == RTE_IOVA_DC) {
> >                         iova_mode = phys_addrs ? RTE_IOVA_PA : RTE_IOVA_VA;
> >                         RTE_LOG(DEBUG, EAL,
> >                                 "Buses did not request a specific IOVA mode, using '%s'
> based on physical addresses availability.\n",
> >                                 phys_addrs ? "PA" : "VA");
> >                 }
> 
> Why running as root if using VA anyway?
> We can assume the user knows what he is doing, so it is a user choice.
> We want to allow the user choosing, right?

The user can override iova=pa/va as eal argument if user needs to run a specific mode.
Running as root for various other reason(just be lazy) etc. it is not or it should not
be connected to set the mode as PA.

> 
> > # When DPDK running on guest, Anyway it can not access the real PA, It will
> be IPA.
> 
> What is IPA? Isn't it a beer?

There may a beer with that name. In this context, it is "Intermediate physical address"

> 
> > So I don't understand logic behind choose PA when DC.
> > To me, it make sense to choose PA when DC.
> 
> You probably mean "choose VA".

Yup.

> 
> > # To align with RTE_PCI_DRV_NEED_MAPPING flag and reflect it "need"
> > rather than support, I think, flag can be changed to
> > RTE_PCI_DRV_NEED_IOVA_AS_VA
> 
> I think the most important is to have a good documentation of this flag (it
> was not done properly when Cavium introduced it initially).
> If you want to rename the flag, you can do it in a separate patch.
> If renaming, I really would like to get an answer to an old question:
> Why IO adress is called IOVA? The name "IOVA_AS_VA" looks strange.

IOVA = IO virtual address
Since IOVA can be PA or VA, the name IOVA_AS_VA as chosen 

> For reference, one description of addressing:
> https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027686.html
> 
> About the naming, do you remember how I insisted to have a correct naming
> of all related stuff in DPDK? It was hard to get it accepted, the discussion was
> not nice and I stopped insisting to get all details fine because I just got bored.
> It was a really bad experience.

I agree.
To me that bad experience was due to mostly not having enough technical comments
On the proposal. Though I am not the author/owner of it.

> You can ask why I remind this now? Because we must take care of all details,
> make sure our messages are well understood, and be cooperative.

No disagreement.
If we see the history the meaning got changed/updated in this commit
By adding intel drivers to it. I would nt say  it is big ideal, It just C code,
It can be changed based on the need. I think, what really import is,
maintain the the feature and commitment towards fixing any issue.

commit f37dfab21c988d2d0ecb3c82be4ba9738c7e51c7
Author: Jianfeng Tan <jianfeng.tan at intel.com>
Date:   Wed Oct 11 10:33:48 2017 +0000

    drivers/net: enable IOVA mode for Intel PMDs

    If we want to enable IOVA mode, introduced by
    commit 93878cf0255e ("eal: introduce helper API for IOVA mode"),
    we need PMDs (for PCI devices) to expose this flag.

    Signed-off-by: Jianfeng Tan <jianfeng.tan at intel.com>
    Acked-by: Anatoly Burakov <anatoly.burakov at intel.com>
    Reviewed-by: Santosh Shukla <santosh.shukla at caviumnetworks.com>

> 
> > Other than above points,
> > Reviewed this patch and tested on octeontx2, It looks good to me.
> 
> 



More information about the dev mailing list