[v4] bus/pci: fix VF bus error for memory access

Message ID 20200625035046.19820-1-haiyue.wang@intel.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series [v4] bus/pci: fix VF bus error for memory access |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/travis-robot success Travis build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/Intel-compilation fail Compilation issues

Commit Message

Wang, Haiyue June 25, 2020, 3:50 a.m. UTC
  To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
and block MMIO access on disabled memory, it will send a SIGBUS to the
application:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fddede3e0a08dee1dcde08fc0eb8476

When the application opens the vfio PCI device, the vfio-pci module will
enable the bus memory space through PCI read/write access. According to
the PCIe specification, the 'Memory Space Enable' is always zero for VF:

             Table 9-13 Command Register Changes

Bit Location | PF and VF Register Differences | PF         | VF
             | From Base                      | Attributes | Attributes
-------------+--------------------------------+------------+-----------
             | Memory Space Enable - Does not |            |
             | apply to VFs. Must be hardwired|  Base      |  0b
     1       | to 0b for VFs. VF Memory Space |            |
             | is controlled by the VF MSE bit|            |
             | in the VF Control register.    |            |
-------------+--------------------------------+------------+-----------

Afterwards the vfio-pci will initialize its own virtual PCI config space
data ('vconfig') by reading the VF's physical PCI config space, then the
'Memory Space Enable' bit in vconfig will always be 0b value. This will
make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
will be triggered if access these BARs.

By investigation, the VF PCI device *passthrough* into the Guest OS by
QEMU has the 'Memory Space Enable' with 1b value. That's because every
PCI driver will start to enable the memory space, and this action will
be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
VF runs in host OS has 'Mem-'.

Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
bus memory space explicitly to avoid access on disabled memory.

Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")
Cc: stable@dpdk.org

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Harman Kalra <hkalra@marvell.com>
Tested-by: David Marchand <david.marchand@redhat.com>
---
v4: Fix commit message typo.
v3: Update the commit log, and fix one debug log with redundant
description.
v2: Rewrite the commit log, and put the link into it even it is long.
---
 drivers/bus/pci/linux/pci_vfio.c | 37 ++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)
  

Comments

David Marchand June 25, 2020, 2:09 p.m. UTC | #1
On Thu, Jun 25, 2020 at 6:00 AM Haiyue Wang <haiyue.wang@intel.com> wrote:
>
> To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
> and block MMIO access on disabled memory, it will send a SIGBUS to the
> application:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fddede3e0a08dee1dcde08fc0eb8476
>
> When the application opens the vfio PCI device, the vfio-pci module will
> enable the bus memory space through PCI read/write access. According to
> the PCIe specification, the 'Memory Space Enable' is always zero for VF:
>
>              Table 9-13 Command Register Changes
>
> Bit Location | PF and VF Register Differences | PF         | VF
>              | From Base                      | Attributes | Attributes
> -------------+--------------------------------+------------+-----------
>              | Memory Space Enable - Does not |            |
>              | apply to VFs. Must be hardwired|  Base      |  0b
>      1       | to 0b for VFs. VF Memory Space |            |
>              | is controlled by the VF MSE bit|            |
>              | in the VF Control register.    |            |
> -------------+--------------------------------+------------+-----------
>
> Afterwards the vfio-pci will initialize its own virtual PCI config space
> data ('vconfig') by reading the VF's physical PCI config space, then the
> 'Memory Space Enable' bit in vconfig will always be 0b value. This will
> make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
> will be triggered if access these BARs.
>
> By investigation, the VF PCI device *passthrough* into the Guest OS by
> QEMU has the 'Memory Space Enable' with 1b value. That's because every
> PCI driver will start to enable the memory space, and this action will
> be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
> Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
> VF runs in host OS has 'Mem-'.
>
> Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
> bus memory space explicitly to avoid access on disabled memory.
>
> Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")
> Cc: stable@dpdk.org
>
> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> Tested-by: Harman Kalra <hkalra@marvell.com>
> Tested-by: David Marchand <david.marchand@redhat.com>
Tested-by: Thierry Martin <thierry.martin.public@gmail.com>

Applied, thanks again Haiyue.


Kevin, Luca,

I can see that some distros have already started backporting the fix
in kernel (fc31, fc32 and rhel7 at least for what I saw).
18.11 and 19.11 will need this fix at some point.
I'll let you decide on the proper timing.
  
Kevin Traynor June 25, 2020, 4:45 p.m. UTC | #2
On 25/06/2020 15:09, David Marchand wrote:
> On Thu, Jun 25, 2020 at 6:00 AM Haiyue Wang <haiyue.wang@intel.com> wrote:
>>
>> To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
>> and block MMIO access on disabled memory, it will send a SIGBUS to the
>> application:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fddede3e0a08dee1dcde08fc0eb8476
>>
>> When the application opens the vfio PCI device, the vfio-pci module will
>> enable the bus memory space through PCI read/write access. According to
>> the PCIe specification, the 'Memory Space Enable' is always zero for VF:
>>
>>              Table 9-13 Command Register Changes
>>
>> Bit Location | PF and VF Register Differences | PF         | VF
>>              | From Base                      | Attributes | Attributes
>> -------------+--------------------------------+------------+-----------
>>              | Memory Space Enable - Does not |            |
>>              | apply to VFs. Must be hardwired|  Base      |  0b
>>      1       | to 0b for VFs. VF Memory Space |            |
>>              | is controlled by the VF MSE bit|            |
>>              | in the VF Control register.    |            |
>> -------------+--------------------------------+------------+-----------
>>
>> Afterwards the vfio-pci will initialize its own virtual PCI config space
>> data ('vconfig') by reading the VF's physical PCI config space, then the
>> 'Memory Space Enable' bit in vconfig will always be 0b value. This will
>> make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
>> will be triggered if access these BARs.
>>
>> By investigation, the VF PCI device *passthrough* into the Guest OS by
>> QEMU has the 'Memory Space Enable' with 1b value. That's because every
>> PCI driver will start to enable the memory space, and this action will
>> be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
>> Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
>> VF runs in host OS has 'Mem-'.
>>
>> Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
>> bus memory space explicitly to avoid access on disabled memory.
>>
>> Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> Tested-by: Harman Kalra <hkalra@marvell.com>
>> Tested-by: David Marchand <david.marchand@redhat.com>
> Tested-by: Thierry Martin <thierry.martin.public@gmail.com>
> 
> Applied, thanks again Haiyue.
> 
> 
> Kevin, Luca,
> 
> I can see that some distros have already started backporting the fix
> in kernel (fc31, fc32 and rhel7 at least for what I saw).
> 18.11 and 19.11 will need this fix at some point.
> I'll let you decide on the proper timing.
> 
> 

It looks an important fix. I think it's worth having in 18.11.9. I will
apply and create an 18.11.9-rc2 tomorrow, so if anyone hasn't started
validation already, they can validate with it in.
  
Wang, Haiyue June 25, 2020, 6:33 p.m. UTC | #3
> -----Original Message-----
> From: Kevin Traynor <ktraynor@redhat.com>
> Sent: Friday, June 26, 2020 00:46
> To: David Marchand <david.marchand@redhat.com>; Wang, Haiyue <haiyue.wang@intel.com>; Luca Boccassi
> <bluca@debian.org>
> Cc: dev <dev@dpdk.org>; Burakov, Anatoly <anatoly.burakov@intel.com>; dpdk stable <stable@dpdk.org>;
> Harman Kalra <hkalra@marvell.com>
> Subject: Re: [PATCH v4] bus/pci: fix VF bus error for memory access
> 
> On 25/06/2020 15:09, David Marchand wrote:
> > On Thu, Jun 25, 2020 at 6:00 AM Haiyue Wang <haiyue.wang@intel.com> wrote:
> >>
> >> To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
> >> and block MMIO access on disabled memory, it will send a SIGBUS to the
> >> application:
> >>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fddede3e0a08dee
> 1dcde08fc0eb8476
> >>
> >> When the application opens the vfio PCI device, the vfio-pci module will
> >> enable the bus memory space through PCI read/write access. According to
> >> the PCIe specification, the 'Memory Space Enable' is always zero for VF:
> >>
> >>              Table 9-13 Command Register Changes
> >>
> >> Bit Location | PF and VF Register Differences | PF         | VF
> >>              | From Base                      | Attributes | Attributes
> >> -------------+--------------------------------+------------+-----------
> >>              | Memory Space Enable - Does not |            |
> >>              | apply to VFs. Must be hardwired|  Base      |  0b
> >>      1       | to 0b for VFs. VF Memory Space |            |
> >>              | is controlled by the VF MSE bit|            |
> >>              | in the VF Control register.    |            |
> >> -------------+--------------------------------+------------+-----------
> >>
> >> Afterwards the vfio-pci will initialize its own virtual PCI config space
> >> data ('vconfig') by reading the VF's physical PCI config space, then the
> >> 'Memory Space Enable' bit in vconfig will always be 0b value. This will
> >> make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
> >> will be triggered if access these BARs.
> >>
> >> By investigation, the VF PCI device *passthrough* into the Guest OS by
> >> QEMU has the 'Memory Space Enable' with 1b value. That's because every
> >> PCI driver will start to enable the memory space, and this action will
> >> be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
> >> Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
> >> VF runs in host OS has 'Mem-'.
> >>
> >> Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
> >> bus memory space explicitly to avoid access on disabled memory.
> >>
> >> Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> >> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> >> Tested-by: Harman Kalra <hkalra@marvell.com>
> >> Tested-by: David Marchand <david.marchand@redhat.com>
> > Tested-by: Thierry Martin <thierry.martin.public@gmail.com>
> >
> > Applied, thanks again Haiyue.
> >
> >
> > Kevin, Luca,
> >
> > I can see that some distros have already started backporting the fix
> > in kernel (fc31, fc32 and rhel7 at least for what I saw).
> > 18.11 and 19.11 will need this fix at some point.
> > I'll let you decide on the proper timing.
> >
> >
> 
> It looks an important fix. I think it's worth having in 18.11.9. I will
> apply and create an 18.11.9-rc2 tomorrow, so if anyone hasn't started
> validation already, they can validate with it in.

Alex post a fix in kernel just now. So looks like the DPDK patch is nice
to have, not a MUST. ;-)

https://lore.kernel.org/kvm/159310421505.27590.16617666489295503039.stgit@gimli.home/T/#u
  
Kevin Traynor June 26, 2020, 9:10 a.m. UTC | #4
On 25/06/2020 19:33, Wang, Haiyue wrote:
>> -----Original Message-----
>> From: Kevin Traynor <ktraynor@redhat.com>
>> Sent: Friday, June 26, 2020 00:46
>> To: David Marchand <david.marchand@redhat.com>; Wang, Haiyue <haiyue.wang@intel.com>; Luca Boccassi
>> <bluca@debian.org>
>> Cc: dev <dev@dpdk.org>; Burakov, Anatoly <anatoly.burakov@intel.com>; dpdk stable <stable@dpdk.org>;
>> Harman Kalra <hkalra@marvell.com>
>> Subject: Re: [PATCH v4] bus/pci: fix VF bus error for memory access
>>
>> On 25/06/2020 15:09, David Marchand wrote:
>>> On Thu, Jun 25, 2020 at 6:00 AM Haiyue Wang <haiyue.wang@intel.com> wrote:
>>>>
>>>> To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
>>>> and block MMIO access on disabled memory, it will send a SIGBUS to the
>>>> application:
>>>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fddede3e0a08dee
>> 1dcde08fc0eb8476
>>>>
>>>> When the application opens the vfio PCI device, the vfio-pci module will
>>>> enable the bus memory space through PCI read/write access. According to
>>>> the PCIe specification, the 'Memory Space Enable' is always zero for VF:
>>>>
>>>>              Table 9-13 Command Register Changes
>>>>
>>>> Bit Location | PF and VF Register Differences | PF         | VF
>>>>              | From Base                      | Attributes | Attributes
>>>> -------------+--------------------------------+------------+-----------
>>>>              | Memory Space Enable - Does not |            |
>>>>              | apply to VFs. Must be hardwired|  Base      |  0b
>>>>      1       | to 0b for VFs. VF Memory Space |            |
>>>>              | is controlled by the VF MSE bit|            |
>>>>              | in the VF Control register.    |            |
>>>> -------------+--------------------------------+------------+-----------
>>>>
>>>> Afterwards the vfio-pci will initialize its own virtual PCI config space
>>>> data ('vconfig') by reading the VF's physical PCI config space, then the
>>>> 'Memory Space Enable' bit in vconfig will always be 0b value. This will
>>>> make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
>>>> will be triggered if access these BARs.
>>>>
>>>> By investigation, the VF PCI device *passthrough* into the Guest OS by
>>>> QEMU has the 'Memory Space Enable' with 1b value. That's because every
>>>> PCI driver will start to enable the memory space, and this action will
>>>> be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
>>>> Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
>>>> VF runs in host OS has 'Mem-'.
>>>>
>>>> Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
>>>> bus memory space explicitly to avoid access on disabled memory.
>>>>
>>>> Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>> Tested-by: Harman Kalra <hkalra@marvell.com>
>>>> Tested-by: David Marchand <david.marchand@redhat.com>
>>> Tested-by: Thierry Martin <thierry.martin.public@gmail.com>
>>>
>>> Applied, thanks again Haiyue.
>>>
>>>
>>> Kevin, Luca,
>>>
>>> I can see that some distros have already started backporting the fix
>>> in kernel (fc31, fc32 and rhel7 at least for what I saw).
>>> 18.11 and 19.11 will need this fix at some point.
>>> I'll let you decide on the proper timing.
>>>
>>>
>>
>> It looks an important fix. I think it's worth having in 18.11.9. I will
>> apply and create an 18.11.9-rc2 tomorrow, so if anyone hasn't started
>> validation already, they can validate with it in.
> 
> Alex post a fix in kernel just now. So looks like the DPDK patch is nice
> to have, not a MUST. ;-)
> 

Thanks for the update Haiyue. That may be true in the future, but not at
the moment. The patch is just submitted yesterday, so I don't know how
long it will take to filter through to all the distro kernels (and users
to update).

I think it's still worth to take this patch now in 18.11. I will wait
until this afternoon in case anyone has reasons not to.

thanks,
Kevin.

> https://lore.kernel.org/kvm/159310421505.27590.16617666489295503039.stgit@gimli.home/T/#u
>
  
David Marchand June 26, 2020, 9:17 a.m. UTC | #5
On Thu, Jun 25, 2020 at 8:34 PM Wang, Haiyue <haiyue.wang@intel.com> wrote:
> > > I can see that some distros have already started backporting the fix
> > > in kernel (fc31, fc32 and rhel7 at least for what I saw).
> > > 18.11 and 19.11 will need this fix at some point.
> > > I'll let you decide on the proper timing.
> > >
> > >
> >
> > It looks an important fix. I think it's worth having in 18.11.9. I will
> > apply and create an 18.11.9-rc2 tomorrow, so if anyone hasn't started
> > validation already, they can validate with it in.
>
> Alex post a fix in kernel just now. So looks like the DPDK patch is nice
> to have, not a MUST. ;-)
>
> https://lore.kernel.org/kvm/159310421505.27590.16617666489295503039.stgit@gimli.home/T/#u

Yes, this has been discussed offlist and Alex proposed an adjustment
on the CVE fix.
But we still have to live with the situation where people will have
only the first part of the fix.
I am still for backporting this change to stable branches.
  
Wang, Haiyue June 26, 2020, 2:14 p.m. UTC | #6
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, June 26, 2020 17:17
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: Kevin Traynor <ktraynor@redhat.com>; Luca Boccassi <bluca@debian.org>; dev <dev@dpdk.org>; Burakov,
> Anatoly <anatoly.burakov@intel.com>; dpdk stable <stable@dpdk.org>; Harman Kalra <hkalra@marvell.com>
> Subject: Re: [PATCH v4] bus/pci: fix VF bus error for memory access
> 
> On Thu, Jun 25, 2020 at 8:34 PM Wang, Haiyue <haiyue.wang@intel.com> wrote:
> > > > I can see that some distros have already started backporting the fix
> > > > in kernel (fc31, fc32 and rhel7 at least for what I saw).
> > > > 18.11 and 19.11 will need this fix at some point.
> > > > I'll let you decide on the proper timing.
> > > >
> > > >
> > >
> > > It looks an important fix. I think it's worth having in 18.11.9. I will
> > > apply and create an 18.11.9-rc2 tomorrow, so if anyone hasn't started
> > > validation already, they can validate with it in.
> >
> > Alex post a fix in kernel just now. So looks like the DPDK patch is nice
> > to have, not a MUST. ;-)
> >
> > https://lore.kernel.org/kvm/159310421505.27590.16617666489295503039.stgit@gimli.home/T/#u
> 
> Yes, this has been discussed offlist and Alex proposed an adjustment
> on the CVE fix.
> But we still have to live with the situation where people will have
> only the first part of the fix.
> I am still for backporting this change to stable branches.
> 

Got it, thanks for sharing.

> 
> --
> David Marchand
  

Patch

diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index 64cd84a68..ba60e7ce9 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -149,6 +149,38 @@  pci_vfio_get_msix_bar(int fd, struct pci_msix_table *msix_table)
 	return 0;
 }
 
+/* enable PCI bus memory space */
+static int
+pci_vfio_enable_bus_memory(int dev_fd)
+{
+	uint16_t cmd;
+	int ret;
+
+	ret = pread64(dev_fd, &cmd, sizeof(cmd),
+		      VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) +
+		      PCI_COMMAND);
+
+	if (ret != sizeof(cmd)) {
+		RTE_LOG(ERR, EAL, "Cannot read command from PCI config space!\n");
+		return -1;
+	}
+
+	if (cmd & PCI_COMMAND_MEMORY)
+		return 0;
+
+	cmd |= PCI_COMMAND_MEMORY;
+	ret = pwrite64(dev_fd, &cmd, sizeof(cmd),
+		       VFIO_GET_REGION_ADDR(VFIO_PCI_CONFIG_REGION_INDEX) +
+		       PCI_COMMAND);
+
+	if (ret != sizeof(cmd)) {
+		RTE_LOG(ERR, EAL, "Cannot write command to PCI config space!\n");
+		return -1;
+	}
+
+	return 0;
+}
+
 /* set PCI bus mastering */
 static int
 pci_vfio_set_bus_master(int dev_fd, bool op)
@@ -427,6 +459,11 @@  pci_rte_vfio_setup_device(struct rte_pci_device *dev, int vfio_dev_fd)
 		return -1;
 	}
 
+	if (pci_vfio_enable_bus_memory(vfio_dev_fd)) {
+		RTE_LOG(ERR, EAL, "Cannot enable bus memory!\n");
+		return -1;
+	}
+
 	/* set bus mastering for the device */
 	if (pci_vfio_set_bus_master(vfio_dev_fd, true)) {
 		RTE_LOG(ERR, EAL, "Cannot set up bus mastering!\n");