[dpdk-dev] vfio: fix file descriptor leak in multi-process applications

Message ID 20170126230521.28314-1-patrick@patrickmacarthur.net (mailing list archive)
State Accepted, archived
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel compilation success Compilation OK

Commit Message

Patrick MacArthur Jan. 26, 2017, 11:05 p.m. UTC
  When a secondary process wants access to the VFIO container file
descriptor, the primary process calls vfio_get_container_fd() which
always opens an entirely new file descriptor on /dev/vfio/vfio.
However, once the file descriptor has been passed to the subprocess, it
is effectively duplicated, meaning that the copy of the file descriptor
in the primary process is no longer needed.  However, the primary
process does not close the duplicate fd, which results in a resource
leak.

This can be reproduced by starting a primary process with a small
RLIMIT_NOFILE limit configured to use VFIO for at least one device, and
repeatedly launching secondary processes until the file descriptor limit
is exceeded.

Fix the resource leak by closing the local vfio container file
descriptor after passing it to the secondary process.

Fixes: 2f4adfad0a69 ("vfio: add multiprocess support")
Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net>
---
 lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Burakov, Anatoly Feb. 9, 2017, 11:41 a.m. UTC | #1
> -----Original Message-----
> From: Patrick MacArthur [mailto:patrick@patrickmacarthur.net]
> Sent: Thursday, January 26, 2017 11:05 PM
> To: dev@dpdk.org; Burakov, Anatoly <anatoly.burakov@intel.com>
> Cc: Patrick MacArthur <patrick@patrickmacarthur.net>
> Subject: [PATCH] vfio: fix file descriptor leak in multi-process applications
> 
> When a secondary process wants access to the VFIO container file descriptor,
> the primary process calls vfio_get_container_fd() which always opens an
> entirely new file descriptor on /dev/vfio/vfio.
> However, once the file descriptor has been passed to the subprocess, it is
> effectively duplicated, meaning that the copy of the file descriptor in the
> primary process is no longer needed.  However, the primary process does
> not close the duplicate fd, which results in a resource leak.
> 
> This can be reproduced by starting a primary process with a small
> RLIMIT_NOFILE limit configured to use VFIO for at least one device, and
> repeatedly launching secondary processes until the file descriptor limit is
> exceeded.
> 
> Fix the resource leak by closing the local vfio container file descriptor after
> passing it to the secondary process.
> 
> Fixes: 2f4adfad0a69 ("vfio: add multiprocess support")
> Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net>
> ---
>  lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c
> b/lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c
> index 00cf919b64d0..fb4a2f84b180 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c
> @@ -301,6 +301,7 @@ vfio_mp_sync_thread(void __rte_unused * arg)
>  				vfio_mp_sync_send_request(conn_sock,
> SOCKET_ERR);
>  			else
>  				vfio_mp_sync_send_fd(conn_sock, fd);
> +			close(fd);
>  			break;
>  		case SOCKET_REQ_GROUP:
>  			/* wait for group number */
> --
> 2.9.3


Acked-by: Anatoly  Burakov <anatoly.burakov@intel.com>
  
Thomas Monjalon Feb. 9, 2017, 5:36 p.m. UTC | #2
> > When a secondary process wants access to the VFIO container file descriptor,
> > the primary process calls vfio_get_container_fd() which always opens an
> > entirely new file descriptor on /dev/vfio/vfio.
> > However, once the file descriptor has been passed to the subprocess, it is
> > effectively duplicated, meaning that the copy of the file descriptor in the
> > primary process is no longer needed.  However, the primary process does
> > not close the duplicate fd, which results in a resource leak.
> > 
> > This can be reproduced by starting a primary process with a small
> > RLIMIT_NOFILE limit configured to use VFIO for at least one device, and
> > repeatedly launching secondary processes until the file descriptor limit is
> > exceeded.
> > 
> > Fix the resource leak by closing the local vfio container file descriptor after
> > passing it to the secondary process.
> > 
> > Fixes: 2f4adfad0a69 ("vfio: add multiprocess support")
> > Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net>
> 
> Acked-by: Anatoly  Burakov <anatoly.burakov@intel.com>

Applied, thanks
  

Patch

diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c b/lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c
index 00cf919b64d0..fb4a2f84b180 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c
@@ -301,6 +301,7 @@  vfio_mp_sync_thread(void __rte_unused * arg)
 				vfio_mp_sync_send_request(conn_sock, SOCKET_ERR);
 			else
 				vfio_mp_sync_send_fd(conn_sock, fd);
+			close(fd);
 			break;
 		case SOCKET_REQ_GROUP:
 			/* wait for group number */