[dpdk-dev] [PATCH v3] mem: command line option to delete hugepage backing files

Bruce Richardson bruce.richardson at intel.com
Thu Oct 22 17:26:39 CEST 2015


On Thu, Oct 22, 2015 at 09:51:28AM +0100, Sergio Gonzalez Monroy wrote:
> On 21/10/2015 17:34, Bruce Richardson wrote:
> >On Wed, Oct 21, 2015 at 04:22:45PM +0000, shesha Sreenivasamurthy (shesha) wrote:
> >>When an application using huge-pages crash or exists, the hugetlbfs
> >>backing files are not cleaned up. This is a patch to clean those files.
> >>There are multi-process DPDK applications that may be benefited by those
> >>backing files. Therefore, I have made that configurable so that the
> >>application that does not need those backing files can remove them, thus
> >>not changing the current default behavior. The application itself can
> >>clean it up, however the rationale behind DPDK cleaning it up is, DPDK
> >>created it and therefore, it is better it unlinks it.
> >>
> >>Signed-off-by: Shesha Sreenivasamurthy <shesha at cisco.com>
> >>---
> >>  lib/librte_eal/common/eal_common_options.c | 12 ++++++++++++
> >>  lib/librte_eal/common/eal_internal_cfg.h   |  1 +
> >>  lib/librte_eal/common/eal_options.h        |  2 ++
> >>  lib/librte_eal/linuxapp/eal/eal_memory.c   | 30
> >>++++++++++++++++++++++++++++++
> >>  4 files changed, 45 insertions(+)
> >>
> ><snip>
> >>+static int
> >>+unlink_hugepage_files(struct hugepage_file *hugepg_tbl,
> >>+		unsigned num_hp_info)
> >>+{
> >>+	unsigned socket, size;
> >>+	int page, nrpages = 0;
> >>+
> >>+	/* get total number of hugepages */
> >>+	for (size = 0; size < num_hp_info; size++)
> >>+		for (socket = 0; socket < RTE_MAX_NUMA_NODES; socket++)
> >>+			nrpages += internal_config.hugepage_info[size].num_pages[socket];
> >>+
> >>+	for (page = 0; page < nrpages; page++) {
> >>+		struct hugepage_file *hp = &hugepg_tbl[page];
> >>+		if (hp->final_va != NULL && unlink(hp->filepath)) {
> >>+			RTE_LOG(WARNING, EAL, "%s(): Removing %s failed: %s\n",
> >>+				__func__, hp->filepath, strerror(errno));
> >>+		}
> >>+	}
> >>+	return 0;
> >>+}
> >>+
> >>  /*
> >>   * unmaps hugepages that are not going to be used. since we originally
> >>allocate
> >>   * ALL hugepages (not just those we need), additional unmapping needs to
> >>be done.
> >>@@ -1289,6 +1311,14 @@ rte_eal_hugepage_init(void)
> >>  		goto fail;
> >>  	}
> >>+	/* free the hugepage backing files */
> >>+	if (internal_config.hugepage_unlink &&
> >>+		unlink_hugepage_files(tmp_hp,
> >>+			internal_config.num_hugepage_sizes) < 0) {
> >>+			RTE_LOG(ERR, EAL, "Unlinking hugepage backing files failed!\n");
> >>+		goto fail;
> >>+	}
> >>+
> >Sorry for the late comment, but...
> >
> >Rather than adding a whole new function to be called here, can the same effect
> >not be got by adding in 2/3 lines like:
> >	if (internal_config.hugepage_unlink)
> >		unlink(hugetlb[i].filepath)
> >
> >at line 409 of eal_memory.c where were have done our final mmap of the file.
> >[You also need the same couple of lines for the 32-bit special case at line 351].
> >It would be a shorter diff.
> >
> >/Bruce
> If you wanted to avoid the extra function call, I might be cleaner to just
> unlink all files when
> doing unmap_all_hugepages_orig.
> My two cents: I think it would be easier to read/debug having a function
> that "unlinks files" instead
> of unlinking files at different points in map_all_hugepages.
> 
> Unfortunately the proposed approach does not work for all cases:
> - If we have single file segment, map_all_hugepages does not get call a
> second time, instead we call
>   remap_all_hugepages
> - If we use options -m or --socket-mem, because unmap_unneeded_hugepages
> does not expect files
>   already unlinked, it will fail when trying to unlink unneeded hugepage
> files.
> 
> The current patch would work as we only unlink after
> unmap_unneeded_hugepages.
> 
> Sergio
> 
I don't mind much where the functionality is done - I mostly care about reducing
the code diff. I'd prefer the unlink to be done either with the mmap of unmap, but
I won't lose any sleep if everyone else prefers a separate function for it.

/Bruce


More information about the dev mailing list