[1/5] mempool: allow unaligned addr/len in populate virt

Message ID 20191028140122.9592-2-olivier.matz@6wind.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series mempool: avoid objects allocations across pages |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail Compilation issues
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-compilation success Compile Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Olivier Matz Oct. 28, 2019, 2:01 p.m. UTC
  rte_mempool_populate_virt() currently requires that both addr
and length are page-aligned.

Remove this uneeded constraint which can be annoying with big
hugepages (ex: 1GB).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 18 +++++++-----------
 lib/librte_mempool/rte_mempool.h |  3 +--
 2 files changed, 8 insertions(+), 13 deletions(-)
  

Comments

Vamsi Krishna Attunuru Oct. 29, 2019, 9:02 a.m. UTC | #1
Hi Olivier,

Thanks for patch set, able run the tests with 512MB page size with this patch set on Octeontx2 platform, somehow mbuf is holding null fields when pool is created with 2MB page size, tests like l3fwd, kni are failing due to the malformed mbufs. Can you confirm if the patch set was verified on any platform with different page sizes. Meanwhile I will also debug this issue.

Regards
A Vamsi

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Monday, October 28, 2019 7:31 PM
> To: dev@dpdk.org
> Cc: Anatoly Burakov <anatoly.burakov@intel.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; Ferruh Yigit <ferruh.yigit@linux.intel.com>;
> Giridharan, Ganesan <ggiridharan@rbbn.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; Kiran Kumar Kokkilagadda <kirankumark@marvell.com>;
> Stephen Hemminger <sthemmin@microsoft.com>; Thomas Monjalon
> <thomas@monjalon.net>; Vamsi Krishna Attunuru <vattunuru@marvell.com>
> Subject: [EXT] [PATCH 1/5] mempool: allow unaligned addr/len in populate virt
> 
> External Email
> 
> ----------------------------------------------------------------------
> rte_mempool_populate_virt() currently requires that both addr and length are
> page-aligned.
> 
> Remove this uneeded constraint which can be annoying with big hugepages (ex:
> 1GB).
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  lib/librte_mempool/rte_mempool.c | 18 +++++++-----------
> lib/librte_mempool/rte_mempool.h |  3 +--
>  2 files changed, 8 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.c
> b/lib/librte_mempool/rte_mempool.c
> index 0f29e8712..76cbacdf3 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -368,17 +368,11 @@ rte_mempool_populate_virt(struct rte_mempool
> *mp, char *addr,
>  	size_t off, phys_len;
>  	int ret, cnt = 0;
> 
> -	/* address and len must be page-aligned */
> -	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
> -		return -EINVAL;
> -	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
> -		return -EINVAL;
> -
>  	if (mp->flags & MEMPOOL_F_NO_IOVA_CONTIG)
>  		return rte_mempool_populate_iova(mp, addr, RTE_BAD_IOVA,
>  			len, free_cb, opaque);
> 
> -	for (off = 0; off + pg_sz <= len &&
> +	for (off = 0; off < len &&
>  		     mp->populated_size < mp->size; off += phys_len) {
> 
>  		iova = rte_mem_virt2iova(addr + off); @@ -389,7 +383,10 @@
> rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
>  		}
> 
>  		/* populate with the largest group of contiguous pages */
> -		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
> +		for (phys_len = RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) -
> +			     (addr + off);
> +		     off + phys_len < len;
> +		     phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
>  			rte_iova_t iova_tmp;
> 
>  			iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
> @@ -575,8 +572,7 @@ rte_mempool_populate_default(struct rte_mempool
> *mp)
>  			 * have
>  			 */
>  			mz = rte_memzone_reserve_aligned(mz_name, 0,
> -					mp->socket_id, flags,
> -					RTE_MAX(pg_sz, align));
> +					mp->socket_id, flags, align);
>  		}
>  		if (mz == NULL) {
>  			ret = -rte_errno;
> @@ -601,7 +597,7 @@ rte_mempool_populate_default(struct rte_mempool
> *mp)
>  				(void *)(uintptr_t)mz);
>  		else
>  			ret = rte_mempool_populate_virt(mp, mz->addr,
> -				RTE_ALIGN_FLOOR(mz->len, pg_sz), pg_sz,
> +				mz->len, pg_sz,
>  				rte_mempool_memchunk_mz_free,
>  				(void *)(uintptr_t)mz);
>  		if (ret < 0) {
> diff --git a/lib/librte_mempool/rte_mempool.h
> b/lib/librte_mempool/rte_mempool.h
> index 8053f7a04..0fe8aa7b8 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -1042,9 +1042,8 @@ int rte_mempool_populate_iova(struct rte_mempool
> *mp, char *vaddr,
>   *   A pointer to the mempool structure.
>   * @param addr
>   *   The virtual address of memory that should be used to store objects.
> - *   Must be page-aligned.
>   * @param len
> - *   The length of memory in bytes. Must be page-aligned.
> + *   The length of memory in bytes.
>   * @param pg_sz
>   *   The size of memory pages in this virtual area.
>   * @param free_cb
> --
> 2.20.1
  
Olivier Matz Oct. 29, 2019, 9:13 a.m. UTC | #2
Hi Vamsi,

On Tue, Oct 29, 2019 at 09:02:26AM +0000, Vamsi Krishna Attunuru wrote:
> Hi Olivier,
> 
> Thanks for patch set, able run the tests with 512MB page size with this patch
> set on Octeontx2 platform, somehow mbuf is holding null fields when pool is
> created with 2MB page size, tests like l3fwd, kni are failing due to the
> malformed mbufs. Can you confirm if the patch set was verified on any platform
> with different page sizes. Meanwhile I will also debug this issue.

Thank you for testing.
I tested the patch locally on x86 with 2MB huge pages, and using
travis-ci.

Maybe it is related to the octeontx2 mempool driver? There is a specific
condition to align the object start address to a multiple of total_elt_sz.

Is it this specific patch that breaks your test? Or is it the full patchset?

Thanks,
Olivier



> 
> Regards
> A Vamsi
> 
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Monday, October 28, 2019 7:31 PM
> > To: dev@dpdk.org
> > Cc: Anatoly Burakov <anatoly.burakov@intel.com>; Andrew Rybchenko
> > <arybchenko@solarflare.com>; Ferruh Yigit <ferruh.yigit@linux.intel.com>;
> > Giridharan, Ganesan <ggiridharan@rbbn.com>; Jerin Jacob Kollanukkaran
> > <jerinj@marvell.com>; Kiran Kumar Kokkilagadda <kirankumark@marvell.com>;
> > Stephen Hemminger <sthemmin@microsoft.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Vamsi Krishna Attunuru <vattunuru@marvell.com>
> > Subject: [EXT] [PATCH 1/5] mempool: allow unaligned addr/len in populate virt
> > 
> > External Email
> > 
> > ----------------------------------------------------------------------
> > rte_mempool_populate_virt() currently requires that both addr and length are
> > page-aligned.
> > 
> > Remove this uneeded constraint which can be annoying with big hugepages (ex:
> > 1GB).
> > 
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > ---
> >  lib/librte_mempool/rte_mempool.c | 18 +++++++-----------
> > lib/librte_mempool/rte_mempool.h |  3 +--
> >  2 files changed, 8 insertions(+), 13 deletions(-)
> > 
> > diff --git a/lib/librte_mempool/rte_mempool.c
> > b/lib/librte_mempool/rte_mempool.c
> > index 0f29e8712..76cbacdf3 100644
> > --- a/lib/librte_mempool/rte_mempool.c
> > +++ b/lib/librte_mempool/rte_mempool.c
> > @@ -368,17 +368,11 @@ rte_mempool_populate_virt(struct rte_mempool
> > *mp, char *addr,
> >  	size_t off, phys_len;
> >  	int ret, cnt = 0;
> > 
> > -	/* address and len must be page-aligned */
> > -	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
> > -		return -EINVAL;
> > -	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
> > -		return -EINVAL;
> > -
> >  	if (mp->flags & MEMPOOL_F_NO_IOVA_CONTIG)
> >  		return rte_mempool_populate_iova(mp, addr, RTE_BAD_IOVA,
> >  			len, free_cb, opaque);
> > 
> > -	for (off = 0; off + pg_sz <= len &&
> > +	for (off = 0; off < len &&
> >  		     mp->populated_size < mp->size; off += phys_len) {
> > 
> >  		iova = rte_mem_virt2iova(addr + off); @@ -389,7 +383,10 @@
> > rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
> >  		}
> > 
> >  		/* populate with the largest group of contiguous pages */
> > -		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
> > +		for (phys_len = RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) -
> > +			     (addr + off);
> > +		     off + phys_len < len;
> > +		     phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
> >  			rte_iova_t iova_tmp;
> > 
> >  			iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
> > @@ -575,8 +572,7 @@ rte_mempool_populate_default(struct rte_mempool
> > *mp)
> >  			 * have
> >  			 */
> >  			mz = rte_memzone_reserve_aligned(mz_name, 0,
> > -					mp->socket_id, flags,
> > -					RTE_MAX(pg_sz, align));
> > +					mp->socket_id, flags, align);
> >  		}
> >  		if (mz == NULL) {
> >  			ret = -rte_errno;
> > @@ -601,7 +597,7 @@ rte_mempool_populate_default(struct rte_mempool
> > *mp)
> >  				(void *)(uintptr_t)mz);
> >  		else
> >  			ret = rte_mempool_populate_virt(mp, mz->addr,
> > -				RTE_ALIGN_FLOOR(mz->len, pg_sz), pg_sz,
> > +				mz->len, pg_sz,
> >  				rte_mempool_memchunk_mz_free,
> >  				(void *)(uintptr_t)mz);
> >  		if (ret < 0) {
> > diff --git a/lib/librte_mempool/rte_mempool.h
> > b/lib/librte_mempool/rte_mempool.h
> > index 8053f7a04..0fe8aa7b8 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -1042,9 +1042,8 @@ int rte_mempool_populate_iova(struct rte_mempool
> > *mp, char *vaddr,
> >   *   A pointer to the mempool structure.
> >   * @param addr
> >   *   The virtual address of memory that should be used to store objects.
> > - *   Must be page-aligned.
> >   * @param len
> > - *   The length of memory in bytes. Must be page-aligned.
> > + *   The length of memory in bytes.
> >   * @param pg_sz
> >   *   The size of memory pages in this virtual area.
> >   * @param free_cb
> > --
> > 2.20.1
>
  
Vamsi Krishna Attunuru Oct. 29, 2019, 9:18 a.m. UTC | #3
Hi Olivier,

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, October 29, 2019 2:43 PM
> To: Vamsi Krishna Attunuru <vattunuru@marvell.com>
> Cc: Anatoly Burakov <anatoly.burakov@intel.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; Ferruh Yigit <ferruh.yigit@linux.intel.com>;
> Giridharan, Ganesan <ggiridharan@rbbn.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>; Kiran Kumar Kokkilagadda <kirankumark@marvell.com>;
> Stephen Hemminger <sthemmin@microsoft.com>; Thomas Monjalon
> <thomas@monjalon.net>; dev@dpdk.org
> Subject: Re: [EXT] [PATCH 1/5] mempool: allow unaligned addr/len in populate
> virt
> 
> Hi Vamsi,
> 
> On Tue, Oct 29, 2019 at 09:02:26AM +0000, Vamsi Krishna Attunuru wrote:
> > Hi Olivier,
> >
> > Thanks for patch set, able run the tests with 512MB page size with
> > this patch set on Octeontx2 platform, somehow mbuf is holding null
> > fields when pool is created with 2MB page size, tests like l3fwd, kni
> > are failing due to the malformed mbufs. Can you confirm if the patch
> > set was verified on any platform with different page sizes. Meanwhile I will
> also debug this issue.
> 
> Thank you for testing.
> I tested the patch locally on x86 with 2MB huge pages, and using travis-ci.
> 
> Maybe it is related to the octeontx2 mempool driver? There is a specific
> condition to align the object start address to a multiple of total_elt_sz.
> 
> Is it this specific patch that breaks your test? Or is it the full patchset?

No, I am testing full patchset. I will check with specific patches to narrow down the non-working case.

> 
> Thanks,
> Olivier
> 
> 
> 
> >
> > Regards
> > A Vamsi
> >
> > > -----Original Message-----
> > > From: Olivier Matz <olivier.matz@6wind.com>
> > > Sent: Monday, October 28, 2019 7:31 PM
> > > To: dev@dpdk.org
> > > Cc: Anatoly Burakov <anatoly.burakov@intel.com>; Andrew Rybchenko
> > > <arybchenko@solarflare.com>; Ferruh Yigit
> > > <ferruh.yigit@linux.intel.com>; Giridharan, Ganesan
> > > <ggiridharan@rbbn.com>; Jerin Jacob Kollanukkaran
> > > <jerinj@marvell.com>; Kiran Kumar Kokkilagadda
> > > <kirankumark@marvell.com>; Stephen Hemminger
> > > <sthemmin@microsoft.com>; Thomas Monjalon <thomas@monjalon.net>;
> > > Vamsi Krishna Attunuru <vattunuru@marvell.com>
> > > Subject: [EXT] [PATCH 1/5] mempool: allow unaligned addr/len in
> > > populate virt
> > >
> > > External Email
> > >
> > > --------------------------------------------------------------------
> > > --
> > > rte_mempool_populate_virt() currently requires that both addr and
> > > length are page-aligned.
> > >
> > > Remove this uneeded constraint which can be annoying with big hugepages
> (ex:
> > > 1GB).
> > >
> > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > > ---
> > >  lib/librte_mempool/rte_mempool.c | 18 +++++++-----------
> > > lib/librte_mempool/rte_mempool.h |  3 +--
> > >  2 files changed, 8 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/lib/librte_mempool/rte_mempool.c
> > > b/lib/librte_mempool/rte_mempool.c
> > > index 0f29e8712..76cbacdf3 100644
> > > --- a/lib/librte_mempool/rte_mempool.c
> > > +++ b/lib/librte_mempool/rte_mempool.c
> > > @@ -368,17 +368,11 @@ rte_mempool_populate_virt(struct rte_mempool
> > > *mp, char *addr,
> > >  	size_t off, phys_len;
> > >  	int ret, cnt = 0;
> > >
> > > -	/* address and len must be page-aligned */
> > > -	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
> > > -		return -EINVAL;
> > > -	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
> > > -		return -EINVAL;
> > > -
> > >  	if (mp->flags & MEMPOOL_F_NO_IOVA_CONTIG)
> > >  		return rte_mempool_populate_iova(mp, addr, RTE_BAD_IOVA,
> > >  			len, free_cb, opaque);
> > >
> > > -	for (off = 0; off + pg_sz <= len &&
> > > +	for (off = 0; off < len &&
> > >  		     mp->populated_size < mp->size; off += phys_len) {
> > >
> > >  		iova = rte_mem_virt2iova(addr + off); @@ -389,7 +383,10 @@
> > > rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
> > >  		}
> > >
> > >  		/* populate with the largest group of contiguous pages */
> > > -		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
> > > +		for (phys_len = RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) -
> > > +			     (addr + off);
> > > +		     off + phys_len < len;
> > > +		     phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
> > >  			rte_iova_t iova_tmp;
> > >
> > >  			iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
> @@ -575,8
> > > +572,7 @@ rte_mempool_populate_default(struct rte_mempool
> > > *mp)
> > >  			 * have
> > >  			 */
> > >  			mz = rte_memzone_reserve_aligned(mz_name, 0,
> > > -					mp->socket_id, flags,
> > > -					RTE_MAX(pg_sz, align));
> > > +					mp->socket_id, flags, align);
> > >  		}
> > >  		if (mz == NULL) {
> > >  			ret = -rte_errno;
> > > @@ -601,7 +597,7 @@ rte_mempool_populate_default(struct
> rte_mempool
> > > *mp)
> > >  				(void *)(uintptr_t)mz);
> > >  		else
> > >  			ret = rte_mempool_populate_virt(mp, mz->addr,
> > > -				RTE_ALIGN_FLOOR(mz->len, pg_sz), pg_sz,
> > > +				mz->len, pg_sz,
> > >  				rte_mempool_memchunk_mz_free,
> > >  				(void *)(uintptr_t)mz);
> > >  		if (ret < 0) {
> > > diff --git a/lib/librte_mempool/rte_mempool.h
> > > b/lib/librte_mempool/rte_mempool.h
> > > index 8053f7a04..0fe8aa7b8 100644
> > > --- a/lib/librte_mempool/rte_mempool.h
> > > +++ b/lib/librte_mempool/rte_mempool.h
> > > @@ -1042,9 +1042,8 @@ int rte_mempool_populate_iova(struct
> > > rte_mempool *mp, char *vaddr,
> > >   *   A pointer to the mempool structure.
> > >   * @param addr
> > >   *   The virtual address of memory that should be used to store objects.
> > > - *   Must be page-aligned.
> > >   * @param len
> > > - *   The length of memory in bytes. Must be page-aligned.
> > > + *   The length of memory in bytes.
> > >   * @param pg_sz
> > >   *   The size of memory pages in this virtual area.
> > >   * @param free_cb
> > > --
> > > 2.20.1
> >
  
Andrew Rybchenko Oct. 29, 2019, 9:21 a.m. UTC | #4
On 10/28/19 5:01 PM, Olivier Matz wrote:
> rte_mempool_populate_virt() currently requires that both addr
> and length are page-aligned.
>
> Remove this uneeded constraint which can be annoying with big
> hugepages (ex: 1GB).
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>

One note below, other than that
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>

> ---
>   lib/librte_mempool/rte_mempool.c | 18 +++++++-----------
>   lib/librte_mempool/rte_mempool.h |  3 +--
>   2 files changed, 8 insertions(+), 13 deletions(-)
>
> diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
> index 0f29e8712..76cbacdf3 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -368,17 +368,11 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
>   	size_t off, phys_len;
>   	int ret, cnt = 0;
>   
> -	/* address and len must be page-aligned */
> -	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
> -		return -EINVAL;
> -	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
> -		return -EINVAL;
> -
>   	if (mp->flags & MEMPOOL_F_NO_IOVA_CONTIG)
>   		return rte_mempool_populate_iova(mp, addr, RTE_BAD_IOVA,
>   			len, free_cb, opaque);
>   
> -	for (off = 0; off + pg_sz <= len &&
> +	for (off = 0; off < len &&
>   		     mp->populated_size < mp->size; off += phys_len) {
>   
>   		iova = rte_mem_virt2iova(addr + off);
> @@ -389,7 +383,10 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
>   		}
>   
>   		/* populate with the largest group of contiguous pages */
> -		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
> +		for (phys_len = RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) -
> +			     (addr + off);
> +		     off + phys_len < len;

If the condition is false on the first check, below we'll populate memory
outside of specified length. So, we should either apply MIN above when 
phys_len
is initialized or drop MIN in the next line, but apply MIN when
rte_mempool_populate_iova() is called.

Bonus question not directly related to the patch is iova == RTE_BAD_IOVA
case when !rte_eal_has_hugepages():
Is it expected that we still use arithmetic iova + phys_len in this case?
I guess comparison will always be false and pages never merged, but it looks
suspicious anyway.

> +		     phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
>   			rte_iova_t iova_tmp;
>   
>   			iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
> @@ -575,8 +572,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
>   			 * have
>   			 */
>   			mz = rte_memzone_reserve_aligned(mz_name, 0,
> -					mp->socket_id, flags,
> -					RTE_MAX(pg_sz, align));
> +					mp->socket_id, flags, align);
>   		}
>   		if (mz == NULL) {
>   			ret = -rte_errno;
> @@ -601,7 +597,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
>   				(void *)(uintptr_t)mz);
>   		else
>   			ret = rte_mempool_populate_virt(mp, mz->addr,
> -				RTE_ALIGN_FLOOR(mz->len, pg_sz), pg_sz,
> +				mz->len, pg_sz,
>   				rte_mempool_memchunk_mz_free,
>   				(void *)(uintptr_t)mz);
>   		if (ret < 0) {
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index 8053f7a04..0fe8aa7b8 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -1042,9 +1042,8 @@ int rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
>    *   A pointer to the mempool structure.
>    * @param addr
>    *   The virtual address of memory that should be used to store objects.
> - *   Must be page-aligned.
>    * @param len
> - *   The length of memory in bytes. Must be page-aligned.
> + *   The length of memory in bytes.
>    * @param pg_sz
>    *   The size of memory pages in this virtual area.
>    * @param free_cb
  
Olivier Matz Oct. 29, 2019, 5:02 p.m. UTC | #5
Hi Andrew,

On Tue, Oct 29, 2019 at 12:21:27PM +0300, Andrew Rybchenko wrote:
> On 10/28/19 5:01 PM, Olivier Matz wrote:
> > rte_mempool_populate_virt() currently requires that both addr
> > and length are page-aligned.
> > 
> > Remove this uneeded constraint which can be annoying with big
> > hugepages (ex: 1GB).
> > 
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> 
> One note below, other than that
> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
> 
> > ---
> >   lib/librte_mempool/rte_mempool.c | 18 +++++++-----------
> >   lib/librte_mempool/rte_mempool.h |  3 +--
> >   2 files changed, 8 insertions(+), 13 deletions(-)
> > 
> > diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
> > index 0f29e8712..76cbacdf3 100644
> > --- a/lib/librte_mempool/rte_mempool.c
> > +++ b/lib/librte_mempool/rte_mempool.c
> > @@ -368,17 +368,11 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
> >   	size_t off, phys_len;
> >   	int ret, cnt = 0;
> > -	/* address and len must be page-aligned */
> > -	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
> > -		return -EINVAL;
> > -	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
> > -		return -EINVAL;
> > -
> >   	if (mp->flags & MEMPOOL_F_NO_IOVA_CONTIG)
> >   		return rte_mempool_populate_iova(mp, addr, RTE_BAD_IOVA,
> >   			len, free_cb, opaque);
> > -	for (off = 0; off + pg_sz <= len &&
> > +	for (off = 0; off < len &&
> >   		     mp->populated_size < mp->size; off += phys_len) {
> >   		iova = rte_mem_virt2iova(addr + off);
> > @@ -389,7 +383,10 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
> >   		}
> >   		/* populate with the largest group of contiguous pages */
> > -		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
> > +		for (phys_len = RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) -
> > +			     (addr + off);
> > +		     off + phys_len < len;
> 
> If the condition is false on the first check, below we'll populate memory
> outside of specified length. So, we should either apply MIN above when
> phys_len
> is initialized or drop MIN in the next line, but apply MIN when
> rte_mempool_populate_iova() is called.

Correct, I'll fix it by adding a RTE_MIN(). Maybe I'll just move it
outside of the for.

> Bonus question not directly related to the patch is iova == RTE_BAD_IOVA
> case when !rte_eal_has_hugepages():
> Is it expected that we still use arithmetic iova + phys_len in this case?
> I guess comparison will always be false and pages never merged, but it looks
> suspicious anyway.

Yes, this is not very elegant.
I can change the test to
	if (iova_tmp != RTE_BAD_IOVA || iova_tmp != iova + phys_len)

> 
> > +		     phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
> >   			rte_iova_t iova_tmp;
> >   			iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
> > @@ -575,8 +572,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
> >   			 * have
> >   			 */
> >   			mz = rte_memzone_reserve_aligned(mz_name, 0,
> > -					mp->socket_id, flags,
> > -					RTE_MAX(pg_sz, align));
> > +					mp->socket_id, flags, align);
> >   		}
> >   		if (mz == NULL) {
> >   			ret = -rte_errno;
> > @@ -601,7 +597,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
> >   				(void *)(uintptr_t)mz);
> >   		else
> >   			ret = rte_mempool_populate_virt(mp, mz->addr,
> > -				RTE_ALIGN_FLOOR(mz->len, pg_sz), pg_sz,
> > +				mz->len, pg_sz,
> >   				rte_mempool_memchunk_mz_free,
> >   				(void *)(uintptr_t)mz);
> >   		if (ret < 0) {
> > diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> > index 8053f7a04..0fe8aa7b8 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -1042,9 +1042,8 @@ int rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
> >    *   A pointer to the mempool structure.
> >    * @param addr
> >    *   The virtual address of memory that should be used to store objects.
> > - *   Must be page-aligned.
> >    * @param len
> > - *   The length of memory in bytes. Must be page-aligned.
> > + *   The length of memory in bytes.
> >    * @param pg_sz
> >    *   The size of memory pages in this virtual area.
> >    * @param free_cb
>
  

Patch

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0f29e8712..76cbacdf3 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -368,17 +368,11 @@  rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	size_t off, phys_len;
 	int ret, cnt = 0;
 
-	/* address and len must be page-aligned */
-	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
-		return -EINVAL;
-	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
-		return -EINVAL;
-
 	if (mp->flags & MEMPOOL_F_NO_IOVA_CONTIG)
 		return rte_mempool_populate_iova(mp, addr, RTE_BAD_IOVA,
 			len, free_cb, opaque);
 
-	for (off = 0; off + pg_sz <= len &&
+	for (off = 0; off < len &&
 		     mp->populated_size < mp->size; off += phys_len) {
 
 		iova = rte_mem_virt2iova(addr + off);
@@ -389,7 +383,10 @@  rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 		}
 
 		/* populate with the largest group of contiguous pages */
-		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
+		for (phys_len = RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) -
+			     (addr + off);
+		     off + phys_len < len;
+		     phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
 			rte_iova_t iova_tmp;
 
 			iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
@@ -575,8 +572,7 @@  rte_mempool_populate_default(struct rte_mempool *mp)
 			 * have
 			 */
 			mz = rte_memzone_reserve_aligned(mz_name, 0,
-					mp->socket_id, flags,
-					RTE_MAX(pg_sz, align));
+					mp->socket_id, flags, align);
 		}
 		if (mz == NULL) {
 			ret = -rte_errno;
@@ -601,7 +597,7 @@  rte_mempool_populate_default(struct rte_mempool *mp)
 				(void *)(uintptr_t)mz);
 		else
 			ret = rte_mempool_populate_virt(mp, mz->addr,
-				RTE_ALIGN_FLOOR(mz->len, pg_sz), pg_sz,
+				mz->len, pg_sz,
 				rte_mempool_memchunk_mz_free,
 				(void *)(uintptr_t)mz);
 		if (ret < 0) {
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 8053f7a04..0fe8aa7b8 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1042,9 +1042,8 @@  int rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
  *   A pointer to the mempool structure.
  * @param addr
  *   The virtual address of memory that should be used to store objects.
- *   Must be page-aligned.
  * @param len
- *   The length of memory in bytes. Must be page-aligned.
+ *   The length of memory in bytes.
  * @param pg_sz
  *   The size of memory pages in this virtual area.
  * @param free_cb