[dpdk-dev] [PATCH 3/4] mempool: introduce block size align flag
santosh
santosh.shukla at caviumnetworks.com
Wed Jul 5 09:35:57 CEST 2017
Hi Olivier,
On Monday 03 July 2017 10:07 PM, Olivier Matz wrote:
> On Wed, 21 Jun 2017 17:32:47 +0000, Santosh Shukla <santosh.shukla at caviumnetworks.com> wrote:
>> Some mempool hw like octeontx/fpa block, demands block size aligned
>> buffer address.
>>
> What is the meaning of block size aligned?
block size is total_elem_sz.
> Does it mean that the address has to be a multiple of total_elt_size?
yes.
> Is this constraint on the virtual address only?
>
both.
>> Introducing an MEMPOOL_F_POOL_BLK_SZ_ALIGNED flag.
>> If this flag is set:
>> 1) adjust 'off' value to block size aligned value.
>> 2) Allocate one additional buffer. This buffer is used to make sure that
>> requested 'n' buffers get correctly populated to mempool.
>> Example:
>> elem_sz = 2432 // total element size.
>> n = 2111 // requested number of buffer.
>> off = 2304 // new buf_offset value after step 1)
>> vaddr = 0x0 // actual start address of pool
>> pool_len = 5133952 // total pool length i.e.. (elem_sz * n)
>>
>> Since 'off' is a non-zero value so below condition would fail for the
>> block size align case.
>>
>> (((vaddr + off) + (elem_sz * n)) <= (vaddr + pool_len))
>>
>> Which is incorrect behavior. Additional buffer will solve this
>> problem and correctly populate 'n' buffer to mempool for the aligned
>> mode.
> Sorry, but the example is not very clear.
>
which part?
I'll try to reword.
The problem statement is:
- We want start of buffer address aligned to block_sz aka total_elt_sz.
Proposed solution in this patch:
- Let's say that we get 'x' size of memory chunk from memzone.
- Ideally we start using buffer at address 0 to...(x-block_sz).
- Not necessarily first buffer address i.e. 0 is aligned to block_sz.
- So we derive offset value for block_sz alignment purpose i.e..'off' .
- That 'off' makes sure that first va/pa address of buffer is blk_sz aligned.
- Calculating 'off' may end up sacrificing first buffer of pool. So total
number of buffer in pool is n-1, Which is incorrect behavior, Thats why
we add 1 addition buffer. We request memzone to allocate (n+1 * total_elt_sz) pool
area when F_BLK_SZ_ALIGNED flag is set.
>> Signed-off-by: Santosh Shukla <santosh.shukla at caviumnetworks.com>
>> Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
>> ---
>> lib/librte_mempool/rte_mempool.c | 19 ++++++++++++++++---
>> lib/librte_mempool/rte_mempool.h | 1 +
>> 2 files changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
>> index 7dec2f51d..2010857f0 100644
>> --- a/lib/librte_mempool/rte_mempool.c
>> +++ b/lib/librte_mempool/rte_mempool.c
>> @@ -350,7 +350,7 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
>> {
>> unsigned total_elt_sz;
>> unsigned i = 0;
>> - size_t off;
>> + size_t off, delta;
>> struct rte_mempool_memhdr *memhdr;
>> int ret;
>>
>> @@ -387,7 +387,15 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
>> memhdr->free_cb = free_cb;
>> memhdr->opaque = opaque;
>>
>> - if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
>> + if (mp->flags & MEMPOOL_F_POOL_BLK_SZ_ALIGNED) {
>> + delta = (uintptr_t)vaddr % total_elt_sz;
>> + off = total_elt_sz - delta;
>> + /* Validate alignment */
>> + if (((uintptr_t)vaddr + off) % total_elt_sz) {
>> + RTE_LOG(ERR, MEMPOOL, "vaddr(%p) not aligned to total_elt_sz(%u)\n", (vaddr + off), total_elt_sz);
>> + return -EINVAL;
>> + }
>> + } else if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
>> off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
>> else
>> off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - vaddr;
> What is the purpose of this test? Can it fail?
Purpose is to sanity check blk_sz alignment. No it won;t fail.
I thought better to keep sanity check but if you see no value
then will remove in v2?
> Not sure having the delta variable is helpful. However, adding a
> small comment like this could help:
>
> /* align object start address to a multiple of total_elt_sz */
> off = total_elt_sz - ((uintptr_t)vaddr % total_elt_sz);
>
> About style, please don't mix brackets and no-bracket blocks in the
> same if/elseif/else.
ok, in v2.
>> @@ -555,8 +563,13 @@ rte_mempool_populate_default(struct rte_mempool *mp)
>> }
>>
>> total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
>> +
>> for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
>> - size = rte_mempool_xmem_size(n, total_elt_sz, pg_shift);
>> + if (mp->flags & MEMPOOL_F_POOL_BLK_SZ_ALIGNED)
>> + size = rte_mempool_xmem_size(n + 1, total_elt_sz,
>> + pg_shift);
>> + else
>> + size = rte_mempool_xmem_size(n, total_elt_sz, pg_shift);
>>
>> ret = snprintf(mz_name, sizeof(mz_name),
>> RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
>
> One issue I see here is that this new flag breaks the function
> rte_mempool_xmem_size(), which calculates the maximum amount of memory
> required to store a given number of objects.
>
> It also probably breaks rte_mempool_xmem_usage().
>
> I don't have any good solution for now. A possibility is to change
> the behavior of these functions for everyone, meaning that we will
> always reserve more memory that really required. If this is done on
> every memory chunk (struct rte_mempool_memhdr), it can eat a lot
> of memory.
>
> Another approach would be to change the API of this function to
> pass the capability flags, or the mempool pointer... but there is
> a problem because these functions are usually called before the
> mempool is instanciated.
>
Per my description on [1/4]. If we agree to call
_ops_get_capability() at very beginning i.e.. at _populate_default()
then 'mp->flag' has capability flag. and We could add one more argument
in _xmem_size( , flag)/_xmem_usage(, flag).
- xmem_size / xmem_usage() to check for that capability bit in 'flag'.
- if set then increase 'elt_num' by num.
That way your approach 2) make sense to me and it will very well fit
in design. Won't waste memory like you mentioned in approach 1).
Does that make sense?
>> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
>> index fd8722e69..99a20263d 100644
>> --- a/lib/librte_mempool/rte_mempool.h
>> +++ b/lib/librte_mempool/rte_mempool.h
>> @@ -267,6 +267,7 @@ struct rte_mempool {
>> #define MEMPOOL_F_POOL_CREATED 0x0010 /**< Internal: pool is created. */
>> #define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
>> #define MEMPOOL_F_POOL_CONTIG 0x0040 /**< Detect physcially contiguous objs */
>> +#define MEMPOOL_F_POOL_BLK_SZ_ALIGNED 0x0080 /**< Align buffer address to block size*/
>>
>> /**
>> * @internal When debug is enabled, store some statistics.
> Same comment than for patch 3: the explanation should really be clarified.
> It's a hw specific limitation, which won't be obvious for the people that
> will read that code, so we must document it as clear as possible.
>
I won't see this as HW limitation. As mentioned in [1/4], even application
can request for block alignment, right?
But I agree that I will reword comment.
Thanks.
More information about the dev
mailing list