[dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK

Walker, Benjamin benjamin.walker at intel.com
Tue Dec 26 18:19:25 CET 2017


On Fri, 2017-12-22 at 09:13 +0000, Burakov, Anatoly wrote:
> On 21-Dec-17 9:38 PM, Walker, Benjamin wrote:
> > SPDK will need some way to register for a notification when pages are
> > allocated
> > or freed. For storage, the number of requests per second is (relative to
> > networking) fairly small (hundreds of thousands per second in a traditional
> > block storage stack, or a few million per second with SPDK). Given that, we
> > can
> > afford to do a dynamic lookup from va to pa/iova on each request in order to
> > greatly simplify our APIs (users can just pass pointers around instead of
> > mbufs). DPDK has a way to lookup the pa from a given va, but it does so by
> > scanning /proc/self/pagemap and is very slow. SPDK instead handles this by
> > implementing a lookup table of va to pa/iova which we populate by scanning
> > through the DPDK memory segments at start up, so the lookup in our table is
> > sufficiently fast for storage use cases. If the list of memory segments
> > changes,
> > we need to know about it in order to update our map.
> 
> Hi Benjamin,
> 
> So, in other words, we need callbacks on alloa/free. What information 
> would SPDK need when receiving this notification? Since we can't really 
> know in advance how many pages we allocate (it may be one, it may be a 
> thousand) and they no longer are guaranteed to be contiguous, would a 
> per-page callback be OK? Alternatively, we could have one callback per 
> operation, but only provide VA and size of allocated memory, while 
> leaving everything else to the user. I do add a virt2memseg() function 
> which would allow you to look up segment physical addresses easier, so
> you won't have to manually scan memseg lists to get IOVA for a given VA.
> 
> Thanks for your feedback and suggestions!

Yes - callbacks on alloc/free would be perfect. Ideally for us we want one
callback per virtual memory region allocated, plus a function we can call to
find the physical addresses/page break points on that virtual region. The
function that finds the physical addresses does not have to be efficient - we'll
just call that once when the new region is allocated and store the results in a
fast lookup table. One call per virtual region is better for us than one call
per physical page because we're actually keeping multiple different types of
memory address translation tables in SPDK. One translates from va to pa/iova, so
for this one we need to break this up into physical pages and it doesn't matter
if you do one call per virtual region or one per physical page. However another
one translates from va to RDMA lkey, so it is much more efficient if we can
register large virtual regions in a single call.


More information about the dev mailing list