[dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK
Burakov, Anatoly
anatoly.burakov at intel.com
Fri Dec 22 10:13:04 CET 2017
On 21-Dec-17 9:38 PM, Walker, Benjamin wrote:
> On Tue, 2017-12-19 at 11:14 +0000, Anatoly Burakov wrote:
>>
>
>> Quick outline of all changes done as part of this patchset:
>>
>> * Malloc heap adjusted to handle holes in address space
>> * Single memseg list replaced by multiple expandable memseg lists
>> * VA space for hugepages is preallocated in advance
>> * Added dynamic alloc/free for pages, happening as needed on malloc/free
>
> SPDK will need some way to register for a notification when pages are allocated
> or freed. For storage, the number of requests per second is (relative to
> networking) fairly small (hundreds of thousands per second in a traditional
> block storage stack, or a few million per second with SPDK). Given that, we can
> afford to do a dynamic lookup from va to pa/iova on each request in order to
> greatly simplify our APIs (users can just pass pointers around instead of
> mbufs). DPDK has a way to lookup the pa from a given va, but it does so by
> scanning /proc/self/pagemap and is very slow. SPDK instead handles this by
> implementing a lookup table of va to pa/iova which we populate by scanning
> through the DPDK memory segments at start up, so the lookup in our table is
> sufficiently fast for storage use cases. If the list of memory segments changes,
> we need to know about it in order to update our map.
Hi Benjamin,
So, in other words, we need callbacks on alloa/free. What information
would SPDK need when receiving this notification? Since we can't really
know in advance how many pages we allocate (it may be one, it may be a
thousand) and they no longer are guaranteed to be contiguous, would a
per-page callback be OK? Alternatively, we could have one callback per
operation, but only provide VA and size of allocated memory, while
leaving everything else to the user. I do add a virt2memseg() function
which would allow you to look up segment physical addresses easier, so
you won't have to manually scan memseg lists to get IOVA for a given VA.
Thanks for your feedback and suggestions!
>
> Having the map also enables a number of other nice things - for instance we
> allow users to register memory that wasn't allocated through DPDK and use it for
> DMA operations. We keep that va to pa/iova mapping in the same map. I appreciate
> you adding APIs to dynamically register this type of memory with the IOMMU on
> our behalf. That allows us to eliminate a nasty hack where we were looking up
> the vfio file descriptor through sysfs in order to send the registration ioctl.
>
>> * Added contiguous memory allocation API's for rte_malloc and rte_memzone
>> * Integrated Pawel Wodkowski's patch [1] for registering/unregistering memory
>> with VFIO
--
Thanks,
Anatoly
More information about the dev
mailing list