[dpdk-dev] [PATCH] dropping librte_ivshmem - was log: deprecate history dump

Burakov, Anatoly anatoly.burakov at intel.com
Fri Jun 10 11:47:07 CEST 2016


> > Hi Thomas,
> >
> > Just a few notes:
> >
> > > 3/ The automatic mapped allocation of DPDK objects in the guest.
> > > It should not be done in EAL.
> > > An ivshmem driver would be called by rte_eal_dev_init.
> > > It would check where are the shared DPDK structures, as currently
> > > done with the IVSHMEM_MAGIC (0x0BADC0DE), and do the appropriate
> allocations.
> > > Thus only the driver would depend on ring and mempool.
> >
> > The problem here is IVSHMEM doesn't allocate the memory from DPDK, it
> allocates new memory segments by mapping a PCI device. I.e. it doesn't do
> mallocs, it modifies mem_config and adds memory to DPDK. Can that be
> done from within a PMD?
> 
> Everything is possible :)
> Maybe you just need to add an API to add some memory segments.
> Other question: why is it so important to register these memory segments in
> EAL? I think they just need to be known by the ivshmem driver which map
> some objects on top.

That's because we need the memzone_lookup functionality. We can get by without it with rings because those are tailq-based, so we can just put rings there, but memzones are looked up through the memconfig, so IVSHMEM memzones have to be present there in order for the code to work without any additional API's.

Although, I guess we don't really need to have _memsegs_ in order to lookup memzones - we just have to create some memzones directly inside mcfg, bypassing the normal memzone_reserve stuff. That would still be a hack, but probably much less of a hack than what there is right now :) 

Another possible issue here is the order in which the memory is allocated. We put IVSHMEM init in EAL because we have to map things at specific addresses. The later IVSHMEM initializes, the more chance something will take up memory space that IVSHMEM needs. This could probably be solved with --base-virtaddr, so documentation will have to be updated to include advice to use that flag.

> 
> > > The last step of the ivshmem cleanup will be to remove the memory
> > > hack RTE_EAL_SINGLE_FILE_SEGMENTS. Then
> CONFIG_RTE_LIBRTE_IVSHMEM
> > > could be removed.
> >
> > The reason for that hack is that we often need to map several hugepages,
> and some of those pages could be 2M in size. If you're sharing 1G worth of
> contiguous memory backed by 2M pages, that's 512 files in the command line
> in vanilla DPDK, but can be made into one with
> RTE_EAL_SINGLE_FILE_SEGMENTS, so that QEMU command-line doesn't get
> overly long.
> >
> > So removing this hack, while definitely desired, will adversely affect
> > some use cases, such as using IVSHMEM on platforms where 1G pages
> > aren't supported. Whether we want to go with the effort of supporting
> > those is of course an open question - I personally don't have any data
> > on IVSHMEM userbase. Maybe Kevin/other OVS devs could help me out
> here
> > :)
> 
> We can keep supporting 2M pages by having a command line option, instead
> of the #ifdef RTE_EAL_SINGLE_FILE_SEGMENTS.
> But as I said, it is not the top priority to remove this hack.

Ah, so you're not suggesting removing the _functionality_, just the #ifdef? That could be made to work I guess...

Also, please correct me if I'm wrong, but I seem to remember some patches about putting all memory in a single file - I think that should work for IVSHMEM as well, because I believe IVSHMEM handles holes in files just fine, and can map even if everything resides inside a single file. So if that patch does what I think it does, we might just integrate it and remove the single file segments code entirely.

Thanks,
Anatoly


More information about the dev mailing list