[dpdk-dev] [PATCH 00/17] vhost: generic vhost API

Yuanhan Liu yuanhan.liu at linux.intel.com
Fri Mar 3 10:51:05 CET 2017


This is a first attempt to make DPDK vhost library be generic enough,
so that user could built its own vhost-user drivers on top of it. For
example, SPDK (Storage Performance Development Kit) is trying to enable
vhost-user SCSI.

The basic idea is, let DPDK vhost be a vhost-user agent. It stores all
the info about the virtio device (i.e. vring address, negotiated features,
etc) and let the specific vhost-user driver to fetch them (by the API
provided by DPDK vhost lib). With those info being provided, the vhost-user
driver then could get/put vring entries, thus, it could exchange data
between the guest and host.

The last patch demonstrates how to use these new APIs to implement a
very simple vhost-user net driver, without any fancy features enabled.


API/ABI Changes summary
=======================

- some renames
  * "struct virtio_net_device_ops" ==> "struct vhost_device_ops"
  * "rte_virtio_net.h"  ==> "rte_vhost.h"

- driver related APIs are bond with the socket file
  * rte_vhost_driver_set_features(socket_file, features);
  * rte_vhost_driver_get_features(socket_file, features);
  * rte_vhost_driver_enable_features(socket_file, features)
  * rte_vhost_driver_disable_features(socket_file, features)
  * rte_vhost_driver_callback_register(socket_file, notify_ops);

- new APIs to fetch guest and vring info
  * rte_vhost_get_vhost_memory(int vid, struct rte_vhost_memory **mem);
  * rte_vhost_get_negotiated_features(int vid);
  * rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx,
			      struct rte_vhost_vring *vring);

- new exported structures 
  * struct rte_vhost_vring
  * struct rte_vhost_mem_region
  * struct rte_vhost_memory


Some design choices
===================

While making this patchset, I met quite few design choices and here are
two of them, with the issue and the reason I made such choices provided.
Please let me know if you have any comments (or better ideas).

Export public structures or not
-------------------------------

I made an ABI refactor last time (v16.07): move all the structures
internally and let applications use a "vid" to reference the internal
struct. With that, I hope we could never worry about the annoying ABI
issues.

It works great (and as expected) since then, as far as we only support
virito-net, as far as we can handle all the descs inside vhost lib. It
becomes problematic when a user wants to implement a vhost-user driver
somewhere. For example, it needs do the GPA to VVA translation. Without
any structs exported, some functions like gpa_to_vva() can't be inlined.
Calling it would be costly, especially it's a function we have to invoke
for processing each vring desc.

For that reason, the guest memory regions are exported. With that, the
gpa_to_vva could be inlined.

  
Add helper functions to fetch/update descs or not
-------------------------------------------------

I intended to do it like this way: introduce one function to get @count
of descs from a specific vring and another one to update the used descs.
It's something like
    rte_vhost_vring_get_descs(vid, vring_idx, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vid, vring_idx, count, offset, descs);

With that, vhost-user driver programmer's task would be easier, as he/she
doesn't have to parse the descs any more (such as to handle indirect desc).

But judging that virtio 1.1 is just emerged and it proposes a completely
ring layout, and most importantly, the vring desc structure is also changed,
I'd like to hold to introduce such two functions. Otherwise, it's very
likely the two will be invalid when virtio 1.1 is out. Though I think it
may could be addressed with a care design, something like making the IOV
generic enough:

	struct rte_vhost_iov {
		uint64_t	gpa;
		uint64_t	vva;
		uint64_t	len;
	};

Instead, I go with the other way: introduce few APIs to export all the vring
infos (vring size, vring addr, callfd, etc), and let the vhost-user driver
read and update the descs. Those info could be passed to vhost-user driver
by introducing one API for each, but for saving few APIs and reducing few
calls for the programmer, I packed few key fields into a new structure, so
that it can be fetched with one call:
        struct rte_vhost_vring {
                struct vring_desc       *desc;
                struct vring_avail      *avail;
                struct vring_used       *used;
                uint64_t                log_guest_addr;
       
                int                     callfd;
                int                     kickfd;
                uint16_t                size;
        };

When virtio 1.1 comes out, likely a simple change like following would
just work:
        struct rte_vhost_vring {
		union {
			struct {
                		struct vring_desc       *desc;
                		struct vring_avail      *avail;
                		struct vring_used       *used;
                		uint64_t                log_guest_addr;
			};
			struct desc	*desc_1_1;	/* vring addr for virtio 1.1 */
		};
       
                int                     callfd;
                int                     kickfd;
                uint16_t                size;
        };

AFAIK, it's not an ABI breakage. Even if it does, we could introduce a new
API to get the virtio 1.1 ring address.

Those fields are the minimum set I got for a specific vring, with the mind
it would bring the minimum chance to break ABI for future extension. If we
need more info, we could introduce a new API.

OTOH, for getting the best performance, the two functions also have to be
inlined ("vid + vring_idx" combo is replaced with "vring"):
    rte_vhost_vring_get_descs(vring, count, offset, iov, descs);
    rte_vhost_vring_update_used_descs(vring, count, offset, descs);

That said, one way or another, we have to export rte_vhost_vring struct.
For this reason, I didn't rush into introducing the two APIs.


TODOs
=====

This series still got few small items to finish, and they are:
- update release note
- fill API comments
- set protocol features


	--yliu

---
Yuanhan Liu (17):
  vhost: introduce driver features related APIs
  net/vhost: remove feature related APIs
  vhost: use new APIs to handle features
  vhost: make notify ops per vhost driver
  vhost: export guest memory regions
  vhost: introduce API to fetch negotiated features
  vhost: export vhost vring info
  vhost: export API to translate gpa to vva
  vhost: turn queue pair to vring
  vhost: export the number of vrings
  vhost: move the device ready check at proper place
  vhost: drop the Rx and Tx queue macro
  vhost: do not include net specific headers
  vhost: rename device ops struct
  vhost: rename virtio-net to vhost
  vhost: rename header file
  examples/vhost: demonstrate the new generic vhost APIs

 doc/guides/rel_notes/deprecation.rst        |   9 -
 drivers/net/vhost/rte_eth_vhost.c           |  51 ++--
 drivers/net/vhost/rte_eth_vhost.h           |  32 +--
 drivers/net/vhost/rte_pmd_vhost_version.map |   3 -
 examples/tep_termination/main.c             |  11 +-
 examples/tep_termination/main.h             |   2 +
 examples/tep_termination/vxlan_setup.c      |   2 +-
 examples/vhost/Makefile                     |   2 +-
 examples/vhost/main.c                       |  88 ++++--
 examples/vhost/main.h                       |  33 ++-
 examples/vhost/virtio_net.c                 | 405 ++++++++++++++++++++++++++++
 lib/librte_vhost/Makefile                   |   4 +-
 lib/librte_vhost/rte_vhost.h                | 259 ++++++++++++++++++
 lib/librte_vhost/rte_vhost_version.map      |  18 +-
 lib/librte_vhost/rte_virtio_net.h           | 193 -------------
 lib/librte_vhost/socket.c                   | 143 ++++++++++
 lib/librte_vhost/vhost.c                    | 209 +++++++-------
 lib/librte_vhost/vhost.h                    |  82 +++---
 lib/librte_vhost/vhost_user.c               |  91 +++----
 lib/librte_vhost/vhost_user.h               |   2 +-
 lib/librte_vhost/virtio_net.c               |  35 +--
 21 files changed, 1140 insertions(+), 534 deletions(-)
 create mode 100644 examples/vhost/virtio_net.c
 create mode 100644 lib/librte_vhost/rte_vhost.h
 delete mode 100644 lib/librte_vhost/rte_virtio_net.h

-- 
1.9.0



More information about the dev mailing list