[dpdk-dev] [PATCH v3 5/6] virtio: Add support for qtest virtio-net PMD
Tetsuya Mukawa
mukawa at igel.co.jp
Fri Mar 4 06:05:22 CET 2016
On 2016/03/04 11:18, Tan, Jianfeng wrote:
> Hi Tetsuya,
>
> Seems that this patch is too long. Is it possible to split into
> multiple commits?
Hi Jianfeng,
Sure, will do.
>
> On 2/22/2016 4:17 PM, Tetsuya Mukawa wrote:
>> The patch adds a new virtio-net PMD configuration that allows the PMD to
>> work on host as if the PMD is in VM.
>> Here is new configuration for virtio-net PMD.
>> - CONFIG_RTE_VIRTIO_VDEV_QTEST
>> To use this mode, EAL needs map all hugepages as one file. Also the file
>> should be mapped between (1 << 31) and (1 << 44). And start address
>> should be aligned by EAL memory size.
>>
>> To allocate like above, use below options.
>> --single-file
>> --range-virtaddr=0x80000000-0x100000000000
>> --align-memsize
>> If a free regions isn't found, EAL will return error.
>>
>> To prepare virtio-net device on host, the users need to invoke QEMU
>> process in special qtest mode. This mode is mainly used for testing QEMU
>> devices from outer process. In this mode, no guest runs.
>> Here is QEMU command line.
>>
>> $ qemu-system-x86_64 \
>> -machine pc-i440fx-1.4,accel=qtest \
>> -display none -qtest-log /dev/null \
>> -qtest unix:/tmp/socket,server \
>> -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1 \
>> -device
>> virtio-net-pci,netdev=net0,mq=on,disable-modern=false,addr=3 \
>> -chardev socket,id=chr1,path=/tmp/ivshmem,server \
>> -device ivshmem,size=1G,chardev=chr1,vectors=1,addr=4
>>
>> * Should use qemu-2.5.1, or above.
>> * QEMU process is needed per port.
>> * virtio-1.0 device are only supported.
>> * The vhost backends like vhost-net and vhost-user can be specified.
>> * In most cases, just using above command is enough, but you can also
>> specify other QEMU virtio-net options.
>> * Only checked "pc-i440fx-1.4" machine, but may work with other
>> machines.
>> * Should not add "--enable-kvm" to QEMU command line.
>
> Correct me if wrong: all control msgs go through qemu process, e.g.,
> tx notifications and rx interrupts need follow frontend-qemu-backend
> path. Question: qemu is started without --enable-kvm, as I understand,
> ioeventfd, the basis of kickfd/callfd, will not be available. So how
> does qemu kick backend or be kicked by backend?
Actually, vhost-backend process will receive kickfd and callfd as -1.
(Currently, we have a bug in librte_vhost, because the library treats -1
as "not initialized state". But actually without "--enable-kvm", -1 will
be set by qemu to initialize kickfd and callfd. I will send a patch for
the issue with next patch series.)
In our case, virtio-net driver and vhost-backend driver are PMD. So we
don't use kickfd and callfd, right?
If you worried about vhost-net case, vhost-net kernel thread will work
without ioeventfd and irqfd.
In this case, virtio-net PMD can kick the vhost-net by accessing
VIRTIO_PCI_QUEUE_NOTIFY register.
(vhost-net doesn't need to kick virtio-net driver, because the driver is
PMD.)
>
>>
>> After invoking QEMU, the PMD can connect to QEMU process using unix
>> domain sockets. Over these sockets, virtio-net, ivshmem and piix3
>> device in QEMU are probed by the PMD.
>> Here is example of command line.
>>
>> $ testpmd -c f -n 1 -m 1024 --no-pci --single-file \
>> --range-virtaddr=0x80000000-0x100000000000 --align-memsize \
>>
>> --vdev="eth_qtest_virtio0,qtest=/tmp/socket,ivshmem=/tmp/ivshmem"\
>> -- --disable-hw-vlan --txqflags=0xf00 -i
>>
>> Please specify same unix domain sockets and memory size in both QEMU
>> and DPDK command lines like above.
>> The share memory size should be power of 2, because ivshmem only
>> accepts such memory size.
>>
>> Signed-off-by: Tetsuya Mukawa <mukawa at igel.co.jp>
>> ---
>> config/common_linuxapp | 1 +
>> drivers/net/virtio/Makefile | 4 +
>> drivers/net/virtio/qtest.c | 1342
>> ++++++++++++++++++++++++++++++++++++
>> drivers/net/virtio/qtest.h | 65 ++
>> drivers/net/virtio/virtio_ethdev.c | 383 +++++++++-
>> drivers/net/virtio/virtio_pci.c | 364 +++++++++-
>> drivers/net/virtio/virtio_pci.h | 5 +-
>> 7 files changed, 2122 insertions(+), 42 deletions(-)
>> create mode 100644 drivers/net/virtio/qtest.c
>> create mode 100644 drivers/net/virtio/qtest.h
>>
>> diff --git a/config/common_linuxapp b/config/common_linuxapp
>> index 452f39c..f6e53bc 100644
>> --- a/config/common_linuxapp
>> +++ b/config/common_linuxapp
>> @@ -533,3 +533,4 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
>> # Enable virtio support for container
>> #
>> CONFIG_RTE_VIRTIO_VDEV=y
>> +CONFIG_RTE_VIRTIO_VDEV_QTEST=y
>> diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
>> index ef920f9..6c11378 100644
>> --- a/drivers/net/virtio/Makefile
>> +++ b/drivers/net/virtio/Makefile
>> @@ -56,6 +56,10 @@ ifeq ($(CONFIG_RTE_VIRTIO_VDEV),y)
>> SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += vhost_embedded.c
>> endif
>> +ifeq ($(CONFIG_RTE_VIRTIO_VDEV_QTEST),y)
>> + SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += qtest.c
>> +endif
>> +
>> # this lib depends upon:
>> DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal
>> lib/librte_ether
>> DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_mempool
>> lib/librte_mbuf
>> diff --git a/drivers/net/virtio/qtest.c b/drivers/net/virtio/qtest.c
>> new file mode 100644
>> index 0000000..061aab5
>> --- /dev/null
>> +++ b/drivers/net/virtio/qtest.c
>> @@ -0,0 +1,1342 @@
>> +/*-
>> + * BSD LICENSE
>> + *
>> + * Copyright(c) 2016 IGEL Co., Ltd. All rights reserved.
>> + * All rights reserved.
>> + *
>> + * Redistribution and use in source and binary forms, with or without
>> + * modification, are permitted provided that the following conditions
>> + * are met:
>> + *
>> + * * Redistributions of source code must retain the above copyright
>> + * notice, this list of conditions and the following disclaimer.
>> + * * Redistributions in binary form must reproduce the above
>> copyright
>> + * notice, this list of conditions and the following
>> disclaimer in
>> + * the documentation and/or other materials provided with the
>> + * distribution.
>> + * * Neither the name of IGEL Co., Ltd. nor the names of its
>> + * contributors may be used to endorse or promote products
>> derived
>> + * from this software without specific prior written permission.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
>> CONTRIBUTORS
>> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
>> FITNESS FOR
>> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
>> COPYRIGHT
>> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>> INCIDENTAL,
>> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
>> OF USE,
>> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
>> ON ANY
>> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
>> TORT
>> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
>> THE USE
>> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
>> DAMAGE.
>> + */
>> +#include <stdint.h>
>> +#include <stdlib.h>
>> +#include <string.h>
>> +#include <unistd.h>
>> +#include <sys/types.h>
>> +#include <sys/socket.h>
>> +#include <sys/un.h>
>> +#include <sys/queue.h>
>> +#include <signal.h>
>> +#include <pthread.h>
>> +#include <sys/stat.h>
>> +#include <fcntl.h>
>> +#include <sys/eventfd.h>
>> +#include <linux/pci_regs.h>
>> +
>> +#include <rte_memory.h>
>> +#include <rte_malloc.h>
>> +#include <rte_common.h>
>> +#include <rte_interrupts.h>
>> +
>> +#include "virtio_pci.h"
>> +#include "virtio_logs.h"
>> +#include "virtio_ethdev.h"
>> +#include "qtest.h"
>> +
>> +#define NB_BAR 6
>> +
>> +/* PIIX3 configuration registers */
>> +#define PIIX3_REG_ADDR_PIRQA 0x60
>> +#define PIIX3_REG_ADDR_PIRQB 0x61
>> +#define PIIX3_REG_ADDR_PIRQC 0x62
>> +#define PIIX3_REG_ADDR_PIRQD 0x63
>> +
>> +/* Device information */
>> +#define VIRTIO_NET_DEVICE_ID 0x1000
>> +#define VIRTIO_NET_VENDOR_ID 0x1af4
>> +#define VIRTIO_NET_IRQ_NUM 10
>> +#define IVSHMEM_DEVICE_ID 0x1110
>> +#define IVSHMEM_VENDOR_ID 0x1af4
>> +#define IVSHMEM_PROTOCOL_VERSION 0
>> +#define PIIX3_DEVICE_ID 0x7000
>> +#define PIIX3_VENDOR_ID 0x8086
>> +
>> +/* ------------------------------------------------------------
>> + * IO port mapping of qtest guest
>> + * ------------------------------------------------------------
>> + * 0x0000 - 0xbfff : not used
>> + * 0xc000 - 0xc03f : virtio-net(BAR0)
>> + * 0xc040 - 0xffff : not used
>> + *
>> + * ------------------------------------------------------------
>> + * Memory mapping of qtest quest
>> + * ------------------------------------------------------------
>> + * 0x00000000_00000000 - 0x00000000_3fffffff : not used
>> + * 0x00000000_40000000 - 0x00000000_40000fff : virtio-net(BAR1)
>> + * 0x00000000_40001000 - 0x00000000_40ffffff : not used
>> + * 0x00000000_41000000 - 0x00000000_417fffff : virtio-net(BAR4)
>> + * 0x00000000_41800000 - 0x00000000_41ffffff : not used
>> + * 0x00000000_42000000 - 0x00000000_420000ff : ivshmem(BAR0)
>> + * 0x00000000_42000100 - 0x00000000_42ffffff : not used
>> + * 0x00000000_80000000 - 0xffffffff_ffffffff : ivshmem(BAR2)
>> + *
>
> Is it possible to arrange multiple virtio-net devices here? What's the
> challenges?
Yes, you can manage multiple virtio-net devices here, if you define
correct memory map.
>
> Seems that lots of below code do the same work as libqos. So can we
> just link libqos? Or we need to maintain this code.
Problem is libqos will be GPL.
So I wrote the code from scratch.
>
>> + * We can only specify start address of a region. The region size
>> + * will be defined by the device implementation in QEMU.
>> + * The size will be pow of 2 according to the PCI specification.
>> + * Also, the region start address should be aligned by region size.
>> + *
>> + * BAR2 of ivshmem will be used to mmap DPDK application memory.
>> + * So this address will be dynamically changed, but not to overlap
>> + * others, it should be mmaped between above addresses. Such allocation
>> + * is done by EAL. Check rte_eal_get_free_region() also.
>> + */
>> +#define VIRTIO_NET_IO_START 0xc000
>> +#define VIRTIO_NET_MEMORY1_START 0x40000000
>> +#define VIRTIO_NET_MEMORY2_START 0x41000000
>> +#define IVSHMEM_MEMORY_START 0x42000000
>> +
>> +#define PCI_CONFIG_ADDR(_bus, _device, _function, _offset) ( \
>> + (1 << 31) | ((_bus) & 0xff) << 16 | ((_device) & 0x1f) << 11 | \
>> + ((_function) & 0x7) << 8 | ((_offset) & 0xfc))
>> +
>> +static char interrupt_message[32];
>> +
>> +enum qtest_pci_bar_type {
>> + QTEST_PCI_BAR_DISABLE = 0,
>> + QTEST_PCI_BAR_IO,
>> + QTEST_PCI_BAR_MEMORY_UNDER_1MB,
>> + QTEST_PCI_BAR_MEMORY_32,
>> + QTEST_PCI_BAR_MEMORY_64
>> +};
>> +
>> +struct qtest_pci_bar {
>> + enum qtest_pci_bar_type type;
>> + uint8_t addr;
>> + uint64_t region_start;
>> + uint64_t region_size;
>> +};
>> +
>> +struct qtest_session;
>> +TAILQ_HEAD(qtest_pci_device_list, qtest_pci_device);
>> +struct qtest_pci_device {
>> + TAILQ_ENTRY(qtest_pci_device) next;
>> + const char *name;
>> + uint16_t device_id;
>> + uint16_t vendor_id;
>> + uint8_t bus_addr;
>> + uint8_t device_addr;
>> + struct qtest_pci_bar bar[NB_BAR];
>> + int (*init)(struct qtest_session *s, struct qtest_pci_device *dev);
>> +};
>> +
>> +union qtest_pipefds {
>> + struct {
>> + int pipefd[2];
>> + };
>> + struct {
>> + int readfd;
>> + int writefd;
>> + };
>> +};
>> +
>> +struct qtest_session {
>> + int qtest_socket;
>> + pthread_mutex_t qtest_session_lock;
>> +
>> + struct qtest_pci_device_list head;
>> + int ivshmem_socket;
>> +
>> + pthread_t event_th;
>> + char *evq;
>> + char *evq_dequeue_ptr;
>> + size_t evq_total_len;
>> +
>> + union qtest_pipefds msgfds;
>> +
>> + pthread_t intr_th;
>> + int eventfd;
>> + rte_atomic16_t enable_intr;
>> + rte_intr_callback_fn cb;
>> + void *cb_arg;
>> + struct rte_eth_dev_data *eth_data;
>> +};
>> +
>> +static int
>> +qtest_raw_send(int fd, char *buf, size_t count)
>> +{
>> + size_t len = count;
>> + size_t total_len = 0;
>> + int ret = 0;
>> +
>> + while (len > 0) {
>> + ret = write(fd, buf, len);
>> + if (ret == -1) {
>> + if (errno == EINTR)
>> + continue;
>> + return ret;
>> + }
>> + if (ret == (int)len)
>> + break;
>
> This _if_ complicates this function, remove this and rely on while
> condition to quit? Same as in the below qtest_raw_recv().
>
Yes, I will remove it.
Thanks,
Tetsuya
> Thanks,
> Jianfeng
>
>> + total_len += ret;
>> + buf += ret;
>> + len -= ret;
>> + }
>> + return total_len + ret;
>> +}
>> +
>> +static int
>> +qtest_raw_recv(int fd, char *buf, size_t count)
>> +{
>> + size_t len = count;
>> + size_t total_len = 0;
>> + int ret = 0;
>> +
>> + while (len > 0) {
>> + ret = read(fd, buf, len);
>> + if (ret <= 0) {
>> + if (errno == EINTR) {
>> + continue;
>> + }
>> + return ret;
>> + }
>> + if (ret == (int)len)
>> + break;
>> + if (*(buf + ret - 1) == '\n')
>> + break;
>> + total_len += ret;
>> + buf += ret;
>> + len -= ret;
>> + }
>> + return total_len + ret;
>> +}
>> +
> [...]
More information about the dev
mailing list