[dpdk-dev] [PATCH v3 6/8] vhost: handle VHOST_USER_SEND_RARP request

Yuanhan Liu yuanhan.liu at linux.intel.com
Fri Feb 19 08:03:26 CET 2016


On Fri, Feb 19, 2016 at 02:11:36PM +0800, Tan, Jianfeng wrote:
> Hi Yuanhan,
> 
> On 1/29/2016 12:58 PM, Yuanhan Liu wrote:
> >While in former patch we enabled GUEST_ANNOUNCE feature, so that the
> >guest OS will broadcast a GARP message after migration to notify the
> >switch about the new location of migrated VM, the thing is that
> >GUEST_ANNOUNCE is enabled since kernel v3.5 only. For older kernel,
> >VHOST_USER_SEND_RARP request comes to rescue.
> >
> >The payload of this new request is the mac address of the migrated VM,
> >with that, we could construct a RARP message, and then broadcast it
> >to host interfaces.
> >
> >That's how this patch works:
> >
> >- list all interfaces, with the help of SIOCGIFCONF ioctl command
> >
> >- construct an RARP message and broadcast it
> >
> >Cc: Thibaut Collet <thibaut.collet at 6wind.com>
> >Signed-off-by: Yuanhan Liu <yuanhan.liu at linux.intel.com>
> >---
> ...
> >+
> >+/*
> >+ * Broadcast a RARP message to all interfaces, to update
> >+ * switch's mac table
> >+ */
> >+int
> >+user_send_rarp(struct VhostUserMsg *msg)
> >+{
> >+	uint8_t *mac = (uint8_t *)&msg->payload.u64;
> >+	uint8_t rarp[RARP_BUF_SIZE];
> >+	struct ifconf ifc = {0, };
> >+	struct ifreq *ifr;
> >+	int nr = 16;
> >+	int fd;
> >+	uint32_t i;
> >+
> >+	RTE_LOG(DEBUG, VHOST_CONFIG,
> >+		":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n",
> >+		mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
> >+
> >+	make_rarp_packet(rarp, mac);
> >+
> >+	/*
> >+	 * Get all interfaces
> >+	 */
> >+	fd = socket(AF_INET, SOCK_DGRAM, 0);
> >+	if (fd < 0) {
> >+		perror("failed to create AF_INET socket");
> >+		return -1;
> >+	}
> >+
> >+again:
> >+	ifc.ifc_len = sizeof(*ifr) * nr;
> >+	ifc.ifc_buf = realloc(ifc.ifc_buf, ifc.ifc_len);
> >+
> >+	if (ioctl(fd, SIOCGIFCONF, &ifc) < 0) {
> >+		perror("failed at SIOCGIFCONF");
> >+		close(fd);
> >+		return -1;
> >+	}
> >+
> >+	if (ifc.ifc_len == (int)sizeof(struct ifreq) * nr) {
> >+		/*
> >+		 * current ifc_buf is not big enough to hold
> >+		 * all interfaces; double it and try again.
> >+		 */
> >+		nr *= 2;
> >+		goto again;
> >+	}
> >+
> >+	ifr = (struct ifreq *)ifc.ifc_buf;
> >+	for (i = 0; i < ifc.ifc_len / sizeof(struct ifreq); i++)
> >+		send_rarp(ifr[i].ifr_name, rarp);
> >+
> >+	close(fd);
> >+
> >+	return 0;
> >+}
> 
> From how you implement user_send_rarp(), if I understand it correctly, it
> broadcasts this ARP packets to all host interfaces, which I don't think it's
> appropriate. This ARP packets should be sent to it's own L2 networking. You
> should not make the hypothesis that all interfaces maintained in the kernel
> are in the same L2 networking. Even worse, this could bring problems when
> used in overlay networking, in which two VM in two different overlay
> networking, can have same MAC address.
> 
> What I suggest here is to move user_send_rarp() to rte_vhost_dequeue_burst()
> using a flag to control, so that this arp packet can be broadcasted in its
> own L2 network.

I have thought of that, too. It was given up because SEND_RARP request was
handled in different thread from rte_vhost_dequeue_burst(), leading to the
fact that the RARP packet will not be broadcasted immediately after migration
is done: it will be broadcasted only when rte_vhost_dequeue_burst() is invoked.

I was thinking the delay might be a problem. While thinking it twice, it
doesn't look like one then. As GUEST_ANNOUNCE is also broadcasted by
rte_vhost_dequeue_burst(); it's enqueued by guest kernel though. And
judging that we are polling mode driver, it won't be an issue then.

So, thanks. I will give it a quick try; it should work.

	--yliu


More information about the dev mailing list