[dpdk-dev] [PATCH v3 6/8] vhost: handle VHOST_USER_SEND_RARP request
Yuanhan Liu
yuanhan.liu at linux.intel.com
Fri Feb 19 08:03:26 CET 2016
On Fri, Feb 19, 2016 at 02:11:36PM +0800, Tan, Jianfeng wrote:
> Hi Yuanhan,
>
> On 1/29/2016 12:58 PM, Yuanhan Liu wrote:
> >While in former patch we enabled GUEST_ANNOUNCE feature, so that the
> >guest OS will broadcast a GARP message after migration to notify the
> >switch about the new location of migrated VM, the thing is that
> >GUEST_ANNOUNCE is enabled since kernel v3.5 only. For older kernel,
> >VHOST_USER_SEND_RARP request comes to rescue.
> >
> >The payload of this new request is the mac address of the migrated VM,
> >with that, we could construct a RARP message, and then broadcast it
> >to host interfaces.
> >
> >That's how this patch works:
> >
> >- list all interfaces, with the help of SIOCGIFCONF ioctl command
> >
> >- construct an RARP message and broadcast it
> >
> >Cc: Thibaut Collet <thibaut.collet at 6wind.com>
> >Signed-off-by: Yuanhan Liu <yuanhan.liu at linux.intel.com>
> >---
> ...
> >+
> >+/*
> >+ * Broadcast a RARP message to all interfaces, to update
> >+ * switch's mac table
> >+ */
> >+int
> >+user_send_rarp(struct VhostUserMsg *msg)
> >+{
> >+ uint8_t *mac = (uint8_t *)&msg->payload.u64;
> >+ uint8_t rarp[RARP_BUF_SIZE];
> >+ struct ifconf ifc = {0, };
> >+ struct ifreq *ifr;
> >+ int nr = 16;
> >+ int fd;
> >+ uint32_t i;
> >+
> >+ RTE_LOG(DEBUG, VHOST_CONFIG,
> >+ ":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n",
> >+ mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
> >+
> >+ make_rarp_packet(rarp, mac);
> >+
> >+ /*
> >+ * Get all interfaces
> >+ */
> >+ fd = socket(AF_INET, SOCK_DGRAM, 0);
> >+ if (fd < 0) {
> >+ perror("failed to create AF_INET socket");
> >+ return -1;
> >+ }
> >+
> >+again:
> >+ ifc.ifc_len = sizeof(*ifr) * nr;
> >+ ifc.ifc_buf = realloc(ifc.ifc_buf, ifc.ifc_len);
> >+
> >+ if (ioctl(fd, SIOCGIFCONF, &ifc) < 0) {
> >+ perror("failed at SIOCGIFCONF");
> >+ close(fd);
> >+ return -1;
> >+ }
> >+
> >+ if (ifc.ifc_len == (int)sizeof(struct ifreq) * nr) {
> >+ /*
> >+ * current ifc_buf is not big enough to hold
> >+ * all interfaces; double it and try again.
> >+ */
> >+ nr *= 2;
> >+ goto again;
> >+ }
> >+
> >+ ifr = (struct ifreq *)ifc.ifc_buf;
> >+ for (i = 0; i < ifc.ifc_len / sizeof(struct ifreq); i++)
> >+ send_rarp(ifr[i].ifr_name, rarp);
> >+
> >+ close(fd);
> >+
> >+ return 0;
> >+}
>
> From how you implement user_send_rarp(), if I understand it correctly, it
> broadcasts this ARP packets to all host interfaces, which I don't think it's
> appropriate. This ARP packets should be sent to it's own L2 networking. You
> should not make the hypothesis that all interfaces maintained in the kernel
> are in the same L2 networking. Even worse, this could bring problems when
> used in overlay networking, in which two VM in two different overlay
> networking, can have same MAC address.
>
> What I suggest here is to move user_send_rarp() to rte_vhost_dequeue_burst()
> using a flag to control, so that this arp packet can be broadcasted in its
> own L2 network.
I have thought of that, too. It was given up because SEND_RARP request was
handled in different thread from rte_vhost_dequeue_burst(), leading to the
fact that the RARP packet will not be broadcasted immediately after migration
is done: it will be broadcasted only when rte_vhost_dequeue_burst() is invoked.
I was thinking the delay might be a problem. While thinking it twice, it
doesn't look like one then. As GUEST_ANNOUNCE is also broadcasted by
rte_vhost_dequeue_burst(); it's enqueued by guest kernel though. And
judging that we are polling mode driver, it won't be an issue then.
So, thanks. I will give it a quick try; it should work.
--yliu
More information about the dev
mailing list