[dpdk-dev] [PATCH v3] vhost: allow for many vhost user ports
Jan Wickbom
jan.wickbom at ericsson.com
Tue Dec 13 14:15:20 CET 2016
> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: den 13 december 2016 10:15
> To: Jan Wickbom <jan.wickbom at ericsson.com>
> Cc: dev at dpdk.org; Patrik Andersson R <patrik.r.andersson at ericsson.com>
> Subject: Re: [PATCH v3] vhost: allow for many vhost user ports
>
> On Mon, Dec 12, 2016 at 05:50:34PM +0100, Jan Wickbom wrote:
> > Currently select() is used to monitor file descriptors for vhostuser
> > ports. This limits the number of ports possible to create since the
> > fd number is used as index in the fd_set and we have seen fds > 1023.
> > This patch changes select() to poll(). This way we can keep an
> > packed (pollfd) array for the fds, e.g. as many fds as the size of
> > the array.
> >
> > Also see:
> > http://dpdk.org/ml/archives/dev/2016-April/037024.html
> >
> > Signed-off-by: Jan Wickbom <jan.wickbom at ericsson.com>
> > Reported-by: Patrik Andersson <patrik.r.andersson at ericsson.com>
> ...
> > +static struct pollfd rwfds[MAX_FDS];
>
> Though it's unlikely, but just assume we have multiple instance of
> fdset_event_dispatch(pfdset), a global rwfds will not work.
>
> Thought twice, and it's better to put it into the fdset struct:
>
> struct fdset {
> struct fdentry fd[MAX_FDS];
> struct pollfd rwfds[MAX_FDS];
> ...
>
Done
> > /**
> > * This functions runs in infinite blocking loop until there is no fd in
> > * pfdset. It calls corresponding r/w handler if there is event on the fd.
> > @@ -229,55 +217,71 @@
> > void
> > fdset_event_dispatch(struct fdset *pfdset)
> > {
> > - fd_set rfds, wfds;
> > - int i, maxfds;
> > + int i;
> > struct fdentry *pfdentry;
> > - int num = MAX_FDS;
> > fd_cb rcb, wcb;
> > void *dat;
> > int fd;
> > int remove1, remove2;
> > - int ret;
> >
> > if (pfdset == NULL)
> > return;
> >
> > - while (1) {
> > - struct timeval tv;
> > - tv.tv_sec = 1;
> > - tv.tv_usec = 0;
> > - FD_ZERO(&rfds);
> > - FD_ZERO(&wfds);
> > - pthread_mutex_lock(&pfdset->fd_mutex);
> > -
> > - maxfds = fdset_fill(&rfds, &wfds, pfdset);
> > -
> > - pthread_mutex_unlock(&pfdset->fd_mutex);
> > + memset(rwfds, 0, sizeof(rwfds));
> >
> > + while (1) {
> > /*
> > - * When select is blocked, other threads might
> unregister
> > + * When poll is blocked, other threads might
> unregister
> > * listenfds from and register new listenfds into
> fdset.
> > - * When select returns, the entries for listenfds
> in the fdset
> > + * When poll returns, the entries for listenfds in
> the fdset
> > * might have been updated. It is ok if there is
> unwanted call
> > * for new listenfds.
> > */
> > - ret = select(maxfds + 1, &rfds, &wfds, NULL,
> &tv);
> > - if (ret <= 0)
> > - continue;
> > + poll(rwfds, pfdset->num, 1000 /* millisecs */);
> >
> > - for (i = 0; i < num; i++) {
> > - remove1 = remove2 = 0;
> > + for (i = 0; i < pfdset->num; ) {
> > pthread_mutex_lock(&pfdset-
> >fd_mutex);
> > +
> > pfdentry = &pfdset->fd[i];
> > fd = pfdentry->fd;
> > +
> > + if (fd < 0) {
> > + /* Removed during
> poll */
> > +
> > +
> fdset_move_last(pfdset, i);
> > +
> fdset_shrink(pfdset);
> > +
> > +
> pthread_mutex_unlock(&pfdset->fd_mutex);
> > +
> > + continue;
> > + }
> > +
> > + if (!rwfds[i].revents) {
> > + /* No revents,
> maybe added during poll */
> > +
> > + rwfds[i].fd = fd;
> > + rwfds[i].events =
> pfdentry->rcb ? POLLIN : 0;
> > + rwfds[i].events |=
> pfdentry->wcb ? POLLOUT : 0;
> > +
> pthread_mutex_unlock(&pfdset->fd_mutex);
> > +
> > + i++;
> > + continue;
>
> I think it's error-prone to manipulate the rwfds here. Besides, it
> registers an fd repeatedly.
>
> The way I think of is:
>
> - set rwfds[i] at fdset_add().
>
> This also simply makes sure that pfdset->rwfds[i] and pfdset->fd[i] is
> correctly bond.
>
> fdset_add(fdset, fd, ...) {
> lock();
>
> i = fdset_find_free_slot(..);
>
> pfdset->fd[i]->fd = fd;
> pfdset->fd[i]->rcb = rcb;
> pfdset->fd[i]->...;
>
> pfdset->rwfds[i]->fd = fd;
> pfdset->rwfds[i]->events = ...;
> pfdset->rwfds[i]->revents = 0;
> }
>
>
> - set pfdset->fd[i]->fd = -1 on fdset_del. Note we should not decrease
> 'num' here, as we may be at poll.
>
> fdset_del(fdset, fd) {
> lock();
>
> i = fdset_find_fd(pfdset, fd);
> pfdset->fd[i]->fd = -1;
>
> ...
> }
>
>
>
> - log down pfdset->num before poll, because 'num' may increase during poll.
>
> I think it's optional, since even 'num' is increased during poll, it just
> leads to few more rwfds entries will be accessed. But it's not tracked by
> kernel, and revents is initiated with 0, that there is no issue.
>
>
> - shrink the fdset and rwfds (together) for those removed entries,
> __outside__
> the for loop after poll.
>
> Works to you?
I did quite of a rework of this yesterday before reading your mail. I think
That should work, please have a look.
V4 coming in a minute
>
> --yliu
More information about the dev
mailing list