net/af_xdp: submit valid count to Tx queue

Message ID 20190410105327.110410-1-xiaolong.ye@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series net/af_xdp: submit valid count to Tx queue |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/intel-Performance-Testing fail Performance Testing issues

Commit Message

Xiaolong Ye April 10, 2019, 10:53 a.m. UTC
  Since tx callback only picks mbufs that can be copied to the xdp desc, it
should call xsk_ring_prod__submit with valid count other than nb_pkts.

Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

David Marchand April 10, 2019, 11:30 a.m. UTC | #1
On Wed, Apr 10, 2019 at 12:58 PM Xiaolong Ye <xiaolong.ye@intel.com> wrote:

> Since tx callback only picks mbufs that can be copied to the xdp desc, it
> should call xsk_ring_prod__submit with valid count other than nb_pkts.
>
> Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD")
>
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
> ---
>  drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 5cc643ce2..8c2ba5740 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -300,7 +300,7 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs,
> uint16_t nb_pkts)
>                 rte_pktmbuf_free(mbuf);
>         }
>
> -       xsk_ring_prod__submit(&txq->tx, nb_pkts);
> +       xsk_ring_prod__submit(&txq->tx, valid);
>

Err, well, I think we will have an issue here.

Taking from the 5.1.0-rc4 sources I have:

static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
{
        __u32 free_entries = r->cached_cons - r->cached_prod;

        if (free_entries >= nb)
                return free_entries;

        /* Refresh the local tail pointer.
         * cached_cons is r->size bigger than the real consumer pointer so
         * that this addition can be avoided in the more frequently
         * executed code that computs free_entries in the beginning of
         * this function. Without this optimization it whould have been
         * free_entries = r->cached_prod - r->cached_cons + r->size.
         */
        r->cached_cons = *r->consumer + r->size;

        return r->cached_cons - r->cached_prod;
}

static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
                                            size_t nb, __u32 *idx)
{
        if (unlikely(xsk_prod_nb_free(prod, nb) < nb))
                return 0;

        *idx = prod->cached_prod;
        prod->cached_prod += nb;

        return nb;
}

static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t
nb)
{
        /* Make sure everything has been written to the ring before
signalling
         * this to the kernel.
         */
        smp_wmb();

        *prod->producer += nb;
}


If we reserve N slots, but only submit n slots, we end up with an incorrect
opinion of the number of available slots later.
Either xsk_ring_prod__submit should also update cached_prod (but I am not
sure it was the intent of this api), or we must ensure that both reserve
and submit are consistent.

Did I miss some subtility ?
  
Xiaolong Ye April 11, 2019, 2:24 a.m. UTC | #2
On 04/10, David Marchand wrote:
>On Wed, Apr 10, 2019 at 12:58 PM Xiaolong Ye <xiaolong.ye@intel.com> wrote:
>
>> Since tx callback only picks mbufs that can be copied to the xdp desc, it
>> should call xsk_ring_prod__submit with valid count other than nb_pkts.
>>
>> Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD")
>>
>> Reported-by: David Marchand <david.marchand@redhat.com>
>> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
>> ---
>>  drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> index 5cc643ce2..8c2ba5740 100644
>> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> @@ -300,7 +300,7 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs,
>> uint16_t nb_pkts)
>>                 rte_pktmbuf_free(mbuf);
>>         }
>>
>> -       xsk_ring_prod__submit(&txq->tx, nb_pkts);
>> +       xsk_ring_prod__submit(&txq->tx, valid);
>>
>
>Err, well, I think we will have an issue here.
>
>Taking from the 5.1.0-rc4 sources I have:
>
>static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
>{
>        __u32 free_entries = r->cached_cons - r->cached_prod;
>
>        if (free_entries >= nb)
>                return free_entries;
>
>        /* Refresh the local tail pointer.
>         * cached_cons is r->size bigger than the real consumer pointer so
>         * that this addition can be avoided in the more frequently
>         * executed code that computs free_entries in the beginning of
>         * this function. Without this optimization it whould have been
>         * free_entries = r->cached_prod - r->cached_cons + r->size.
>         */
>        r->cached_cons = *r->consumer + r->size;
>
>        return r->cached_cons - r->cached_prod;
>}
>
>static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
>                                            size_t nb, __u32 *idx)
>{
>        if (unlikely(xsk_prod_nb_free(prod, nb) < nb))
>                return 0;
>
>        *idx = prod->cached_prod;
>        prod->cached_prod += nb;
>
>        return nb;
>}
>
>static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t
>nb)
>{
>        /* Make sure everything has been written to the ring before
>signalling
>         * this to the kernel.
>         */
>        smp_wmb();
>
>        *prod->producer += nb;
>}
>
>
>If we reserve N slots, but only submit n slots, we end up with an incorrect
>opinion of the number of available slots later.
>Either xsk_ring_prod__submit should also update cached_prod (but I am not
>sure it was the intent of this api), or we must ensure that both reserve
>and submit are consistent.

I think you are right, current design does have the flaw, I haven't thought of
it before :( So in order to make sure both reserve and submit are consistent, what
about we check the valid count of mbuf at the beginning, then reserve the valid 
count slots?

Thanks,
Xiaolong
>
>Did I miss some subtility ?
>
>
>-- 
>David Marchand
  
David Marchand April 11, 2019, 7:20 a.m. UTC | #3
Hello Xiaolong,

On Thu, Apr 11, 2019 at 4:30 AM Ye Xiaolong <xiaolong.ye@intel.com> wrote:

> On 04/10, David Marchand wrote:
> >If we reserve N slots, but only submit n slots, we end up with an
> incorrect
> >opinion of the number of available slots later.
> >Either xsk_ring_prod__submit should also update cached_prod (but I am not
> >sure it was the intent of this api), or we must ensure that both reserve
> >and submit are consistent.
>
> I think you are right, current design does have the flaw, I haven't
> thought of
> it before :( So in order to make sure both reserve and submit are
> consistent, what
> about we check the valid count of mbuf at the beginning, then reserve the
> valid
> count slots?
>
>
Ok, I can see other places to inspect in the driver: reserve_fill_queue()
for the same issue, and eth_af_xdp_rx() for a similar issue but with
xsk_ring_cons__peek()/xsk_ring_cons__release() ?
Validating the needed slots count before reserving/peeking in the prod/cons
rings seems the most simple fix.
  
Xiaolong Ye April 11, 2019, 7:27 a.m. UTC | #4
On 04/11, David Marchand wrote:
>Hello Xiaolong,
>
>On Thu, Apr 11, 2019 at 4:30 AM Ye Xiaolong <xiaolong.ye@intel.com> wrote:
>
>> On 04/10, David Marchand wrote:
>> >If we reserve N slots, but only submit n slots, we end up with an
>> incorrect
>> >opinion of the number of available slots later.
>> >Either xsk_ring_prod__submit should also update cached_prod (but I am not
>> >sure it was the intent of this api), or we must ensure that both reserve
>> >and submit are consistent.
>>
>> I think you are right, current design does have the flaw, I haven't
>> thought of
>> it before :( So in order to make sure both reserve and submit are
>> consistent, what
>> about we check the valid count of mbuf at the beginning, then reserve the
>> valid
>> count slots?
>>
>>
>Ok, I can see other places to inspect in the driver: reserve_fill_queue()
>for the same issue, and eth_af_xdp_rx() for a similar issue but with
>xsk_ring_cons__peek()/xsk_ring_cons__release() ?
>Validating the needed slots count before reserving/peeking in the prod/cons
>rings seems the most simple fix.

Ok, then I'll handle them in a following patch.

Thanks,
Xiaolong
>
>
>-- 
>David Marchand
  

Patch

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 5cc643ce2..8c2ba5740 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -300,7 +300,7 @@  eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 		rte_pktmbuf_free(mbuf);
 	}
 
-	xsk_ring_prod__submit(&txq->tx, nb_pkts);
+	xsk_ring_prod__submit(&txq->tx, valid);
 
 	kick_tx(txq);