net/af_xdp: submit valid count to Tx queue
Checks
Commit Message
Since tx callback only picks mbufs that can be copied to the xdp desc, it
should call xsk_ring_prod__submit with valid count other than nb_pkts.
Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD")
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
---
drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Wed, Apr 10, 2019 at 12:58 PM Xiaolong Ye <xiaolong.ye@intel.com> wrote:
> Since tx callback only picks mbufs that can be copied to the xdp desc, it
> should call xsk_ring_prod__submit with valid count other than nb_pkts.
>
> Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD")
>
> Reported-by: David Marchand <david.marchand@redhat.com>
> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
> ---
> drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 5cc643ce2..8c2ba5740 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -300,7 +300,7 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs,
> uint16_t nb_pkts)
> rte_pktmbuf_free(mbuf);
> }
>
> - xsk_ring_prod__submit(&txq->tx, nb_pkts);
> + xsk_ring_prod__submit(&txq->tx, valid);
>
Err, well, I think we will have an issue here.
Taking from the 5.1.0-rc4 sources I have:
static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
{
__u32 free_entries = r->cached_cons - r->cached_prod;
if (free_entries >= nb)
return free_entries;
/* Refresh the local tail pointer.
* cached_cons is r->size bigger than the real consumer pointer so
* that this addition can be avoided in the more frequently
* executed code that computs free_entries in the beginning of
* this function. Without this optimization it whould have been
* free_entries = r->cached_prod - r->cached_cons + r->size.
*/
r->cached_cons = *r->consumer + r->size;
return r->cached_cons - r->cached_prod;
}
static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
size_t nb, __u32 *idx)
{
if (unlikely(xsk_prod_nb_free(prod, nb) < nb))
return 0;
*idx = prod->cached_prod;
prod->cached_prod += nb;
return nb;
}
static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t
nb)
{
/* Make sure everything has been written to the ring before
signalling
* this to the kernel.
*/
smp_wmb();
*prod->producer += nb;
}
If we reserve N slots, but only submit n slots, we end up with an incorrect
opinion of the number of available slots later.
Either xsk_ring_prod__submit should also update cached_prod (but I am not
sure it was the intent of this api), or we must ensure that both reserve
and submit are consistent.
Did I miss some subtility ?
On 04/10, David Marchand wrote:
>On Wed, Apr 10, 2019 at 12:58 PM Xiaolong Ye <xiaolong.ye@intel.com> wrote:
>
>> Since tx callback only picks mbufs that can be copied to the xdp desc, it
>> should call xsk_ring_prod__submit with valid count other than nb_pkts.
>>
>> Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD")
>>
>> Reported-by: David Marchand <david.marchand@redhat.com>
>> Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
>> ---
>> drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> index 5cc643ce2..8c2ba5740 100644
>> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> @@ -300,7 +300,7 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs,
>> uint16_t nb_pkts)
>> rte_pktmbuf_free(mbuf);
>> }
>>
>> - xsk_ring_prod__submit(&txq->tx, nb_pkts);
>> + xsk_ring_prod__submit(&txq->tx, valid);
>>
>
>Err, well, I think we will have an issue here.
>
>Taking from the 5.1.0-rc4 sources I have:
>
>static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
>{
> __u32 free_entries = r->cached_cons - r->cached_prod;
>
> if (free_entries >= nb)
> return free_entries;
>
> /* Refresh the local tail pointer.
> * cached_cons is r->size bigger than the real consumer pointer so
> * that this addition can be avoided in the more frequently
> * executed code that computs free_entries in the beginning of
> * this function. Without this optimization it whould have been
> * free_entries = r->cached_prod - r->cached_cons + r->size.
> */
> r->cached_cons = *r->consumer + r->size;
>
> return r->cached_cons - r->cached_prod;
>}
>
>static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
> size_t nb, __u32 *idx)
>{
> if (unlikely(xsk_prod_nb_free(prod, nb) < nb))
> return 0;
>
> *idx = prod->cached_prod;
> prod->cached_prod += nb;
>
> return nb;
>}
>
>static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t
>nb)
>{
> /* Make sure everything has been written to the ring before
>signalling
> * this to the kernel.
> */
> smp_wmb();
>
> *prod->producer += nb;
>}
>
>
>If we reserve N slots, but only submit n slots, we end up with an incorrect
>opinion of the number of available slots later.
>Either xsk_ring_prod__submit should also update cached_prod (but I am not
>sure it was the intent of this api), or we must ensure that both reserve
>and submit are consistent.
I think you are right, current design does have the flaw, I haven't thought of
it before :( So in order to make sure both reserve and submit are consistent, what
about we check the valid count of mbuf at the beginning, then reserve the valid
count slots?
Thanks,
Xiaolong
>
>Did I miss some subtility ?
>
>
>--
>David Marchand
Hello Xiaolong,
On Thu, Apr 11, 2019 at 4:30 AM Ye Xiaolong <xiaolong.ye@intel.com> wrote:
> On 04/10, David Marchand wrote:
> >If we reserve N slots, but only submit n slots, we end up with an
> incorrect
> >opinion of the number of available slots later.
> >Either xsk_ring_prod__submit should also update cached_prod (but I am not
> >sure it was the intent of this api), or we must ensure that both reserve
> >and submit are consistent.
>
> I think you are right, current design does have the flaw, I haven't
> thought of
> it before :( So in order to make sure both reserve and submit are
> consistent, what
> about we check the valid count of mbuf at the beginning, then reserve the
> valid
> count slots?
>
>
Ok, I can see other places to inspect in the driver: reserve_fill_queue()
for the same issue, and eth_af_xdp_rx() for a similar issue but with
xsk_ring_cons__peek()/xsk_ring_cons__release() ?
Validating the needed slots count before reserving/peeking in the prod/cons
rings seems the most simple fix.
On 04/11, David Marchand wrote:
>Hello Xiaolong,
>
>On Thu, Apr 11, 2019 at 4:30 AM Ye Xiaolong <xiaolong.ye@intel.com> wrote:
>
>> On 04/10, David Marchand wrote:
>> >If we reserve N slots, but only submit n slots, we end up with an
>> incorrect
>> >opinion of the number of available slots later.
>> >Either xsk_ring_prod__submit should also update cached_prod (but I am not
>> >sure it was the intent of this api), or we must ensure that both reserve
>> >and submit are consistent.
>>
>> I think you are right, current design does have the flaw, I haven't
>> thought of
>> it before :( So in order to make sure both reserve and submit are
>> consistent, what
>> about we check the valid count of mbuf at the beginning, then reserve the
>> valid
>> count slots?
>>
>>
>Ok, I can see other places to inspect in the driver: reserve_fill_queue()
>for the same issue, and eth_af_xdp_rx() for a similar issue but with
>xsk_ring_cons__peek()/xsk_ring_cons__release() ?
>Validating the needed slots count before reserving/peeking in the prod/cons
>rings seems the most simple fix.
Ok, then I'll handle them in a following patch.
Thanks,
Xiaolong
>
>
>--
>David Marchand
@@ -300,7 +300,7 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
rte_pktmbuf_free(mbuf);
}
- xsk_ring_prod__submit(&txq->tx, nb_pkts);
+ xsk_ring_prod__submit(&txq->tx, valid);
kick_tx(txq);