[dpdk-dev] [PATCH 1/6] cxgbe: Optimize forwarding performance for 40G

Rahul Lakkireddy rahul.lakkireddy at chelsio.com
Mon Oct 5 17:07:31 CEST 2015


On Monday, October 10/05/15, 2015 at 07:09:27 -0700, Ananyev, Konstantin wrote:
> Hi Rahul,

[...]

> > > > This additional check seems redundant for single segment
> > > > packets since rte_pktmbuf_free_seg also performs rte_mbuf_sanity_check.
> > > >
> > > > Several PMDs already prefer to use rte_pktmbuf_free_seg directly over
> > > > rte_pktmbuf_free as it is faster.
> > >
> > > Other PMDs use rte_pktmbuf_free_seg() as each TD has an associated
> > > with it segment. So as HW is done with the TD, SW frees associated segment.
> > > In your case I don't see any point in re-implementing rte_pktmbuf_free() manually,
> > > and I don't think it would be any faster.
> > >
> > > Konstantin
> > 
> > As I mentioned below, I am clearly seeing a difference of 1 Mpps. And 1
> > Mpps is not a small difference IMHO.
> 
> Agree with you here - it is a significant difference.
> 
> > 
> > When running l3fwd with 8 queues, I also collected a perf report.
> > When using rte_pktmbuf_free, I see that it eats up around 6% cpu as
> > below in perf top report:-
> > --------------------
> > 32.00%  l3fwd                        [.] cxgbe_poll
> > 22.25%  l3fwd                        [.] t4_eth_xmit
> > 20.30%  l3fwd                        [.] main_loop
> >  6.77%  l3fwd                        [.] rte_pktmbuf_free
> >  4.86%  l3fwd                        [.] refill_fl_usembufs
> >  2.00%  l3fwd                        [.] write_sgl
> > .....
> > --------------------
> > 
> > While, when using rte_pktmbuf_free_seg directly, I don't see above
> > problem. perf top report now comes as:-
> > -------------------
> > 33.36%  l3fwd                        [.] cxgbe_poll
> > 32.69%  l3fwd                        [.] t4_eth_xmit
> > 19.05%  l3fwd                        [.] main_loop
> >  5.21%  l3fwd                        [.] refill_fl_usembufs
> >  2.40%  l3fwd                        [.] write_sgl
> > ....
> > -------------------
> 
> I don't think these 6% disappeared anywhere.
> As I can see, now t4_eth_xmit() increased by roughly same amount
> (you still have same job to do).

Right.

> To me it looks like in that case compiler didn't really inline rte_pktmbuf_free().
> Wonder can you add 'always_inline' attribute to the  rte_pktmbuf_free(),
> and see would it make any difference?
> 
> Konstantin 

I will try out above and update further.


Thanks,
Rahul.


More information about the dev mailing list