[dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path

Maxime Coquelin maxime.coquelin at redhat.com
Fri Oct 28 09:42:13 CEST 2016



On 10/28/2016 02:49 AM, Wang, Zhihong wrote:
>
>> > -----Original Message-----
>> > From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
>> > Sent: Thursday, October 27, 2016 6:46 PM
>> > To: Maxime Coquelin <maxime.coquelin at redhat.com>
>> > Cc: Wang, Zhihong <zhihong.wang at intel.com>;
>> > stephen at networkplumber.org; Pierre Pfister (ppfister)
>> > <ppfister at cisco.com>; Xie, Huawei <huawei.xie at intel.com>; dev at dpdk.org;
>> > vkaplans at redhat.com; mst at redhat.com
>> > Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support
>> > to the TX path
>> >
>> > On Thu, Oct 27, 2016 at 12:35:11PM +0200, Maxime Coquelin wrote:
>>> > >
>>> > >
>>> > > On 10/27/2016 12:33 PM, Yuanhan Liu wrote:
>>>> > > >On Thu, Oct 27, 2016 at 11:10:34AM +0200, Maxime Coquelin wrote:
>>>>> > > >>Hi Zhihong,
>>>>> > > >>
>>>>> > > >>On 10/27/2016 11:00 AM, Wang, Zhihong wrote:
>>>>>> > > >>>Hi Maxime,
>>>>>> > > >>>
>>>>>> > > >>>Seems indirect desc feature is causing serious performance
>>>>>> > > >>>degradation on Haswell platform, about 20% drop for both
>>>>>> > > >>>mrg=on and mrg=off (--txqflags=0xf00, non-vector version),
>>>>>> > > >>>both iofwd and macfwd.
>>>>> > > >>I tested PVP (with macswap on guest) and Txonly/Rxonly on an Ivy
>> > Bridge
>>>>> > > >>platform, and didn't faced such a drop.
>>>> > > >
>>>> > > >I was actually wondering that may be the cause. I tested it with
>>>> > > >my IvyBridge server as well, I saw no drop.
>>>> > > >
>>>> > > >Maybe you should find a similar platform (Haswell) and have a try?
>>> > > Yes, that's why I asked Zhihong whether he could test Txonly in guest to
>>> > > see if issue is reproducible like this.
>> >
>> > I have no Haswell box, otherwise I could do a quick test for you. IIRC,
>> > he tried to disable the indirect_desc feature, then the performance
>> > recovered. So, it's likely the indirect_desc is the culprit here.
>> >
>>> > > I will be easier for me to find an Haswell machine if it has not to be
>>> > > connected back to back to and HW/SW packet generator.
> In fact simple loopback test will also do, without pktgen.
>
> Start testpmd in both host and guest, and do "start" in one
> and "start tx_first 32" in another.
>
> Perf drop is about 24% in my test.
>

Thanks, I never tried this test.
I managed to find an Haswell platform (Intel(R) Xeon(R) CPU E5-2699 v3
@ 2.30GHz), and can reproduce the problem with the loop test you
mention. I see a performance drop about 10% (8.94Mpps/8.08Mpps).
Out of curiosity, what are the numbers you get with your setup?

As I never tried this test, I run it again on my Sandy Bridge setup, and
I also see a performance regression, this time of 4%.

If I understand correctly the test, only 32 packets are allocated,
corresponding to a single burst, which is less than the queue size.
So it makes sense that the performance is lower with this test case.

Thanks,
Maxime


More information about the dev mailing list