[dpdk-users] Strange packet loss with multi-frame payloads

Shyam Shrivastav shrivastav.shyam at gmail.com
Tue Jul 18 14:20:25 CEST 2017


Yes your RSS configuration is not an issue ..

On Tue, Jul 18, 2017 at 4:36 PM, Harold Demure <harold.demure87 at gmail.com>
wrote:

> Hello again,
>   At the bottom of this email you find my rte_eth_conf configuration,
> which includes RSS.
> For my NIC, documentation says RSS can only be used taking into account
> also the transport layer [1].
> For a given client/server pair, all the packets with the same src/dst port
> are received by the same core.
> So to ensure that all the fragments are received by the same core, I keep
> fixed the src-dst port.
>
> Indeed, this works just fine with smaller payloads (even multi-frame), and
> also the clients always get multi-frame replies, because an individual
> logical reply has all its segments delivered to the same client thread.
>
> Thank you again for your feedback.
> Regards,
>   Harold
>
> =========
>
> [1] http://dpdk.org/doc/guides/nics/mlx4.html
>
> static struct rte_eth_conf port_conf = {
>         .rxmode = {
>                 .mq_mode    = ETH_MQ_RX_RSS,
>                 .split_hdr_size = 0,
>                 .header_split   = 0, /**< Header Split disabled */
>                 .hw_ip_checksum = 0, /**< IP checksum offload enabled */
>                 .hw_vlan_filter = 0, /**< VLAN filtering disabled */
>                 .jumbo_frame    = 0, /**< Jumbo Frame Support disabled */
>                 .hw_strip_crc   = 0, /**< CRC stripped by hardware */
>                 .max_rx_pkt_len =  ETHER_MAX_LEN,
>                 .enable_scatter = 1
>         },
>         .rx_adv_conf = {
>                 .rss_conf = {
>                         .rss_key = NULL,
>                         .rss_hf = ETH_RSS_IP | ETH_RSS_UDP,
>                 },
>         },
>         .txmode = {
>                 .mq_mode = ETH_MQ_TX_NONE,
>         },
> };
>
>
> 2017-07-18 12:07 GMT+02:00 Shyam Shrivastav <shrivastav.shyam at gmail.com>:
>
>> Hi Harold
>> I meant optimal performance w.r.t packets per second. If there is no loss
>> without app fragmentation at target pps with say 8 RX queues, and same
>> results in missing packets with app fragmentation then the issue might me
>> somewhere else. What is RSS configuration, you should not take transport
>> headers into account ETH_RSS_IPV4 is safe otherwise different app fragments
>> of same packet can go to different RX queues.
>>
>> On Tue, Jul 18, 2017 at 3:06 PM, Harold Demure <harold.demure87 at gmail.com
>> > wrote:
>>
>>> Hello Shyam,
>>>    Thank you for your suggestion. I will try what you say. However, this
>>> problem arises only with specific workloads. For example, if the clients
>>> only send requests of 1 frame, everything runs smoothly even with 16 active
>>> queues. My problem arises only with bigger payloads and multiple queues.
>>> Shouldn't this suggest that the problem is not "simply" that my NIC drops
>>> packets with > X active queues?
>>>
>>> Regards,
>>>   Harold
>>>
>>> 2017-07-18 7:50 GMT+02:00 Shyam Shrivastav <shrivastav.shyam at gmail.com>:
>>>
>>>> As I understand the problem disappears with 1 RX queue on server. You
>>>> can reduce number of queues on server from 8 and arrive at an optimal value
>>>> without packet loss.
>>>> For intel 82599 NIC packet loss is experienced with more than 4 RX
>>>> queues, this was reported in dpdk dev or user mailing list, read in
>>>> archives sometime back while looking for similar information with 82599.
>>>>
>>>> On Tue, Jul 18, 2017 at 4:54 AM, Harold Demure <
>>>> harold.demure87 at gmail.com> wrote:
>>>>
>>>>> Hello again,
>>>>>   I tried to convert my statically defined buffers into buffers
>>>>> allocated
>>>>> through rte_malloc (as discussed in the previous email, see quoted
>>>>> text).
>>>>> Unfortunately, the problem is still there :(
>>>>> Regards,
>>>>>   Harold
>>>>>
>>>>>
>>>>>
>>>>> >
>>>>> > 2. How do you know you have the packet loss?
>>>>> >
>>>>> >
>>>>> > *I know it because some fragmented packets never get reassembled
>>>>> fully. If
>>>>> > I print the packets seen by the server I see something like
>>>>> "PCKT_ID 10
>>>>> > FRAG 250, PCKT_ID 10 FRAG 252". And FRAG 251 is never printed.*
>>>>> >
>>>>> > *Actually, something strange that happens sometimes is that a core
>>>>> > receives fragments of two packets and, say, receives   frag 1 of
>>>>> packet X,
>>>>> > frag 2 of packet Y, frag 3 of packet X, frag 4 of packet Y.*
>>>>> > *Or that, after "losing" a fragment for packet X, I only see printed
>>>>> > fragments with EVEN frag_id for that packet X. At least for a while.*
>>>>> >
>>>>> > *This led me also to consider a bug in my implementation (I don't
>>>>> > experience this problem if I run with a SINGLE client thread).
>>>>> However,
>>>>> > with smaller payloads, even fragmented, everything runs smoothly.*
>>>>> > *If you have any suggestions for tests to run to spot a possible bug
>>>>> in my
>>>>> > implementation, It'd be more than welcome!*
>>>>> >
>>>>> > *MORE ON THIS: the buffers in which I store the packets taken from
>>>>> RX are
>>>>> > statically defined arrays, like struct rte_mbuf*  temp_mbuf[SIZE].
>>>>> SIZE
>>>>> > can be pretty high (say, 10K entries), and there are 3 of those
>>>>> arrays per
>>>>> > core. Can it be that, somehow, they mess up the memory layout (e.g.,
>>>>> they
>>>>> > intersect)?*
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>


More information about the users mailing list