[dpdk-dev] Occasional instability in RSS Hashes/Queues from X540 NIC

Matt Laswell laswell at infinite.io
Thu May 4 20:15:51 CEST 2017


Hey Keith,

Here is a hexdump of a subset of one of my packet captures.  In this
capture, all of the packets are part of the same TCP connection, which
happens to be NFSv3 traffic. All of them except packet number 6 get the
correct RSS hash and go to the right queue.  Packet number 6 (an NFS rename
reply with an NFS error) gets RSS hash 0 and goes to queue 0.   Whenever I
repeat this test, the reply to this particular rename attempt always goes
to the wrong core, though it seemingly differs from the rest of the flow
only in layers 4-7.

 I'll also attach a pcap to this email, in case that's a more convenient
way to interact with the packets.

--
Matt Laswell
laswell at infinite.io


16:08:37.093306 IP 10.151.3.81.disclose > 10.151.3.161.nfsd: Flags [P.],
seq 3173509264:3173509380, ack 3244259549, win 580, options [nop,nop,TS val
23060466 ecr 490971270], length 116: NFS request xid 2690728524 112 access
fh Unknown/8B6BFEBB04000000CFABD10301000000FFFFFFFF00000000DABC050201000000
NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
0x0000:  4500 00a8 6d0f 4000 4006 b121 0a97 0351  E...m. at .@..!...Q
0x0010:  0a97 03a1 029b 0801 bd27 e890 c15f 78dd  .........'..._x.
0x0020:  8018 0244 1cba 0000 0101 080a 015f dff2  ...D........._..
0x0030:  1d43 a086 8000 0070 a061 424c 0000 0000  .C.....p.aBL....
0x0040:  0000 0002 0001 86a3 0000 0003 0000 0004  ................
0x0050:  0000 0001 0000 0020 0107 8d2f 0000 0007  .........../....
0x0060:  6573 7869 3275 3100 0000 0000 0000 0000  esxi2u1.........
0x0070:  0000 0001 0000 0000 0000 0000 0000 0000  ................
0x0080:  0000 0020 8b6b febb 0400 0000 cfab d103  .....k..........
0x0090:  0100 0000 ffff ffff 0000 0000 dabc 0502  ................
0x00a0:  0100 0000 0000 001f                      ........
16:08:37.095837 IP 10.151.3.161.nfsd > 10.151.3.81.disclose: Flags [P.],
seq 1:125, ack 116, win 28688, options [nop,nop,TS val 490971270 ecr
23060466], length 124: NFS reply xid 2690728524 reply ok 120 access c 001f
0x0000:  4500 00b0 1b80 4000 4006 02a9 0a97 03a1  E..... at .@.......
0x0010:  0a97 0351 0801 029b c15f 78dd bd27 e904  ...Q....._x..'..
0x0020:  8018 7010 a61a 0000 0101 080a 1d43 a086  ..p..........C..
0x0030:  015f dff2 8000 0078 a061 424c 0000 0001  ._.....x.aBL....
0x0040:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0050:  0000 0000 0000 0001 0000 0002 0000 01ed  ................
0x0060:  0000 0003 0000 0000 0000 0000 0000 0000  ................
0x0070:  0000 0029 0000 0000 0000 0800 0000 00ff  ...)............
0x0080:  ffff 00ff 0000 0000 bbfe 6b8b 0000 0001  ..........k.....
0x0090:  03d1 abcf 5908 f554 3272 e4e6 5908 f554  ....Y..T2r..Y..T
0x00a0:  3272 e4e6 5908 f554 3365 2612 0000 001f  2r..Y..T3e&.....
16:08:37.096235 IP 10.151.3.81.disclose > 10.151.3.161.nfsd: Flags [P.],
seq 256:372, ack 285, win 589, options [nop,nop,TS val 23060467 ecr
490971270], length 116: NFS request xid 2724282956 112 access fh
Unknown/8B6BFEBB04000000D0ABD10301000000FFFFFFFF00000000DABC050201000000
NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
0x0000:  4500 00a8 6d11 4000 4006 b11f 0a97 0351  E...m. at .@......Q
0x0010:  0a97 03a1 029b 0801 bd27 e990 c15f 79f9  .........'..._y.
0x0020:  8018 024d 1cba 0000 0101 080a 015f dff3  ...M........._..
0x0030:  1d43 a086 8000 0070 a261 424c 0000 0000  .C.....p.aBL....
0x0040:  0000 0002 0001 86a3 0000 0003 0000 0004  ................
0x0050:  0000 0001 0000 0020 0107 8d2f 0000 0007  .........../....
0x0060:  6573 7869 3275 3100 0000 0000 0000 0000  esxi2u1.........
0x0070:  0000 0001 0000 0000 0000 0000 0000 0000  ................
0x0080:  0000 0020 8b6b febb 0400 0000 d0ab d103  .....k..........
0x0090:  0100 0000 ffff ffff 0000 0000 dabc 0502  ................
0x00a0:  0100 0000 0000 001f                      ........
16:08:37.098361 IP 10.151.3.161.nfsd > 10.151.3.81.disclose: Flags [P.],
seq 285:409, ack 372, win 28688, options [nop,nop,TS val 490971270 ecr
23060467], length 124: NFS reply xid 2724282956 reply ok 120 access c 001f
0x0000:  4500 00b0 1b81 4000 4006 02a8 0a97 03a1  E..... at .@.......
0x0010:  0a97 0351 0801 029b c15f 79f9 bd27 ea04  ...Q....._y..'..
0x0020:  8018 7010 ec45 0000 0101 080a 1d43 a086  ..p..E.......C..
0x0030:  015f dff3 8000 0078 a261 424c 0000 0001  ._.....x.aBL....
0x0040:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0050:  0000 0000 0000 0001 0000 0002 0000 01ed  ................
0x0060:  0000 0004 0000 0000 0000 0000 0000 0000  ................
0x0070:  0000 0050 0000 0000 0000 0800 0000 00ff  ...P............
0x0080:  ffff 00ff 0000 0000 bbfe 6b8b 0000 0001  ..........k.....
0x0090:  03d1 abd0 5908 f554 3536 88ea 5908 f554  ....Y..T56..Y..T
0x00a0:  3536 88ea 5908 f555 01ff bf76 0000 001f  56..Y..U...v....
16:08:37.099013 IP 10.151.3.81.disclose > 10.151.3.161.nfsd: Flags [P.],
seq 652:856, ack 813, win 605, options [nop,nop,TS val 23060467 ecr
490971270], length 204: NFS request xid 2774614604 200 rename fh
Unknown/8B6BFEBB04000000D0ABD10301000000FFFFFFFF00000000DABC050201000000
"DirReplaceNotEmpty_ovr" -> fh
Unknown/8B6BFEBB04000000D0ABD10301000000FFFFFFFF00000000DABC050201000000
"DirReplaceNotEmpty_src"
0x0000:  4500 0100 6d14 4000 4006 b0c4 0a97 0351  E...m. at .@......Q
0x0010:  0a97 03a1 029b 0801 bd27 eb1c c15f 7c09  .........'..._|.
0x0020:  8018 025d 1d12 0000 0101 080a 015f dff3  ...]........._..
0x0030:  1d43 a086 8000 00c8 a561 424c 0000 0000  .C.......aBL....
0x0040:  0000 0002 0001 86a3 0000 0003 0000 000e  ................
0x0050:  0000 0001 0000 0020 0107 8d2f 0000 0007  .........../....
0x0060:  6573 7869 3275 3100 0000 0000 0000 0000  esxi2u1.........
0x0070:  0000 0001 0000 0000 0000 0000 0000 0000  ................
0x0080:  0000 0020 8b6b febb 0400 0000 d0ab d103  .....k..........
0x0090:  0100 0000 ffff ffff 0000 0000 dabc 0502  ................
0x00a0:  0100 0000 0000 0016 4469 7252 6570 6c61  ........DirRepla
0x00b0:  6365 4e6f 7445 6d70 7479 5f6f 7672 0000  ceNotEmpty_ovr..
0x00c0:  0000 0020 8b6b febb 0400 0000 d0ab d103  .....k..........
0x00d0:  0100 0000 ffff ffff 0000 0000 dabc 0502  ................
0x00e0:  0100 0000 0000 0016 4469 7252 6570 6c61  ........DirRepla
0x00f0:  6365 4e6f 7445 6d70 7479 5f73 7263 0000  ceNotEmpty_src..
16:08:37.101770 IP 10.151.3.161.nfsd > 10.151.3.81.disclose: Flags [P.],
seq 4294966865:4294966961, ack 4294967244, win 28688, options [nop,nop,TS
val 490971270 ecr 23060467], length 96: NFS reply xid 2774614604 reply ok
92 rename ERROR: File exists
0x0000:  4500 0094 1b82 4000 4006 02c3 0a97 03a1  E..... at .@.......
0x0010:  0a97 0351 0801 029b c15f 772d bd27 e85c  ...Q....._w-.'.\
0x0020:  8018 7010 c0f8 0000 0101 080a 1d43 a086  ..p..........C..
0x0030:  015f dff3 8000 005c a561 424c 0000 0001  ._.....\.aBL....
0x0040:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0050:  0000 0011 0000 0001 0000 0000 0000 0050  ...............P
0x0060:  5908 f554 3536 88ea 5908 f555 01ff bf76  Y..T56..Y..U...v
0x0070:  0000 0000 0000 0001 0000 0000 0000 0050  ...............P
0x0080:  5908 f554 3536 88ea 5908 f555 01ff bf76  Y..T56..Y..U...v
0x0090:  0000 0000                                ....
16:08:37.101774 IP 10.151.3.81.disclose > 10.151.3.161.nfsd: Flags [.], ack
813, win 605, options [nop,nop,TS val 23060468 ecr 490971270,nop,nop,sack 1
{4294966865:4294966961}], length 0
0x0000:  4500 0040 6d15 4000 4006 b183 0a97 0351  E.. at m.@. at ......Q
0x0010:  0a97 03a1 029b 0801 bd27 ebe8 c15f 7c09  .........'..._|.
0x0020:  b010 025d 1c52 0000 0101 080a 015f dff4  ...].R......._..
0x0030:  1d43 a086 0101 050a c15f 772d c15f 778d  .C......._w-._w.

On Thu, May 4, 2017 at 11:34 AM, Wiles, Keith <keith.wiles at intel.com> wrote:

>
> > On May 4, 2017, at 8:04 AM, Matt Laswell <laswell at infinite.io> wrote:
> >
> > Hey Folks,
> >
> > I'm seeing some strange behavior with regard to the RSS hash values in my
> > applications and was hoping somebody might have some pointers on where to
> > look.  In my application, I'm using RSS to divide work among multiple
> > cores, each of which services a single RX queue.  When dealing with a
> > single long-lived TCP connection, I occasionally see packets going to the
> > wrong core.   That is, almost all of the packets in the connection go to
> > core 5 in this case, but every once in a while, one goes to core 0
> instead.
> >
> > Upon further investigation, I find two problems are occurring.  The first
> > is that problem packets have the RSS hash value in the mbuf incorrectly
> set
> > to zero.  They are therefore put in queue zero, where they are read by
> core
> > zero.  Other packets from the same connection that occur immediately
> before
> > and after the packet in question have the correct hash value and
> therefore
> > go to a different core.   The second problem is that we sometimes see
> > packets in which the RSS hash in the mbuf appears correct, but the
> packets
> > are incorrectly put into queue zero.  As with the first, this results in
> > the wrong core getting the packet.  Either one of these confuses the
> state
> > tracking we're doing per-core.
> >
> > A few details:
> >
> >   - Using an Intel X540-AT2 NIC and the igb_uio driver
> >   - DPDK 16.04
> >   - A particular packet in our workflow always encounters this problem.
> >   - Retransmissions of the packet in question also encounter the problem
> >   - The packet is IPv4, with header length of 20 (so no options), no
> >   fragmentation.
> >   - The only differences I can see in the IP header between packets that
> >   get the right hash value and those that get the wrong one are in the
> IP ID,
> >   total length, and checksum fields.
> >   - Using ETH_RSS_IPV4
> >   - The packet is TCP with about 100 bytes of payload - it's not a jumbo
> >   or a runt
> >   - We fill the key in with 0x6d5a to get symmetric hashing of both sides
> >   of the connection
> >   - We only configure RSS information at boot; things like the key or
> >   header fields are not being changed dynamically
> >   - Traffic load is light when the problem occurs
> >
> > Is anybody aware of an errata, either in the NIC or the PMD's
> configuration
> > of it that might explain something like this?   Failing that, if you ran
> > into this sort of behavior, how would you approach finding the reason for
> > the error?  Every failure mode I can think of would tend to affect all of
> > the packets in the connection consistently, even if incorrectly.
>
> Just to add more information to this email, can you provide hexdumps of
> the packets to help someone maybe spot the problem?
>
> Need the previous OK packet plus the one after it and the failing packets
> you are seeing.
>
> I do not know why this is happening as I do not know of any errata to
> explain this issue.
>
> >
> > Thanks in advance for any ideas.
> >
> > --
> > Matt Laswell
> > laswell at infinite.io
>
> Regards,
> Keith
>
>


More information about the dev mailing list