Issues around packet capture when secondary process is doing rx/tx

Morten Brørup mb at smartsharesystems.com
Mon Jan 8 11:41:17 CET 2024


> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Monday, 8 January 2024 02.59
> 
> I have been looking at a problem reported by Sandesh
> where packet capture does not work if rx/tx burst is done in secondary
> process.
> 
> The root cause is that existing rx/tx callback model just doesn't work
> unless the process doing the rx/tx burst calls is the same one that
> registered the callbacks.

So, callbacks don't work across processes, because code might differ across processes.

If process A is running, and RX'ing and TX'ing, and process B wants to install its own callbacks (e.g. packet capture) on RX and RX, we basically want process A to execute code residing in process B, which is impossible.

An alternative could be to pass the packets through a ring in shared memory. However, this method would add the ring processing latency of process B to the RX/TX latency of process A.

I think we can conclude that callbacks are one of the things that don't work with secondary processes.

With this decided, we can then consider how to best add packet capture. The concept of passing "data" (instead of calling functions) across processes obviously applies to this use case.

> 
> An example sequence would be:
> 	1. dumpcap (or pdump) as secondary tells pdump in primary to
> register callback
> 	2. secondary process calls rx_burst.
> 	3. rx_burst sees the callback but it has pointer pdump_rx which
> is not necessarily
> 	   at same location in primary and secondary process.
> 	4. indirect function call in secondary to bad location likely
> causes crash.
> 
> Some possible workarounds.
> 	1. Keep callback list per-process: messy, but won't crash.
> Capture won't work
>            without other changes. In this primary would register
> callback, but secondaries
>            would not use them in rx/tx burst.
> 
> 	2. Replace use of rx/tx callback in pdump with change to
> rte_ethdev to have
>            a capture flag. (i.e. don't use indirection).  Likely ABI
> problems.
>            Basically, ignore the rx/tx callback mechanism. This is my
> preferred
> 	   solution.
> 
> 	3. Some fix up mechanism (in EAL mp support?) to have each
> process fixup
>            its callback mechanism.
> 
> 	4. Do something in pdump_init to register the callback in same
> process context
> 	   (probably need callbacks to be per-process). Would mean
> callback is always
>            on independent of capture being enabled.
> 
>         5. Get rid of indirect function call pointer, and replace it by
> index into
>            a static table of callback functions. Every process would
> have same code
>            (in this case pdump_rx) but at different address.  Requires
> all callbacks
>            to be statically defined at build time.
> 
> The existing rx/tx callback is not safe id rx/tx burst is called from
> different process
> than where callback is registered.
> 



More information about the dev mailing list