[dpdk-dev] [PATCH v2 15/15] app/test: add unit tests for SW eventdev driver
Van Haaren, Harry
harry.van.haaren at intel.com
Wed Feb 8 11:44:11 CET 2017
> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Wednesday, February 8, 2017 10:23 AM
> To: Van Haaren, Harry <harry.van.haaren at intel.com>
> Cc: dev at dpdk.org; Richardson, Bruce <bruce.richardson at intel.com>; Hunt, David
> <david.hunt at intel.com>; nipun.gupta at nxp.com; hemant.agrawal at nxp.com; Eads, Gage
> <gage.eads at intel.com>
> Subject: Re: [PATCH v2 15/15] app/test: add unit tests for SW eventdev driver
<snip>
> Thanks for SW driver specific test cases. It provided me a good insight
> of expected application behavior from SW driver perspective and in turn it created
> some challenge in portable applications.
>
> I would like highlight a main difference between the implementation and get a
> consensus on how to abstract it?
Thanks for taking the time to detail your thoughts - the examples certainly help to get a better picture of the whole.
> Based on existing header file, We can do event pipelining in two different ways
> a) Flow-based event pipelining
> b) queue_id based event pipelining
>
> I will provide an example to showcase application flow in both modes.
> Based on my understanding from SW driver source code, it supports only
> queue_id based event pipelining. I guess, Flow based event pipelining will
> work semantically with SW driver but it will be very slow.
>
> I think, the reason for the difference is the capability of the context definition.
> SW model the context is - queue_id
> Cavium HW model the context is queue_id + flow_id + sub_event_type +
> event_type
>
> AFAIK, queue_id based event pipelining will work with NXP HW but I am not
> sure about flow based event pipelining model with NXP HW. Appreciate any
> input this?
>
> In Cavium HW, We support both modes.
>
> As an open question, Should we add a capability flag to advertise the supported
> models and let application choose the model based on implementation capability. The
> downside is, a small portion of stage advance code will be different but we
> can reuse the STAGE specific application code(I think it a fair
> trade off)
>
> Bruce, Harry, Gage, Hemant, Nipun
> Thoughts? Or any other proposal?
[HvH] Comments inline.
> I will take an non trivial realworld NW use case show the difference.
> A standard IPSec outbound processing will have minimum 4 to 5 stages
>
> stage_0:
> --------
> a) Takes the pkts from ethdev and push to eventdev as
> RTE_EVENT_OP_NEW
> b) Some HW implementation, This will be done by HW. In SW implementation
> it done by service cores
>
> stage_1:(ORDERED)
> ------------------
> a) Receive pkts from stage_0 in ORDERED flow and it process in parallel on N
> of cores
> b) Find a SA belongs that packet move to next stage for SA specific
> outbound operations.Outbound processing starts with updating the
> sequence number in the critical section and followed by packet encryption in
> parallel.
>
> stage_2(ATOMIC) based on SA
> ----------------------------
> a) Update the sequence number and move to ORDERED sched_type for packet
> encryption in parallel
>
> stage_3(ORDERED) based on SA
> ----------------------------
> a) Encrypt the packets in parallel
> b) Do output route look-up and figure out tx port and queue to transmit
> the packet
> c) Move to ATOMIC stage based on tx port and tx queue_id to transmit
> the packet _without_ losing the ingress ordering
>
> stage_4(ATOMIC) based on tx port/tx queue
> -----------------------------------------
> a) enqueue the encrypted packet to ethdev tx port/tx_queue
>
>
> 1) queue_id based event pipelining
> =================================
>
> stage_1_work(assigned to event queue 1)# N ports/N cores establish
> link to queue 1 through rte_event_port_link()
>
> on_each_cores_linked_to_queue1(stage1)
[HvH] All worker cores can be linked to all stages - we do a lookup of what stage the work is based on the event->queue_id.
> while(1)
> {
> /* STAGE 1 processing */
> nr_events = rte_event_dequeue_burst(ev,..);
> if (!nr_events);
> continue;
>
> sa = find_sa_from_packet(ev.mbuf);
>
> /* move to next stage(ATOMIC) */
> ev.event_type = RTE_EVENT_TYPE_CPU;
> ev.sub_event_type = 2;
> ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
> ev.flow_id = sa;
> ev.op = RTE_EVENT_OP_FORWARD;
> ev.queue_id = 2;
> /* move to stage 2(event queue 2) */
> rte_event_enqueue_burst(ev,..);
> }
>
> on_each_cores_linked_to_queue2(stage2)
> while(1)
> {
> /* STAGE 2 processing */
> nr_events = rte_event_dequeue_burst(ev,..);
> if (!nr_events);
> continue;
>
> sa_specific_atomic_processing(sa /* ev.flow_id */);/* seq number update in
> critical section */
>
> /* move to next stage(ORDERED) */
> ev.event_type = RTE_EVENT_TYPE_CPU;
> ev.sub_event_type = 3;
> ev.sched_type = RTE_SCHED_TYPE_ORDERED;
> ev.flow_id = sa;
> ev.op = RTE_EVENT_OP_FORWARD;
> ev.queue_id = 3;
> /* move to stage 3(event queue 3) */
> rte_event_enqueue_burst(ev,..);
> }
>
> on_each_cores_linked_to_queue3(stage3)
> while(1)
> {
> /* STAGE 3 processing */
> nr_events = rte_event_dequeue_burst(ev,..);
> if (!nr_events);
> continue;
>
> sa_specific_ordered_processing(sa /*ev.flow_id */);/* packets encryption in
> parallel */
>
> /* move to next stage(ATOMIC) */
> ev.event_type = RTE_EVENT_TYPE_CPU;
> ev.sub_event_type = 4;
> ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
> output_tx_port_queue = find_output_tx_queue_and_tx_port(ev.mbuff);
> ev.flow_id = output_tx_port_queue;
> ev.op = RTE_EVENT_OP_FORWARD;
> ev.queue_id = 4;
> /* move to stage 4(event queue 4) */
> rte_event_enqueue_burst(ev,...);
> }
>
> on_each_cores_linked_to_queue4(stage4)
> while(1)
> {
> /* STAGE 4 processing */
> nr_events = rte_event_dequeue_burst(ev,..);
> if (!nr_events);
> continue;
>
> rte_eth_tx_buffer();
> }
>
> 2) flow-based event pipelining
> =============================
>
> - No need to partition queues for different stages
> - All the cores can operate on all the stages, Thus enables
> automatic multicore scaling, true dynamic load balancing,
[HvH] The sw case is the same - all cores can map to all stages, the lookup for stage of work is the queue_id.
> - Fairly large number of SA(kind of 2^16 to 2^20) can be processed in parallel
> Something existing IPSec application has constraints on
> http://dpdk.org/doc/guides-16.04/sample_app_ug/ipsec_secgw.html
>
> on_each_worker_cores()
> while(1)
> {
> rte_event_dequeue_burst(ev,..)
> if (!nr_events);
> continue;
>
> /* STAGE 1 processing */
> if(ev.event_type == RTE_EVENT_TYPE_ETHDEV) {
> sa = find_it_from_packet(ev.mbuf);
> /* move to next stage2(ATOMIC) */
> ev.event_type = RTE_EVENT_TYPE_CPU;
> ev.sub_event_type = 2;
> ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
> ev.flow_id = sa;
> ev.op = RTE_EVENT_OP_FORWARD;
> rte_event_enqueue_burst(ev..);
>
> } else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 2) { /* stage 2 */
[HvH] In the case of software eventdev ev.queue_id is used instead of ev.sub_event_type - but this is the same lookup operation as mentioned above. I don't see a fundamental difference between these approaches?
>
> sa_specific_atomic_processing(sa /* ev.flow_id */);/* seq number update in critical
> section */
> /* move to next stage(ORDERED) */
> ev.event_type = RTE_EVENT_TYPE_CPU;
> ev.sub_event_type = 3;
> ev.sched_type = RTE_SCHED_TYPE_ORDERED;
> ev.flow_id = sa;
> ev.op = RTE_EVENT_OP_FORWARD;
> rte_event_enqueue_burst(ev,..);
>
> } else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 3) { /* stage 3 */
>
> sa_specific_ordered_processing(sa /* ev.flow_id */);/* like encrypting packets in
> parallel */
> /* move to next stage(ATOMIC) */
> ev.event_type = RTE_EVENT_TYPE_CPU;
> ev.sub_event_type = 4;
> ev.sched_type = RTE_SCHED_TYPE_ATOMIC;
> output_tx_port_queue = find_output_tx_queue_and_tx_port(ev.mbuff);
> ev.flow_id = output_tx_port_queue;
> ev.op = RTE_EVENT_OP_FORWARD;
> rte_event_enqueue_burst(ev,..);
>
> } else if(ev.event_type == RTE_EVENT_TYPE_CPU && ev.sub_event_type == 4) { /* stage 4 */
> rte_eth_tx_buffer();
> }
> }
>
> /Jerin
> Cavium
More information about the dev
mailing list