[PATCH v7 15/17] examples/l3fwd-graph: introduce multicore dispatch worker model
Jerin Jacob
jerinjacobk at gmail.com
Tue Jun 6 07:55:24 CEST 2023
On Tue, Jun 6, 2023 at 10:41 AM Yan, Zhirun <zhirun.yan at intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk at gmail.com>
> > Sent: Monday, June 5, 2023 9:42 PM
> > To: Yan, Zhirun <zhirun.yan at intel.com>
> > Cc: dev at dpdk.org; jerinj at marvell.com; kirankumark at marvell.com;
> > ndabilpuram at marvell.com; stephen at networkplumber.org;
> > pbhagavatula at marvell.com; Liang, Cunming <cunming.liang at intel.com>; Wang,
> > Haiyue <haiyue.wang at intel.com>; mattias.ronnblom
> > <mattias.ronnblom at ericsson.com>
> > Subject: Re: [PATCH v7 15/17] examples/l3fwd-graph: introduce multicore
> > dispatch worker model
> >
> > On Mon, Jun 5, 2023 at 4:57 PM Zhirun Yan <zhirun.yan at intel.com> wrote:
> > >
> > > Add new parameter "model" to choose mcore dispatch or rtc model.
> > > And in dispatch model, the node will affinity to worker core successively.
> > >
> > > Note:
> > > RTE_GRAPH_MODEL_SELECT is set to GRAPH_MODEL_RTC by default. Must
> > set
> > > model the same as RTE_GRAPH_MODEL_SELECT If set it as rtc or mcore
> > > dispatch explicitly. GRAPH_MODEL_MCORE_RUNTIME_SELECT means it could
> > > choose by model in runtime.
> > > Only support one RX node for mcore dispatch model in current
> > > implementation.
> > >
> > > ./dpdk-l3fwd-graph -l 8,9,10,11 -n 4 -- -p 0x1 --config="(0,0,9)" -P
> > > --model="dispatch"
> > >
> > > Signed-off-by: Haiyue Wang <haiyue.wang at intel.com>
> > > Signed-off-by: Cunming Liang <cunming.liang at intel.com>
> > > Signed-off-by: Zhirun Yan <zhirun.yan at intel.com>
> > > ---
> > > examples/l3fwd-graph/main.c | 231 +++++++++++++++++++++++++++++------
> > > lib/graph/rte_graph_worker.h | 3 +
> > > 2 files changed, 196 insertions(+), 38 deletions(-)
> > >
> > > diff --git a/examples/l3fwd-graph/main.c b/examples/l3fwd-graph/main.c
> > > index 5feeab4f0f..4ecc6c9af4 100644
> > > --- a/examples/l3fwd-graph/main.c
> > > +++ b/examples/l3fwd-graph/main.c
> > > @@ -23,6 +23,12 @@
> > > #include <rte_cycles.h>
> > > #include <rte_eal.h>
> > > #include <rte_ethdev.h>
> > > +#define GRAPH_MODEL_RTC 0 /* Run-to-completion model, set by default.
> > > +*/ #define GRAPH_MODEL_MCORE_DISPATCH 1 /* Dispatch model. */
> > #define
> > > +GRAPH_MODEL_MCORE_RUNTIME_SELECT 2 /* Support to select model by
> > */
> > > + /* parsing model in
> > > +cmdline. */
> >
> > After moving model to graph->model, Can you check the performance.
>
> In my env, I test l3fwd-graph, I got the same throughput.(slight improve could be treated as jitter)
> For graph_perf_autotest in test app, there is slight drop (About 0.2% call, similar cycles/call)
> Can it be treated as jitter?
Most likely.
Try in following in fasth path.
const ... model = graph->model;
>
> Old:
> +-------------------------------+---------------+---------------+---------------+---------------+---------------+-----------+
> |Node |calls |objs |realloc_count |objs/call |objs/sec(10E6) |cycles/call|
> +-------------------------------+---------------+---------------+---------------+---------------+---------------+-----------+
> |test_graph_perf_worker-0-0 |10175176 |2604845056 |1 |256.000 |2015.394304 |27.0000 |
> |test_graph_perf_worker-1-0 |10175542 |2604938752 |1 |256.000 |2015.488000 |28.0000 |
> |test_graph_perf_worker-2-0 |10175565 |2604944640 |1 |256.000 |2015.493888 |28.0000 |
> |test_graph_perf_worker-3-0 |10175593 |2604951808 |1 |256.000 |2015.501056 |27.0000 |
> |test_graph_perf_source-0 |10175623 |2604959488 |2 |256.000 |2015.508480 |27.0000 |
> |test_graph_perf_sink-0 |10175642 |2604964352 |1 |256.000 |2015.513600 |27.0000 |
> +-------------------------------+---------------+---------------+---------------+---------------+---------------+-----------+
>
> New:
> +-------------------------------+---------------+---------------+---------------+---------------+---------------+-----------+
> |Node |calls |objs |realloc_count |objs/call |objs/sec(10E6) |cycles/call|
> +-------------------------------+---------------+---------------+---------------+---------------+---------------+-----------+
> |test_graph_perf_worker-0-0 |10154953 |2599667968 |1 |256.000 |2010.960128 |27.0000 |
> |test_graph_perf_worker-1-0 |10155316 |2599760896 |1 |256.000 |2011.053056 |27.0000 |
> |test_graph_perf_worker-2-0 |10155338 |2599766528 |1 |256.000 |2011.058688 |28.0000 |
> |test_graph_perf_worker-3-0 |10155357 |2599771392 |1 |256.000 |2011.063552 |28.0000 |
> |test_graph_perf_source-0 |10155394 |2599780864 |2 |256.000 |2011.072768 |27.0000 |
> |test_graph_perf_sink-0 |10155422 |2599788032 |1 |256.000 |2011.080192 |27.0000 |
> +-------------------------------+---------------+---------------+---------------+---------------+---------------+-----------+
>
> > This may not be needed for l3fwd
> >
> Do you mean graph->model?
Yes.
>
> > or if there is not much code duplication,
> >
> > Do the following remove the limitation,
> > #define RTE_GRAPH_MODEL_SELECT RTE_GRAPH_MODEL_RTC.
> >
> > graph_main_loop change to graph_main_rtc_loop
> >
> > #define RTE_GRAPH_MODEL_SELECT GRAPH_MODEL_MCORE_DISPATCH
> >
> > graph_main_loop change to graph_main_mcore_loop
> >
> > Select the following based on runtime option
> > /* Launch per-lcore init on every worker lcore */
> > rte_eal_mp_remote_launch(graph_main_rtc_loop, NULL, SKIP_MAIN); or
> > rte_eal_mp_remote_launch(graph_main_mcore_loop, NULL, SKIP_MAIN);
> >
>
> We want to 1. Use same API (rte_graph_walk()) for diff models.
> 2. no performance drop for rtc (use RTE_GRAPH_MODEL_SELECT in compile time)
>
> If I understand correctly, I need remove graph->model and only use
> RTE_GRAPH_MODEL_SELECT to select models?
>
> And change it as
> graph_main_rtc_loop()
> {
> While(1)
> rte_graph_walk_rtc()
> }
>
> But actually, I think graph->model is need, especially for config stage and for runtime config
> If set RTE_GRAPH_MODEL_SELECT_RUNTIME.
Yes. Agree. If there is no MAJOR performance issues lets use
RTE_GRAPH_MODEL_SELECT_RUNTIME for l3fwd.
> We need the model type to decide to alloc workqueue and use RTE_GRAPH_MODEL_SELECT
> to choose the walk.
>
>
> > > memset(&rewrite_data, 0, sizeof(rewrite_data));
> > > rewrite_len = sizeof(rewrite_data); diff --git
> > > a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h index
> > > 541c373cb1..19b4c1514f 100644
> > > --- a/lib/graph/rte_graph_worker.h
> > > +++ b/lib/graph/rte_graph_worker.h
> > > @@ -26,6 +26,9 @@ __rte_experimental
> > > static inline void
> > > rte_graph_walk(struct rte_graph *graph) {
> > > +#define RTE_GRAPH_MODEL_RTC 0
> > > +#define RTE_GRAPH_MODEL_MCORE_DISPATCH 1
> >
> > No need for duplicate enum. Please remove enum make this as in public header
> > file.
> >
> Yes, it will cause no defined warnings.
> Thanks for your comments.
> I will remove enum and define model type macros in public header. And also change
> the related structs/APIs.
Also add a comment in RTE_GRAPH_MODEL_MCORE_DISPATCH, If adding
new entry, then update graph_is_valid API.
>
> >
> > > +
> >
> > Add comment here, On how application uses this, aka. before inlcuding the
> > worker header file #define RTE_GRAPH_MODEL_SELECT
> > RTE_GRAPH_MODEL_RTC.
> > Please change the text as needed.
> Yes, I will add comment and add the usage in document.
>
> >
> >
> > > #if !defined(RTE_GRAPH_MODEL_SELECT) || RTE_GRAPH_MODEL_SELECT ==
> > RTE_GRAPH_MODEL_RTC
> > > rte_graph_walk_rtc(graph);
> > > #elif RTE_GRAPH_MODEL_SELECT ==
> > RTE_GRAPH_MODEL_MCORE_DISPATCH
> > > --
> > > 2.37.2
> > >
More information about the dev
mailing list