[dpdk-dev] [PATCH v13] app/testpmd: support multi-process
Andrew Rybchenko
andrew.rybchenko at oktetlabs.ru
Tue Jun 8 10:42:17 CEST 2021
@Thomas, @Ferruh, please, see question below.
On 4/22/21 4:18 AM, Min Hu (Connor) wrote:
> This patch adds multi-process support for testpmd.
> The test cmd example as follows:
> the primary cmd:
> ./dpdk-testpmd -a xxx --proc-type=auto -l 0-1 -- -i \
> --rxq=4 --txq=4 --num-procs=2 --proc-id=0
>
> the secondary cmd:
> ./dpdk-testpmd -a xxx --proc-type=auto -l 2-3 -- -i \
> --rxq=4 --txq=4 --num-procs=2 --proc-id=1
>
> Signed-off-by: Min Hu (Connor) <humin29 at huawei.com>
> Signed-off-by: Lijun Ou <oulijun at huawei.com>
> Acked-by: Xiaoyun Li <xiaoyun.li at intel.com>
> Acked-by: Ajit Khaparde <ajit.khaparde at broadcom.com>
> Reviewed-by: Ferruh Yigit <ferruh.yigit at intel.com>
> ---
> v13:
> * Modified the doc syntax.
>
> v12:
> * Updated doc info.
>
> v11:
> * Fixed some minor syntax.
>
> v10:
> * Hid process type checks behind new functions.
> * Added comments.
>
> v9:
> * Updated release notes and rst doc.
> * Deleted deprecated codes.
> * move macro and variable.
>
> v8:
> * Added warning info about queue numbers and process numbers.
>
> v7:
> * Fixed compiling error for unexpected unindent.
>
> v6:
> * Add rte flow description for multiple process.
>
> v5:
> * Fixed run_app.rst for multiple process description.
> * Fix compiling error.
>
> v4:
> * Fixed minimum vlaue of Rxq or Txq in doc.
>
> v3:
> * Fixed compiling error using gcc10.0.
>
> v2:
> * Added document for this patch.
> ---
> app/test-pmd/cmdline.c | 6 ++
> app/test-pmd/config.c | 21 +++++-
> app/test-pmd/parameters.c | 11 +++
> app/test-pmd/testpmd.c | 129 ++++++++++++++++++++++++++-------
> app/test-pmd/testpmd.h | 9 +++
> doc/guides/rel_notes/release_21_05.rst | 1 +
> doc/guides/testpmd_app_ug/run_app.rst | 70 ++++++++++++++++++
> 7 files changed, 220 insertions(+), 27 deletions(-)
>
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 12efbc0..f0fa6e8 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -5450,6 +5450,12 @@ cmd_set_flush_rx_parsed(void *parsed_result,
> __rte_unused void *data)
> {
> struct cmd_set_flush_rx *res = parsed_result;
> +
> + if (num_procs > 1 && (strcmp(res->mode, "on") == 0)) {
> + printf("multi-process doesn't support to flush rx queues.\n");
rx -> Rx
> + return;
> + }
> +
> no_flush_rx = (uint8_t)((strcmp(res->mode, "on") == 0) ? 0 : 1);
> }
>
> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
> index e189062..9eb1fa7 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -2971,6 +2971,8 @@ rss_fwd_config_setup(void)
> queueid_t rxq;
> queueid_t nb_q;
> streamid_t sm_id;
> + int start;
> + int end;
>
> nb_q = nb_rxq;
> if (nb_q > nb_txq)
> @@ -2988,7 +2990,22 @@ rss_fwd_config_setup(void)
> init_fwd_streams();
>
> setup_fwd_config_of_each_lcore(&cur_fwd_config);
> - rxp = 0; rxq = 0;
> +
> + if (proc_id > 0 && nb_q % num_procs)
Please, compare result with 0 explicitly.
> + printf("Warning! queue numbers should be multiple of "
> + "processes, or packet loss will happen.\n");
Do not split format string across multiple lines.
Frankly speaking I don't undertand why. Why is it impossible to
serve 2 queues in the first process and 1 queue in the second
process if 3 queues and 2 processes are configured.
I think RSS redirection table can perfectly do it.
> +
> + /**
> + * In multi-process, All queues are allocated to different
> + * processes based on num_procs and proc_id. For example:
> + * if supports 4 queues(nb_q), 2 processes(num_procs),
> + * the 0~1 queue for primary process.
> + * the 2~3 queue for secondary process.
> + */
> + start = proc_id * nb_q / num_procs;
> + end = start + nb_q / num_procs;
> + rxp = 0;
> + rxq = start;
> for (sm_id = 0; sm_id < cur_fwd_config.nb_fwd_streams; sm_id++) {
> struct fwd_stream *fs;
>
> @@ -3005,6 +3022,8 @@ rss_fwd_config_setup(void)
> continue;
> rxp = 0;
> rxq++;
> + if (rxq >= end)
> + rxq = start;
> }
> }
>
> diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
> index f3954c1..ece05c1 100644
> --- a/app/test-pmd/parameters.c
> +++ b/app/test-pmd/parameters.c
> @@ -508,6 +508,9 @@ parse_link_speed(int n)
> void
> launch_args_parse(int argc, char** argv)
> {
> +#define PARAM_PROC_ID "proc-id"
> +#define PARAM_NUM_PROCS "num-procs"
> +
> int n, opt;
> char **argvopt;
> int opt_idx;
> @@ -625,6 +628,8 @@ launch_args_parse(int argc, char** argv)
> { "rx-mq-mode", 1, 0, 0 },
> { "record-core-cycles", 0, 0, 0 },
> { "record-burst-stats", 0, 0, 0 },
> + { PARAM_NUM_PROCS, 1, 0, 0 },
> + { PARAM_PROC_ID, 1, 0, 0 },
> { 0, 0, 0, 0 },
> };
>
> @@ -1391,6 +1396,12 @@ launch_args_parse(int argc, char** argv)
> record_core_cycles = 1;
> if (!strcmp(lgopts[opt_idx].name, "record-burst-stats"))
> record_burst_stats = 1;
> + if (strncmp(lgopts[opt_idx].name,
> + PARAM_NUM_PROCS, 9) == 0)
I strongly dislike 9 here and 7 below. Why is strncmp() used
here, but just strcmp() is used for all other options.
It makes the code inconsistent.
> + num_procs = atoi(optarg);
> + if (strncmp(lgopts[opt_idx].name,
> + PARAM_PROC_ID, 7) == 0)
> + proc_id = atoi(optarg);
> break;
> case 'h':
> usage(argv[0]);
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index d4be23f..afa2a6b 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -518,6 +518,62 @@ enum rte_eth_rx_mq_mode rx_mq_mode = ETH_MQ_RX_VMDQ_DCB_RSS;
> */
> uint32_t eth_link_speed;
>
> +/*
> + * Id of the current process in multi-process, used to
Id -> ID in accordance with devtools/words-case.txt
> + * configure the queues to be polled.
> + */
> +int proc_id;
> +
> +/*
> + * Number of processes in multi-process, used to
> + * configure the queues to be polled.
> + */
> +unsigned int num_procs = 1;
> +
> +static int
> +eth_dev_configure_mp(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
> + const struct rte_eth_conf *dev_conf)
> +{
> + if (is_proc_primary())
> + return rte_eth_dev_configure(port_id, nb_rx_q, nb_tx_q,
> + dev_conf);
> + return 0;
> +}
> +
> +static int
> +eth_dev_start_mp(uint16_t port_id)
> +{
> + if (is_proc_primary())
> + return rte_eth_dev_start(port_id);
> +
> + return 0;
> +}
> +
> +static int
> +eth_dev_stop_mp(uint16_t port_id)
> +{
> + if (is_proc_primary())
> + return rte_eth_dev_stop(port_id);
> +
> + return 0;
> +}
> +
> +static void
> +mempool_free_mp(struct rte_mempool *mp)
> +{
> + if (is_proc_primary())
> + return rte_mempool_free(mp);
As far as I remember some compilers do not like it for void.
Just remove 'return'.
> +}
> +
> +static int
> +eth_dev_set_mtu_mp(uint16_t port_id, uint16_t mtu)
> +{
> + if (is_proc_primary())
> + return rte_eth_dev_set_mtu(port_id, mtu);
> +
> + return 0;
> +}
> +
> /* Forward function declarations */
> static void setup_attached_port(portid_t pi);
> static void check_all_ports_link_status(uint32_t port_mask);
> @@ -977,6 +1033,11 @@ mbuf_pool_create(uint16_t mbuf_seg_size, unsigned nb_mbuf,
> mb_size = sizeof(struct rte_mbuf) + mbuf_seg_size;
> mbuf_poolname_build(socket_id, pool_name, sizeof(pool_name), size_idx);
>
> + if (!is_proc_primary()) {
> + rte_mp = rte_mempool_lookup(pool_name);
> + goto err;
It looks like error path, but it works in the case of success
as well. Looks confusing.
> + }
> +
> TESTPMD_LOG(INFO,
> "create a new mbuf pool <%s>: n=%u, size=%u, socket=%u\n",
> pool_name, nb_mbuf, mbuf_seg_size, socket_id);
> @@ -1059,9 +1120,14 @@ mbuf_pool_create(uint16_t mbuf_seg_size, unsigned nb_mbuf,
>
> err:
> if (rte_mp == NULL) {
> - rte_exit(EXIT_FAILURE,
> - "Creation of mbuf pool for socket %u failed: %s\n",
> - socket_id, rte_strerror(rte_errno));
> + if (is_proc_primary())
> + rte_exit(EXIT_FAILURE,
> + "Creation of mbuf pool for socket %u failed: %s\n",
> + socket_id, rte_strerror(rte_errno));
> + else
> + rte_exit(EXIT_FAILURE,
> + "Get mbuf pool for socket %u failed: %s\n",
> + socket_id, rte_strerror(rte_errno));
> } else if (verbose_level > 0) {
> rte_mempool_dump(stdout, rte_mp);
> }
> @@ -2002,6 +2068,12 @@ flush_fwd_rx_queues(void)
> uint64_t prev_tsc = 0, diff_tsc, cur_tsc, timer_tsc = 0;
> uint64_t timer_period;
>
> + if (num_procs > 1) {
> + printf("multi-process not support for flushing fwd rx "
rx -> Rx
also it is better to avoid like split.
> + "queues, skip the below lines and return.\n");
> + return;
> + }
> +
> /* convert to number of cycles */
> timer_period = rte_get_timer_hz(); /* 1 second timeout */
>
> @@ -2511,21 +2583,24 @@ start_port(portid_t pid)
> return -1;
> }
> /* configure port */
> - diag = rte_eth_dev_configure(pi, nb_rxq + nb_hairpinq,
> - nb_txq + nb_hairpinq,
> - &(port->dev_conf));
> + diag = eth_dev_configure_mp(pi,
> + nb_rxq + nb_hairpinq,
> + nb_txq + nb_hairpinq,
> + &(port->dev_conf));
> if (diag != 0) {
> - if (rte_atomic16_cmpset(&(port->port_status),
> - RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
> - printf("Port %d can not be set back "
> - "to stopped\n", pi);
> + if (rte_atomic16_cmpset(
> + &(port->port_status),
> + RTE_PORT_HANDLING,
> + RTE_PORT_STOPPED) == 0)
> + printf("Port %d can not be set back to stopped\n",
can not -> cannot (since you touch the line anyway)
> + pi);
> printf("Fail to configure port %d\n", pi);
> /* try to reconfigure port next time */
> port->need_reconfig = 1;
> return -1;
> }
> }
> - if (port->need_reconfig_queues > 0) {
> + if (port->need_reconfig_queues > 0 && is_proc_primary()) {
> port->need_reconfig_queues = 0;
> /* setup tx queues */
> for (qi = 0; qi < nb_txq; qi++) {
> @@ -2548,8 +2623,8 @@ start_port(portid_t pid)
> if (rte_atomic16_cmpset(&(port->port_status),
> RTE_PORT_HANDLING,
> RTE_PORT_STOPPED) == 0)
> - printf("Port %d can not be set back "
> - "to stopped\n", pi);
> + printf("Port %d can not be set back to stopped\n",
can not -> cannot
> + pi);
> printf("Fail to configure port %d tx queues\n",
> pi);
> /* try to reconfigure queues next time */
> @@ -2626,16 +2701,16 @@ start_port(portid_t pid)
> cnt_pi++;
>
> /* start port */
> - diag = rte_eth_dev_start(pi);
> + diag = eth_dev_start_mp(pi);
> if (diag < 0) {
> printf("Fail to start port %d: %s\n", pi,
> rte_strerror(-diag));
>
> /* Fail to setup rx queue, return */
> if (rte_atomic16_cmpset(&(port->port_status),
> - RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
> - printf("Port %d can not be set back to "
> - "stopped\n", pi);
> + RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
> + printf("Port %d can not be set back to stopped\n",
can not -> cannot
> + pi);
> continue;
> }
>
> @@ -2765,7 +2840,7 @@ stop_port(portid_t pid)
> if (port->flow_list)
> port_flow_flush(pi);
>
> - if (rte_eth_dev_stop(pi) != 0)
> + if (eth_dev_stop_mp(pi) != 0)
> RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port %u\n",
> pi);
>
> @@ -2834,8 +2909,10 @@ close_port(portid_t pid)
> continue;
> }
>
> - port_flow_flush(pi);
> - rte_eth_dev_close(pi);
> + if (is_proc_primary()) {
> + port_flow_flush(pi);
> + rte_eth_dev_close(pi);
> + }
> }
>
> remove_invalid_ports();
> @@ -3101,7 +3178,7 @@ pmd_test_exit(void)
> }
> for (i = 0 ; i < RTE_DIM(mempools) ; i++) {
> if (mempools[i])
> - rte_mempool_free(mempools[i]);
> + mempool_free_mp(mempools[i]);
> }
>
> printf("\nBye...\n");
> @@ -3432,7 +3509,7 @@ update_jumbo_frame_offload(portid_t portid)
> * if unset do it here
> */
> if ((rx_offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) == 0) {
> - ret = rte_eth_dev_set_mtu(portid,
> + ret = eth_dev_set_mtu_mp(portid,
> port->dev_conf.rxmode.max_rx_pkt_len - eth_overhead);
> if (ret)
> printf("Failed to set MTU to %u for port %u\n",
> @@ -3622,6 +3699,10 @@ init_port_dcb_config(portid_t pid,
> int retval;
> uint16_t i;
>
> + if (num_procs > 1) {
> + printf("The multi-process feature doesn't support dcb.\n");
> + return -ENOTSUP;
> + }
> rte_port = &ports[pid];
>
> memset(&port_conf, 0, sizeof(struct rte_eth_conf));
> @@ -3787,10 +3868,6 @@ main(int argc, char** argv)
> rte_exit(EXIT_FAILURE, "Cannot init EAL: %s\n",
> rte_strerror(rte_errno));
>
> - if (rte_eal_process_type() == RTE_PROC_SECONDARY)
> - rte_exit(EXIT_FAILURE,
> - "Secondary process type not supported.\n");
> -
> ret = register_eth_event_callback();
> if (ret != 0)
> rte_exit(EXIT_FAILURE, "Cannot register for ethdev events");
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index 6ca872d..3318e8f 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -632,6 +632,15 @@ extern enum rte_eth_rx_mq_mode rx_mq_mode;
>
> extern struct rte_flow_action_conntrack conntrack_context;
>
> +extern int proc_id;
> +extern unsigned int num_procs;
> +
> +static inline bool
> +is_proc_primary(void)
> +{
> + return rte_eal_process_type() == RTE_PROC_PRIMARY;
> +}
> +
> static inline unsigned int
> lcore_num(void)
> {
> diff --git a/doc/guides/rel_notes/release_21_05.rst b/doc/guides/rel_notes/release_21_05.rst
> index b36c120..9361332 100644
> --- a/doc/guides/rel_notes/release_21_05.rst
> +++ b/doc/guides/rel_notes/release_21_05.rst
> @@ -235,6 +235,7 @@ New Features
> ``port cleanup (port_id) txq (queue_id) (free_cnt)``
> * Added command to show link flow control info.
> ``show port (port_id) flow_ctrl``
> + * Added support multi-process for testpmd.
Please, rebase to 21.08.
>
> * **Updated ipsec-secgw sample application.**
>
> diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst
> index d062165..74efa4f 100644
> --- a/doc/guides/testpmd_app_ug/run_app.rst
> +++ b/doc/guides/testpmd_app_ug/run_app.rst
> @@ -543,3 +543,73 @@ The command line options are:
> bit 1 - two hairpin ports paired
> bit 0 - two hairpin ports loop
> The default value is 0. Hairpin will use single port mode and implicit Tx flow mode.
> +
> +Testpmd Multi-Process Command-line Options
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The following are the command-line options for testpmd multi-process support:
> +
> +.. code-block:: console
> +
> + primary process:
> + sudo ./dpdk-testpmd -a xxx --proc-type=auto -l 0-1 -- -i --rxq=4 --txq=4 \
> + --num-procs=2 --proc-id=0
> +
> + secondary process:
> + sudo ./dpdk-testpmd -a xxx --proc-type=auto -l 2-3 -- -i --rxq=4 --txq=4 \
> + --num-procs=2 --proc-id=1
Do we really need --rxq=4 --txq=4 in secodary processes?
Can't we get it from already configured device in primary
process?
Same question is applicable to --num-procs. May be testpmd
can publish in shared memory? If yes, I'm OK to either
do it in the patch or defer as a later improvement.
> +
> +The command line options are:
> +
> +* ``--num-procs=N``
> +
> + The number of processes which will be used.
> +
> +* ``--proc-id=id``
> +
> + The id of the current process (id < num-procs). id should be different in primary
> + process and secondary process, which starts from '0'.
> +
> +Calculation rule for queue:
> +All queues are allocated to different processes based on ``proc_num`` and ``proc_id``.
> +Calculation rule for the testpmd to allocate queues to each process:
> +start(queue start id) = proc_id * nb_q / num_procs;
> +end(queue end id) = start + nb_q / num_procs;
> +
> +For example, if testpmd supports 4 Tx and Rx queues
> +the 0~1 for primary process
> +the 2~3 for secondary process
> +
> +The number of rings should be a multiple of the number of processes. If not,
> +redundant queues will exist after queues are allocated to processes. After RSS is
> +enabled, packet loss occurs when traffic is sent to all processes at the same time.
> +Some traffic enters redundant queues and cannot be forwarded.
> +
> +All the dev ops is supported in primary process. While secondary process is not permitted
> +to allocate or release shared memory, so some ops are not supported as follows::
> +``dev_configure``
> +``dev_start``
> +``dev_stop``
> +``rx_queue_setup``
> +``tx_queue_setup``
> +``rx_queue_release``
> +``tx_queue_release``
@Thomas, @Ferrh, shouldn't it be handled on ethdev level as
well if it is really that strict.
> +
> +So, any command from testpmd which calls those APIs will not be supported in secondary
> +process, like::
> +``port config all rxq|txq|rxd|txd <value>``
> +``port config <port_id> rx_offload xxx on/off ``
> +``port config <port_id> tx_offload xxx on/off``
> +etc.
> +
> +Flow API is supported, it applies only on its own process on SW side, but all on HW side.
Sorry, I don't understand it.
> +
> +Stats is supported, stats will not change when one quit and start, as they share the same
> +buffer to store the stats. Flow rules are maintained in process level: primary and secondary
> +has its own flow list (but one flow list in HW). The two can see all the queues, so setting
> +the flow rules for the other is OK. But in the testpmd primary process receiving or transmitting
> +packets from the queue allocated for secondary process is not permitted, and same for
> +secondary process.
> +
> +RSS is supported, primary process and secondary process has separate queues to use, RSS
> +will work in their own queues whether primary or secondary process.
For me it sounds like secondary process has own RSS
configuration. If so, it is false. I guess I simply
misunderstand above paragraph.
More information about the dev
mailing list