[dpdk-stable] [dpdk-dev] [PATCH 1/5] app/testpmd: clock gettime call in throughput calculation

Jerin Jacob jerinjacobk at gmail.com
Thu Jun 18 12:16:13 CEST 2020


On Thu, Jun 18, 2020 at 9:33 AM Honnappa Nagarahalli
<Honnappa.Nagarahalli at arm.com> wrote:
>
> Thanks Jerin for the feedback
>
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk at gmail.com>
> > Sent: Wednesday, June 17, 2020 10:16 AM
> > To: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> > Cc: dpdk-dev <dev at dpdk.org>; Ali Alnubani <alialnu at mellanox.com>;
> > orgerlitz at mellanox.com; Wenzhuo Lu <wenzhuo.lu at intel.com>; Beilei Xing
> > <beilei.xing at intel.com>; Bernard Iremonger <bernard.iremonger at intel.com>;
> > hemant.agrawal at nxp.com; jerinj at marvell.com; Slava Ovsiienko
> > <viacheslavo at mellanox.com>; thomas at monjalon.net; Ruifeng Wang
> > <Ruifeng.Wang at arm.com>; Phil Yang <Phil.Yang at arm.com>; nd
> > <nd at arm.com>; Zhihong Wang <zhihong.wang at intel.com>; dpdk stable
> > <stable at dpdk.org>
> > Subject: Re: [dpdk-dev] [PATCH 1/5] app/testpmd: clock gettime call in
> > throughput calculation
> >
> > On Wed, Jun 17, 2020 at 8:13 PM Honnappa Nagarahalli
> > <honnappa.nagarahalli at arm.com> wrote:
> > >
> > > The throughput calculation requires a counter that measures passing of
> > > time. The PMU cycle counter does not do that. This
> >
> >
> > It is not clear from git commit on why PMU cycle counter does not do that?
> > On dpdk bootup, we are figuring out the Hz value based on PMU counter
> > cycles.
> > What is the missing piece here?
> As I understand Linux kernel saves the PMU state and restores it every time a thread is scheduled out and in. So, when the thread is scheduled out the PMU cycles are not counted towards that thread. The thread that prints the statistics issues good amount of system calls and I am guessing it is getting scheduled out. So, it is reporting very low cycle count.

OK. Probably add this info in git commit.

> >
> > IMO, PMU counter should have less latency and more granularity than
> > clock_getime.
> In general, agree. In this particular calculation the granularity has not mattered much (for ex: numbers are fine with 50Mhz generic counter and 2.5Ghz CPU). The latency also does not matter as it is getting amortized over a large number of packets. So, I do not see it affecting the reported PPS/BPS numbers.

Reasonable to use clock_gettime for the control thread.

>
> >
> > > results in incorrect throughput numbers when
> > RTE_ARM_EAL_RDTSC_USE_PMU
> > > is enabled. Use clock_gettime system call to calculate the time passed
> > > since last call.
> > >
> > > Bugzilla ID: 450
> > > Fixes: 0e106980301d ("app/testpmd: show throughput in port stats")
> > > Cc: zhihong.wang at intel.com
> > > Cc: stable at dpdk.org
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> > > Reviewed-by: Phil Yang <phil.yang at arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> > > ---
> > >  app/test-pmd/config.c | 44
> > > +++++++++++++++++++++++++++++--------------
> > >  1 file changed, 30 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
> > > 016bcb09c..91fbf99f8 100644
> > > --- a/app/test-pmd/config.c
> > > +++ b/app/test-pmd/config.c
> > > @@ -54,6 +54,14 @@
> > >
> > >  #define ETHDEV_FWVERS_LEN 32
> > >
> > > +#ifdef CLOCK_MONOTONIC_RAW /* Defined in glibc bits/time.h */ #define
> > > +CLOCK_TYPE_ID CLOCK_MONOTONIC_RAW #else #define CLOCK_TYPE_ID
> > > +CLOCK_MONOTONIC #endif
> > > +
> > > +#define NS_PER_SEC 1E9
> > > +
> > >  static char *flowtype_to_str(uint16_t flow_type);
> > >
> > >  static const struct {
> > > @@ -136,9 +144,10 @@ nic_stats_display(portid_t port_id)
> > >         static uint64_t prev_pkts_tx[RTE_MAX_ETHPORTS];
> > >         static uint64_t prev_bytes_rx[RTE_MAX_ETHPORTS];
> > >         static uint64_t prev_bytes_tx[RTE_MAX_ETHPORTS];
> > > -       static uint64_t prev_cycles[RTE_MAX_ETHPORTS];
> > > +       static uint64_t prev_ns[RTE_MAX_ETHPORTS];
> > > +       struct timespec cur_time;
> > >         uint64_t diff_pkts_rx, diff_pkts_tx, diff_bytes_rx, diff_bytes_tx,
> > > -                                                               diff_cycles;
> > > +
> > > + diff_ns;
> > >         uint64_t mpps_rx, mpps_tx, mbps_rx, mbps_tx;
> > >         struct rte_eth_stats stats;
> > >         struct rte_port *port = &ports[port_id]; @@ -195,10 +204,17 @@
> > > nic_stats_display(portid_t port_id)
> > >                 }
> > >         }
> > >
> > > -       diff_cycles = prev_cycles[port_id];
> > > -       prev_cycles[port_id] = rte_rdtsc();
> > > -       if (diff_cycles > 0)
> > > -               diff_cycles = prev_cycles[port_id] - diff_cycles;
> > > +       diff_ns = 0;
> > > +       if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
> > > +               uint64_t ns;
> > > +
> > > +               ns = cur_time.tv_sec * NS_PER_SEC;
> > > +               ns += cur_time.tv_nsec;
> > > +
> > > +               if (prev_ns[port_id] != 0)
> > > +                       diff_ns = ns - prev_ns[port_id];
> > > +               prev_ns[port_id] = ns;
> > > +       }
> > >
> > >         diff_pkts_rx = (stats.ipackets > prev_pkts_rx[port_id]) ?
> > >                 (stats.ipackets - prev_pkts_rx[port_id]) : 0; @@
> > > -206,10 +222,10 @@ nic_stats_display(portid_t port_id)
> > >                 (stats.opackets - prev_pkts_tx[port_id]) : 0;
> > >         prev_pkts_rx[port_id] = stats.ipackets;
> > >         prev_pkts_tx[port_id] = stats.opackets;
> > > -       mpps_rx = diff_cycles > 0 ?
> > > -               diff_pkts_rx * rte_get_tsc_hz() / diff_cycles : 0;
> > > -       mpps_tx = diff_cycles > 0 ?
> > > -               diff_pkts_tx * rte_get_tsc_hz() / diff_cycles : 0;
> > > +       mpps_rx = diff_ns > 0 ?
> > > +               (double)diff_pkts_rx / diff_ns * NS_PER_SEC : 0;
> > > +       mpps_tx = diff_ns > 0 ?
> > > +               (double)diff_pkts_tx / diff_ns * NS_PER_SEC : 0;
> > >
> > >         diff_bytes_rx = (stats.ibytes > prev_bytes_rx[port_id]) ?
> > >                 (stats.ibytes - prev_bytes_rx[port_id]) : 0; @@
> > > -217,10 +233,10 @@ nic_stats_display(portid_t port_id)
> > >                 (stats.obytes - prev_bytes_tx[port_id]) : 0;
> > >         prev_bytes_rx[port_id] = stats.ibytes;
> > >         prev_bytes_tx[port_id] = stats.obytes;
> > > -       mbps_rx = diff_cycles > 0 ?
> > > -               diff_bytes_rx * rte_get_tsc_hz() / diff_cycles : 0;
> > > -       mbps_tx = diff_cycles > 0 ?
> > > -               diff_bytes_tx * rte_get_tsc_hz() / diff_cycles : 0;
> > > +       mbps_rx = diff_ns > 0 ?
> > > +               (double)diff_bytes_rx / diff_ns * NS_PER_SEC : 0;
> > > +       mbps_tx = diff_ns > 0 ?
> > > +               (double)diff_bytes_tx / diff_ns * NS_PER_SEC : 0;
> > >
> > >         printf("\n  Throughput (since last show)\n");
> > >         printf("  Rx-pps: %12"PRIu64"          Rx-bps: %12"PRIu64"\n  Tx-
> > pps: %12"
> > > --
> > > 2.17.1
> > >


More information about the stable mailing list