[PATCH v4] pcap: support MTU set

Ido Goshen Ido at cgstowernetworks.com
Mon Jun 6 21:07:58 CEST 2022



> -----Original Message-----
> From: Stephen Hemminger <stephen at networkplumber.org>
> Sent: Monday, 6 June 2022 20:10
> To: Ido Goshen <Ido at cgstowernetworks.com>
> Cc: ferruh.yigit at xilinx.com; dev at dpdk.org
> Subject: Re: [PATCH v4] pcap: support MTU set
> 
> On Mon,  6 Jun 2022 19:21:47 +0300
> Ido Goshen <ido at cgstowernetworks.com> wrote:
> 
> > Support rte_eth_dev_set_mtu by pcap vdevs Enforce mtu on rx/tx
> >
> > Bugzilla ID: 961
> 
> This is not really a bug, it is an enhancement specific to your test setup. It should
> not be backported to stable.
> 
> Since it is change in behavior it might be better to add a vdev argument for this
> rather than overloading meaning of MTU.

[idog] The default behavior stays the same and long packets will continue to pass as used to,
Only if 'rte_eth_dev_set_mtu' is explicitly used it will take effect.
I doubt it'll break anything cause no one could use it so far as it returns -ENOTSUP,
and I assume that would be the expected behavior for anyone who will set it.

Adding it as an argument to vdev (e.g. vdev='net_pcap0,iface=eth0,mtu=9400') seems to me 
like a duplication to an existing API.

> Also, this does not behave the same[idog]  as virtio or hardware drivers.

[idog] The idea of this patch is to make pcap behave more like HW NICs.
Couple of HW NICs (ixgbe, i40e) I've checked do respect MTU
Please see test outputs in https://bugs.dpdk.org/show_bug.cgi?id=961
Though probably it's done by the HW and not by the driver 

Alternative might be to set the network interfaces MTU and not do it in pmd, so
It'll be like the "HW" is doing it, but this will work only for ifaces and not for pcap files.

> 
> The mtu is already in dev->data->mtu, why copy it?
> 

[idog] That's what I was using so far, but I got a request from ferruh.yigit at xilinx.com 
not to use 'dev' but access 'internals' via the 'pcap_rx/tx_queue' struct.

> > +		if (unlikely(header.caplen > internals->mtu)) {
> > +			pcap_q->rx_stat.err_pkts++;
> > +			rte_pktmbuf_free(mbuf);
> > +			break;
> > +		}
> 
> This doesn't account for VLAN header.

[idog] Good point, I'm never sure what overhead should be considered?
Please advice what should I add
e.g.  '(RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN + RTE_VLAN_HLEN * 2)'

however caller can always set it a bit higher if needed

> > +
> > +		if (unlikely(len > internals->mtu)) {
> > +			rte_pktmbuf_free(mbuf);
> > +			continue;
> > +		}
> 
> There needs to be a per queue counter any and all drops.

[idog] It will be counted few lines below by
	'dumper_q->tx_stat.err_pkts += nb_pkts - num_tx;'
as this case doesn't increment the 'num_tx'

> >
> > +static int
> > +eth_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) {
> > +	struct pmd_internals *internals = dev->data->dev_private;
> > +
> > +	PMD_LOG(INFO, "MTU set %s %u\n", dev->device->name, mtu);
> > +	internals->mtu = mtu;
> > +	return 0;
> > +}
> 
> If you drop internals->mtu (redundant) then this just becomes stub (ie return 0)
> 

[idog] Again I'm not sure if it's right to use 'dev->data->mtu' directly where later needed.
ferruh.yigit at xilinx.com ?
Anyway this function is needed even if it does nothing (or just logs) in order for the
eth_dev_ops.mtu_set to be supported


> >
> >  static int
> > @@ -1233,6 +1270,7 @@ pmd_init_internals(struct rte_vdev_device *vdev,
> >  		.addr_bytes = { 0x02, 0x70, 0x63, 0x61, 0x70, iface_idx++ }
> >  	};
> >  	(*internals)->phy_mac = 0;
> > +	(*internals)->mtu = RTE_ETH_PCAP_SNAPLEN;
> 
> 
> Use dev->data->mtu not internal value.
> 

[idog] This runs early when the probe creates the device 
Later 'dev->data->mtu' will be overwritten later in 'rte_eth_dev_configure'
To hard-coded 1500

	if (dev_conf->rxmode.mtu == 0)
		dev->data->dev_conf.rxmode.mtu = RTE_ETHER_MTU;
	ret = eth_dev_validate_mtu(port_id, &dev_info,
			dev->data->dev_conf.rxmode.mtu);
	if (ret != 0)
		goto rollback;
	dev->data->mtu = dev->data->dev_conf.rxmode.mtu;

I tried to overcome it by [PATCH v2] http://mails.dpdk.org/archives/dev/2022-May/241974.html
But this code change spills out of the pcap pmd and changes rte_ethdev abi which I rather avoid



More information about the dev mailing list