[dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS

Ouyang, Changchun changchun.ouyang at intel.com
Fri Dec 26 02:52:25 CET 2014



> -----Original Message-----
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Thursday, December 25, 2014 9:20 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS
> 
> 
> On 12/25/14 04:43, Ouyang, Changchun wrote:
> > Hi,
> > Sorry miss some comments, so continue my response below,
> >
> >> -----Original Message-----
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Wednesday, December 24, 2014 6:40 PM
> >> To: Ouyang, Changchun; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS
> >>
> >>
> >> On 12/24/14 07:23, Ouyang Changchun wrote:
> >>> It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable
> VF
> >> RSS.
> >>> The psrtype will determine how many queues the received packets will
> >>> distribute to, and the value of psrtype should depends on both facet:
> >>> max VF rxq number which has been negotiated with PF, and the number
> >>> of
> >> rxq specified in config on guest.
> >>> Signed-off-by: Changchun Ouyang <changchun.ouyang at intel.com>
> >>> ---
> >>>    lib/librte_pmd_ixgbe/ixgbe_pf.c   | 15 +++++++
> >>>    lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 92
> >> ++++++++++++++++++++++++++++++++++-----
> >>>    2 files changed, 97 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c
> >>> b/lib/librte_pmd_ixgbe/ixgbe_pf.c index cbb0145..9c9dad8 100644
> >>> --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
> >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
> >>> @@ -187,6 +187,21 @@ int ixgbe_pf_host_configure(struct rte_eth_dev
> >> *eth_dev)
> >>>    	IXGBE_WRITE_REG(hw, IXGBE_MPSAR_LO(hw-
> mac.num_rar_entries), 0);
> >>>    	IXGBE_WRITE_REG(hw, IXGBE_MPSAR_HI(hw-
> mac.num_rar_entries), 0);
> >>>
> >>> +	/*
> >>> +	 * VF RSS can support at most 4 queues for each VF, even if
> >>> +	 * 8 queues are available for each VF, it need refine to 4
> >>> +	 * queues here due to this limitation, otherwise no queue
> >>> +	 * will receive any packet even RSS is enabled.
> >> According to Table 7-3 in the 82599 spec RSS is not available when
> >> port is configured to have 8 queues per pool. This means that if u
> >> see this configuration u may immediately disable RSS flow in your code.
> >>
> >>> +	 */
> >>> +	if (eth_dev->data->dev_conf.rxmode.mq_mode ==
> >> ETH_MQ_RX_VMDQ_RSS) {
> >>> +		if (RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool == 8) {
> >>> +			RTE_ETH_DEV_SRIOV(eth_dev).active =
> >> ETH_32_POOLS;
> >>> +			RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = 4;
> >>> +			RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx =
> >>> +				dev_num_vf(eth_dev) * 4;
> >> According to 82599 spec u can't do that since RSS is not allowed when
> >> port is configured to have 8 function per-VF. Have u verified that
> >> this works? If yes, then spec should be updated.
> >>
> >>> +		}
> >>> +	}
> >>> +
> >>>    	/* set VMDq map to default PF pool */
> >>>    	hw->mac.ops.set_vmdq(hw, 0,
> >>> RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx);
> >>>
> >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> >>> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> >>> index f69abda..a7c17a4 100644
> >>> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> >>> @@ -3327,6 +3327,39 @@ ixgbe_alloc_rx_queue_mbufs(struct
> >> igb_rx_queue *rxq)
> >>>    }
> >>>
> >>>    static int
> >>> +ixgbe_config_vf_rss(struct rte_eth_dev *dev) {
> >>> +	struct ixgbe_hw *hw;
> >>> +	uint32_t mrqc;
> >>> +
> >>> +	ixgbe_rss_configure(dev);
> >>> +
> >>> +	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> >>> +
> >>> +	/* MRQC: enable VF RSS */
> >>> +	mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
> >>> +	mrqc &= ~IXGBE_MRQC_MRQE_MASK;
> >>> +	switch (RTE_ETH_DEV_SRIOV(dev).active) {
> >>> +	case ETH_64_POOLS:
> >>> +		mrqc |= IXGBE_MRQC_VMDQRSS64EN;
> >>> +		break;
> >>> +
> >>> +	case ETH_32_POOLS:
> >>> +	case ETH_16_POOLS:
> >>> +		mrqc |= IXGBE_MRQC_VMDQRSS32EN;
> >> Again, this contradicts with the spec.
> > Yes, the spec say the hw can't support vf rss at all, but experiment find that
> could be done.
> 
> The spec explicitly say that VF RSS *is* supported in particular in the table
> mentioned above.
But the spec(January 2014 revision 2.9) on my hand says: "in IOV mode, VMDq+RSS mode is not available"  in note of section 4.6.10.2.1
> What your code is doing is that in case of 16 VFs u setup a 32 pools
> configuration and use only 16 out of them.
But I don't see any big issue here, in this case, each vf COULD have 8 queues, like I said before, but this is estimation value, actually only 4 queues
Are really available for one vf, you can refer to spec for the correctness here.
> 
> > We can focus on discussing the implementation firstly.
> 
> >
> >>> +		break;
> >>> +
> >>> +	default:
> >>> +		PMD_INIT_LOG(ERR, "Invalid pool number in IOV mode");
> >>> +		return -EINVAL;
> >>> +	}
> >>> +
> >>> +	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static int
> >>>    ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
> >>>    {
> >>>    	struct ixgbe_hw *hw =
> >>> @@ -3358,24 +3391,38 @@ ixgbe_dev_mq_rx_configure(struct
> >> rte_eth_dev *dev)
> >>>    			default: ixgbe_rss_disable(dev);
> >>>    		}
> >>>    	} else {
> >>> -		switch (RTE_ETH_DEV_SRIOV(dev).active) {
> >>>    		/*
> >>>    		 * SRIOV active scheme
> >>>    		 * FIXME if support DCB/RSS together with VMDq & SRIOV
> >>>    		 */
> >>> -		case ETH_64_POOLS:
> >>> -			IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >> IXGBE_MRQC_VMDQEN);
> >>> +		switch (dev->data->dev_conf.rxmode.mq_mode) {
> >>> +		case ETH_MQ_RX_RSS:
> >>> +		case ETH_MQ_RX_VMDQ_RSS:
> >>> +			ixgbe_config_vf_rss(dev);
> >>>    			break;
> >>>
> >>> -		case ETH_32_POOLS:
> >>> -			IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >> IXGBE_MRQC_VMDQRT4TCEN);
> >>> -			break;
> >>> +		default:
> >>> +			switch (RTE_ETH_DEV_SRIOV(dev).active) {
> >> Sorry for nitpicking but have u considered taking this encapsulated
> >> "switch- case" block into a separate function? This could make the
> >> code look a lot nicer. ;)
> >>
> >>> +			case ETH_64_POOLS:
> >>> +				IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >>> +					IXGBE_MRQC_VMDQEN);
> >>> +				break;
> >>>
> >>> -		case ETH_16_POOLS:
> >>> -			IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >> IXGBE_MRQC_VMDQRT8TCEN);
> >>> +			case ETH_32_POOLS:
> >>> +				IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >>> +					IXGBE_MRQC_VMDQRT4TCEN);
> >>> +				break;
> >>> +
> >>> +			case ETH_16_POOLS:
> >>> +				IXGBE_WRITE_REG(hw, IXGBE_MRQC,
> >>> +					IXGBE_MRQC_VMDQRT8TCEN);
> >>> +				break;
> >>> +			default:
> >>> +				PMD_INIT_LOG(ERR,
> >>> +					"invalid pool number in IOV mode");
> >>> +				break;
> >>> +			}
> >>>    			break;
> >>> -		default:
> >>> -			PMD_INIT_LOG(ERR, "invalid pool number in IOV
> >> mode");
> >>>    		}
> >>>    	}
> >>>
> >>> @@ -3989,10 +4036,32 @@ ixgbevf_dev_rx_init(struct rte_eth_dev
> *dev)
> >>>    	uint16_t buf_size;
> >>>    	uint16_t i;
> >>>    	int ret;
> >>> +	uint16_t valid_rxq_num;
> >>>
> >>>    	PMD_INIT_FUNC_TRACE();
> >>>    	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> >>>
> >>> +	valid_rxq_num = RTE_MIN(dev->data->nb_rx_queues,
> >>> +hw->mac.max_rx_queues);
> >>> +
> >>> +	/*
> >>> +	 * VMDq RSS can't support 3 queues, so config it into 4 queues,
> >>> +	 * and give user a hint that some packets may loss if it doesn't
> >>> +	 * poll the queue where those packets are distributed to.
> >>> +	 */
> >>> +	if (valid_rxq_num == 3)
> >>> +		valid_rxq_num = 4;
> >> Why to configure more queues that requested and not less (2)? Why to
> >> configure anything at all and not return an error?
> > Sorry, I don't agree this is "anything" you say, because I don't use 5,6,7,
> 8, ..., 16, 2014, 2015,... etc.
> > By considering 2 or 4,
> > I prefer 4, the reason is if user need more than 3 queues per vf to do
> > something, And pf has also the capability to setup 4 queues per vf,
> > confining to 2 queues is also not good thing, So here try to enable 4 queues,
> and give user hints here.
> > Btw, change it into 2 is another way, depends on other guys' more insight
> here.
> 
> Like I said before, trying to guess what user wants is a way to making a code
> that is very hard to use and to maintain. Pls., just return an error and let the
> user code deal with it the way he/she really wants and not the way u *think*
> he/she wants.
> 
I didn't disagree on this, either :-)
If you have strong reason for this way and more guys agree with it,
I will modify it probably in v4. 
> >
> >>> +
> >>> +	if (dev->data->nb_rx_queues > valid_rxq_num) {
> >>> +		PMD_INIT_LOG(ERR, "The number of Rx queue invalid, "
> >>> +			"it should be equal to or less than %d",
> >>> +			valid_rxq_num);
> >>> +		return -1;
> >>> +	} else if (dev->data->nb_rx_queues < valid_rxq_num)
> >>> +		PMD_INIT_LOG(ERR, "The number of Rx queue is less "
> >>> +			"than the number of available Rx queues:%d, "
> >>> +			"packets in Rx queues(q_id >= %d) may loss.",
> >>> +			valid_rxq_num, dev->data->nb_rx_queues);
> >> Who ever looks in the "INIT_LOG" if everything "work well" and u make
> >> it look so by allowing this call to succeed. And then some packets
> >> will just silently not arrive?! And what the used should somehow guess to
> do?
> >> - Look in the "INIT_LOG"?! This is a nightmare!
> >>
> >>> +
> >>>    	/*
> >>>    	 * When the VF driver issues a IXGBE_VF_RESET request, the PF
> >> driver
> >>>    	 * disables the VF receipt of packets if the PF MTU is > 1500.
> >>> @@ -4094,6 +4163,9 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
> >>>    			IXGBE_PSRTYPE_IPV6HDR;
> >>>    #endif
> >>>
> >>> +	/* Set RQPL for VF RSS according to max Rx queue */
> >>> +	psrtype |= (valid_rxq_num >> 1) <<
> >>> +		IXGBE_PSRTYPE_RQPL_SHIFT;
> >>>    	IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype);
> >>>
> >>>    	if (dev->data->dev_conf.rxmode.enable_scatter) {



More information about the dev mailing list