[PATCH v2 6/7] baseband/acc: introduce the new VRB2 variant

Chautru, Nicolas nicolas.chautru at intel.com
Thu Sep 28 01:46:12 CEST 2023


Hi Maxime, 

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> Sent: Wednesday, September 27, 2023 6:40 AM
> To: Chautru, Nicolas <nicolas.chautru at intel.com>; dev at dpdk.org
> Cc: hemant.agrawal at nxp.com; david.marchand at redhat.com; Vargas, Hernan
> <hernan.vargas at intel.com>
> Subject: Re: [PATCH v2 6/7] baseband/acc: introduce the new VRB2 variant
> 
> 
> 
> On 9/21/23 22:43, Nicolas Chautru wrote:
> > This extends the unified driver to support both the
> > VRB1 and VRB2 implementations of Intel vRAN Boost.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru at intel.com>
> > ---
> >   doc/guides/bbdevs/index.rst            |    1 +
> >   doc/guides/bbdevs/vrb2.rst             |  269 +++++
> >   doc/guides/rel_notes/release_23_11.rst |    3 +
> >   drivers/baseband/acc/acc_common.h      |   84 +-
> >   drivers/baseband/acc/rte_acc100_pmd.c  |    4 +-
> >   drivers/baseband/acc/rte_vrb_pmd.c     | 1442 +++++++++++++++++++++---
> >   drivers/baseband/acc/vrb1_pf_enum.h    |   17 +-
> >   drivers/baseband/acc/vrb2_pf_enum.h    |  124 ++
> >   drivers/baseband/acc/vrb2_vf_enum.h    |  121 ++
> >   drivers/baseband/acc/vrb_pmd.h         |  161 ++-
> >   10 files changed, 2062 insertions(+), 164 deletions(-)
> >   create mode 100644 doc/guides/bbdevs/vrb2.rst
> >   create mode 100644 drivers/baseband/acc/vrb2_pf_enum.h
> >   create mode 100644 drivers/baseband/acc/vrb2_vf_enum.h
> 
> 
> As a general comment, the review process would be easier if the patch was
> split by features introduced:
> - Base init, no capabilities exposed
> - Add LDPC capabilities and related code
> - ADD FFT capabilities and related code
> - MLD-TS ...
> 
> We had this very discussion last week at the techboard for a Net PMD, and it
> was a shared opinion across the board.

Yes I realize there is more code that I first thought. Will do in v3.

> 
> > diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> > index 77d4c54664..269157d77f 100644
> > --- a/doc/guides/bbdevs/index.rst
> > +++ b/doc/guides/bbdevs/index.rst
> > @@ -15,4 +15,5 @@ Baseband Device Drivers
> >       fpga_5gnr_fec
> >       acc100
> >       vrb1
> > +    vrb2
> >       la12xx
> > diff --git a/doc/guides/bbdevs/vrb2.rst b/doc/guides/bbdevs/vrb2.rst
> > new file mode 100644 index 0000000000..8d8e094660
> > --- /dev/null
> > +++ b/doc/guides/bbdevs/vrb2.rst
> > @@ -0,0 +1,269 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(c) 2023 Intel Corporation
> > +
> > +.. include:: <isonum.txt>
> > +
> > +Intel\ |reg| vRAN Boost v2 Poll Mode Driver (PMD)
> > +=================================================
> > +
> > +The Intel\ |reg| vRAN Boost integrated accelerator enables
> > +cost-effective 4G and 5G next-generation virtualized Radio Access
> > +Network (vRAN) solutions.
> > +The Intel vRAN Boost v2.0 (VRB2 in the code) is specifically
> > +integrated on the Intel\ |reg| Xeon\ |reg| Granite Rapids-D Process (GNR-
> D).
> > +
> > +Features
> > +--------
> > +
> > +Intel vRAN Boost v2.0 includes a 5G Low Density Parity Check (LDPC)
> > +encoder/decoder, rate match/dematch, Hybrid Automatic Repeat Request
> > +(HARQ) with access to DDR memory for buffer management, a 4G Turbo
> > +encoder/decoder, a Fast Fourier Transform (FFT) block providing
> > +DFT/iDFT processing offload for the 5G Sounding Reference Signal
> > +(SRS), a MLD-TS accelerator, a Queue Manager (QMGR), and a DMA
> subsystem.
> > +There is no dedicated on-card memory for HARQ, the coherent memory on
> the CPU side is being used.
> > +
> > +These hardware blocks provide the following features exposed by the PMD:
> > +
> > +- LDPC Encode in the Downlink (5GNR)
> > +- LDPC Decode in the Uplink (5GNR)
> > +- Turbo Encode in the Downlink (4G)
> > +- Turbo Decode in the Uplink (4G)
> > +- FFT processing
> > +- MLD-TS processing
> > +- Single Root I/O Virtualization (SR-IOV) with 16 Virtual Functions
> > +(VFs) per Physical Function (PF)
> > +- Maximum of 2048 queues per VF
> > +- Message Signaled Interrupts (MSIs)
> > +
> > +The Intel vRAN Boost v2.0 PMD supports the following bbdev capabilities:
> > +
> > +* For the LDPC encode operation:
> > +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH``: set to attach CRC24B to CB(s).
> > +   - ``RTE_BBDEV_LDPC_RATE_MATCH``: if set then do not do Rate Match
> bypass.
> > +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS``: if set then bypass
> interleaver.
> > +   - ``RTE_BBDEV_LDPC_ENC_SCATTER_GATHER``: supports scatter-gather
> for input/output data.
> > +   - ``RTE_BBDEV_LDPC_ENC_CONCATENATION``: concatenate code blocks
> with bit granularity.
> > +
> > +* For the LDPC decode operation:
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK``: check CRC24B from CB(s).
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP``: drops CRC24B bits
> appended while decoding.
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK``: check CRC24A from CB(s).
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK``: check CRC16 from CB(s).
> > +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE``: provides an input for
> HARQ combining.
> > +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE``: provides an input
> for HARQ combining.
> > +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE``: disable early
> termination.
> > +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER``: supports scatter-gather
> for input/output data.
> > +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION``: supports
> compression of the HARQ input/output.
> > +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION``: supports LLR input
> compression.
> > +   - ``RTE_BBDEV_LDPC_HARQ_4BIT_COMPRESSION``: supports
> compression of the HARQ input/output.
> > +   - ``RTE_BBDEV_LDPC_SOFT_OUT_ENABLE``: set the APP LLR soft output.
> > +   - ``RTE_BBDEV_LDPC_SOFT_OUT_RM_BYPASS``: set the APP LLR soft
> output after rate-matching.
> > +   - ``RTE_BBDEV_LDPC_SOFT_OUT_DEINTERLEAVER_BYPASS``: disables the
> de-interleaver.
> > +
> > +* For the turbo encode operation:
> > +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH``: set to attach CRC24B to
> CB(s).
> > +   - ``RTE_BBDEV_TURBO_RATE_MATCH``: if set then do not do Rate Match
> bypass.
> > +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS``: set for encoder dequeue
> interrupts.
> > +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS``: set to bypass RV index.
> > +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER``: supports scatter-
> gather for input/output data.
> > +
> > +* For the turbo decode operation:
> > +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B``: check CRC24B from CB(s).
> > +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE``: perform subblock
> de-interleave.
> > +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS``: set for decoder dequeue
> interrupts.
> > +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN``: set if negative LLR input is
> supported.
> > +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP``: keep CRC24B bits
> appended while decoding.
> > +   - ``RTE_BBDEV_TURBO_DEC_CRC_24B_DROP``: option to drop the code
> block CRC after decoding.
> > +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION``: set early termination
> feature.
> > +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER``: supports scatter-
> gather for input/output data.
> > +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN``: set half iteration
> granularity.
> > +   - ``RTE_BBDEV_TURBO_SOFT_OUTPUT``: set the APP LLR soft output.
> > +   - ``RTE_BBDEV_TURBO_EQUALIZER``: set the turbo equalizer feature.
> > +   - ``RTE_BBDEV_TURBO_SOFT_OUT_SATURATE``: set the soft output
> saturation.
> > +   - ``RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH``: set to run an extra
> odd iteration after CRC match.
> > +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT``: set if negative APP
> LLR output supported.
> > +   - ``RTE_BBDEV_TURBO_MAP_DEC``: supports flexible parallel MAP
> engine decoding.
> > +
> > +* For the FFT operation:
> > +   - ``RTE_BBDEV_FFT_WINDOWING``: flexible windowing capability.
> > +   - ``RTE_BBDEV_FFT_CS_ADJUSTMENT``: flexible adjustment of Cyclic Shift
> time offset.
> > +   - ``RTE_BBDEV_FFT_DFT_BYPASS``: set for bypass the DFT and get
> directly into iDFT input.
> > +   - ``RTE_BBDEV_FFT_IDFT_BYPASS``: set for bypass the IDFT and get
> directly the DFT output.
> > +   - ``RTE_BBDEV_FFT_WINDOWING_BYPASS``: set for bypass time domain
> windowing.
> > +
> > +* For the MLD-TS operation:
> > +   - ``RTE_BBDEV_MLDTS_REP``: set to repeat and reuse channel across
> operations.
> > +
> > +Installation
> > +------------
> > +
> > +Section 3 of the DPDK manual provides instructions on installing and
> compiling DPDK.
> > +
> > +DPDK requires hugepages to be configured as detailed in section 2 of the
> DPDK manual.
> > +The bbdev test application has been tested with a configuration 40 x 1GB
> hugepages.
> > +The hugepage configuration of a server may be examined using:
> > +
> > +.. code-block:: console
> > +
> > +   grep Huge* /proc/meminfo
> > +
> > +
> > +Initialization
> > +--------------
> > +
> > +When the device first powers up, its PCI Physical Functions (PF) can
> > +be listed through these commands for Intel vRAN Boost v2:
> > +
> > +.. code-block:: console
> > +
> > +   sudo lspci -vd8086:57c2
> > +
> > +The physical and virtual functions are compatible with Linux UIO drivers:
> > +``vfio`` and ``igb_uio``.
> > +However, in order to work the 5G/4G FEC device first needs to be
> > +bound to one of these Linux drivers through DPDK.
> > +
> > +
> > +Bind PF UIO driver(s)
> > +~~~~~~~~~~~~~~~~~~~~~
> >
> 
> Please document with VFIO instead of IGB_UIO, which is being deprecated.

There are still some deployment with igb_uio, so keeping for a bit. Then we can remove in all documentations. 

> 
> > +Install the DPDK igb_uio driver, bind it with the PF PCI device ID
> > +and use ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK
> UIO driver.
> > +
> > +The igb_uio driver may be bound to the PF PCI device using one of two
> > +methods for Intel vRAN Boost v2:
> > +
> > +#. PCI functions (physical or virtual, depending on the use case) can
> > +be bound to the UIO driver by repeating this command for every function.
> > +
> > +.. code-block:: console
> > +
> > +   cd <dpdk-top-level-directory>
> > +   insmod build/kmod/igb_uio.ko
> > +   echo "8086 57c2" > /sys/bus/pci/drivers/igb_uio/new_id
> > +   lspci -vd8086:57c2
> > +
> > +#. Another way to bind PF with DPDK UIO driver is by using the
> > +``dpdk-devbind.py`` tool
> > +
> > +.. code-block:: console
> > +
> > +   cd <dpdk-top-level-directory>
> > +   usertools/dpdk-devbind.py -b igb_uio 0000:f7:00.0
> > +
> > +where the PCI device ID (example: 0000:f7:00.0) is obtained using ``lspci -
> vd8086:57c2``.
> > +
> > +In a similar way the PF may be bound with vfio-pci as any PCIe device.
> > +
> > +
> > +Enable Virtual Functions
> > +~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Now, it should be visible in the printouts that PCI PF is under
> > +igb_uio control "``Kernel driver in use: igb_uio``"
> > +
> > +To show the number of available VFs on the device, read ``sriov_totalvfs``
> file.
> > +
> > +.. code-block:: console
> > +
> > +   cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> > +
> > +where ``0000\:<b>\:<d>.<f>`` is the PCI device ID
> > +
> > +To enable VFs via igb_uio, echo the number of virtual functions
> > +intended to enable to ``max_vfs`` file.
> > +
> > +.. code-block:: console
> > +
> > +   echo <num-of-vfs> >
> > + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> > +
> > +Afterwards, all VFs must be bound to appropriate UIO drivers as
> > +required, same way it was done with the physical function previously.
> > +
> > +Enabling SR-IOV via VFIO driver is pretty much the same, except that
> > +the file name is different:
> > +
> > +.. code-block:: console
> > +
> > +   echo <num-of-vfs> >
> > + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> > +
> > +
> > +Configure the VFs through PF
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +The PCI virtual functions must be configured before working or
> > +getting assigned to VMs/Containers.
> > +The configuration involves allocating the number of hardware queues,
> > +priorities, load balance, bandwidth and other settings necessary for
> > +the device to perform FEC functions.
> > +
> > +This configuration needs to be executed at least once after reboot or
> > +PCI FLR and can be achieved by using the functions
> > +``rte_acc_configure()``, which sets up the parameters defined in the
> compatible ``rte_acc_conf`` structure.
> > +
> > +
> > +Test Application
> > +----------------
> > +
> > +The bbdev class is provided with a test application,
> > +``test-bbdev.py`` and range of test data for testing the
> > +functionality of the device, depending on the device's capabilities.
> > +The test application is located under app/test-bbdev folder and has
> > +the following options:
> > +
> > +.. code-block:: console
> > +
> > +   "-p", "--testapp-path": specifies path to the bbdev test app.
> > +   "-e", "--eal-params": EAL arguments which are passed to the test app.
> > +   "-t", "--timeout": Timeout in seconds (default=300).
> > +   "-c", "--test-cases": Defines test cases to run. Run all if not specified.
> > +   "-v", "--test-vector": Test vector path.
> > +   "-n", "--num-ops": Number of operations to process on device
> (default=32).
> > +   "-b", "--burst-size": Operations enqueue/dequeue burst size (default=32).
> > +   "-s", "--snr": SNR in dB used when generating LLRs for bler tests.
> > +   "-s", "--iter_max": Number of iterations for LDPC decoder.
> > +   "-l", "--num-lcores": Number of lcores to run (default=16).
> > +   "-i", "--init-device": Initialise PF device with default values.
> > +
> > +
> > +To execute the test application tool using simple decode or encode
> > +data, type one of the following:
> > +
> > +.. code-block:: console
> > +
> > +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> > + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> > +
> > +
> > +The test application ``test-bbdev.py``, supports the ability to
> > +configure the PF device with a default set of values, if the "-i" or
> > +"- -init-device" option is included. The default values are defined in
> test_bbdev_perf.c.
> > +
> > +
> > +Test Vectors
> > +~~~~~~~~~~~~
> > +
> > +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev
> > +also provides a range of additional tests under the test_vectors
> > +folder, which may be useful.
> > +The results of these tests will depend on the device capabilities
> > +which may cause some test cases to be skipped, but no failure should be
> reported.
> > +
> > +
> > +Alternate Baseband Device configuration tool
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +On top of the embedded configuration feature supported in test-bbdev
> > +using
> > +"- -init-device" option mentioned above, there is also a tool
> > +available to perform that device configuration using a companion
> application.
> > +The ``pf_bb_config`` application notably enables then to run
> > +bbdev-test from the VF and not only limited to the PF as captured above.
> > +
> > +See for more details: https://github.com/intel/pf-bb-config
> > +
> > +Specifically for the bbdev Intel vRAN Boost v2 PMD, the command below
> > +can be used (note that ACC200 was used previously to refer to VRB2):
> > +
> > +.. code-block:: console
> > +
> > +   pf_bb_config VRB2 -c ./vrb2/vrb2_config_vf_5g.cfg
> > +   test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -n 64
> > + -b 64 -l 1 -v ./ldpc_dec_default.data
> > diff --git a/doc/guides/rel_notes/release_23_11.rst
> > b/doc/guides/rel_notes/release_23_11.rst
> > index 333e1d95a2..668dd58ee3 100644
> > --- a/doc/guides/rel_notes/release_23_11.rst
> > +++ b/doc/guides/rel_notes/release_23_11.rst
> > @@ -78,6 +78,9 @@ New Features
> >   * build: Optional libraries can now be selected with the new ``enable_libs``
> >     build option similarly to the existing ``enable_drivers`` build option.
> >
> > +* **Updated Intel vRAN Boost bbdev PMD.**
> > +
> > +  Added support for the new Intel vRAN Boost v2 device variant (GNR-D)
> within the unified driver.
> >
> >   Removed Items
> >   -------------
> > diff --git a/drivers/baseband/acc/acc_common.h
> > b/drivers/baseband/acc/acc_common.h
> > index 5de58dbe36..b71292af94 100644
> > --- a/drivers/baseband/acc/acc_common.h
> > +++ b/drivers/baseband/acc/acc_common.h
> > @@ -18,6 +18,7 @@
> >   #define ACC_DMA_BLKID_OUT_HARQ      3
> >   #define ACC_DMA_BLKID_IN_HARQ       3
> >   #define ACC_DMA_BLKID_IN_MLD_R      3
> > +#define ACC_DMA_BLKID_DEWIN_IN      3
> >
> >   /* Values used in filling in decode FCWs */
> >   #define ACC_FCW_TD_VER              1
> > @@ -103,6 +104,9 @@
> >   #define ACC_MAX_NUM_QGRPS              32
> >   #define ACC_RING_SIZE_GRANULARITY      64
> >   #define ACC_MAX_FCW_SIZE              128
> > +#define ACC_IQ_SIZE                    4
> > +
> > +#define ACC_FCW_FFT_BLEN_3             28
> >
> >   /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
> >   #define ACC_N_ZC_1 66 /* N = 66 Zc for BG 1 */ @@ -132,6 +136,11 @@
> >   #define ACC_LIM_21 14 /* 0.21 */
> >   #define ACC_LIM_31 20 /* 0.31 */
> >   #define ACC_MAX_E (128 * 1024 - 2)
> > +#define ACC_MAX_CS 12
> > +
> > +#define ACC100_VARIANT          0
> > +#define VRB1_VARIANT		2
> > +#define VRB2_VARIANT		3
> >
> >   /* Helper macro for logging */
> >   #define rte_acc_log(level, fmt, ...) \ @@ -332,6 +341,37 @@ struct
> > __rte_packed acc_fcw_fft {
> >   		res:19;
> >   };
> >
> > +/* FFT Frame Control Word. */
> > +struct __rte_packed acc_fcw_fft_3 {
> > +	uint32_t in_frame_size:16,
> > +		leading_pad_size:16;
> > +	uint32_t out_frame_size:16,
> > +		leading_depad_size:16;
> > +	uint32_t cs_window_sel;
> > +	uint32_t cs_window_sel2:16,
> > +		cs_enable_bmap:16;
> > +	uint32_t num_antennas:8,
> > +		idft_size:8,
> > +		dft_size:8,
> > +		cs_offset:8;
> > +	uint32_t idft_shift:8,
> > +		dft_shift:8,
> > +		cs_multiplier:16;
> > +	uint32_t bypass:2,
> > +		fp16_in:1,
> > +		fp16_out:1,
> > +		exp_adj:4,
> > +		power_shift:4,
> > +		power_en:1,
> > +		enable_dewin:1,
> > +		freq_resample_mode:2,
> > +		depad_ouput_size:16;
> > +	uint16_t cs_theta_0[ACC_MAX_CS];
> > +	uint32_t cs_theta_d[ACC_MAX_CS];
> > +	int8_t cs_time_offset[ACC_MAX_CS];
> > +};
> > +
> > +
> >   /* MLD-TS Frame Control Word */
> >   struct __rte_packed acc_fcw_mldts {
> >   	uint32_t fcw_version:4,
> > @@ -473,14 +513,14 @@ union acc_info_ring_data {
> >   		uint16_t valid: 1;
> >   	};
> >   	struct {
> > -		uint32_t aq_id_3: 6;
> > -		uint32_t qg_id_3: 5;
> > -		uint32_t vf_id_3: 6;
> > -		uint32_t int_nb_3: 6;
> > -		uint32_t msi_0_3: 1;
> > -		uint32_t vf2pf_3: 6;
> > -		uint32_t loop_3: 1;
> > -		uint32_t valid_3: 1;
> > +		uint32_t aq_id_vrb2: 6;
> > +		uint32_t qg_id_vrb2: 5;
> > +		uint32_t vf_id_vrb2: 6;
> > +		uint32_t int_nb_vrb2: 6;
> > +		uint32_t msi_0_vrb2: 1;
> > +		uint32_t vf2pf_vrb2: 6;
> > +		uint32_t loop_vrb2: 1;
> > +		uint32_t valid_vrb2: 1;
> >   	};
> >   } __rte_packed;
> >
> > @@ -765,16 +805,20 @@ alloc_sw_rings_min_mem(struct rte_bbdev *dev,
> struct acc_device *d,
> >    */
> >   static inline uint16_t
> >   get_queue_id_from_ring_info(struct rte_bbdev_data *data,
> > -		const union acc_info_ring_data ring_data)
> > +		const union acc_info_ring_data ring_data, uint16_t
> device_variant)
> 
> The device variant ID should be derived from the data, which stores device
> private pointer. No need to add an extra parameter.

the device variant comes from directly forom dev-> not from dev->data. 

> 
> >   {
> >   	uint16_t queue_id;
> > +	struct acc_queue *acc_q;
> > +	uint16_t vf_id = (device_variant == VRB2_VARIANT) ?
> ring_data.vf_id_vrb2 : ring_data.vf_id;
> > +	uint16_t aq_id = (device_variant == VRB2_VARIANT) ?
> ring_data.aq_id_vrb2 : ring_data.aq_id;
> > +	uint16_t qg_id = (device_variant == VRB2_VARIANT) ?
> > +ring_data.qg_id_vrb2 : ring_data.qg_id;
> 
> Wouldn't it be better to have an intermediate representation, and each variant
> implements a callback to fill it?
> 
> It will become messy otherwise when other variants will be added.
> 
> Something like the existing queue_offset callback would be great.

ok fair enough. 

> 
> >
> >   	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
> > -		struct acc_queue *acc_q =
> > -				data->queues[queue_id].queue_private;
> > -		if (acc_q != NULL && acc_q->aq_id == ring_data.aq_id &&
> > -				acc_q->qgrp_id == ring_data.qg_id &&
> > -				acc_q->vf_id == ring_data.vf_id)
> > +		acc_q = data->queues[queue_id].queue_private;
> > +
> > +		if (acc_q != NULL && acc_q->aq_id == aq_id &&
> > +				acc_q->qgrp_id == qg_id &&
> > +				acc_q->vf_id == vf_id)
> >   			return queue_id;
> >   	}
> >
> > @@ -1436,4 +1480,16 @@ get_num_cbs_in_tb_ldpc_enc(struct
> rte_bbdev_op_ldpc_enc *ldpc_enc)
> >   	return cbs_in_tb;
> >   }
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +static inline void
> > +acc_memdump(const char *string, void *buf, uint16_t bytes) {
> > +	printf("%s\n", string);
> > +	uint32_t *data = buf;
> > +	uint16_t i;
> > +	for (i = 0; i < bytes / 4; i++)
> > +		printf("0x%08X\n", data[i]);
> > +}
> 
> For consistency and to avoid code dumplication, please use rte_memdump()
> directly as you did in ACC100.

I can do for now, it is just that for format is quite poor. Will change. 

> 
> > +#endif
> > +
> >   #endif /* _ACC_COMMON_H_ */
> > diff --git a/drivers/baseband/acc/rte_acc100_pmd.c
> > b/drivers/baseband/acc/rte_acc100_pmd.c
> > index 5362d39c30..7f8d05b5a9 100644
> > --- a/drivers/baseband/acc/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc/rte_acc100_pmd.c
> > @@ -294,7 +294,7 @@ acc100_pf_interrupt_handler(struct rte_bbdev *dev)
> >   		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
> >   		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
> >   			deq_intr_det.queue_id =
> get_queue_id_from_ring_info(
> > -					dev->data, *ring_data);
> > +					dev->data, *ring_data, acc100_dev-
> >device_variant);
> >   			if (deq_intr_det.queue_id == UINT16_MAX) {
> >   				rte_bbdev_log(ERR,
> >   						"Couldn't find queue: aq_id:
> %u, qg_id: %u, vf_id: %u", @@
> > -348,7 +348,7 @@ acc100_vf_interrupt_handler(struct rte_bbdev *dev)
> >   			 */
> >   			ring_data->vf_id = 0;
> >   			deq_intr_det.queue_id =
> get_queue_id_from_ring_info(
> > -					dev->data, *ring_data);
> > +					dev->data, *ring_data, acc100_dev-
> >device_variant);
> >   			if (deq_intr_det.queue_id == UINT16_MAX) {
> >   				rte_bbdev_log(ERR,
> >   						"Couldn't find queue: aq_id:
> %u, qg_id: %u", diff --git
> > a/drivers/baseband/acc/rte_vrb_pmd.c
> > b/drivers/baseband/acc/rte_vrb_pmd.c
> > index e82ed55ca7..a787592ec9 100644
> > --- a/drivers/baseband/acc/rte_vrb_pmd.c
> > +++ b/drivers/baseband/acc/rte_vrb_pmd.c
> > @@ -37,6 +37,15 @@ vrb1_queue_offset(bool pf_device, uint8_t vf_id,
> uint8_t qgrp_id, uint16_t aq_id
> >   		return ((qgrp_id << 7) + (aq_id << 3) +
> VRB1_VfQmgrIngressAq);
> >   }
> >
> > +static inline uint32_t
> > +vrb2_queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
> > +uint16_t aq_id) {
> > +	if (pf_device)
> > +		return ((vf_id << 14) + (qgrp_id << 9) + (aq_id << 3) +
> VRB2_PfQmgrIngressAq);
> > +	else
> > +		return ((qgrp_id << 9) + (aq_id << 3) +
> VRB2_VfQmgrIngressAq); }
> > +
> >   enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, FFT, MLD, NUM_ACC};
> >
> >   /* Return the accelerator enum for a Queue Group Index. */ @@ -197,7
> > +206,7 @@ fetch_acc_config(struct rte_bbdev *dev)
> >   	struct acc_device *d = dev->data->dev_private;
> >   	struct rte_acc_conf *acc_conf = &d->acc_conf;
> >   	uint8_t acc, qg;
> > -	uint32_t reg_aq, reg_len0, reg_len1, reg0, reg1;
> > +	uint32_t reg_aq, reg_len0, reg_len1, reg_len2, reg_len3, reg0, reg1,
> > +reg2, reg3;
> >   	uint32_t reg_mode, idx;
> >   	struct rte_acc_queue_topology *q_top = NULL;
> >   	int qman_func_id[VRB_NUM_ACCS] = {ACC_ACCMAP_0,
> ACC_ACCMAP_1, @@
> > -219,32 +228,81 @@ fetch_acc_config(struct rte_bbdev *dev)
> >   	acc_conf->num_vf_bundles = 1;
> >   	initQTop(acc_conf);
> >
> > -	reg0 = acc_reg_read(d, d->reg_addr->qman_group_func);
> > -	reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4);
> > -	for (qg = 0; qg < d->num_qgroups; qg++) {
> > -		reg_aq = acc_reg_read(d, d->queue_offset(d->pf_device, 0,
> qg, 0));
> > -		if (reg_aq & ACC_QUEUE_ENABLE) {
> > -			if (qg < ACC_NUM_QGRPS_PER_WORD)
> > -				idx = (reg0 >> (qg * 4)) & 0x7;
> > +	if (d->device_variant == VRB1_VARIANT) {
> > +		reg0 = acc_reg_read(d, d->reg_addr->qman_group_func);
> > +		reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4);
> > +		for (qg = 0; qg < d->num_qgroups; qg++) {
> > +			reg_aq = acc_reg_read(d, d->queue_offset(d-
> >pf_device, 0, qg, 0));
> > +			if (reg_aq & ACC_QUEUE_ENABLE) {
> > +				if (qg < ACC_NUM_QGRPS_PER_WORD)
> > +					idx = (reg0 >> (qg * 4)) & 0x7;
> > +				else
> > +					idx = (reg1 >> ((qg -
> ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
> > +				if (idx < VRB1_NUM_ACCS) {
> > +					acc = qman_func_id[idx];
> > +					updateQtop(acc, qg, acc_conf, d);
> > +				}
> > +			}
> > +		}
> > +
> > +		/* Check the depth of the AQs. */
> > +		reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset);
> > +		reg_len1 = acc_reg_read(d, d->reg_addr->depth_log1_offset);
> > +		for (acc = 0; acc < NUM_ACC; acc++) {
> > +			qtopFromAcc(&q_top, acc, acc_conf);
> > +			if (q_top->first_qgroup_index <
> ACC_NUM_QGRPS_PER_WORD)
> > +				q_top->aq_depth_log2 =
> > +						(reg_len0 >> (q_top-
> >first_qgroup_index * 4)) & 0xF;
> >   			else
> > -				idx = (reg1 >> ((qg -
> ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
> > -			if (idx < VRB1_NUM_ACCS) {
> > -				acc = qman_func_id[idx];
> > -				updateQtop(acc, qg, acc_conf, d);
> > +				q_top->aq_depth_log2 = (reg_len1 >> ((q_top-
> >first_qgroup_index -
> > +
> 	ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
> > +		}
> > +	} else {
> > +		reg0 = acc_reg_read(d, d->reg_addr->qman_group_func);
> > +		reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4);
> > +		reg2 = acc_reg_read(d, d->reg_addr->qman_group_func + 8);
> > +		reg3 = acc_reg_read(d, d->reg_addr->qman_group_func + 12);
> > +		/* printf("Debug Function %08x %08x %08x %08x\n", reg0,
> reg1, reg2, reg3);*/
> > +		for (qg = 0; qg < VRB2_NUM_QGRPS; qg++) {
> > +			reg_aq = acc_reg_read(d, vrb2_queue_offset(d-
> >pf_device, 0, qg, 0));
> > +			if (reg_aq & ACC_QUEUE_ENABLE) {
> > +				/* printf("Qg enabled %d %x\n", qg,
> reg_aq);*/
> > +				if (qg / ACC_NUM_QGRPS_PER_WORD == 0)
> > +					idx = (reg0 >> ((qg %
> ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
> > +				else if (qg / ACC_NUM_QGRPS_PER_WORD ==
> 1)
> > +					idx = (reg1 >> ((qg %
> ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
> > +				else if (qg / ACC_NUM_QGRPS_PER_WORD ==
> 2)
> > +					idx = (reg2 >> ((qg %
> ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
> > +				else
> > +					idx = (reg3 >> ((qg %
> ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
> > +				if (idx < VRB_NUM_ACCS) {
> > +					acc = qman_func_id[idx];
> > +					updateQtop(acc, qg, acc_conf, d);
> > +				}
> >   			}
> >   		}
> > -	}
> >
> > -	/* Check the depth of the AQs. */
> > -	reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset);
> > -	reg_len1 = acc_reg_read(d, d->reg_addr->depth_log1_offset);
> > -	for (acc = 0; acc < NUM_ACC; acc++) {
> > -		qtopFromAcc(&q_top, acc, acc_conf);
> > -		if (q_top->first_qgroup_index <
> ACC_NUM_QGRPS_PER_WORD)
> > -			q_top->aq_depth_log2 = (reg_len0 >> (q_top-
> >first_qgroup_index * 4)) & 0xF;
> > -		else
> > -			q_top->aq_depth_log2 = (reg_len1 >> ((q_top-
> >first_qgroup_index -
> > -					ACC_NUM_QGRPS_PER_WORD) * 4))
> & 0xF;
> > +		/* Check the depth of the AQs. */
> > +		reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset);
> > +		reg_len1 = acc_reg_read(d, d->reg_addr->depth_log0_offset +
> 4);
> > +		reg_len2 = acc_reg_read(d, d->reg_addr->depth_log0_offset +
> 8);
> > +		reg_len3 = acc_reg_read(d, d->reg_addr->depth_log0_offset +
> 12);
> > +
> > +		for (acc = 0; acc < NUM_ACC; acc++) {
> > +			qtopFromAcc(&q_top, acc, acc_conf);
> > +			if (q_top->first_qgroup_index /
> ACC_NUM_QGRPS_PER_WORD == 0)
> > +				q_top->aq_depth_log2 = (reg_len0 >> ((q_top-
> >first_qgroup_index %
> > +
> 	ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
> > +			else if (q_top->first_qgroup_index /
> ACC_NUM_QGRPS_PER_WORD == 1)
> > +				q_top->aq_depth_log2 = (reg_len1 >> ((q_top-
> >first_qgroup_index %
> > +
> 	ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
> > +			else if (q_top->first_qgroup_index /
> ACC_NUM_QGRPS_PER_WORD == 2)
> > +				q_top->aq_depth_log2 = (reg_len2 >> ((q_top-
> >first_qgroup_index %
> > +
> 	ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
> > +			else
> > +				q_top->aq_depth_log2 = (reg_len3 >> ((q_top-
> >first_qgroup_index %
> > +
> 	ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
> > +		}
> >   	}
> >
> >   	/* Read PF mode. */
> > @@ -341,18 +399,29 @@ vrb_check_ir(struct acc_device *acc_dev)
> >   	ring_data = acc_dev->info_ring + (acc_dev->info_ring_head &
> > ACC_INFO_RING_MASK);
> >
> >   	while (ring_data->valid) {
> > -		if ((ring_data->int_nb < ACC_PF_INT_DMA_DL_DESC_IRQ) || (
> > -				ring_data->int_nb >
> ACC_PF_INT_DMA_MLD_DESC_IRQ)) {
> > -			rte_bbdev_log(WARNING, "InfoRing: ITR:%d
> Info:0x%x",
> > -					ring_data->int_nb, ring_data-
> >detailed_info);
> > -			/* Initialize Info Ring entry and move forward. */
> > -			ring_data->val = 0;
> > +		if (acc_dev->device_variant == VRB1_VARIANT) {
> > +			if ((ring_data->int_nb <
> ACC_PF_INT_DMA_DL_DESC_IRQ) || (
> > +					ring_data->int_nb >
> ACC_PF_INT_DMA_MLD_DESC_IRQ)) {
> > +				rte_bbdev_log(WARNING, "InfoRing: ITR:%d
> Info:0x%x",
> > +						ring_data->int_nb, ring_data-
> >detailed_info);
> > +				/* Initialize Info Ring entry and move forward.
> */
> > +				ring_data->val = 0;
> > +			}
> > +		} else { /* VRB2_VARIANT */
> > +			if ((ring_data->int_nb_vrb2 <
> ACC_PF_INT_DMA_DL_DESC_IRQ) || (
> > +					ring_data->int_nb_vrb2 >
> ACC_PF_INT_DMA_MLD_DESC_IRQ)) {
> > +				rte_bbdev_log(WARNING, "InfoRing: ITR:%d
> Info:0x%x",
> > +						ring_data->int_nb_vrb2,
> ring_data->val);
> > +				/* Initialize Info Ring entry and move forward.
> */
> > +				ring_data->val = 0;
> > +			}
> 
> Same as get_queue_id_from_ring_info().
> Having a callback to get the different parameters (int_nb, vf_id, ...)
> implemented by each variant would avoid duplication like above.
> 
> It would make it more easier to add new variants, i.e. just implement the
> callback without needing to patch every function making use of these info.

ok, fair enough.


> 
> >   		}
> >   		info_ring_head++;
> >   		ring_data = acc_dev->info_ring + (info_ring_head &
> ACC_INFO_RING_MASK);
> >   	}
> >   }
> >
> > +
> 
> Remove new line.

ok

> 
> >   /* Interrupt handler triggered by dev for handling specific interrupt. */
> >   static void
> >   vrb_dev_interrupt_handler(void *cb_arg) @@ -361,16 +430,22 @@
> > vrb_dev_interrupt_handler(void *cb_arg)
> >   	struct acc_device *acc_dev = dev->data->dev_private;
> >   	volatile union acc_info_ring_data *ring_data;
> >   	struct acc_deq_intr_details deq_intr_det;
> > +	uint16_t vf_id, aq_id, qg_id, int_nb;
> > +	bool isVrb1 = (acc_dev->device_variant == VRB1_VARIANT);
> >
> >   	ring_data = acc_dev->info_ring + (acc_dev->info_ring_head &
> > ACC_INFO_RING_MASK);
> >
> >   	while (ring_data->valid) {
> > +		vf_id = isVrb1 ? ring_data->vf_id : ring_data->vf_id_vrb2;
> > +		aq_id = isVrb1 ? ring_data->aq_id : ring_data->aq_id_vrb2;
> > +		qg_id = isVrb1 ? ring_data->qg_id : ring_data->qg_id_vrb2; > +
> 		int_nb = isVrb1 ? ring_data->int_nb : ring_data->int_nb_vrb2;
> 
> Same comment as above.

ok

> 
> >   		if (acc_dev->pf_device) {
> >   			rte_bbdev_log_debug(
> > -					"VRB1 PF Interrupt received, Info Ring
> data: 0x%x -> %d",
> > -					ring_data->val, ring_data->int_nb);
> > +					"PF Interrupt received, Info Ring data:
> 0x%x -> %d",
> > +					ring_data->val, int_nb);
> >
> > -			switch (ring_data->int_nb) {
> > +			switch (int_nb) {
> >   			case ACC_PF_INT_DMA_DL_DESC_IRQ:
> >   			case ACC_PF_INT_DMA_UL_DESC_IRQ:
> >   			case ACC_PF_INT_DMA_FFT_DESC_IRQ:
> > @@ -378,13 +453,11 @@ vrb_dev_interrupt_handler(void *cb_arg)
> >   			case ACC_PF_INT_DMA_DL5G_DESC_IRQ:
> >   			case ACC_PF_INT_DMA_MLD_DESC_IRQ:
> >   				deq_intr_det.queue_id =
> get_queue_id_from_ring_info(
> > -						dev->data, *ring_data);
> > +						dev->data, *ring_data,
> acc_dev->device_variant);
> >   				if (deq_intr_det.queue_id == UINT16_MAX) {
> >   					rte_bbdev_log(ERR,
> >   							"Couldn't find queue:
> aq_id: %u, qg_id: %u, vf_id: %u",
> > -							ring_data->aq_id,
> > -							ring_data->qg_id,
> > -							ring_data->vf_id);
> > +							aq_id, qg_id, vf_id);
> >   					return;
> >   				}
> >   				rte_bbdev_pmd_callback_process(dev,
> > @@ -396,9 +469,9 @@ vrb_dev_interrupt_handler(void *cb_arg)
> >   			}
> >   		} else {
> >   			rte_bbdev_log_debug(
> > -					"VRB1 VF Interrupt received, Info Ring
> data: 0x%x\n",
> > +					"VRB VF Interrupt received, Info Ring
> data: 0x%x\n",
> >   					ring_data->val);
> > -			switch (ring_data->int_nb) {
> > +			switch (int_nb) {
> >   			case ACC_VF_INT_DMA_DL_DESC_IRQ:
> >   			case ACC_VF_INT_DMA_UL_DESC_IRQ:
> >   			case ACC_VF_INT_DMA_FFT_DESC_IRQ:
> > @@ -406,14 +479,16 @@ vrb_dev_interrupt_handler(void *cb_arg)
> >   			case ACC_VF_INT_DMA_DL5G_DESC_IRQ:
> >   			case ACC_VF_INT_DMA_MLD_DESC_IRQ:
> >   				/* VFs are not aware of their vf_id - it's set to
> 0.  */
> > -				ring_data->vf_id = 0;
> > +				if (acc_dev->device_variant ==
> VRB1_VARIANT)
> > +					ring_data->vf_id = 0;
> > +				else
> > +					ring_data->vf_id_vrb2 = 0;
> 
> Ditto, you also need a way to set these fields in a generic way.

ok

> 
> >   				deq_intr_det.queue_id =
> get_queue_id_from_ring_info(
> > -						dev->data, *ring_data);
> > +						dev->data, *ring_data,
> acc_dev->device_variant);
> >   				if (deq_intr_det.queue_id == UINT16_MAX) {
> >   					rte_bbdev_log(ERR,
> >   							"Couldn't find queue:
> aq_id: %u, qg_id: %u",
> > -							ring_data->aq_id,
> > -							ring_data->qg_id);
> > +							aq_id, qg_id);
> >   					return;
> >   				}
> >   				rte_bbdev_pmd_callback_process(dev,
> > @@ -428,8 +503,7 @@ vrb_dev_interrupt_handler(void *cb_arg)
> >   		/* Initialize Info Ring entry and move forward. */
> >   		ring_data->val = 0;
> >   		++acc_dev->info_ring_head;
> > -		ring_data = acc_dev->info_ring +
> > -				(acc_dev->info_ring_head &
> ACC_INFO_RING_MASK);
> > +		ring_data = acc_dev->info_ring + (acc_dev->info_ring_head &
> > +ACC_INFO_RING_MASK);
> >   	}
> >   }
> >
> > @@ -461,7 +535,10 @@ allocate_info_ring(struct rte_bbdev *dev)
> >   	phys_low  = (uint32_t)(info_ring_iova);
> >   	acc_reg_write(d, d->reg_addr->info_ring_hi, phys_high);
> >   	acc_reg_write(d, d->reg_addr->info_ring_lo, phys_low);
> > -	acc_reg_write(d, d->reg_addr->info_ring_en,
> VRB1_REG_IRQ_EN_ALL);
> > +	if (d->device_variant == VRB1_VARIANT)
> > +		acc_reg_write(d, d->reg_addr->info_ring_en,
> VRB1_REG_IRQ_EN_ALL);
> > +	else
> > +		acc_reg_write(d, d->reg_addr->info_ring_en,
> VRB2_REG_IRQ_EN_ALL);
> 
> My guess is the mask value difference between the two is the VRB2 added
> MLD-TS related interrupts, is it?
> 
> If so, it would be more future proof to configure this based on the device
> capabilities instead of its variant?

There are some differences in these implementations, best to to keep them separate. 

> 
> >   	d->info_ring_head = (acc_reg_read(d, d->reg_addr->info_ring_ptr) &
> >   			0xFFF) / sizeof(union acc_info_ring_data);
> >   	return 0;
> > @@ -516,6 +593,7 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t
> num_queues, int socket_id)
> >   	phys_high = (uint32_t)(d->sw_rings_iova >> 32);
> >   	phys_low  = (uint32_t)(d->sw_rings_iova & ~(ACC_SIZE_64MBYTE-1));
> >
> > +
> 
> Remove new line.

ok

> 
> >   	/* Read the populated cfg from device registers. */
> >   	fetch_acc_config(dev);
> >
> > @@ -540,6 +618,10 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t
> num_queues, int socket_id)
> >   	acc_reg_write(d, d->reg_addr->dma_ring_dl4g_lo, phys_low);
> >   	acc_reg_write(d, d->reg_addr->dma_ring_fft_hi, phys_high);
> >   	acc_reg_write(d, d->reg_addr->dma_ring_fft_lo, phys_low);
> > +	if (d->device_variant == VRB2_VARIANT) {
> > +		acc_reg_write(d, d->reg_addr->dma_ring_mld_hi, phys_high);
> > +		acc_reg_write(d, d->reg_addr->dma_ring_mld_lo, phys_low);
> 
> Same, wouldn't it be more future-proof to set this based on the capabilities of
> the device and not its variant?

Checking capability on the fly would be cumbersome I think.
I dont see problem related to next gen extension for this. 
Thanks

> 
> > +	}
> >   	/*
> >   	 * Configure Ring Size to the max queue ring size
> >   	 * (used for wrapping purpose).
> > @@ -549,8 +631,7 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t
> > num_queues, int socket_id)
> >
> >   	/* Configure tail pointer for use when SDONE enabled. */
> >   	if (d->tail_ptrs == NULL)
> > -		d->tail_ptrs = rte_zmalloc_socket(
> > -				dev->device->driver->name,
> > +		d->tail_ptrs = rte_zmalloc_socket(dev->device->driver->name,
> >   				VRB_MAX_QGRPS * VRB_MAX_AQS *
> sizeof(uint32_t),
> >   				RTE_CACHE_LINE_SIZE, socket_id);
> >   	if (d->tail_ptrs == NULL) {
> > @@ -574,6 +655,10 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t
> num_queues, int socket_id)
> >   	acc_reg_write(d, d->reg_addr->tail_ptrs_dl4g_lo, phys_low);
> >   	acc_reg_write(d, d->reg_addr->tail_ptrs_fft_hi, phys_high);
> >   	acc_reg_write(d, d->reg_addr->tail_ptrs_fft_lo, phys_low);
> > +	if (d->device_variant == VRB2_VARIANT) {
> > +		acc_reg_write(d, d->reg_addr->tail_ptrs_mld_hi, phys_high);
> > +		acc_reg_write(d, d->reg_addr->tail_ptrs_mld_lo, phys_low);
> > +	}
> 
> Ditto.
> 
> >
> >   	ret = allocate_info_ring(dev);
> >   	if (ret < 0) {
> > @@ -671,10 +756,17 @@ vrb_intr_enable(struct rte_bbdev *dev)
> >   			return ret;
> >   		}
> >
> > -		if (acc_dev->pf_device)
> > -			max_queues = VRB1_MAX_PF_MSIX;
> > -		else
> > -			max_queues = VRB1_MAX_VF_MSIX;
> > +		if (d->device_variant == VRB1_VARIANT) {
> > +			if (acc_dev->pf_device)
> > +				max_queues = VRB1_MAX_PF_MSIX;
> > +			else
> > +				max_queues = VRB1_MAX_VF_MSIX;
> > +		} else {
> > +			if (acc_dev->pf_device)
> > +				max_queues = VRB2_MAX_PF_MSIX;
> > +			else
> > +				max_queues = VRB2_MAX_VF_MSIX;
> > +		}
> >
> >   		if (rte_intr_efd_enable(dev->intr_handle, max_queues)) {
> >   			rte_bbdev_log(ERR, "Failed to create fds for %u
> queues", @@
> > -776,7 +868,10 @@ vrb_find_free_queue_idx(struct rte_bbdev *dev,
> >   			/* Mark the Queue as assigned. */
> >   			d->q_assigned_bit_map[group_idx] |= (1ULL <<
> aq_idx);
> >   			/* Report the AQ Index. */
> > -			return (group_idx << VRB1_GRP_ID_SHIFT) + aq_idx;
> > +			if (d->device_variant == VRB1_VARIANT)
> > +				return (group_idx << VRB1_GRP_ID_SHIFT) +
> aq_idx;
> > +			else
> > +				return (group_idx << VRB2_GRP_ID_SHIFT) +
> aq_idx;
> 
> This is the same bitmap as in acc_info_ring_data, isn't it?
> We could maybe share some code to to the decoding as suggested ealier in
> this patch.

you mean just to add an inline function for the grp_id_shift(variant) right? 
fair enough


> 
> >   		}
> >   	}
> >   	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> > @@ -819,6 +914,9 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t
> queue_id,
> >   			ACC_FCW_LD_BLEN : (conf->op_type ==
> RTE_BBDEV_OP_FFT ?
> >   			ACC_FCW_FFT_BLEN : ACC_FCW_MLDTS_BLEN))));
> >
> > +	if ((q->d->device_variant == VRB2_VARIANT) && (conf->op_type ==
> RTE_BBDEV_OP_FFT))
> > +		fcw_len = ACC_FCW_FFT_BLEN_3;
> > +
> >   	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> >   		desc = q->ring_addr + desc_idx;
> >   		desc->req.word0 = ACC_DMA_DESC_TYPE; @@ -915,9
> +1013,16 @@
> > vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> >   		}
> >   	}
> >
> > -	q->qgrp_id = (q_idx >> VRB1_GRP_ID_SHIFT) & 0xF;
> > -	q->vf_id = (q_idx >> VRB1_VF_ID_SHIFT)  & 0x3F;
> > -	q->aq_id = q_idx & 0xF;
> > +	if (d->device_variant == VRB1_VARIANT) {
> > +		q->qgrp_id = (q_idx >> VRB1_GRP_ID_SHIFT) & 0xF;
> > +		q->vf_id = (q_idx >> VRB1_VF_ID_SHIFT)  & 0x3F;
> > +		q->aq_id = q_idx & 0xF;
> > +	} else {
> > +		q->qgrp_id = (q_idx >> VRB2_GRP_ID_SHIFT) & 0x1F;
> > +		q->vf_id = (q_idx >> VRB2_VF_ID_SHIFT)  & 0x3F;
> > +		q->aq_id = q_idx & 0x3F;
> > +	}
> > +
> 
> Same remark as before.

yes, still valid

> 
> >   	q->aq_depth = 0;
> >   	if (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC)
> >   		q->aq_depth = (1 << d->acc_conf.q_ul_4g.aq_depth_log2);
> > @@ -1149,6 +1254,127 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct
> rte_bbdev_driver_info *dev_info)
> >   		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> >   	};
> >
> > +	static const struct rte_bbdev_op_cap vrb2_bbdev_capabilities[] = {
> > +		{
> > +			.type = RTE_BBDEV_OP_TURBO_DEC,
> > +			.cap.turbo_dec = {
> > +				.capability_flags =
> > +
> 	RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
> > +					RTE_BBDEV_TURBO_CRC_TYPE_24B |
> > +
> 	RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
> > +					RTE_BBDEV_TURBO_EQUALIZER |
> > +
> 	RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
> > +
> 	RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> > +
> 	RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
> > +					RTE_BBDEV_TURBO_SOFT_OUTPUT |
> > +
> 	RTE_BBDEV_TURBO_EARLY_TERMINATION |
> > +
> 	RTE_BBDEV_TURBO_DEC_INTERRUPTS |
> > +
> 	RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> > +
> 	RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
> > +					RTE_BBDEV_TURBO_MAP_DEC |
> > +
> 	RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
> > +
> 	RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
> > +				.max_llr_modulus = INT8_MAX,
> > +				.num_buffers_src =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +				.num_buffers_hard_out =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +				.num_buffers_soft_out =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +			}
> > +		},
> > +		{
> > +			.type = RTE_BBDEV_OP_TURBO_ENC,
> > +			.cap.turbo_enc = {
> > +				.capability_flags =
> > +
> 	RTE_BBDEV_TURBO_CRC_24B_ATTACH |
> > +
> 	RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
> > +					RTE_BBDEV_TURBO_RATE_MATCH |
> > +
> 	RTE_BBDEV_TURBO_ENC_INTERRUPTS |
> > +
> 	RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
> > +				.num_buffers_src =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +				.num_buffers_dst =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +			}
> > +		},
> > +		{
> > +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> > +			.cap.ldpc_enc = {
> > +				.capability_flags =
> > +					RTE_BBDEV_LDPC_RATE_MATCH |
> > +					RTE_BBDEV_LDPC_CRC_24B_ATTACH
> |
> > +
> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
> > +					RTE_BBDEV_LDPC_ENC_INTERRUPTS
> |
> > +
> 	RTE_BBDEV_LDPC_ENC_SCATTER_GATHER |
> > +
> 	RTE_BBDEV_LDPC_ENC_CONCATENATION,
> > +				.num_buffers_src =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +				.num_buffers_dst =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			}
> > +		},
> > +		{
> > +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> > +			.cap.ldpc_dec = {
> > +			.capability_flags =
> > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > +				RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK |
> > +				RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK |
> > +				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE
> |
> > +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > +				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE
> |
> > +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> > +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > +
> 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > +
> 	RTE_BBDEV_LDPC_HARQ_4BIT_COMPRESSION |
> > +				RTE_BBDEV_LDPC_LLR_COMPRESSION |
> > +				RTE_BBDEV_LDPC_SOFT_OUT_ENABLE |
> > +				RTE_BBDEV_LDPC_SOFT_OUT_RM_BYPASS |
> > +
> 	RTE_BBDEV_LDPC_SOFT_OUT_DEINTERLEAVER_BYPASS |
> > +				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
> > +			.llr_size = 8,
> > +			.llr_decimals = 2,
> > +			.num_buffers_src =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			.num_buffers_hard_out =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			.num_buffers_soft_out = 0,
> > +			}
> > +		},
> > +		{
> > +			.type	= RTE_BBDEV_OP_FFT,
> > +			.cap.fft = {
> > +				.capability_flags =
> > +
> 	RTE_BBDEV_FFT_WINDOWING |
> > +
> 	RTE_BBDEV_FFT_CS_ADJUSTMENT |
> > +
> 	RTE_BBDEV_FFT_DFT_BYPASS |
> > +
> 	RTE_BBDEV_FFT_IDFT_BYPASS |
> > +						RTE_BBDEV_FFT_FP16_INPUT
> |
> > +
> 	RTE_BBDEV_FFT_FP16_OUTPUT |
> > +
> 	RTE_BBDEV_FFT_POWER_MEAS |
> > +
> 	RTE_BBDEV_FFT_WINDOWING_BYPASS,
> > +				.num_buffers_src =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +				.num_buffers_dst =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> 
> Is that a copy/pasta error or is it expected to have LDPC defines in FFT
> capabilities?

I need to change that, this is not quite correct actually even previously.
Probably through other serie though. No one uses this currently.

> 
> > +			}
> > +		},
> > +		{
> > +			.type	= RTE_BBDEV_OP_MLDTS,
> > +			.cap.fft = {
> > +				.capability_flags =
> > +						RTE_BBDEV_MLDTS_REP,
> > +				.num_buffers_src =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +				.num_buffers_dst =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> 
> Same question here, that looks suspicious but maybe I'm wrong.
> 
> > +			}
> > +		},
> > +		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > +	};
> > +
> >   	static struct rte_bbdev_queue_conf default_queue_conf;
> >   	default_queue_conf.socket = dev->data->socket_id;
> >   	default_queue_conf.queue_size = ACC_MAX_QUEUE_DEPTH; @@ -
> 1193,7
> > +1419,10 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct
> rte_bbdev_driver_info *dev_info)
> >   	dev_info->default_queue_conf = default_queue_conf;
> >   	dev_info->cpu_flag_reqs = NULL;
> >   	dev_info->min_alignment = 1;
> > -	dev_info->capabilities = vrb1_bbdev_capabilities;
> > +	if (d->device_variant == VRB1_VARIANT)
> > +		dev_info->capabilities = vrb1_bbdev_capabilities;
> > +	else
> > +		dev_info->capabilities = vrb2_bbdev_capabilities;
> >   	dev_info->harq_buffer_size = 0;
> >
> >   	vrb_check_ir(d);
> > @@ -1242,6 +1471,9 @@ static struct rte_pci_id pci_id_vrb_pf_map[] = {
> >   	{
> >   		RTE_PCI_DEVICE(RTE_VRB1_VENDOR_ID,
> RTE_VRB1_PF_DEVICE_ID)
> >   	},
> > +	{
> > +		RTE_PCI_DEVICE(RTE_VRB2_VENDOR_ID,
> RTE_VRB2_PF_DEVICE_ID)
> > +	},
> >   	{.device_id = 0},
> >   };
> >
> > @@ -1250,6 +1482,9 @@ static struct rte_pci_id pci_id_vrb_vf_map[] = {
> >   	{
> >   		RTE_PCI_DEVICE(RTE_VRB1_VENDOR_ID,
> RTE_VRB1_VF_DEVICE_ID)
> >   	},
> > +	{
> > +		RTE_PCI_DEVICE(RTE_VRB2_VENDOR_ID,
> RTE_VRB2_VF_DEVICE_ID)
> > +	},
> >   	{.device_id = 0},
> >   };
> >
> 
> I'm stopping the review here for now.

Thanks expecting a v3 this week based on your suggestions. 

> 
> Thanks,
> Maxime



More information about the dev mailing list