[dpdk-dev] [PATCH v1 00/12] mldev: introduce machine learning device library

Thomas Monjalon thomas at monjalon.net
Thu Jan 26 12:11:53 CET 2023


25/01/2023 20:01, Jerin Jacob:
> On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <thomas at monjalon.net> wrote:
> > 14/11/2022 13:02, jerinj at marvell.com:
> > > ML Model: An ML model is an algorithm trained over a dataset. A model consists of
> > > procedure/algorithm and data/pattern required to make predictions on live data.
> > > Once the model is created and trained outside of the DPDK scope, the model can be loaded
> > > via rte_ml_model_load() and then start it using rte_ml_model_start() API.
> > > The rte_ml_model_params_update() can be used to update the model parameters such as weight
> > > and bias without unloading the model using rte_ml_model_unload().
> >
> > The fact that the model is prepared outside means the model format is free
> > and probably different per mldev driver.
> > I think it is OK but it requires a lot of documentation effort to explain
> > how to bind the model and its parameters with the DPDK API.
> > Also we may need to pass some metadata from the model builder
> > to the inference engine in order to enable optimizations prepared in the model.
> > And the other way, we may need inference capabilities in order to generate
> > an optimized model which can run in the inference engine.
> 
> The base API specification kept absolute minimum. Currently, weight and biases
> parameters updated through rte_ml_model_params_update(). It can be extended
> when there are drivers supports it or if you have any specific
> parameter you would like to add
> it in rte_ml_model_params_update().

This function is
int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer);

How are we supposed to provide separate parameters in this void* ?


> Other metadata data like batch, shapes, formats queried using rte_ml_io_info().

Copying:
+/** Input and output data information structure
+ *
+ * Specifies the type and shape of input and output data.
+ */
+struct rte_ml_io_info {
+       char name[RTE_ML_STR_MAX];
+       /**< Name of data */
+       struct rte_ml_io_shape shape;
+       /**< Shape of data */
+       enum rte_ml_io_type qtype;
+       /**< Type of quantized data */
+       enum rte_ml_io_type dtype;
+       /**< Type of de-quantized data */
+};

Is it the right place to notify the app that some model optimizations
are supported? (example: merge some operations in the graph)

> > [...]
> > > Typical application utilisation of the ML API will follow the following
> > > programming flow.
> > >
> > > - rte_ml_dev_configure()
> > > - rte_ml_dev_queue_pair_setup()
> > > - rte_ml_model_load()
> > > - rte_ml_model_start()
> > > - rte_ml_model_info()
> > > - rte_ml_dev_start()
> > > - rte_ml_enqueue_burst()
> > > - rte_ml_dequeue_burst()
> > > - rte_ml_model_stop()
> > > - rte_ml_model_unload()
> > > - rte_ml_dev_stop()
> > > - rte_ml_dev_close()
> >
> > Where is parameters update in this flow?
> 
> Added the mandatory APIs in the top level flow doc.
> rte_ml_model_params_update() used to update the parameters.

The question is "where" should it be done?
Before/after start?

> > Should we update all parameters at once or can it be done more fine-grain?
> 
> Currently, rte_ml_model_params_update() can be used to update weight
> and bias via buffer when device is
> in stop state and without unloading the model.

The question is "can we update a single parameter"?
And how?

> > Question about the memory used by mldev:
> > Can we manage where the memory is allocated (host, device, mix, etc)?
> 
> Just passing buffer pointers now like other subsystem.
> Other EAL infra service can take care of the locality of memory as it
> is not specific to ML dev.

I was thinking about memory allocation required by the inference engine.
How to specify where to allocate? Is it just hardcoded in the driver?




More information about the dev mailing list