[v2] doc: announce changes to ethdev rxconf structure

Message ID 1596617395-29271-1-git-send-email-viacheslavo@mellanox.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series [v2] doc: announce changes to ethdev rxconf structure |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/travis-robot success Travis build: passed

Commit Message

Slava Ovsiienko Aug. 5, 2020, 8:49 a.m. UTC
  The DPDK datapath in the transmit direction is very flexible.
The applications can build multi-segment packets and manages
almost all data aspects - the memory pools where segments
are allocated from, the segment lengths, the memory attributes
like external, registered, etc.

In the receiving direction, the datapath is much less flexible,
the applications can only specify the memory pool to configure
the receiving queue and nothing more. The packet being received
can only be pushed to the chain of the mbufs of the same data
buffer size and allocated from the same pool. In order to extend
the receiving datapath buffer description it is proposed to add
the new fields into rte_eth_rxconf structure:

struct rte_eth_rxconf {
    ...
    uint16_t rx_split_num; /* number of segments to split */
    uint16_t *rx_split_len; /* array of segment lengths */
    struct rte_mempool **mp; /* array of segment memory pools */
    ...
};

The non-zero value of rx_split_num field configures the receiving
queue to split ingress packets into multiple segments to the mbufs
allocated from various memory pools according to the specified
lengths. The zero value of rx_split_num field provides the
backward compatibility and queue should be configured in a regular
way (with single/multiple mbufs of the same data buffer length
allocated from the single memory pool).

The new approach would allow splitting the ingress packets into
multiple parts pushed to the memory with different attributes.
For example, the packet headers can be pushed to the embedded data
buffers within mbufs and the application data into the external
buffers attached to mbufs allocated from the different memory
pools. The memory attributes for the split parts may differ
either - for example the application data may be pushed into
the external memory located on the dedicated physical device,
say GPU or NVMe. This would improve the DPDK receiving datapath
flexibility preserving compatibility with existing API.

The proposed extended description of receiving buffers might be
considered by other vendors to be involved into similar features
support, it is the subject for the further discussion.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Jerin Jacob <jerinjacobk@gmail.com>

---
v1->v2: commit message updated, proposed to consider the new
        fields for supporting similar features by multiple
	vendors
---
 doc/guides/rel_notes/deprecation.rst | 5 +++++
 1 file changed, 5 insertions(+)
  

Comments

Andrew Rybchenko Aug. 5, 2020, 11:14 a.m. UTC | #1
On 8/5/20 11:49 AM, Viacheslav Ovsiienko wrote:
> The DPDK datapath in the transmit direction is very flexible.
> The applications can build multi-segment packets and manages
> almost all data aspects - the memory pools where segments
> are allocated from, the segment lengths, the memory attributes
> like external, registered, etc.
> 
> In the receiving direction, the datapath is much less flexible,
> the applications can only specify the memory pool to configure
> the receiving queue and nothing more. The packet being received
> can only be pushed to the chain of the mbufs of the same data
> buffer size and allocated from the same pool. In order to extend
> the receiving datapath buffer description it is proposed to add
> the new fields into rte_eth_rxconf structure:
> 
> struct rte_eth_rxconf {
>     ...
>     uint16_t rx_split_num; /* number of segments to split */
>     uint16_t *rx_split_len; /* array of segment lengths */
>     struct rte_mempool **mp; /* array of segment memory pools */
>     ...
> };
> 
> The non-zero value of rx_split_num field configures the receiving
> queue to split ingress packets into multiple segments to the mbufs
> allocated from various memory pools according to the specified
> lengths. The zero value of rx_split_num field provides the
> backward compatibility and queue should be configured in a regular
> way (with single/multiple mbufs of the same data buffer length
> allocated from the single memory pool).
> 
> The new approach would allow splitting the ingress packets into
> multiple parts pushed to the memory with different attributes.
> For example, the packet headers can be pushed to the embedded data
> buffers within mbufs and the application data into the external
> buffers attached to mbufs allocated from the different memory
> pools. The memory attributes for the split parts may differ
> either - for example the application data may be pushed into
> the external memory located on the dedicated physical device,
> say GPU or NVMe. This would improve the DPDK receiving datapath
> flexibility preserving compatibility with existing API.
> 
> The proposed extended description of receiving buffers might be
> considered by other vendors to be involved into similar features
> support, it is the subject for the further discussion.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Jerin Jacob <jerinjacobk@gmail.com>

I"m OK with the idea in general and we'll work on details
in the next release cycle.

Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
  
Thomas Monjalon Aug. 6, 2020, 12:39 p.m. UTC | #2
05/08/2020 13:14, Andrew Rybchenko:
> On 8/5/20 11:49 AM, Viacheslav Ovsiienko wrote:
> > The DPDK datapath in the transmit direction is very flexible.
> > The applications can build multi-segment packets and manages
> > almost all data aspects - the memory pools where segments
> > are allocated from, the segment lengths, the memory attributes
> > like external, registered, etc.
> > 
> > In the receiving direction, the datapath is much less flexible,
> > the applications can only specify the memory pool to configure
> > the receiving queue and nothing more. The packet being received
> > can only be pushed to the chain of the mbufs of the same data
> > buffer size and allocated from the same pool. In order to extend
> > the receiving datapath buffer description it is proposed to add
> > the new fields into rte_eth_rxconf structure:
> > 
> > struct rte_eth_rxconf {
> >     ...
> >     uint16_t rx_split_num; /* number of segments to split */
> >     uint16_t *rx_split_len; /* array of segment lengths */
> >     struct rte_mempool **mp; /* array of segment memory pools */
> >     ...
> > };
> > 
> > The non-zero value of rx_split_num field configures the receiving
> > queue to split ingress packets into multiple segments to the mbufs
> > allocated from various memory pools according to the specified
> > lengths. The zero value of rx_split_num field provides the
> > backward compatibility and queue should be configured in a regular
> > way (with single/multiple mbufs of the same data buffer length
> > allocated from the single memory pool).
> > 
> > The new approach would allow splitting the ingress packets into
> > multiple parts pushed to the memory with different attributes.
> > For example, the packet headers can be pushed to the embedded data
> > buffers within mbufs and the application data into the external
> > buffers attached to mbufs allocated from the different memory
> > pools. The memory attributes for the split parts may differ
> > either - for example the application data may be pushed into
> > the external memory located on the dedicated physical device,
> > say GPU or NVMe. This would improve the DPDK receiving datapath
> > flexibility preserving compatibility with existing API.
> > 
> > The proposed extended description of receiving buffers might be
> > considered by other vendors to be involved into similar features
> > support, it is the subject for the further discussion.
> > 
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
> 
> I"m OK with the idea in general and we'll work on details
> in the next release cycle.
> 
> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>

I agree we need to be more flexible with the mempools in Rx.

Acked-by: Thomas Monjalon <thomas@monjalon.net>
  
Ferruh Yigit Aug. 6, 2020, 4:31 p.m. UTC | #3
On 8/5/2020 9:49 AM, Viacheslav Ovsiienko wrote:
> The DPDK datapath in the transmit direction is very flexible.
> The applications can build multi-segment packets and manages
> almost all data aspects - the memory pools where segments
> are allocated from, the segment lengths, the memory attributes
> like external, registered, etc.
> 
> In the receiving direction, the datapath is much less flexible,
> the applications can only specify the memory pool to configure
> the receiving queue and nothing more. The packet being received
> can only be pushed to the chain of the mbufs of the same data
> buffer size and allocated from the same pool. In order to extend
> the receiving datapath buffer description it is proposed to add
> the new fields into rte_eth_rxconf structure:
> 
> struct rte_eth_rxconf {
>     ...
>     uint16_t rx_split_num; /* number of segments to split */
>     uint16_t *rx_split_len; /* array of segment lengths */
>     struct rte_mempool **mp; /* array of segment memory pools */
>     ...
> };

What is the way to say first 14 bytes will go first mempool and rest will go
second one?
Or do you have to define fixed sizes for all segments?
What if that 'rest' part larger than given buffer size for that mempool?

Intel NICs also has header split support, similar to what Jerin described,
header and data goes to different buffers, which doesn't require fixed sizes and
need only two mempools, not sure if it should be integrated to this feature but
we can discuss later.

Also there are some valid concerns Andrew highlighted, like how application will
know if PMD supports this feature etc.. and more.
But since these are design/implementation related concerns, not a blocker for
deprecation notice I think, overall no objection to config structure change, hence:

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

> 
> The non-zero value of rx_split_num field configures the receiving
> queue to split ingress packets into multiple segments to the mbufs
> allocated from various memory pools according to the specified
> lengths. The zero value of rx_split_num field provides the
> backward compatibility and queue should be configured in a regular
> way (with single/multiple mbufs of the same data buffer length
> allocated from the single memory pool).
> 
> The new approach would allow splitting the ingress packets into
> multiple parts pushed to the memory with different attributes.
> For example, the packet headers can be pushed to the embedded data
> buffers within mbufs and the application data into the external
> buffers attached to mbufs allocated from the different memory
> pools. The memory attributes for the split parts may differ
> either - for example the application data may be pushed into
> the external memory located on the dedicated physical device,
> say GPU or NVMe. This would improve the DPDK receiving datapath
> flexibility preserving compatibility with existing API.
> 
> The proposed extended description of receiving buffers might be
> considered by other vendors to be involved into similar features
> support, it is the subject for the further discussion.
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
> 
> ---
> v1->v2: commit message updated, proposed to consider the new
>         fields for supporting similar features by multiple
> 	vendors
> ---
>  doc/guides/rel_notes/deprecation.rst | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index acf87d3..b6bdb83 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -99,6 +99,11 @@ Deprecation Notices
>    In 19.11 PMDs will still update the field even when the offload is not
>    enabled.
>  
> +* ethdev: add new fields to ``rte_eth_rxconf`` to configure the receiving
> +  queues to split ingress packets into multiple segments according to the
> +  specified lengths into the buffers allocated from the specified
> +  memory pools. The backward compatibility to existing API is preserved.
> +
>  * ethdev: ``rx_descriptor_done`` dev_ops and ``rte_eth_rx_descriptor_done``
>    will be deprecated in 20.11 and will be removed in 21.11.
>    Existing ``rte_eth_rx_descriptor_status`` and ``rte_eth_tx_descriptor_status``
>
  
Slava Ovsiienko Aug. 6, 2020, 5 p.m. UTC | #4
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Thursday, August 6, 2020 19:32
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> jerinjacobk@gmail.com; stephen@networkplumber.org;
> arybchenko@solarflare.com; ajit.khaparde@broadcom.com;
> maxime.coquelin@redhat.com; olivier.matz@6wind.com;
> david.marchand@redhat.com
> Subject: Re: [PATCH v2] doc: announce changes to ethdev rxconf structure
> 
> On 8/5/2020 9:49 AM, Viacheslav Ovsiienko wrote:
> > The DPDK datapath in the transmit direction is very flexible.
> > The applications can build multi-segment packets and manages almost
> > all data aspects - the memory pools where segments are allocated from,
> > the segment lengths, the memory attributes like external, registered,
> > etc.
> >
> > In the receiving direction, the datapath is much less flexible, the
> > applications can only specify the memory pool to configure the
> > receiving queue and nothing more. The packet being received can only
> > be pushed to the chain of the mbufs of the same data buffer size and
> > allocated from the same pool. In order to extend the receiving
> > datapath buffer description it is proposed to add the new fields into
> > rte_eth_rxconf structure:
> >
> > struct rte_eth_rxconf {
> >     ...
> >     uint16_t rx_split_num; /* number of segments to split */
> >     uint16_t *rx_split_len; /* array of segment lengths */
> >     struct rte_mempool **mp; /* array of segment memory pools */
> >     ...
> > };
> 
> What is the way to say first 14 bytes will go first mempool and rest will go
> second one?
> Or do you have to define fixed sizes for all segments?
Yes - "rx_split_len" array defines the sizes of segments.

> What if that 'rest' part larger than given buffer size for that mempool?
Error. The supposed size for appropriate segment must conform the pool requirements.
So - the segment size must be less or equal to the mbuf buffer size from this pool.
And, must be supported by HW and PMD, if there are some other limitations - these
ones should be checked in rx_queue_setup and error returned if needed.

> 
> Intel NICs also has header split support, similar to what Jerin described,
> header and data goes to different buffers, which doesn't require fixed sizes
> and need only two mempools, not sure if it should be integrated to this
> feature but we can discuss later.
HEADER_SPLIT is port-wide feature, not per queue.
BTW, where the second pool (for the payload) is specified? Do I miss the second pool?

> 
> Also there are some valid concerns Andrew highlighted, like how application
> will know if PMD supports this feature etc.. and more.
> But since these are design/implementation related concerns, not a blocker
> for deprecation notice I think, overall no objection to config structure
> change, hence:
> 
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

Thanks a lot for the review, discussion and Deprecation Note ack.

> 
> >
> > The non-zero value of rx_split_num field configures the receiving
> > queue to split ingress packets into multiple segments to the mbufs
> > allocated from various memory pools according to the specified
> > lengths. The zero value of rx_split_num field provides the backward
> > compatibility and queue should be configured in a regular way (with
> > single/multiple mbufs of the same data buffer length allocated from
> > the single memory pool).
> >
> > The new approach would allow splitting the ingress packets into
> > multiple parts pushed to the memory with different attributes.
> > For example, the packet headers can be pushed to the embedded data
> > buffers within mbufs and the application data into the external
> > buffers attached to mbufs allocated from the different memory pools.
> > The memory attributes for the split parts may differ either - for
> > example the application data may be pushed into the external memory
> > located on the dedicated physical device, say GPU or NVMe. This would
> > improve the DPDK receiving datapath flexibility preserving
> > compatibility with existing API.
> >
> > The proposed extended description of receiving buffers might be
> > considered by other vendors to be involved into similar features
> > support, it is the subject for the further discussion.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
> >
> > ---
> > v1->v2: commit message updated, proposed to consider the new
> >         fields for supporting similar features by multiple
> > 	vendors
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index acf87d3..b6bdb83 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -99,6 +99,11 @@ Deprecation Notices
> >    In 19.11 PMDs will still update the field even when the offload is not
> >    enabled.
> >
> > +* ethdev: add new fields to ``rte_eth_rxconf`` to configure the
> > +receiving
> > +  queues to split ingress packets into multiple segments according to
> > +the
> > +  specified lengths into the buffers allocated from the specified
> > +  memory pools. The backward compatibility to existing API is preserved.
> > +
> >  * ethdev: ``rx_descriptor_done`` dev_ops and
> ``rte_eth_rx_descriptor_done``
> >    will be deprecated in 20.11 and will be removed in 21.11.
> >    Existing ``rte_eth_rx_descriptor_status`` and
> > ``rte_eth_tx_descriptor_status``
> >
  
Thomas Monjalon Aug. 6, 2020, 9:42 p.m. UTC | #5
06/08/2020 14:39, Thomas Monjalon:
> 05/08/2020 13:14, Andrew Rybchenko:
> > On 8/5/20 11:49 AM, Viacheslav Ovsiienko wrote:
> > > The DPDK datapath in the transmit direction is very flexible.
> > > The applications can build multi-segment packets and manages
> > > almost all data aspects - the memory pools where segments
> > > are allocated from, the segment lengths, the memory attributes
> > > like external, registered, etc.
> > > 
> > > In the receiving direction, the datapath is much less flexible,
> > > the applications can only specify the memory pool to configure
> > > the receiving queue and nothing more. The packet being received
> > > can only be pushed to the chain of the mbufs of the same data
> > > buffer size and allocated from the same pool. In order to extend
> > > the receiving datapath buffer description it is proposed to add
> > > the new fields into rte_eth_rxconf structure:
> > > 
> > > struct rte_eth_rxconf {
> > >     ...
> > >     uint16_t rx_split_num; /* number of segments to split */
> > >     uint16_t *rx_split_len; /* array of segment lengths */
> > >     struct rte_mempool **mp; /* array of segment memory pools */
> > >     ...
> > > };
> > > 
> > > The non-zero value of rx_split_num field configures the receiving
> > > queue to split ingress packets into multiple segments to the mbufs
> > > allocated from various memory pools according to the specified
> > > lengths. The zero value of rx_split_num field provides the
> > > backward compatibility and queue should be configured in a regular
> > > way (with single/multiple mbufs of the same data buffer length
> > > allocated from the single memory pool).
> > > 
> > > The new approach would allow splitting the ingress packets into
> > > multiple parts pushed to the memory with different attributes.
> > > For example, the packet headers can be pushed to the embedded data
> > > buffers within mbufs and the application data into the external
> > > buffers attached to mbufs allocated from the different memory
> > > pools. The memory attributes for the split parts may differ
> > > either - for example the application data may be pushed into
> > > the external memory located on the dedicated physical device,
> > > say GPU or NVMe. This would improve the DPDK receiving datapath
> > > flexibility preserving compatibility with existing API.
> > > 
> > > The proposed extended description of receiving buffers might be
> > > considered by other vendors to be involved into similar features
> > > support, it is the subject for the further discussion.
> > > 
> > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > > Acked-by: Jerin Jacob <jerinjacobk@gmail.com>
> > 
> > I"m OK with the idea in general and we'll work on details
> > in the next release cycle.
> > 
> > Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
> 
> I agree we need to be more flexible with the mempools in Rx.
> 
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

Applied

Implementation will require more design discussions.
  

Patch

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index acf87d3..b6bdb83 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -99,6 +99,11 @@  Deprecation Notices
   In 19.11 PMDs will still update the field even when the offload is not
   enabled.
 
+* ethdev: add new fields to ``rte_eth_rxconf`` to configure the receiving
+  queues to split ingress packets into multiple segments according to the
+  specified lengths into the buffers allocated from the specified
+  memory pools. The backward compatibility to existing API is preserved.
+
 * ethdev: ``rx_descriptor_done`` dev_ops and ``rte_eth_rx_descriptor_done``
   will be deprecated in 20.11 and will be removed in 21.11.
   Existing ``rte_eth_rx_descriptor_status`` and ``rte_eth_tx_descriptor_status``