[dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in the packet buffer just after mbuf

Dumitrescu, Cristian cristian.dumitrescu at intel.com
Fri May 30 00:28:16 CEST 2014


Hi Ivan,

I hate to disagree with you :), but no, we do not break the scatter-gather feature. We actually implemented the Packet Framework IPv4 fragmentation and reassembly ports to validate exactly this.

Maybe the confusion comes from the RTE_MBUF_TO_BADDR macro. This macro only works (and it is only used) for direct mbufs, so probably the correct name for this macro should be RTE_MBUF_DIRECT_TO_BADDR, as it cannot work (and it is not used) for indirect mbufs.

I am describing the rationale behind meta-data design below, sorry for the long description that looks like a design document ...

Quick recap:
- mbuf->buf_addr points to the first address where a byte of data for the current segment _can_ be placed
- direct mbuf: the packet descriptor (mbuf) sits in the same mempool buffer with the packet itself, so mbuf->buf_addr = mbuf + 1;
- indirect mbuf: the packet descriptor is located in a different mempool buffer than the packet itself, so mbuf->buf_addr != mbuf + 1;
- Regardless of the mbuf type, it is not necessarily true that the first byte of data is located at mbuf->buf_addr, as we save a headroom (configurable, by default initially of CONFIG_RTE_PKTMBUF_HEADROOM = 128 bytes) at the start of the data buffer (mbuf->buf_addr) for prepending packet headers ( ... or, why not, meta-data!).  The location of the first real data byte is mbuf->pkt.data, which is variable, as opposed to mbuf->buf_addr, which does not change.

Packet meta-data rationale:
- I think we both agree that we need to be able to store packet meta-data in the packet buffer. The packet buffer regions where meta-data could be stored can only be the in the headroom or in the tailroom, which are both managed by the application and both go up and down during the packet lifetime.

- To me, packet-metadata is part of the packet descriptor: mbuf is the mandatory & standard part of the packet descriptor, while meta-data is the optional & non-standard (per application) part of the packet descriptor. Therefore, it makes sense to put the meta-data immediately after mbuf, but this is not mandatory.

- The zero-size field called meta-data is really just an offset: it points to the first buffer location where meta-data _can_ be placed. The reason for having this field is to provide a standard way to place meta-data in the packet buffer rather that hide it in the Packet Framework libraries and potentially conflict with other mbuf changes in the future. It flags that meta-data should be taken into consideration when any mbuf change is done in  the future.

- For direct mbufs, the same buffer space (the headroom) is shared between packet data (header prepending) and meta-data similar to how the stack and the heap manage the same memory. Of course, since it is managed by the application, it is the responsibility of the application to make sure the packet bytes and the meta-data do not step on one another, but this problem is not at all new, nor introduced by the meta-data field: even currently, the application has to make sure the headroom is dimensioned correctly, so that even in the worst case scenario (application-specific), the packet bytes do not run into the mbuf fields, right?

- For indirect mbufs, the packet meta-data is associated with the indirect mbuf rather than the direct mbuf (the one the indirect mbuf attaches to, as the direct mbuf contains a different packet that has different meta-data), so it makes sense that meta-data of the indirect mbuf is stored in the same buffer as the indirect mbuf (rather than the buffer of the attached direct mbuf). So this means that the buffer size used to store the indirect mbuf is sized appropriately (mbuf size, currently 64 bytes, plus the max size for additional meta-data). This is illustrated in the code of the IPv4 fragmentation port, where for every child packet (output fragment) we copy the parent meta-data in the child buffer (which stores an indirect buffer attached to the direct buffer of the input jumbo, plus now additional meta-data).

- We are also considering having a user_data field in the mbuf itself to point to meta-data in the same buffer or any other buffer. The Packet Framework functionally works with both approaches, but there is a performance problem associated with the mbuf->user_data approach that we are not addressing for this 1.7 release timeframe. The issue is the data dependency that is created, as in order to find out the location of the meta-data, we need to first read the mbuf  and then read meta-data from mbuf->user_data. The cost of the additional memory access is high, due to data dependency preventing efficient prefetch, therefore (at least for the time being) we need to use a compile-time known meta-data offset. The application can fill the meta-data on packet RX and then all additional CPU cores processing the packet can leave of the meta-data for a long time with no need to access the mbuf or the packet (efficiency). For the time being, if somebody needs more yet meta-data in yet another buffer, they can add (for their application) a user_data pointer as part of their application meta-data (instead of standard mbuf).

- As said, the mbuf->metadata points to the first location where meta-data _can_ be placed, if somebody needs a different offset, they can add it on top of the mbuf->metadata (e.g. by having a reserved field in their struct app_pkt_metadata). We have demonstrated the use of meta-data in the examples/ip_pipeline sample app (see struct app_pkt_metadata in "main.h").

Please let me know if this makes sense?

Regards,
Cristian
 
PS: This is where a white board would help a lot ...


-----Original Message-----
From: Ivan Boule [mailto:ivan.boule at 6wind.com] 
Sent: Wednesday, May 28, 2014 1:03 PM
To: Dumitrescu, Cristian; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in the packet buffer just after mbuf

Hi Cristian,

Currently, the DPDK framework does not make any assumption on the actual
layout of a mbuf.
More precisely, the DPDK does not impose any constraint on the actual
location of additional metadata, if any, or on the actual location and
size of its associated payload data buffer.
This is coherent with the fact that mbuf pools are not created by the
DPDK itself, but by network applications that are free to choose
whatever packet mbuf layout that fits their particular needs.

There is one exception to this basic DPDK rule: the mbuf cloning feature 
available through the RTE_MBUF_SCATTER_GATHER configuration option 
assumes that the payload data buffer of the mbuf immediately follows the 
rte_mbuf data structure (see the macros RTE_MBUF_FROM_BADDR, 
RTE_MBUF_TO_BADDR, RTE_MBUF_INDIRECT, and RTE_MBUF_DIRECT in the file 
lib/librte_mbuf/rte_mbuf.h).

The cloning feature prevents to build packet mbufs with their metadata 
located immediately after the rte_mbuf data structure, which is exactly 
what your patch introduces.

At least, a comment that clearly warns the user of this incompatibility
might be worth adding into both the code and your patch log.

Regards,
Ivan

On 05/27/2014 07:09 PM, Cristian Dumitrescu wrote:
> Added zero-size field (offset in data structure) to specify the beginning of packet meta-data in the packet buffer just after the mbuf.
>
> The size of the packet meta-data is application specific and the packet meta-data is managed by the application.
>
> The packet meta-data should always be accessed through the provided macros.
>
> This is used by the Packet Framework libraries (port, table, pipeline).
>
> There is absolutely no performance impact due to this mbuf field, as it does not take any space in the mbuf structure (zero-size field).
>
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu at intel.com>
> ---
>   lib/librte_mbuf/rte_mbuf.h |   17 +++++++++++++++++
>   1 files changed, 17 insertions(+), 0 deletions(-)
>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 4a9ab41..bf09618 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -201,8 +201,25 @@ struct rte_mbuf {
>   		struct rte_ctrlmbuf ctrl;
>   		struct rte_pktmbuf pkt;
>   	};
> +	
> +	union {
> +		uint8_t metadata[0];
> +		uint16_t metadata16[0];
> +		uint32_t metadata32[0];
> +		uint64_t metadata64[0];
> +	};
>   } __rte_cache_aligned;
>
> +#define RTE_MBUF_METADATA_UINT8(mbuf, offset)       (mbuf->metadata[offset])
> +#define RTE_MBUF_METADATA_UINT16(mbuf, offset)      (mbuf->metadata16[offset/sizeof(uint16_t)])
> +#define RTE_MBUF_METADATA_UINT32(mbuf, offset)      (mbuf->metadata32[offset/sizeof(uint32_t)])
> +#define RTE_MBUF_METADATA_UINT64(mbuf, offset)      (mbuf->metadata64[offset/sizeof(uint64_t)])
> +
> +#define RTE_MBUF_METADATA_UINT8_PTR(mbuf, offset)   (&mbuf->metadata[offset])
> +#define RTE_MBUF_METADATA_UINT16_PTR(mbuf, offset)  (&mbuf->metadata16[offset/sizeof(uint16_t)])
> +#define RTE_MBUF_METADATA_UINT32_PTR(mbuf, offset)  (&mbuf->metadata32[offset/sizeof(uint32_t)])
> +#define RTE_MBUF_METADATA_UINT64_PTR(mbuf, offset)  (&mbuf->metadata64[offset/sizeof(uint64_t)])
> +
>   /**
>    * Given the buf_addr returns the pointer to corresponding mbuf.
>    */
>


-- 
Ivan Boule
6WIND Development Engineer
--------------------------------------------------------------
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.




More information about the dev mailing list