[dpdk-dev] [RFC 0/8] mbuf: structure reorganization

Olivier Matz olivier.matz at 6wind.com
Thu Mar 2 17:46:23 CET 2017


Hi Konstantin,

On Tue, 28 Feb 2017 22:53:55 +0000, "Ananyev, Konstantin" <konstantin.ananyev at intel.com> wrote:
> > > Another thing that doesn't look very convenient to me here -
> > > We can have 2 different values of timestamp (both normalized and not)
> > > and there is no clear way for the application to know which one is in
> > > use right now. So each app writer would have to come-up with his own
> > > solution.  
> > 
> > It depends:
> > - the solution you describe is to have the application storing the
> >   normalized value in its private metadata.
> > - another solution would be to store the normalized value in
> >   m->timestamp. In this case, we would need a flag to tell if the
> >   timestamp value is normalized.  
> 
> My first thought also was about second flag to specify was timestamp
> already normalized or not.
> Though I still in doubt - is it all really worth it: extra ol_flag, new function in eth_dev API.
> My feeling that we trying to overcomplicate things.

I don't see what is so complicated. The idea is just to let the
application do the normalization if it is required.

If the time is normalized in nanosecond in the PMD, we would still
need to normalized the time reference (the 0). And for that we'd need
a call to a synchronization code as well.



> > The problem pointed out by Jan is that doing the timestamp
> > normalization may take some CPU cycles, even if a small part of packets
> > requires it.  
> 
> I understand that point, but from what I've seen with real example:
> http://dpdk.org/ml/archives/dev/2016-October/048810.html
> the amount of calculations at RX is pretty small.
> I don't think it would affect performance in a noticeable way
> (though I don't have any numbers here to prove it).

I think we can consider by default that adding code in the data path
impacts performance.


> From other side, if user doesn't want a timestamp he can always disable
> that feature anad save cycles, right? 
> 
> BTW, you and Jan both mention that not every packet would need a timestamp.
> Instead we need sort of a timestamp for the group of packets?

I think that for many applications the timestamp should be as precise
as possible for each packet.


> Is that really the only foreseen usage model?

No, but it could be one.


> If so, then why not to have a special function that would extract 'latest' timestamp
> from the dev?
> Or even have tx_burst_extra() that would return a latest timestamp (extra parameter or so).
> Then there is no need to put timestamp into mbuf at all. 

Doing that will give a poor precision for the timestamp.


> > > > Applications that
> > > > are doing this are responsible of what they change.
> > > >
> > > >  
> > > > > 3. In theory with eth_dev_detach() - mbuf->port value might be
> > > > > not valid at the point when application would decide to do
> > > > > normalization.
> > > > >
> > > > > So to me all that approach with delayed normalization seems
> > > > > unnecessary overcomplicated. Original one suggested by Olivier,
> > > > > when normalization is done in PMD at RX look much cleaner and
> > > > > more manageable.  
> > > >
> > > > Detaching a device requires a synchronization between control and
> > > > data plane, and not only for this use case.  
> > >
> > > Of course it does.
> > > But right now it is possible to do:
> > >
> > > eth_rx_burst(port=0, ..., &mbuf, 1);
> > > eth_dev_detach(port=0, ...);
> > > ...
> > > /*process previously received mbuf */
> > >
> > > With what you are proposing it would be not always possible any more.  
> > 
> > With your example, it does not work even without the timestamp feature,
> > since the mbuf input port would reference an invalid port.
> > This port  is usually used in the application to do a lookup for an port structure,
> > so it is expected that the entry is valid. It would be even worse if you
> > do a detach + attach.  
> 
> I am not talking about the mbuf->port value usage.
> Right now user can access/interpret  all metadata fields set by PMD RX routines
> (vlan, rss hash, ol_flags, ptype, etc.) without need to accessing the device data or
> calling device functions.
> With that change it wouldn't be the case anymore. 

That's the same for some other functions. If in my application I want
to call eth_rx_queue_count(m->port), I will have the same problem.

I think we also have something quite similar in examples/ptpclient:

  rte_eth_rx_burst(portid, 0, &m, 1);
  ...
  parse_ptp_frames(portid, m);
  ...
  ptp_data.portid = portid;
  ...
  rte_eth_timesync_read_tx_timestamp(ptp_data->portid, ...)


So, really, I think it's an application issue: when the app deletes
a port, it should ask itself if there are remaining references to
it (m->port).

> > So, I think it is already the responsibility of the application to do
> > the sync (flush retrieved packets before detaching a port).  
> 
> The packets are not in RX or TX queue of detaching device any more.
> I received a packet, after that I expect to have all its data and metadata inside mbuf.
> So I can store mbufs somewhere and process them much later.
> Or might be I would like to pass it to the secondary process for logging/analyzing, etc.

Yes, but that's still an app problem for me.


> > > >In the first solution, the normalization is
> > > > partial: unit is nanosecond, but the time reference is different.  
> > >
> > > Not sure I get you here...  
> > 
> > In the first solution I described, each PMD had to convert its unit
> > into nanosecond. This is easy because we assume the PMD knows the
> > value of its clock. But to get a fully normalized value, it also has to
> > use the same time reference, so we would also need to manage an offset
> > (we need a new API to give this value to the PMD).  
> 
> Yes, I suppose we do need an start timestamp and sort of factor() to convert
> HW value, something like:
> 
> mbuf->timestamp = rxq->start_timestamp  + factor(hw_timestamp);
> 
> Right?
> Why passing start_timestamp at the configure() phase will be a problem?
> 
> > 
> > I have another fear related to hardware clocks: if clocks are not
> > synchronized between PMDs, the simple operation "t * ratio - offset"
> > won't work. That's why I think we could delegate this job in a specific
> > library that would manage this.  
> 
> But then that library would need to account all PMDs inside the system,
> and be aware about each HW clock skew, etc.
> Again, doesn't' sound like an simple task to me. 

Exactly, that's also why I want to let the specialists take care of
it. Having non-normalized timestamps now allow to do the job later
when required, while allowing basic usages as required by metrics
libraries and mlx pmd.



Olivier


More information about the dev mailing list